acoustics speech communication: Topics by Science.gov

Sample records for acoustics speech communication

Acoustical conditions for speech communication in active elementary school classrooms

NASA Astrophysics Data System (ADS)

Sato, Hiroshi; Bradley, John

2005-04-01

Detailed acoustical measurements were made in 34 active elementary school classrooms with typical rectangular room shape in schools near Ottawa, Canada. There was an average of 21 students in classrooms. The measurements were made to obtain accurate indications of the acoustical quality of conditions for speech communication during actual teaching activities. Mean speech and noise levels were determined from the distribution of recorded sound levels and the average speech-to-noise ratio was 11 dBA. Measured mid-frequency reverberation times (RT) during the same occupied conditions varied from 0.3 to 0.6 s, and were a little less than for the unoccupied rooms. RT values were not related to noise levels. Octave band speech and noise levels, useful-to-detrimental ratios, and Speech Transmission Index values were also determined. Key results included: (1) The average vocal effort of teachers corresponded to louder than Pearsons Raised voice level; (2) teachers increase their voice level to overcome ambient noise; (3) effective speech levels can be enhanced by up to 5 dB by early reflection energy; and (4) student activity is seen to be the dominant noise source, increasing average noise levels by up to 10 dBA during teaching activities. [Work supported by CLLRnet.
Shared acoustic codes underlie emotional communication in music and speech-Evidence from deep transfer learning.

PubMed

Coutinho, Eduardo; Schuller, Björn

2017-01-01

Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies-the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain.
Disordered speech disrupts conversational entrainment: a study of acoustic-prosodic entrainment and communicative success in populations with communication challenges

PubMed Central

Borrie, Stephanie A.; Lubold, Nichola; Pon-Barry, Heather

2015-01-01

Conversational entrainment, a pervasive communication phenomenon in which dialogue partners adapt their behaviors to align more closely with one another, is considered essential for successful spoken interaction. While well-established in other disciplines, this phenomenon has received limited attention in the field of speech pathology and the study of communication breakdowns in clinical populations. The current study examined acoustic-prosodic entrainment, as well as a measure of communicative success, in three distinctly different dialogue groups: (i) healthy native vs. healthy native speakers (Control), (ii) healthy native vs. foreign-accented speakers (Accented), and (iii) healthy native vs. dysarthric speakers (Disordered). Dialogue group comparisons revealed significant differences in how the groups entrain on particular acoustic–prosodic features, including pitch, intensity, and jitter. Most notably, the Disordered dialogues were characterized by significantly less acoustic-prosodic entrainment than the Control dialogues. Further, a positive relationship between entrainment indices and communicative success was identified. These results suggest that the study of conversational entrainment in speech pathology will have essential implications for both scientific theory and clinical application in this domain. PMID:26321996
Acoustic Differences between Humorous and Sincere Communicative Intentions

ERIC Educational Resources Information Center

Hoicka, Elena; Gattis, Merideth

2012-01-01

Previous studies indicate that the acoustic features of speech discriminate between positive and negative communicative intentions, such as approval and prohibition. Two studies investigated whether acoustic features of speech can discriminate between two positive communicative intentions: humour and sweet-sincerity, where sweet-sincerity involved…
Acoustic differences between humorous and sincere communicative intentions.

PubMed

Hoicka, Elena; Gattis, Merideth

2012-11-01

Previous studies indicate that the acoustic features of speech discriminate between positive and negative communicative intentions, such as approval and prohibition. Two studies investigated whether acoustic features of speech can discriminate between two positive communicative intentions: humour and sweet-sincerity, where sweet-sincerity involved being sincere in a positive, warm-hearted way. In Study 1, 22 mothers read a book containing humorous, sweet-sincere, and neutral-sincere images to their 19- to 24-month-olds. In Study 2, 41 mothers read a book containing humorous or sweet-sincere sentences and images to their 18- to 24-month-olds. Mothers used a higher mean F0 to communicate visual humour as compared to visual sincerity. Mothers used greater F0 mean, range, and standard deviation; greater intensity mean, range, and standard deviation; and a slower speech rate to communicate verbal humour as compared to verbal sweet-sincerity. Mothers used a rising linear contour to communicate verbal humour, but used no specific contour to express verbal sweet-sincerity. We conclude that speakers provide acoustic cues enabling listeners to distinguish between positive communicative intentions. ©2011 The British Psychological Society.
Optimizing acoustical conditions for speech intelligibility in classrooms

NASA Astrophysics Data System (ADS)

Yang, Wonyoung

High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C [Livermore, CA; Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA

2006-08-08

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2004-03-23

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-02-14

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Speech intelligibility in noise using throat and acoustic microphones.

PubMed

Acker-Mills, Barbara E; Houtsma, Adrianus J M; Ahroon, William A

2006-01-01

Helicopter cockpits are very noisy and this noise must be reduced for effective communication. The standard U.S. Army aviation helmet is equipped with a noise-canceling acoustic microphone, but some ambient noise still is transmitted. Throat microphones are not sensitive to air molecule vibrations and thus, transmittal of ambient noise is reduced. It is possible that throat microphones could enhance speech communication in helicopters, but speech intelligibility with the devices must first be assessed. In the current study, speech intelligibility of signals generated by an acoustic microphone, a throat microphone, and by the combined output of the two microphones was assessed using the Modified Rhyme Test (MRT). Stimulus words were recorded in a reverberant chamber with ambient broadband noise intensity at 90 and 106 dBA. Listeners completed the MRT task in the same settings, thus simulating the typical environment of a rotary-wing aircraft. Results show that speech intelligibility is significantly worse for the throat microphone (average percent correct = 55.97) than for the acoustic microphone (average percent correct = 69.70), particularly for the higher noise level. In addition, no benefit is gained by simultaneously using both microphones. A follow-up experiment evaluated different consonants using the Diagnostic Rhyme Test and replicated the MRT results. The current results show that intelligibility using throat microphones is poorer than with the use of boom microphones in noisy and in quiet environments. Therefore, throat microphones are not recommended for use in any situation where fast and accurate speech intelligibility is essential.
Underwater speech communications with a modulated laser

NASA Astrophysics Data System (ADS)

Woodward, B.; Sari, H.

2008-04-01

A novel speech communications system using a modulated laser beam has been developed for short-range applications in which high directionality is an exploitable feature. Although it was designed for certain underwater applications, such as speech communications between divers or between a diver and the surface, it may equally be used for air applications. With some modification it could be used for secure diver-to-diver communications in the situation where untethered divers are swimming close together and do not want their conversations monitored by intruders. Unlike underwater acoustic communications, where the transmitted speech may be received at ranges of hundreds of metres omnidirectionally, a laser communication link is very difficult to intercept and also obviates the need for cables that become snagged or broken. Further applications include the transmission of speech and data, including the short message service (SMS), from a fixed installation such as a sea-bed habitat; and data transmission to and from an autonomous underwater vehicle (AUV), particularly during docking manoeuvres. The performance of the system has been assessed subjectively by listening tests, which revealed that the speech was intelligible, although of poor quality due to the speech algorithm used.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOEpatents

Holzrichter, John F.; Ng, Lawrence C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOEpatents

Holzrichter, J.F.; Ng, L.C.

1998-03-17

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, J.F.; Ng, L.C.

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less
System And Method For Characterizing Voiced Excitations Of Speech And Acoustic Signals, Removing Acoustic Noise From Speech, And Synthesizi

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-04-25

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Speech Intelligibility Advantages using an Acoustic Beamformer Display

NASA Technical Reports Server (NTRS)

Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

2015-01-01

A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

NASA Astrophysics Data System (ADS)

Heracleous, Panikos; Kaino, Tomomi; Saruwatari, Hiroshi; Shikano, Kiyohiro

2006-12-01

We present the use of stethoscope and silicon NAM (nonaudible murmur) microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible) speech, but also very quietly uttered speech (nonaudible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc.) for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a[InlineEquation not available: see fulltext.] word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Dynamic Encoding of Acoustic Features in Neural Responses to Continuous Speech.

PubMed

Khalighinejad, Bahar; Cruzatto da Silva, Guilherme; Mesgarani, Nima

2017-02-22

Humans are unique in their ability to communicate using spoken language. However, it remains unclear how the speech signal is transformed and represented in the brain at different stages of the auditory pathway. In this study, we characterized electroencephalography responses to continuous speech by obtaining the time-locked responses to phoneme instances (phoneme-related potential). We showed that responses to different phoneme categories are organized by phonetic features. We found that each instance of a phoneme in continuous speech produces multiple distinguishable neural responses occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Comparing the patterns of phoneme similarity in the neural responses and the acoustic signals confirms a repetitive appearance of acoustic distinctions of phonemes in the neural data. Analysis of the phonetic and speaker information in neural activations revealed that different time intervals jointly encode the acoustic similarity of both phonetic and speaker categories. These findings provide evidence for a dynamic neural transformation of low-level speech features as they propagate along the auditory pathway, and form an empirical framework to study the representational changes in learning, attention, and speech disorders. SIGNIFICANCE STATEMENT We characterized the properties of evoked neural responses to phoneme instances in continuous speech. We show that each instance of a phoneme in continuous speech produces several observable neural responses at different times occurring as early as 50 ms and as late as 400 ms after the phoneme onset. Each temporal event explicitly encodes the acoustic similarity of phonemes, and linguistic and nonlinguistic information are best represented at different time intervals. Finally, we show a joint encoding of phonetic and speaker information, where the neural representation of speakers is dependent on phoneme category. These findings provide compelling new evidence for
The a priori SDR Estimation Techniques with Reduced Speech Distortion for Acoustic Echo and Noise Suppression

NASA Astrophysics Data System (ADS)

Thoonsaengngam, Rattapol; Tangsangiumvisai, Nisachon

This paper proposes an enhanced method for estimating the a priori Signal-to-Disturbance Ratio (SDR) to be employed in the Acoustic Echo and Noise Suppression (AENS) system for full-duplex hands-free communications. The proposed a priori SDR estimation technique is modified based upon the Two-Step Noise Reduction (TSNR) algorithm to suppress the background noise while preserving speech spectral components. In addition, a practical approach to determine accurately the Echo Spectrum Variance (ESV) is presented based upon the linear relationship assumption between the power spectrum of far-end speech and acoustic echo signals. The ESV estimation technique is then employed to alleviate the acoustic echo problem. The performance of the AENS system that employs these two proposed estimation techniques is evaluated through the Echo Attenuation (EA), Noise Attenuation (NA), and two speech distortion measures. Simulation results based upon real speech signals guarantee that our improved AENS system is able to mitigate efficiently the problem of acoustic echo and background noise, while preserving the speech quality and speech intelligibility.
Ultrasonic speech translator and communications system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akerman, M.A.; Ayers, C.W.; Haynes, H.D.

1996-07-23

A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system includes an ultrasonic transmitting device and an ultrasonic receiving device. The ultrasonic transmitting device accepts as input an audio signal such as human voice input from a microphone or tape deck. The ultrasonic transmitting device frequency modulatesmore » an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output. 7 figs.« less

Ultrasonic speech translator and communications system

DOEpatents

Akerman, M.A.; Ayers, C.W.; Haynes, H.D.

1996-07-23

A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system includes an ultrasonic transmitting device and an ultrasonic receiving device. The ultrasonic transmitting device accepts as input an audio signal such as human voice input from a microphone or tape deck. The ultrasonic transmitting device frequency modulates an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output. 7 figs.
Ultrasonic speech translator and communications system

DOEpatents

Akerman, M. Alfred; Ayers, Curtis W.; Haynes, Howard D.

1996-01-01

A wireless communication system undetectable by radio frequency methods for converting audio signals, including human voice, to electronic signals in the ultrasonic frequency range, transmitting the ultrasonic signal by way of acoustical pressure waves across a carrier medium, including gases, liquids, or solids, and reconverting the ultrasonic acoustical pressure waves back to the original audio signal. The ultrasonic speech translator and communication system (20) includes an ultrasonic transmitting device (100) and an ultrasonic receiving device (200). The ultrasonic transmitting device (100) accepts as input (115) an audio signal such as human voice input from a microphone (114) or tape deck. The ultrasonic transmitting device (100) frequency modulates an ultrasonic carrier signal with the audio signal producing a frequency modulated ultrasonic carrier signal, which is transmitted via acoustical pressure waves across a carrier medium such as gases, liquids or solids. The ultrasonic receiving device (200) converts the frequency modulated ultrasonic acoustical pressure waves to a frequency modulated electronic signal, demodulates the audio signal from the ultrasonic carrier signal, and conditions the demodulated audio signal to reproduce the original audio signal at its output (250).
Methods and apparatus for non-acoustic speech characterization and recognition

DOEpatents

Holzrichter, John F.

1999-01-01

By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Methods and apparatus for non-acoustic speech characterization and recognition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, J.F.

By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Effects and modeling of phonetic and acoustic confusions in accented speech.

PubMed

Fung, Pascale; Liu, Yi

2005-11-01

Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. Since low phonetic confusion units in accented speech do not give rise to automatic speech recognition errors, we focus on analyzing and reducing phonetic and acoustic confusability under high phonetic confusion conditions. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic units with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition.
Study of acoustic correlates associate with emotional speech

NASA Astrophysics Data System (ADS)

Yildirim, Serdar; Lee, Sungbok; Lee, Chul Min; Bulut, Murtaza; Busso, Carlos; Kazemzadeh, Ebrahim; Narayanan, Shrikanth

2004-10-01

This study investigates the acoustic characteristics of four different emotions expressed in speech. The aim is to obtain detailed acoustic knowledge on how a speech signal is modulated by changes from neutral to a certain emotional state. Such knowledge is necessary for automatic emotion recognition and classification and emotional speech synthesis. Speech data obtained from two semi-professional actresses are analyzed and compared. Each subject produces 211 sentences with four different emotions; neutral, sad, angry, happy. We analyze changes in temporal and acoustic parameters such as magnitude and variability of segmental duration, fundamental frequency and the first three formant frequencies as a function of emotion. Acoustic differences among the emotions are also explored with mutual information computation, multidimensional scaling and acoustic likelihood comparison with normal speech. Results indicate that speech associated with anger and happiness is characterized by longer duration, shorter interword silence, higher pitch and rms energy with wider ranges. Sadness is distinguished from other emotions by lower rms energy and longer interword silence. Interestingly, the difference in formant pattern between [happiness/anger] and [neutral/sadness] are better reflected in back vowels such as /a/(/father/) than in front vowels. Detailed results on intra- and interspeaker variability will be reported.
Speech recognition: Acoustic phonetic and lexical knowledge representation

NASA Astrophysics Data System (ADS)

Zue, V. W.

1983-02-01

The purpose of this program is to develop a speech data base facility under which the acoustic characteristics of speech sounds in various contexts can be studied conveniently; investigate the phonological properties of a large lexicon of, say 10,000 words, and determine to what extent the phontactic constraints can be utilized in speech recognition; study the acoustic cues that are used to mark work boundaries; develop a test bed in the form of a large-vocabulary, IWR system to study the interactions of acoustic, phonetic and lexical knowledge; and develop a limited continuous speech recognition system with the goal of recognizing any English word from its spelling in order to assess the interactions of higher-level knowledge sources.
Infant-Directed Visual Prosody: Mothers’ Head Movements and Speech Acoustics

PubMed Central

Smith, Nicholas A.; Strader, Heather L.

2014-01-01

Acoustical changes in the prosody of mothers’ speech to infants are distinct and near universal. However, less is known about the visible properties mothers’ infant-directed (ID) speech, and their relation to speech acoustics. Mothers’ head movements were tracked as they interacted with their infants using ID speech, and compared to movements accompanying their adult-directed (AD) speech. Movement measures along three dimensions of head translation, and three axes of head rotation were calculated. Overall, more head movement was found for ID than AD speech, suggesting that mothers exaggerate their visual prosody in a manner analogous to the acoustical exaggerations in their speech. Regression analyses examined the relation between changing head position and changing acoustical pitch (F0) over time. Head movements and voice pitch were more strongly related in ID speech than in AD speech. When these relations were examined across time windows of different durations, stronger relations were observed for shorter time windows (< 5 sec). However, the particular form of these more local relations did not extend or generalize to longer time windows. This suggests that the multimodal correspondences in speech prosody are variable in form, and occur within limited time spans. PMID:25242907
Acoustics of Clear Speech: Effect of Instruction

ERIC Educational Resources Information Center

Lam, Jennifer; Tjaden, Kris; Wilding, Greg

2012-01-01

Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…
Effect of classroom acoustics on the speech intelligibility of students.

PubMed

Rabelo, Alessandra Terra Vasconcelos; Santos, Juliana Nunes; Oliveira, Rafaella Cristina; Magalhães, Max de Castro

2014-01-01

To analyze the acoustic parameters of classrooms and the relationship among equivalent sound pressure level (Leq), reverberation time (T₃₀), the Speech Transmission Index (STI), and the performance of students in speech intelligibility testing. A cross-sectional descriptive study, which analyzed the acoustic performance of 18 classrooms in 9 public schools in Belo Horizonte, Minas Gerais, Brazil, was conducted. The following acoustic parameters were measured: Leq, T₃₀, and the STI. In the schools evaluated, a speech intelligibility test was performed on 273 students, 45.4% of whom were boys, with an average age of 9.4 years. The results of the speech intelligibility test were compared to the values of the acoustic parameters with the help of Student's t-test. The Leq, T₃₀, and STI tests were conducted in empty and furnished classrooms. Children showed better results in speech intelligibility tests conducted in classrooms with less noise, a lower T₃₀, and greater STI values. The majority of classrooms did not meet the recommended regulatory standards for good acoustic performance. Acoustic parameters have a direct effect on the speech intelligibility of students. Noise contributes to a decrease in their understanding of information presented orally, which can lead to negative consequences in their education and their social integration as future professionals.
The minor third communicates sadness in speech, mirroring its use in music.

PubMed

Curtis, Meagan E; Bharucha, Jamshed J

2010-06-01

There is a long history of attempts to explain why music is perceived as expressing emotion. The relationship between pitches serves as an important cue for conveying emotion in music. The musical interval referred to as the minor third is generally thought to convey sadness. We reveal that the minor third also occurs in the pitch contour of speech conveying sadness. Bisyllabic speech samples conveying four emotions were recorded by 9 actresses. Acoustic analyses revealed that the relationship between the 2 salient pitches of the sad speech samples tended to approximate a minor third. Participants rated the speech samples for perceived emotion, and the use of numerous acoustic parameters as cues for emotional identification was modeled using regression analysis. The minor third was the most reliable cue for identifying sadness. Additional participants rated musical intervals for emotion, and their ratings verified the historical association between the musical minor third and sadness. These findings support the theory that human vocal expressions and music share an acoustic code for communicating sadness.
Acoustic richness modulates the neural networks supporting intelligible speech processing.

PubMed

Lee, Yune-Sang; Min, Nam Eun; Wingfield, Arthur; Grossman, Murray; Peelle, Jonathan E

2016-03-01

The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high. Copyright © 2015 Elsevier
An acoustic comparison of two women's infant- and adult-directed speech

NASA Astrophysics Data System (ADS)

Andruski, Jean; Katz-Gershon, Shiri

2003-04-01

In addition to having prosodic characteristics that are attractive to infant listeners, infant-directed (ID) speech shares certain characteristics of adult-directed (AD) clear speech, such as increased acoustic distance between vowels, that might be expected to make ID speech easier for adults to perceive in noise than AD conversational speech. However, perceptual tests of two women's ID productions by Andruski and Bessega [J. Acoust. Soc. Am. 112, 2355] showed that is not always the case. In a word identification task that compared ID speech with AD clear and conversational speech, one speaker's ID productions were less well-identified than AD clear speech, but better identified than AD conversational speech. For the second woman, ID speech was the least accurately identified of the three speech registers. For both speakers, hard words (infrequent words with many lexical neighbors) were also at an increased disadvantage relative to easy words (frequent words with few lexical neighbors) in speech registers that were less accurately perceived. This study will compare several acoustic properties of these women's productions, including pitch and formant-frequency characteristics. Results of the acoustic analyses will be examined with the original perceptual results to suggest reasons for differences in listener's accuracy in identifying these two women's ID speech in noise.
Specific acoustic models for spontaneous and dictated style in indonesian speech recognition

NASA Astrophysics Data System (ADS)

Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.

2018-03-01

The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.
Fluid-acoustic interactions and their impact on pathological voiced speech

NASA Astrophysics Data System (ADS)

Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.

2011-11-01

Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Careers in Speech Communication.

ERIC Educational Resources Information Center

Speech Communication Association, New York, NY.

Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…
Multilevel Analysis in Analyzing Speech Data

ERIC Educational Resources Information Center

Guddattu, Vasudeva; Krishna, Y.

2011-01-01

The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Executives' speech expressiveness: analysis of perceptive and acoustic aspects of vocal dynamics.

PubMed

Marquezin, Daniela Maria Santos Serrano; Viola, Izabel; Ghirardi, Ana Carolina de Assis Moura; Madureira, Sandra; Ferreira, Léslie Piccolotto

2015-01-01

To analyze speech expressiveness in a group of executives based on perceptive and acoustic aspects of vocal dynamics. Four male subjects participated in the research study (S1, S2, S3, and S4). The assessments included the Kingdomality test to obtain the keywords of communicative attitudes; perceptive-auditory assessment to characterize vocal quality and dynamics, performed by three judges who are speech language pathologists; perceptiveauditory assessment to judge the chosen keywords; speech acoustics to assess prosodic elements (Praat software); and a statistical analysis. According to the perceptive-auditory analysis of vocal dynamics, S1, S2, S3, and S4 did not show vocal alterations and all of them were considered with lowered habitual pitch. S1: pointed out as insecure, nonobjective, nonempathetic, and unconvincing with inappropriate use of pauses that are mainly formed by hesitations; inadequate separation of prosodic groups with breaking of syntagmatic constituents. S2: regular use of pauses for respiratory reload, organization of sentences, and emphasis, which is considered secure, little objective, empathetic, and convincing. S3: pointed out as secure, objective, empathetic, and convincing with regular use of pauses for respiratory reload and organization of sentences and hesitations. S4: the most secure, objective, empathetic, and convincing, with proper use of pauses for respiratory reload, planning, and emphasis; prosodic groups agreed with the statement, without separating the syntagmatic constituents. The speech characteristics and communicative attitudes were highlighted in two subjects in a different manner, in such a way that the slow rate of speech and breaks of the prosodic groups transmitted insecurity, little objectivity, and nonpersuasion.
Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning

PubMed Central

Schuller, Björn

2017-01-01

Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies—the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain. PMID:28658285
Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing.

PubMed

Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K

2018-04-01

The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

Speech and communication in Parkinson’s disease: a cross-sectional exploratory study in the UK

PubMed Central

Barnish, Maxwell S; Horton, Simon M C; Butterfint, Zoe R; Clark, Allan B; Atkinson, Rachel A; Deane, Katherine H O

2017-01-01

Objective To assess associations between cognitive status, intelligibility, acoustics and functional communication in PD. Design Cross-sectional exploratory study of functional communication, including a within-participants experimental design for listener assessment. Setting A major academic medical centre in the East of England, UK. Participants Questionnaire data were assessed for 45 people with Parkinson’s disease (PD), who had self-reported speech or communication difficulties and did not have clinical dementia. Acoustic and listener analyses were conducted on read and conversational speech for 20 people with PD and 20 familiar conversation partner controls without speech, language or cognitive difficulties. Main outcome measures Functional communication assessed by the Communicative Participation Item Bank (CPIB) and Communicative Effectiveness Survey (CES). Results People with PD had lower intelligibility than controls for both the read (mean difference 13.7%, p=0.009) and conversational (mean difference 16.2%, p=0.04) sentences. Intensity and pause were statistically significant predictors of intelligibility in read sentences. Listeners were less accurate identifying the intended emotion in the speech of people with PD (14.8% point difference across conditions, p=0.02) and this was associated with worse speaker cognitive status (16.7% point difference, p=0.04). Cognitive status was a significant predictor of functional communication using CPIB (F=8.99, p=0.005, η2 = 0.15) but not CES. Intelligibility in conversation sentences was a statistically significant predictor of CPIB (F=4.96, p=0.04, η2 = 0.19) and CES (F=13.65, p=0.002, η2 = 0.43). Read sentence intelligibility was not a significant predictor of either outcome. Conclusions Cognitive status was an important predictor of functional communication—the role of intelligibility was modest and limited to conversational and not read speech. Our results highlight the importance of focusing on
Acoustic properties of naturally produced clear speech at normal speaking rates

NASA Astrophysics Data System (ADS)

Krause, Jean C.; Braida, Louis D.

2004-01-01

Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
Preserved Acoustic Hearing in Cochlear Implantation Improves Speech Perception

PubMed Central

Sheffield, Sterling W.; Jahn, Kelly; Gifford, René H.

2015-01-01

Background With improved surgical techniques and electrode design, an increasing number of cochlear implant (CI) recipients have preserved acoustic hearing in the implanted ear, thereby resulting in bilateral acoustic hearing. There are currently no guidelines, however, for clinicians with respect to audio-metric criteria and the recommendation of amplification in the implanted ear. The acoustic bandwidth necessary to obtain speech perception benefit from acoustic hearing in the implanted ear is unknown. Additionally, it is important to determine if, and in which listening environments, acoustic hearing in both ears provides more benefit than hearing in just one ear, even with limited residual hearing. Purpose The purposes of this study were to (1) determine whether acoustic hearing in an ear with a CI provides as much speech perception benefit as an equivalent bandwidth of acoustic hearing in the non-implanted ear, and (2) determine whether acoustic hearing in both ears provides more benefit than hearing in just one ear. Research Design A repeated-measures, within-participant design was used to compare performance across listening conditions. Study Sample Seven adults with CIs and bilateral residual acoustic hearing (hearing preservation) were recruited for the study. Data Collection and Analysis Consonant-nucleus-consonant word recognition was tested in four conditions: CI alone, CI + acoustic hearing in the nonimplanted ear, CI + acoustic hearing in the implanted ear, and CI + bilateral acoustic hearing. A series of low-pass filters were used to examine the effects of acoustic bandwidth through an insert earphone with amplification. Benefit was defined as the difference among conditions. The benefit of bilateral acoustic hearing was tested in both diffuse and single-source background noise. Results were analyzed using repeated-measures analysis of variance. Results Similar benefit was obtained for equivalent acoustic frequency bandwidth in either ear. Acoustic
Acoustic assessment of speech privacy curtains in two nursing units

PubMed Central

Pope, Diana S.; Miller-Klein, Erik T.

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959
Acoustic assessment of speech privacy curtains in two nursing units.

PubMed

Pope, Diana S; Miller-Klein, Erik T

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
Acoustic Analysis of Speech of Cochlear Implantees and Its Implications

PubMed Central

Patadia, Rajesh; Govale, Prajakta; Rangasayee, R.; Kirtane, Milind

2012-01-01

Objectives Cochlear implantees have improved speech production skills compared with those using hearing aids, as reflected in their acoustic measures. When compared to normal hearing controls, implanted children had fronted vowel space and their /s/ and /∫/ noise frequencies overlapped. Acoustic analysis of speech provides an objective index of perceived differences in speech production which can be precursory in planning therapy. The objective of this study was to compare acoustic characteristics of speech in cochlear implantees with those of normal hearing age matched peers to understand implications. Methods Group 1 consisted of 15 children with prelingual bilateral severe-profound hearing loss (age, 5-11 years; implanted between 4-10 years). Prior to an implant behind the ear, hearing aids were used; prior & post implantation subjects received at least 1 year of aural intervention. Group 2 consisted of 15 normal hearing age matched peers. Sustained productions of vowels and words with selected consonants were recorded. Using Praat software for acoustic analysis, digitized speech tokens were measured for F1, F2, and F3 of vowels; centre frequency (Hz) and energy concentration (dB) in burst; voice onset time (VOT in ms) for stops; centre frequency (Hz) of noise in /s/; rise time (ms) for affricates. A t-test was used to find significant differences between groups. Results Significant differences were found in VOT for /b/, F1 and F2 of /e/, and F3 of /u/. No significant differences were found for centre frequency of burst, energy concentration for stops, centre frequency of noise in /s/, or rise time for affricates. These findings suggest that auditory feedback provided by cochlear implants enable subjects to monitor production of speech sounds. Conclusion Acoustic analysis of speech is an essential method for discerning characteristics which have or have not been improved by cochlear implantation and thus for planning intervention. PMID:22701768
Speech communications in noise

NASA Technical Reports Server (NTRS)

1984-01-01

The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.
Talker Differences in Clear and Conversational Speech: Acoustic Characteristics of Vowels

ERIC Educational Resources Information Center

Ferguson, Sarah Hargus; Kewley-Port, Diane

2007-01-01

Purpose: To determine the specific acoustic changes that underlie improved vowel intelligibility in clear speech. Method: Seven acoustic metrics were measured for conversational and clear vowels produced by 12 talkers--6 who previously were found (S. H. Ferguson, 2004) to produce a large clear speech vowel intelligibility effect for listeners with…
Speech waveform perturbation analysis: a perceptual-acoustical comparison of seven measures.

PubMed

Askenfelt, A G; Hammarberg, B

1986-03-01

The performance of seven acoustic measures of cycle-to-cycle variations (perturbations) in the speech waveform was compared. All measures were calculated automatically and applied on running speech. Three of the measures refer to the frequency of occurrence and severity of waveform perturbations in special selected parts of the speech, identified by means of the rate of change in the fundamental frequency. Three other measures refer to statistical properties of the distribution of the relative frequency differences between adjacent pitch periods. One perturbation measure refers to the percentage of consecutive pitch period differences with alternating signs. The acoustic measures were tested on tape recorded speech samples from 41 voice patients, before and after successful therapy. Scattergrams of acoustic waveform perturbation data versus an average of perceived deviant voice qualities, as rated by voice clinicians, are presented. The perturbation measures were compared with regard to the acoustic-perceptual correlation and their ability to discriminate between normal and pathological voice status. The standard deviation of the distribution of the relative frequency differences was suggested as the most useful acoustic measure of waveform perturbations for clinical applications.
Applications for Subvocal Speech

NASA Technical Reports Server (NTRS)

Jorgensen, Charles; Betts, Bradley

2007-01-01

A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.
Department of Cybernetic Acoustics

NASA Astrophysics Data System (ADS)

The development of the theory, instrumentation and applications of methods and systems for the measurement, analysis, processing and synthesis of acoustic signals within the audio frequency range, particularly of the speech signal and the vibro-acoustic signal emitted by technical and industrial equipments treated as noise and vibration sources was discussed. The research work, both theoretical and experimental, aims at applications in various branches of science, and medicine, such as: acoustical diagnostics and phoniatric rehabilitation of pathological and postoperative states of the speech organ; bilateral ""man-machine'' speech communication based on the analysis, recognition and synthesis of the speech signal; vibro-acoustical diagnostics and continuous monitoring of the state of machines, technical equipments and technological processes.
Speech Perception in Complex Acoustic Environments: Developmental Effects

ERIC Educational Resources Information Center

Leibold, Lori J.

2017-01-01

Purpose: The ability to hear and understand speech in complex acoustic environments follows a prolonged time course of development. The purpose of this article is to provide a general overview of the literature describing age effects in susceptibility to auditory masking in the context of speech recognition, including a summary of findings related…
From prosodic structure to acoustic saliency: A fMRI investigation of speech rate, clarity, and emphasis

NASA Astrophysics Data System (ADS)

Golfinopoulos, Elisa

Acoustic variability in fluent speech can arise at many stages in speech production planning and execution. For example, at the phonological encoding stage, the grouping of phonemes into syllables determines which segments are coarticulated and, by consequence, segment-level acoustic variation. Likewise phonetic encoding, which determines the spatiotemporal extent of articulatory gestures, will affect the acoustic detail of segments. Functional magnetic resonance imaging (fMRI) was used to measure brain activity of fluent adult speakers in four speaking conditions: fast, normal, clear, and emphatic (or stressed) speech. These speech manner changes typically result in acoustic variations that do not change the lexical or semantic identity of productions but do affect the acoustic saliency of phonemes, syllables and/or words. Acoustic responses recorded inside the scanner were assessed quantitatively using eight acoustic measures and sentence duration was used as a covariate of non-interest in the neuroimaging analysis. Compared to normal speech, emphatic speech was characterized acoustically by a greater difference between stressed and unstressed vowels in intensity, duration, and fundamental frequency, and neurally by increased activity in right middle premotor cortex and supplementary motor area, and bilateral primary sensorimotor cortex. These findings are consistent with right-lateralized motor planning of prosodic variation in emphatic speech. Clear speech involved an increase in average vowel and sentence durations and average vowel spacing, along with increased activity in left middle premotor cortex and bilateral primary sensorimotor cortex. These findings are consistent with an increased reliance on feedforward control, resulting in hyper-articulation, under clear as compared to normal speech. Fast speech was characterized acoustically by reduced sentence duration and average vowel spacing, and neurally by increased activity in left anterior frontal
A method for determining internal noise criteria based on practical speech communication applied to helicopters

NASA Technical Reports Server (NTRS)

Sternfeld, H., Jr.; Doyle, L. B.

1978-01-01

The relationship between the internal noise environment of helicopters and the ability of personnel to understand commands and instructions was studied. A test program was conducted to relate speech intelligibility to a standard measurement called Articulation Index. An acoustical simulator was used to provide noise environments typical of Army helicopters. Speech material (command sentences and phonetically balanced word lists) were presented at several voice levels in each helicopter environment. Recommended helicopter internal noise criteria, based on speech communication, were derived and the effectiveness of hearing protection devices were evaluated.
A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition

NASA Astrophysics Data System (ADS)

Oh, Yoo Rhee; Kim, Hong Kook

In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.
Speech recognition: Acoustic-phonetic knowledge acquisition and representation

NASA Astrophysics Data System (ADS)

Zue, Victor W.

1988-09-01

The long-term research goal is to develop and implement speaker-independent continuous speech recognition systems. It is believed that the proper utilization of speech-specific knowledge is essential for such advanced systems. This research is thus directed toward the acquisition, quantification, and representation, of acoustic-phonetic and lexical knowledge, and the application of this knowledge to speech recognition algorithms. In addition, we are exploring new speech recognition alternatives based on artificial intelligence and connectionist techniques. We developed a statistical model for predicting the acoustic realization of stop consonants in various positions in the syllable template. A unification-based grammatical formalism was developed for incorporating this model into the lexical access algorithm. We provided an information-theoretic justification for the hierarchical structure of the syllable template. We analyzed segmented duration for vowels and fricatives in continuous speech. Based on contextual information, we developed durational models for vowels and fricatives that account for over 70 percent of the variance, using data from multiple, unknown speakers. We rigorously evaluated the ability of human spectrogram readers to identify stop consonants spoken by many talkers and in a variety of phonetic contexts. Incorporating the declarative knowledge used by the readers, we developed a knowledge-based system for stop identification. We achieved comparable system performance to that to the readers.
Clear Speech Variants: An Acoustic Study in Parkinson's Disease

ERIC Educational Resources Information Center

Lam, Jennifer; Tjaden, Kris

2016-01-01

Purpose: The authors investigated how different variants of clear speech affect segmental and suprasegmental acoustic measures of speech in speakers with Parkinson's disease and a healthy control group. Method: A total of 14 participants with Parkinson's disease and 14 control participants served as speakers. Each speaker produced 18 different…
Clear Speech Variants: An Acoustic Study in Parkinson's Disease.

PubMed

Lam, Jennifer; Tjaden, Kris

2016-08-01

The authors investigated how different variants of clear speech affect segmental and suprasegmental acoustic measures of speech in speakers with Parkinson's disease and a healthy control group. A total of 14 participants with Parkinson's disease and 14 control participants served as speakers. Each speaker produced 18 different sentences selected from the Sentence Intelligibility Test (Yorkston & Beukelman, 1996). All speakers produced stimuli in 4 speaking conditions (habitual, clear, overenunciate, and hearing impaired). Segmental acoustic measures included vowel space area and first moment (M1) coefficient difference measures for consonant pairs. Second formant slope of diphthongs and measures of vowel and fricative durations were also obtained. Suprasegmental measures included fundamental frequency, sound pressure level, and articulation rate. For the majority of adjustments, all variants of clear speech instruction differed from the habitual condition. The overenunciate condition elicited the greatest magnitude of change for segmental measures (vowel space area, vowel durations) and the slowest articulation rates. The hearing impaired condition elicited the greatest fricative durations and suprasegmental adjustments (fundamental frequency, sound pressure level). Findings have implications for a model of speech production for healthy speakers as well as for speakers with dysarthria. Findings also suggest that particular clear speech instructions may target distinct speech subsystems.
Tongue-Palate Contact Pressure, Oral Air Pressure, and Acoustics of Clear Speech

ERIC Educational Resources Information Center

Searl, Jeff; Evitts, Paul M.

2013-01-01

Purpose: The authors compared articulatory contact pressure (ACP), oral air pressure (Po), and speech acoustics for conversational versus clear speech. They also assessed the relationship of these measures to listener perception. Method: Twelve adults with normal speech produced monosyllables in a phrase using conversational and clear speech.…
Sperry Univac speech communications technology

NASA Technical Reports Server (NTRS)

Medress, Mark F.

1977-01-01

Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

Speech privacy and annoyance considerations in the acoustic environment of passenger cars of high-speed trains.

PubMed

Jeon, Jin Yong; Hong, Joo Young; Jang, Hyung Suk; Kim, Jae Hyeon

2015-12-01

It is necessary to consider not only annoyance of interior noises but also speech privacy to achieve acoustic comfort in a passenger car of a high-speed train because speech from other passengers can be annoying. This study aimed to explore an optimal acoustic environment to satisfy speech privacy and reduce annoyance in a passenger car. Two experiments were conducted using speech sources and compartment noise of a high speed train with varying speech-to-noise ratios (SNRA) and background noise levels (BNL). Speech intelligibility was tested in experiment I, and in experiment II, perceived speech privacy, annoyance, and acoustic comfort of combined sounds with speech and background noise were assessed. The results show that speech privacy and annoyance were significantly influenced by the SNRA. In particular, the acoustic comfort was evaluated as acceptable when the SNRA was less than -6 dB for both speech privacy and noise annoyance. In addition, annoyance increased significantly as the BNL exceeded 63 dBA, whereas the effect of the background-noise level on the speech privacy was not significant. These findings suggest that an optimal level of interior noise in a passenger car might exist between 59 and 63 dBA, taking normal speech levels into account.
Communication in a noisy environment: Perception of one's own voice and speech enhancement

NASA Astrophysics Data System (ADS)

Le Cocq, Cecile

Workers in noisy industrial environments are often confronted to communication problems. Lost of workers complain about not being able to communicate easily with their coworkers when they wear hearing protectors. In consequence, they tend to remove their protectors, which expose them to the risk of hearing loss. In fact this communication problem is a double one: first the hearing protectors modify one's own voice perception; second they interfere with understanding speech from others. This double problem is examined in this thesis. When wearing hearing protectors, the modification of one's own voice perception is partly due to the occlusion effect which is produced when an earplug is inserted in the car canal. This occlusion effect has two main consequences: first the physiological noises in low frequencies are better perceived, second the perception of one's own voice is modified. In order to have a better understanding of this phenomenon, the literature results are analyzed systematically, and a new method to quantify the occlusion effect is developed. Instead of stimulating the skull with a bone vibrator or asking the subject to speak as is usually done in the literature, it has been decided to excite the buccal cavity with an acoustic wave. The experiment has been designed in such a way that the acoustic wave which excites the buccal cavity does not excite the external car or the rest of the body directly. The measurement of the hearing threshold in open and occluded car has been used to quantify the subjective occlusion effect for an acoustic wave in the buccal cavity. These experimental results as well as those reported in the literature have lead to a better understanding of the occlusion effect and an evaluation of the role of each internal path from the acoustic source to the internal car. The speech intelligibility from others is altered by both the high sound levels of noisy industrial environments and the speech signal attenuation due to hearing
Acoustic Analysis of the Voiced-Voiceless Distinction in Dutch Tracheoesophageal Speech

ERIC Educational Resources Information Center

Jongmans, Petra; Wempe, Ton G.; van Tinteren, Harm; Hilgers, Frans J. M.; Pols, Louis C. W.; van As-Brooks, Corina J.

2010-01-01

Purpose: Confusions between voiced and voiceless plosives and voiced and voiceless fricatives are common in Dutch tracheoesophageal (TE) speech. This study investigates (a) which acoustic measures are found to convey a correct voicing contrast in TE speech and (b) whether different measures are found in TE speech than in normal laryngeal (NL)…
Communication Supports for People with Motor Speech Disorders

ERIC Educational Resources Information Center

Hanson, Elizabeth K.; Fager, Susan K.

2017-01-01

Communication supports for people with motor speech disorders can include strategies and technologies to supplement natural speech efforts, resolve communication breakdowns, and replace natural speech when necessary to enhance participation in all communicative contexts. This article emphasizes communication supports that can enhance…
Teaching Speech-Communication: Guidelines for Teachers of the Required Speech Communication Course in Idaho Schools.

ERIC Educational Resources Information Center

Elliot, Linda; And Others

Designed to aid school districts, administrators, and teachers in meeting the Idaho Department of Education Speech Communication requirement, this pamphlet first defines the learning-teaching environment for the speech communication course, describes who should teach it, and justifies its inclusion in the school curriculum. The main part of the…
Quantified acoustic-optical speech signal incongruity identifies cortical sites of audiovisual speech processing

PubMed Central

Bernstein, Lynne E.; Lu, Zhong-Lin; Jiang, Jintao

2008-01-01

A fundamental question about human perception is how the speech perceiving brain combines auditory and visual phonetic stimulus information. We assumed that perceivers learn the normal relationship between acoustic and optical signals. We hypothesized that when the normal relationship is perturbed by mismatching the acoustic and optical signals, cortical areas responsible for audiovisual stimulus integration respond as a function of the magnitude of the mismatch. To test this hypothesis, in a previous study, we developed quantitative measures of acoustic-optical speech stimulus incongruity that correlate with perceptual measures. In the current study, we presented low incongruity (LI, matched), medium incongruity (MI, moderately mismatched), and high incongruity (HI, highly mismatched) audiovisual nonsense syllable stimuli during fMRI scanning. Perceptual responses differed as a function of the incongruity level, and BOLD measures were found to vary regionally and quantitatively with perceptual and quantitative incongruity levels. Each increase in level of incongruity resulted in an increase in overall levels of cortical activity and in additional activations. However, the only cortical region that demonstrated differential sensitivity to the three stimulus incongruity levels (HI > MI > LI) was a subarea of the left supramarginal gyrus (SMG). The left SMG might support a fine-grained analysis of the relationship between audiovisual phonetic input in comparison with stored knowledge, as hypothesized here. The methods here show that quantitative manipulation of stimulus incongruity is a new and powerful tool for disclosing the system that processes audiovisual speech stimuli. PMID:18495091
The contrast between alveolar and velar stops with typical speech data: acoustic and articulatory analyses.

PubMed

Melo, Roberta Michelon; Mota, Helena Bolli; Berti, Larissa Cristina

2017-06-08

This study used acoustic and articulatory analyses to characterize the contrast between alveolar and velar stops with typical speech data, comparing the parameters (acoustic and articulatory) of adults and children with typical speech development. The sample consisted of 20 adults and 15 children with typical speech development. The analyzed corpus was organized through five repetitions of each target-word (/'kap ə/, /'tapə/, /'galo/ e /'daɾə/). These words were inserted into a carrier phrase and the participant was asked to name them spontaneously. Simultaneous audio and video data were recorded (tongue ultrasound images). The data was submitted to acoustic analyses (voice onset time; spectral peak and burst spectral moments; vowel/consonant transition and relative duration measures) and articulatory analyses (proportion of significant axes of the anterior and posterior tongue regions and description of tongue curves). Acoustic and articulatory parameters were effective to indicate the contrast between alveolar and velar stops, mainly in the adult group. Both speech analyses showed statistically significant differences between the two groups. The acoustic and articulatory parameters provided signals to characterize the phonic contrast of speech. One of the main findings in the comparison between adult and child speech was evidence of articulatory refinement/maturation even after the period of segment acquisition.
Acoustic evidence for phonologically mismatched speech errors.

PubMed

Gormley, Andrea

2015-04-01

Speech errors are generally said to accommodate to their new phonological context. This accommodation has been validated by several transcription studies. The transcription methodology is not the best choice for detecting errors at this level, however, as this type of error can be difficult to perceive. This paper presents an acoustic analysis of speech errors that uncovers non-accommodated or mismatch errors. A mismatch error is a sub-phonemic error that results in an incorrect surface phonology. This type of error could arise during the processing of phonological rules or they could be made at the motor level of implementation. The results of this work have important implications for both experimental and theoretical research. For experimentalists, it validates the tools used for error induction and the acoustic determination of errors free of the perceptual bias. For theorists, this methodology can be used to test the nature of the processes proposed in language production.
Perceiving speech in context: Compensation for contextual variability during acoustic cue encoding and categorization

NASA Astrophysics Data System (ADS)

Toscano, Joseph Christopher

Several fundamental questions about speech perception concern how listeners understand spoken language despite considerable variability in speech sounds across different contexts (the problem of lack of invariance in speech). This contextual variability is caused by several factors, including differences between individual talkers' voices, variation in speaking rate, and effects of coarticulatory context. A number of models have been proposed to describe how the speech system handles differences across contexts. Critically, these models make different predictions about (1) whether contextual variability is handled at the level of acoustic cue encoding or categorization, (2) whether it is driven by feedback from category-level processes or interactions between cues, and (3) whether listeners discard fine-grained acoustic information to compensate for contextual variability. Separating the effects of cue- and category-level processing has been difficult because behavioral measures tap processes that occur well after initial cue encoding and are influenced by task demands and linguistic information. Recently, we have used the event-related brain potential (ERP) technique to examine cue encoding and online categorization. Specifically, we have looked at differences in the auditory N1 as a measure of acoustic cue encoding and the P3 as a measure of categorization. This allows us to examine multiple levels of processing during speech perception and can provide a useful tool for studying effects of contextual variability. Here, I apply this approach to determine the point in processing at which context has an effect on speech perception and to examine whether acoustic cues are encoded continuously. Several types of contextual variability (talker gender, speaking rate, and coarticulation), as well as several acoustic cues (voice onset time, formant frequencies, and bandwidths), are examined in a series of experiments. The results suggest that (1) at early stages of speech
A speech processing study using an acoustic model of a multiple-channel cochlear implant

NASA Astrophysics Data System (ADS)

Xu, Ying

1998-10-01

A cochlear implant is an electronic device designed to provide sound information for adults and children who have bilateral profound hearing loss. The task of representing speech signals as electrical stimuli is central to the design and performance of cochlear implants. Studies have shown that the current speech- processing strategies provide significant benefits to cochlear implant users. However, the evaluation and development of speech-processing strategies have been complicated by hardware limitations and large variability in user performance. To alleviate these problems, an acoustic model of a cochlear implant with the SPEAK strategy is implemented in this study, in which a set of acoustic stimuli whose psychophysical characteristics are as close as possible to those produced by a cochlear implant are presented on normal-hearing subjects. To test the effectiveness and feasibility of this acoustic model, a psychophysical experiment was conducted to match the performance of a normal-hearing listener using model- processed signals to that of a cochlear implant user. Good agreement was found between an implanted patient and an age-matched normal-hearing subject in a dynamic signal discrimination experiment, indicating that this acoustic model is a reasonably good approximation of a cochlear implant with the SPEAK strategy. The acoustic model was then used to examine the potential of the SPEAK strategy in terms of its temporal and frequency encoding of speech. It was hypothesized that better temporal and frequency encoding of speech can be accomplished by higher stimulation rates and a larger number of activated channels. Vowel and consonant recognition tests were conducted on normal-hearing subjects using speech tokens processed by the acoustic model, with different combinations of stimulation rate and number of activated channels. The results showed that vowel recognition was best at 600 pps and 8 activated channels, but further increases in stimulation rate and
Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)

ERIC Educational Resources Information Center

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…
Vowels in clear and conversational speech: Talker differences in acoustic characteristics and intelligibility for normal-hearing listeners

NASA Astrophysics Data System (ADS)

Hargus Ferguson, Sarah; Kewley-Port, Diane

2002-05-01

Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. Recent work in this lab suggests that talkers vary in how much their intelligibility improves when they are instructed to speak clearly. The few studies examining acoustic characteristics of clear and conversational speech suggest that these differing clear speech effects result from different acoustic strategies on the part of individual talkers. However, only two studies to date have directly examined differences among talkers producing clear versus conversational speech, and neither included acoustic analysis. In this project, clear and conversational speech was recorded from 41 male and female talkers aged 18-45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. Acoustic analyses using stimuli from a subgroup of talkers shown to have a range of speaking style effects will be used to assess specific acoustic correlates of vowel intelligibility in clear and conversational speech. [Work supported by NIHDCD-02229.
Improving Understanding of Emotional Speech Acoustic Content

NASA Astrophysics Data System (ADS)

Tinnemore, Anna

Children with cochlear implants show deficits in identifying emotional intent of utterances without facial or body language cues. A known limitation to cochlear implants is the inability to accurately portray the fundamental frequency contour of speech which carries the majority of information needed to identify emotional intent. Without reliable access to the fundamental frequency, other methods of identifying vocal emotion, if identifiable, could be used to guide therapies for training children with cochlear implants to better identify vocal emotion. The current study analyzed recordings of adults speaking neutral sentences with a set array of emotions in a child-directed and adult-directed manner. The goal was to identify acoustic cues that contribute to emotion identification that may be enhanced in child-directed speech, but are also present in adult-directed speech. Results of this study showed that there were significant differences in the variation of the fundamental frequency, the variation of intensity, and the rate of speech among emotions and between intended audiences.
Acoustic Evidence for Phonologically Mismatched Speech Errors

ERIC Educational Resources Information Center

Gormley, Andrea

2015-01-01

Speech errors are generally said to accommodate to their new phonological context. This accommodation has been validated by several transcription studies. The transcription methodology is not the best choice for detecting errors at this level, however, as this type of error can be difficult to perceive. This paper presents an acoustic analysis of…
Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech

PubMed Central

Leong, Victoria; Goswami, Usha

2015-01-01

When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72–82% (freely-read CDS) and 90–98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across
Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech.

PubMed

Leong, Victoria; Goswami, Usha

2015-01-01

When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72-82% (freely-read CDS) and 90-98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across
Differential effects of speech situations on mothers' and fathers' infant-directed and dog-directed speech: An acoustic analysis.

PubMed

Gergely, Anna; Faragó, Tamás; Galambos, Ágoston; Topál, József

2017-10-23

There is growing evidence that dog-directed and infant-directed speech have similar acoustic characteristics, like high overall pitch, wide pitch range, and attention-getting devices. However, it is still unclear whether dog- and infant-directed speech have gender or context-dependent acoustic features. In the present study, we collected comparable infant-, dog-, and adult directed speech samples (IDS, DDS, and ADS) in four different speech situations (Storytelling, Task solving, Teaching, and Fixed sentences situations); we obtained the samples from parents whose infants were younger than 30 months of age and also had pet dog at home. We found that ADS was different from IDS and DDS, independently of the speakers' gender and the given situation. Higher overall pitch in DDS than in IDS during free situations was also found. Our results show that both parents hyperarticulate their vowels when talking to children but not when addressing dogs: this result is consistent with the goal of hyperspeech in language tutoring. Mothers, however, exaggerate their vowels for their infants under 18 months more than fathers do. Our findings suggest that IDS and DDS have context-dependent features and support the notion that people adapt their prosodic features to the acoustic preferences and emotional needs of their audience.
Suprasegmental Characteristics of Spontaneous Speech Produced in Good and Challenging Communicative Conditions by Talkers Aged 9-14 Years

ERIC Educational Resources Information Center

Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle

2016-01-01

Purpose: This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Method: Recordings were made for 96 children (50 female participants, 46 male…
Normal Aspects of Speech, Hearing, and Language.

ERIC Educational Resources Information Center

Minifie, Fred. D., Ed.; And Others

This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
Acoustic communication in plant-animal interactions.

PubMed

Schöner, Michael G; Simon, Ralph; Schöner, Caroline R

2016-08-01

Acoustic communication is widespread and well-studied in animals but has been neglected in other organisms such as plants. However, there is growing evidence for acoustic communication in plant-animal interactions. While knowledge about active acoustic signalling in plants (i.e. active sound production) is still in its infancy, research on passive acoustic signalling (i.e. reflection of animal sounds) revealed that bat-dependent plants have adapted to the bats' echolocation systems by providing acoustic reflectors to attract their animal partners. Understanding the proximate mechanisms and ultimate causes of acoustic communication will shed light on an underestimated dimension of information transfer between plants and animals. Copyright © 2016 Elsevier Ltd. All rights reserved.

Is Birdsong More Like Speech or Music?

PubMed

Shannon, Robert V

2016-04-01

Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.
DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1-1.1

NASA Astrophysics Data System (ADS)

Garofolo, J. S.; Lamel, L. F.; Fisher, W. M.; Fiscus, J. G.; Pallett, D. S.

1993-02-01

The Texas Instruments/Massachusetts Institute of Technology (TIMIT) corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems. TIMIT contains speech from 630 speakers representing 8 major dialect divisions of American English, each speaking 10 phonetically-rich sentences. The TIMIT corpus includes time-aligned orthographic, phonetic, and word transcriptions, as well as speech waveform data for each spoken sentence. The release of TIMIT contains several improvements over the Prototype CD-ROM released in December, 1988: (1) full 630-speaker corpus, (2) checked and corrected transcriptions, (3) word-alignment transcriptions, (4) NIST SPHERE-headered waveform files and header manipulation software, (5) phonemic dictionary, (6) new test and training subsets balanced for dialectal and phonetic coverage, and (7) more extensive documentation.
Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model

NASA Astrophysics Data System (ADS)

He, Di; Lim, Boon Pang; Yang, Xuesong; Hasegawa-Johnson, Mark; Chen, Deming

2018-06-01

Most mainstream Automatic Speech Recognition (ASR) systems consider all feature frames equally important. However, acoustic landmark theory is based on a contradictory idea, that some frames are more important than others. Acoustic landmark theory exploits quantal non-linearities in the articulatory-acoustic and acoustic-perceptual relations to define landmark times at which the speech spectrum abruptly changes or reaches an extremum; frames overlapping landmarks have been demonstrated to be sufficient for speech perception. In this work, we conduct experiments on the TIMIT corpus, with both GMM and DNN based ASR systems and find that frames containing landmarks are more informative for ASR than others. We find that altering the level of emphasis on landmarks by re-weighting acoustic likelihood tends to reduce the phone error rate (PER). Furthermore, by leveraging the landmark as a heuristic, one of our hybrid DNN frame dropping strategies maintained a PER within 0.44% of optimal when scoring less than half (45.8% to be precise) of the frames. This hybrid strategy out-performs other non-heuristic-based methods and demonstrate the potential of landmarks for reducing computation.
Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters.

PubMed

McNeer, Richard R; Bennett, Christopher L; Horn, Danielle Bodzin; Dudaryk, Roman

2017-06-01

Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m (average VR = 131.1 [34.2] m). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC, partial correlations were 0.825 (P
Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters

PubMed Central

Bennett, Christopher L.; Horn, Danielle Bodzin; Dudaryk, Roman

2017-01-01

INTRODUCTION: Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. METHODS: We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. RESULTS: A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m3 (average VR = 131.1 [34.2] m3). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R2 = 0.666, F(2, 37) = 39.9, P < .0001). For Dmax (maximum distance where there is <15% loss of consonant articulation), both VR and VC β coefficients were significant. For RT60 and Dmax, after controlling for VC
Career Development for Speech Communication Majors.

ERIC Educational Resources Information Center

Johnson, Arlee W.

A change in the focus of the speech communication program at Oklahoma State University (OSU) resulted from recognition during the late 1960s that the only growth potential for the speech communication field was in preparing students for work in nonacademic settings. This paper presents the current status of the program at OSU and discusses the…
Monaural room acoustic parameters from music and speech.

PubMed

Kendrick, Paul; Cox, Trevor J; Li, Francis F; Zhang, Yonggang; Chambers, Jonathon A

2008-07-01

This paper compares two methods for extracting room acoustic parameters from reverberated speech and music. An approach which uses statistical machine learning, previously developed for speech, is extended to work with music. For speech, reverberation time estimations are within a perceptual difference limen of the true value. For music, virtually all early decay time estimations are within a difference limen of the true value. The estimation accuracy is not good enough in other cases due to differences between the simulated data set used to develop the empirical model and real rooms. The second method carries out a maximum likelihood estimation on decay phases at the end of notes or speech utterances. This paper extends the method to estimate parameters relating to the balance of early and late energies in the impulse response. For reverberation time and speech, the method provides estimations which are within the perceptual difference limen of the true value. For other parameters such as clarity, the estimations are not sufficiently accurate due to the natural reverberance of the excitation signals. Speech is a better test signal than music because of the greater periods of silence in the signal, although music is needed for low frequency measurement.
Communicating by Language: The Speech Process.

ERIC Educational Resources Information Center

House, Arthur S., Ed.

This document reports on a conference focused on speech problems. The main objective of these discussions was to facilitate a deeper understanding of human communication through interaction of conference participants with colleagues in other disciplines. Topics discussed included speech production, feedback, speech perception, and development of…
Ocean Variability Effects on Underwater Acoustic Communications

DTIC Science & Technology

2010-09-30

fluctuations on noncoherent acoustic communication [11] as well as on phase-coherent communication [12] were investigated for a near-seafloor source over...V. McDonald, and the KauaiEx Group, “Effects of ocean thermocline variability on noncoherent underwater acoustic communications,” J. Acoust. Soc. Am
Acoustic Sources of Accent in Second Language Japanese Speech.

PubMed

Idemaru, Kaori; Wei, Peipei; Gubbins, Lucy

2018-05-01

This study reports an exploratory analysis of the acoustic characteristics of second language (L2) speech which give rise to the perception of a foreign accent. Japanese speech samples were collected from American English and Mandarin Chinese speakers ( n = 16 in each group) studying Japanese. The L2 participants and native speakers ( n = 10) provided speech samples modeling after six short sentences. Segmental (vowels and stops) and prosodic features (rhythm, tone, and fluency) were examined. Native Japanese listeners ( n = 10) rated the samples with regard to degrees of foreign accent. The analyses predicting accent ratings based on the acoustic measurements indicated that one of the prosodic features in particular, tone (defined as high and low patterns of pitch accent and intonation in this study), plays an important role in robustly predicting accent rating in L2 Japanese across the two first language (L1) backgrounds. These results were consistent with the prediction based on phonological and phonetic comparisons between Japanese and English, as well as Japanese and Mandarin Chinese. The results also revealed L1-specific predictors of perceived accent in Japanese. The findings of this study contribute to the growing literature that examines sources of perceived foreign accent.
Acoustics in human communication: evolving ideas about the nature of speech.

PubMed

Cooper, F S

1980-07-01

This paper discusses changes in attitude toward the nature of speech during the past half century. After reviewing early views on the subject, it considers the role of speech spectrograms, speech articulation, speech perception, messages and computers, and the nature of fluent speech.
Do 6-Month-Olds Understand That Speech Can Communicate?

ERIC Educational Resources Information Center

Vouloumanos, Athena; Martin, Alia; Onishi, Kristine H.

2014-01-01

Adults and 12-month-old infants recognize that even unfamiliar speech can communicate information between third parties, suggesting that they can separate the communicative function of speech from its lexical content. But do infants recognize that speech can communicate due to their experience understanding and producing language, or do they…
The Interaction of Temporal and Spectral Acoustic Information with Word Predictability on Speech Intelligibility

NASA Astrophysics Data System (ADS)

Shahsavarani, Somayeh Bahar

High-level, top-down information such as linguistic knowledge is a salient cortical resource that influences speech perception under most listening conditions. But, are all listeners able to exploit these resources for speech facilitation to the same extent? It was found that children with cochlear implants showed different patterns of benefit from contextual information in speech perception compared with their normal-haring peers. Previous studies have discussed the role of non-acoustic factors such as linguistic and cognitive capabilities to account for this discrepancy. Given the fact that the amount of acoustic information encoded and processed by auditory nerves of listeners with cochlear implants differs from normal-hearing listeners and even varies across individuals with cochlear implants, it is important to study the interaction of specific acoustic properties of the speech signal with contextual cues. This relationship has been mostly neglected in previous research. In this dissertation, we aimed to explore how different acoustic dimensions interact to affect listeners' abilities to combine top-down information with bottom-up information in speech perception beyond the known effects of linguistic and cognitive capacities shown previously. Specifically, the present study investigated whether there were any distinct context effects based on the resolution of spectral versus slowly-varying temporal information in perception of spectrally impoverished speech. To that end, two experiments were conducted. In both experiments, a noise-vocoded technique was adopted to generate spectrally-degraded speech to approximate acoustic cues delivered to listeners with cochlear implants. The frequency resolution was manipulated by varying the number of frequency channels. The temporal resolution was manipulated by low-pass filtering of amplitude envelope with varying low-pass cutoff frequencies. The stimuli were presented to normal-hearing native speakers of American
Postlingual deaf speech and the role of audition in speech production: comments on Waldstein's paper [R.S. Waldstein, J. Acoust. Soc. Am. 88, 2099-2114 (1990)].

PubMed

Sapir, S; Canter, G J

1991-09-01

Using acoustic analysis techniques, Waldstein [J. Acoust. Soc. Am. 88, 2099-2114 (1990] reported abnormal speech findings in postlingual deaf speakers. She interpreted her findings to suggest that auditory feedback is important in motor speech control. However, it is argued here that Waldstein's interpretation may be unwarranted without addressing the possibility of neurologic deficits (e.g., dysarthria) as confounding (or even primary) causes of the abnormal speech in her subjects.
Fifty years of progress in acoustic phonetics

NASA Astrophysics Data System (ADS)

Stevens, Kenneth N.

2004-10-01

Three events that occurred 50 or 60 years ago shaped the study of acoustic phonetics, and in the following few decades these events influenced research and applications in speech disorders, speech development, speech synthesis, speech recognition, and other subareas in speech communication. These events were: (1) the source-filter theory of speech production (Chiba and Kajiyama; Fant); (2) the development of the sound spectrograph and its interpretation (Potter, Kopp, and Green; Joos); and (3) the birth of research that related distinctive features to acoustic patterns (Jakobson, Fant, and Halle). Following these events there has been systematic exploration of the articulatory, acoustic, and perceptual bases of phonological categories, and some quantification of the sources of variability in the transformation of this phonological representation of speech into its acoustic manifestations. This effort has been enhanced by studies of how children acquire language in spite of this variability and by research on speech disorders. Gaps in our knowledge of this inherent variability in speech have limited the directions of applications such as synthesis and recognition of speech, and have led to the implementation of data-driven techniques rather than theoretical principles. Some examples of advances in our knowledge, and limitations of this knowledge, are reviewed.
Building an Interdepartmental Major in Speech Communication.

ERIC Educational Resources Information Center

Litterst, Judith K.

This paper describes a popular and innovative major program of study in speech communication at St. Cloud University in Minnesota: the Speech Communication Interdepartmental Major. The paper provides background on the program, discusses overall program requirements, presents sample student options, identifies ingredients for program success,…
A physiologically-inspired model reproducing the speech intelligibility benefit in cochlear implant listeners with residual acoustic hearing.

PubMed

Zamaninezhad, Ladan; Hohmann, Volker; Büchner, Andreas; Schädler, Marc René; Jürgens, Tim

2017-02-01

This study introduces a speech intelligibility model for cochlear implant users with ipsilateral preserved acoustic hearing that aims at simulating the observed speech-in-noise intelligibility benefit when receiving simultaneous electric and acoustic stimulation (EA-benefit). The model simulates the auditory nerve spiking in response to electric and/or acoustic stimulation. The temporally and spatially integrated spiking patterns were used as the final internal representation of noisy speech. Speech reception thresholds (SRTs) in stationary noise were predicted for a sentence test using an automatic speech recognition framework. The model was employed to systematically investigate the effect of three physiologically relevant model factors on simulated SRTs: (1) the spatial spread of the electric field which co-varies with the number of electrically stimulated auditory nerves, (2) the "internal" noise simulating the deprivation of auditory system, and (3) the upper bound frequency limit of acoustic hearing. The model results show that the simulated SRTs increase monotonically with increasing spatial spread for fixed internal noise, and also increase with increasing the internal noise strength for a fixed spatial spread. The predicted EA-benefit does not follow such a systematic trend and depends on the specific combination of the model parameters. Beyond 300 Hz, the upper bound limit for preserved acoustic hearing is less influential on speech intelligibility of EA-listeners in stationary noise. The proposed model-predicted EA-benefits are within the range of EA-benefits shown by 18 out of 21 actual cochlear implant listeners with preserved acoustic hearing. Copyright © 2016 Elsevier B.V. All rights reserved.
Effect of Reflective Practice on Student Recall of Acoustics for Speech Science

ERIC Educational Resources Information Center

Walden, Patrick R.; Bell-Berti, Fredericka

2013-01-01

Researchers have developed models of learning through experience; however, these models are rarely named as a conceptual frame for educational research in the sciences. This study examined the effect of reflective learning responses on student recall of speech acoustics concepts. Two groups of undergraduate students enrolled in a speech science…
Philosophical Perspectives on Values and Ethics in Speech Communication.

ERIC Educational Resources Information Center

Becker, Carl B.

There are three very different concerns of communication ethics: (1) applied speech ethics, (2) ethical rules or standards, and (3) metaethical issues. In the area of applied speech ethics, communications theorists attempt to determine whether a speech act is moral or immoral by focusing on the content and effects of specific speech acts. Specific…
Examining Acoustic and Kinematic Measures of Articulatory Working Space: Effects of Speech Intensity.

PubMed

Whitfield, Jason A; Dromey, Christopher; Palmer, Panika

2018-05-17

The purpose of this study was to examine the effect of speech intensity on acoustic and kinematic vowel space measures and conduct a preliminary examination of the relationship between kinematic and acoustic vowel space metrics calculated from continuously sampled lingual marker and formant traces. Young adult speakers produced 3 repetitions of 2 different sentences at 3 different loudness levels. Lingual kinematic and acoustic signals were collected and analyzed. Acoustic and kinematic variants of several vowel space metrics were calculated from the formant frequencies and the position of 2 lingual markers. Traditional metrics included triangular vowel space area and the vowel articulation index. Acoustic and kinematic variants of sentence-level metrics based on the articulatory-acoustic vowel space and the vowel space hull area were also calculated. Both acoustic and kinematic variants of the sentence-level metrics significantly increased with an increase in loudness, whereas no statistically significant differences in traditional vowel-point metrics were observed for either the kinematic or acoustic variants across the 3 loudness conditions. In addition, moderate-to-strong relationships between the acoustic and kinematic variants of the sentence-level vowel space metrics were observed for the majority of participants. These data suggest that both kinematic and acoustic vowel space metrics that reflect the dynamic contributions of both consonant and vowel segments are sensitive to within-speaker changes in articulation associated with manipulations of speech intensity.

Language Comprehension in Language-Learning Impaired Children Improved with Acoustically Modified Speech

NASA Astrophysics Data System (ADS)

Tallal, Paula; Miller, Steve L.; Bedi, Gail; Byma, Gary; Wang, Xiaoqin; Nagarajan, Srikantan S.; Schreiner, Christoph; Jenkins, William M.; Merzenich, Michael M.

1996-01-01

A speech processing algorithm was developed to create more salient versions of the rapidly changing elements in the acoustic waveform of speech that have been shown to be deficiently processed by language-learning impaired (LLI) children. LLI children received extensive daily training, over a 4-week period, with listening exercises in which all speech was translated into this synthetic form. They also received daily training with computer "games" designed to adaptively drive improvements in temporal processing thresholds. Significant improvements in speech discrimination and language comprehension abilities were demonstrated in two independent groups of LLI children.
A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels

PubMed Central

2014-01-01

The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined. Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production. PMID:25060583
A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels.

PubMed

Zourmand, Alireza; Mirhassani, Seyed Mostafa; Ting, Hua-Nong; Bux, Shaik Ismail; Ng, Kwan Hoong; Bilgen, Mehmet; Jalaludin, Mohd Amin

2014-07-25

The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined.Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production.
The "Checkers" Speech and Televised Political Communication.

ERIC Educational Resources Information Center

Flaningam, Carl

Richard Nixon's 1952 "Checkers" speech was an innovative use of television for political communication. Like television news itself, the campaign fund crisis behind the speech can be thought of in the same terms as other television melodrama, with the speech serving as its climactic episode. The speech adapted well to television because…
Understanding the abstract role of speech in communication at 12 months.

PubMed

Martin, Alia; Onishi, Kristine H; Vouloumanos, Athena

2012-04-01

Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the understanding that speech permits the transfer of information about a Communicator's target object to a Recipient. Initially, the Communicator selectively grasped one of two objects. In test, the Communicator could no longer reach the objects. She then turned to the Recipient and produced speech (a nonsense word) or non-speech (coughing). Infants looked longer when the Recipient selected the non-target than the target object when the Communicator had produced speech but not coughing (Experiment 1). Looking time patterns differed from the speech condition when the Recipient rather than the Communicator produced the speech (Experiment 2), and when the Communicator produced a positive emotional vocalization (Experiment 3), but did not differ when the Recipient had previously received information about the target by watching the Communicator's selective grasping (Experiment 4). Thus infants understand the information-transferring properties of speech and recognize some of the conditions under which others' information states can be updated. These results suggest that infants possess an abstract understanding of the communicative function of speech, providing an important potential mechanism for language and knowledge acquisition. Copyright © 2011 Elsevier B.V. All rights reserved.
Suppressed Alpha Oscillations Predict Intelligibility of Speech and its Acoustic Details

PubMed Central

Weisz, Nathan

2012-01-01

Modulations of human alpha oscillations (8–13 Hz) accompany many cognitive processes, but their functional role in auditory perception has proven elusive: Do oscillatory dynamics of alpha reflect acoustic details of the speech signal and are they indicative of comprehension success? Acoustically presented words were degraded in acoustic envelope and spectrum in an orthogonal design, and electroencephalogram responses in the frequency domain were analyzed in 24 participants, who rated word comprehensibility after each trial. First, the alpha power suppression during and after a degraded word depended monotonically on spectral and, to a lesser extent, envelope detail. The magnitude of this alpha suppression exhibited an additional and independent influence on later comprehension ratings. Second, source localization of alpha suppression yielded superior parietal, prefrontal, as well as anterior temporal brain areas. Third, multivariate classification of the time–frequency pattern across participants showed that patterns of late posterior alpha power allowed best for above-chance classification of word intelligibility. Results suggest that both magnitude and topography of late alpha suppression in response to single words can indicate a listener's sensitivity to acoustic features and the ability to comprehend speech under adverse listening conditions. PMID:22100354
Children's views of communication and speech-language pathology.

PubMed

Merrick, Rosalind; Roulstone, Sue

2011-08-01

Children have the right to express their views and influence decisions in matters that affect them. Yet decisions regarding speech-language pathology are often made on their behalf, and research into the perspectives of children who receive speech-language pathology intervention is currently limited. This paper reports a qualitative study which explored experiences of communication and of speech-language pathology from the perspectives of children with speech, language, and communication needs (SLCN). The aim was to explore their perspectives of communication, communication impairment, and assistance. Eleven school-children participated in the study, aged between 7-10 years. They were recruited through a speech-language pathology service in south west England, to include a range of ages and severity of difficulties. The study used open-ended interviews within which non-verbal activities such as drawing, taking photographs, and compiling a scrapbook were used to create a context for supported conversations. Findings were analysed according to the principles of grounded theory. Three ways of talking about communication emerged. These were in terms of impairment, learning, and behaviour. Findings offer insight into dialogue between children with SLCN and adults; the way communication is talked about has implications for children's view of themselves, their skills, and their participation.
Articulatory-acoustic vowel space: application to clear speech in individuals with Parkinson's disease.

PubMed

Whitfield, Jason A; Goberman, Alexander M

2014-01-01

Individuals with Parkinson disease (PD) often exhibit decreased range of movement secondary to the disease process, which has been shown to affect articulatory movements. A number of investigations have failed to find statistically significant differences between control and disordered groups, and between speaking conditions, using traditional vowel space area measures. The purpose of the current investigation was to evaluate both between-group (PD versus control) and within-group (habitual versus clear) differences in articulatory function using a novel vowel space measure, the articulatory-acoustic vowel space (AAVS). The novel AAVS is calculated from continuously sampled formant trajectories of connected speech. In the current study, habitual and clear speech samples from twelve individuals with PD along with habitual control speech samples from ten neurologically healthy adults were collected and acoustically analyzed. In addition, a group of listeners completed perceptual rating of speech clarity for all samples. Individuals with PD were perceived to exhibit decreased speech clarity compared to controls. Similarly, the novel AAVS measure was significantly lower in individuals with PD. In addition, the AAVS measure significantly tracked changes between the habitual and clear conditions that were confirmed by perceptual ratings. In the current study, the novel AAVS measure is shown to be sensitive to disease-related group differences and within-person changes in articulatory function of individuals with PD. Additionally, these data confirm that individuals with PD can modulate the speech motor system to increase articulatory range of motion and speech clarity when given a simple prompt. The reader will be able to (i) describe articulatory behavior observed in the speech of individuals with Parkinson disease; (ii) describe traditional measures of vowel space area and how they relate to articulation; (iii) describe a novel measure of vowel space, the articulatory-acoustic
Perceptual centres in speech - an acoustic analysis

NASA Astrophysics Data System (ADS)

Scott, Sophie Kerttu

Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of
Third International Conference on Acoustic Communication by Animals

DTIC Science & Technology

2011-09-30

communications Invited Speakers Peter Tyack cetacean communications Christopher Clark acoustic environment of whales Whitlow Au sound detection and...echolocation by dolphins Magnus Wahlberg sperm whale acoustics Robert Dooling bird hearing Ronald Hoy communication strategies in insects Peter Narins...frogs (6). Topics covered included cognition/language; song and call classification; rule learning; acoustic ecology; communication in noisy
Acoustic Analysis of PD Speech

PubMed Central

Chenausky, Karen; MacAuslan, Joel; Goldhor, Richard

2011-01-01

According to the U.S. National Institutes of Health, approximately 500,000 Americans have Parkinson's disease (PD), with roughly another 50,000 receiving new diagnoses each year. 70%–90% of these people also have the hypokinetic dysarthria associated with PD. Deep brain stimulation (DBS) substantially relieves motor symptoms in advanced-stage patients for whom medication produces disabling dyskinesias. This study investigated speech changes as a result of DBS settings chosen to maximize motor performance. The speech of 10 PD patients and 12 normal controls was analyzed for syllable rate and variability, syllable length patterning, vowel fraction, voice-onset time variability, and spirantization. These were normalized by the controls' standard deviation to represent distance from normal and combined into a composite measure. Results show that DBS settings relieving motor symptoms can improve speech, making it up to three standard deviations closer to normal. However, the clinically motivated settings evaluated here show greater capacity to impair, rather than improve, speech. A feedback device developed from these findings could be useful to clinicians adjusting DBS parameters, as a means for ensuring they do not unwittingly choose DBS settings which impair patients' communication. PMID:21977333
Reducing language to rhythm: Amazonian Bora drummed language exploits speech rhythm for long-distance communication

NASA Astrophysics Data System (ADS)

Seifart, Frank; Meyer, Julien; Grawunder, Sven; Dentel, Laure

2018-04-01

Many drum communication systems around the world transmit information by emulating tonal and rhythmic patterns of spoken languages in sequences of drumbeats. Their rhythmic characteristics, in particular, have not been systematically studied so far, although understanding them represents a rare occasion for providing an original insight into the basic units of speech rhythm as selected by natural speech practices directly based on beats. Here, we analyse a corpus of Bora drum communication from the northwest Amazon, which is nowadays endangered with extinction. We show that four rhythmic units are encoded in the length of pauses between beats. We argue that these units correspond to vowel-to-vowel intervals with different numbers of consonants and vowel lengths. By contrast, aligning beats with syllables, mora or only vowel length yields inconsistent results. Moreover, we also show that Bora drummed messages conventionally select rhythmically distinct markers to further distinguish words. The two phonological tones represented in drummed speech encode only few lexical contrasts. Rhythm thus appears to crucially contribute to the intelligibility of drummed Bora. Our study provides novel evidence for the role of rhythmic structures composed of vowel-to-vowel intervals in the complex puzzle concerning the redundancy and distinctiveness of acoustic features embedded in speech.
Reducing language to rhythm: Amazonian Bora drummed language exploits speech rhythm for long-distance communication

PubMed Central

Grawunder, Sven; Dentel, Laure

2018-01-01

Many drum communication systems around the world transmit information by emulating tonal and rhythmic patterns of spoken languages in sequences of drumbeats. Their rhythmic characteristics, in particular, have not been systematically studied so far, although understanding them represents a rare occasion for providing an original insight into the basic units of speech rhythm as selected by natural speech practices directly based on beats. Here, we analyse a corpus of Bora drum communication from the northwest Amazon, which is nowadays endangered with extinction. We show that four rhythmic units are encoded in the length of pauses between beats. We argue that these units correspond to vowel-to-vowel intervals with different numbers of consonants and vowel lengths. By contrast, aligning beats with syllables, mora or only vowel length yields inconsistent results. Moreover, we also show that Bora drummed messages conventionally select rhythmically distinct markers to further distinguish words. The two phonological tones represented in drummed speech encode only few lexical contrasts. Rhythm thus appears to crucially contribute to the intelligibility of drummed Bora. Our study provides novel evidence for the role of rhythmic structures composed of vowel-to-vowel intervals in the complex puzzle concerning the redundancy and distinctiveness of acoustic features embedded in speech. PMID:29765620
Acoustic correlates of sexual orientation and gender-role self-concept in women's speech.

PubMed

Kachel, Sven; Simpson, Adrian P; Steffens, Melanie C

2017-06-01

Compared to studies of male speakers, relatively few studies have investigated acoustic correlates of sexual orientation in women. The present investigation focuses on shedding more light on intra-group variability in lesbians and straight women by using a fine-grained analysis of sexual orientation and collecting data on psychological characteristics (e.g., gender-role self-concept). For a large-scale women's sample (overall n = 108), recordings of spontaneous and read speech were analyzed for median fundamental frequency and acoustic vowel space features. Two studies showed no acoustic differences between lesbians and straight women, but there was evidence of acoustic differences within sexual orientation groups. Intra-group variability in median f0 was found to depend on the exclusivity of sexual orientation; F1 and F2 in /iː/ (study 1) and median f0 (study 2) were acoustic correlates of gender-role self-concept, at least for lesbians. Other psychological characteristics (e.g., sexual orientation of female friends) were also reflected in lesbians' speech. Findings suggest that acoustic features indexicalizing sexual orientation can only be successfully interpreted in combination with a fine-grained analysis of psychological characteristics.
Improving the speech intelligibility in classrooms

NASA Astrophysics Data System (ADS)

Lam, Choi Ling Coriolanus

One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations
Acoustic analysis of speech under stress.

PubMed

Sondhi, Savita; Khan, Munna; Vijay, Ritu; Salhan, Ashok K; Chouhan, Satish

2015-01-01

When a person is emotionally charged, stress could be discerned in his voice. This paper presents a simplified and a non-invasive approach to detect psycho-physiological stress by monitoring the acoustic modifications during a stressful conversation. Voice database consists of audio clips from eight different popular FM broadcasts wherein the host of the show vexes the subjects who are otherwise unaware of the charade. The audio clips are obtained from real-life stressful conversations (no simulated emotions). Analysis is done using PRAAT software to evaluate mean fundamental frequency (F0) and formant frequencies (F1, F2, F3, F4) both in neutral and stressed state. Results suggest that F0 increases with stress; however, formant frequency decreases with stress. Comparison of Fourier and chirp spectra of short vowel segment shows that for relaxed speech, the two spectra are similar; however, for stressed speech, they differ in the high frequency range due to increased pitch modulation.
Alternative Speech Communication System for Persons with Severe Speech Disorders

NASA Astrophysics Data System (ADS)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
Speech and Communication Changes Reported by People with Parkinson's Disease.

PubMed

Schalling, Ellika; Johansson, Kerstin; Hartelius, Lena

2017-01-01

Changes in communicative functions are common in Parkinson's disease (PD), but there are only limited data provided by individuals with PD on how these changes are perceived, what their consequences are, and what type of intervention is provided. To present self-reported information about speech and communication, the impact on communicative participation, and the amount and type of speech-language pathology services received by people with PD. Respondents with PD recruited via the Swedish Parkinson's Disease Society filled out a questionnaire accessed via a Web link or provided in a paper version. Of 188 respondents, 92.5% reported at least one symptom related to communication; the most common symptoms were weak voice, word-finding difficulties, imprecise articulation, and getting off topic in conversation. The speech and communication problems resulted in restricted communicative participation for between a quarter and a third of the respondents, and their speech caused embarrassment sometimes or more often to more than half. Forty-five percent of the respondents had received speech-language pathology services. Most respondents reported both speech and language symptoms, and many experienced restricted communicative participation. Access to speech-language pathology services is still inadequate. Services should also address cognitive/linguistic aspects to meet the needs of people with PD. © 2018 S. Karger AG, Basel.
Psychoacoustic cues to emotion in speech prosody and music.

PubMed

Coutinho, Eduardo; Dibben, Nicola

2013-01-01

There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.
Trimodal speech perception: how residual acoustic hearing supplements cochlear-implant consonant recognition in the presence of visual cues.

PubMed

Sheffield, Benjamin M; Schuchman, Gerald; Bernstein, Joshua G W

2015-01-01

As cochlear implant (CI) acceptance increases and candidacy criteria are expanded, these devices are increasingly recommended for individuals with less than profound hearing loss. As a result, many individuals who receive a CI also retain acoustic hearing, often in the low frequencies, in the nonimplanted ear (i.e., bimodal hearing) and in some cases in the implanted ear (i.e., hybrid hearing) which can enhance the performance achieved by the CI alone. However, guidelines for clinical decisions pertaining to cochlear implantation are largely based on expectations for postsurgical speech-reception performance with the CI alone in auditory-only conditions. A more comprehensive prediction of postimplant performance would include the expected effects of residual acoustic hearing and visual cues on speech understanding. An evaluation of auditory-visual performance might be particularly important because of the complementary interaction between the speech information relayed by visual cues and that contained in the low-frequency auditory signal. The goal of this study was to characterize the benefit provided by residual acoustic hearing to consonant identification under auditory-alone and auditory-visual conditions for CI users. Additional information regarding the expected role of residual hearing in overall communication performance by a CI listener could potentially lead to more informed decisions regarding cochlear implantation, particularly with respect to recommendations for or against bilateral implantation for an individual who is functioning bimodally. Eleven adults 23 to 75 years old with a unilateral CI and air-conduction thresholds in the nonimplanted ear equal to or better than 80 dB HL for at least one octave frequency between 250 and 1000 Hz participated in this study. Consonant identification was measured for conditions involving combinations of electric hearing (via the CI), acoustic hearing (via the nonimplanted ear), and speechreading (visual cues

Preliminary study of acoustic analysis for evaluating speech-aid oral prostheses: Characteristic dips in octave spectrum for comparison of nasality.

PubMed

Chang, Yen-Liang; Hung, Chao-Ho; Chen, Po-Yueh; Chen, Wei-Chang; Hung, Shih-Han

2015-10-01

Acoustic analysis is often used in speech evaluation but seldom for the evaluation of oral prostheses designed for reconstruction of surgical defect. This study aimed to introduce the application of acoustic analysis for patients with velopharyngeal insufficiency (VPI) due to oral surgery and rehabilitated with oral speech-aid prostheses. The pre- and postprosthetic rehabilitation acoustic features of sustained vowel sounds from two patients with VPI were analyzed and compared with the acoustic analysis software Praat. There were significant differences in the octave spectrum of sustained vowel speech sound between the pre- and postprosthetic rehabilitation. Acoustic measurements of sustained vowels for patients before and after prosthetic treatment showed no significant differences for all parameters of fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, formant frequency, F1 bandwidth, and band energy difference. The decrease in objective nasality perceptions correlated very well with the decrease in dips of the spectra for the male patient with a higher speech bulb height. Acoustic analysis may be a potential technique for evaluating the functions of oral speech-aid prostheses, which eliminates dysfunctions due to the surgical defect and contributes to a high percentage of intelligible speech. Octave spectrum analysis may also be a valuable tool for detecting changes in nasality characteristics of the voice during prosthetic treatment of VPI. Copyright © 2014. Published by Elsevier B.V.
Acoustics in Halls for Speech and Music

NASA Astrophysics Data System (ADS)

Gade, Anders C.

This chapter deals specifically with concepts, tools, and architectural variables of importance when designing auditoria for speech and music. The focus will be on cultivating the useful components of the sound in the room rather than on avoiding noise from outside or from installations, which is dealt with in Chap. 11. The chapter starts by presenting the subjective aspects of the room acoustic experience according to consensus at the time of writing. Then follows a description of their objective counterparts, the objective room acoustic parameters, among which the classical reverberation time measure is only one of many, but still of fundamental value. After explanations on how these parameters can be measured and predicted during the design phase, the remainder of the chapter deals with how the acoustic properties can be controlled by the architectural design of auditoria. This is done by presenting the influence of individual design elements as well as brief descriptions of halls designed for specific purposes, such as drama, opera, and symphonic concerts. Finally, some important aspects of loudspeaker installations in auditoria are briefly touched upon.
Digital signal processing at Bell Labs-Foundations for speech and acoustics research

NASA Astrophysics Data System (ADS)

Rabiner, Lawrence R.

2004-05-01

Digital signal processing (DSP) is a fundamental tool for much of the research that has been carried out of Bell Labs in the areas of speech and acoustics research. The fundamental bases for DSP include the sampling theorem of Nyquist, the method for digitization of analog signals by Shannon et al., methods of spectral analysis by Tukey, the cepstrum by Bogert et al., and the FFT by Tukey (and Cooley of IBM). Essentially all of these early foundations of DSP came out of the Bell Labs Research Lab in the 1930s, 1940s, 1950s, and 1960s. This fundamental research was motivated by fundamental applications (mainly in the areas of speech, sonar, and acoustics) that led to novel design methods for digital filters (Kaiser, Golden, Rabiner, Schafer), spectrum analysis methods (Rabiner, Schafer, Allen, Crochiere), fast convolution methods based on the FFT (Helms, Bergland), and advanced digital systems used to implement telephony channel banks (Jackson, McDonald, Freeny, Tewksbury). This talk summarizes the key contributions to DSP made at Bell Labs, and illustrates how DSP was utilized in the areas of speech and acoustics research. It also shows the vast, worldwide impact of this DSP research on modern consumer electronics.
Perceptual, auditory and acoustic vocal analysis of speech and singing in choir conductors.

PubMed

Rehder, Maria Inês Beltrati Cornacchioni; Behlau, Mara

2008-01-01

the voice of choir conductors. to evaluate the vocal quality of choir conductors based on the production of a sustained vowel during singing and when speaking in order to observe auditory and acoustic differences. participants of this study were 100 choir conductors, with an equal distribution between genders. Participants were asked to produce the sustained vowel "é" using a singing and speaking voice. Speech samples were analyzed based on auditory-perceptive and acoustic parameters. The auditory-perceptive analysis was carried out by two speech-language pathologist, specialists in this field of knowledge. The acoustic analysis was carried out with the support of the computer software Doctor Speech (Tiger Electronics, SRD, USA, version 4.0), using the Real Analysis module. the auditory-perceptive analysis of the vocal quality indicated that most conductors have adapted voices, presenting more alterations in their speaking voice. The acoustic analysis indicated different values between genders and between the different production modalities. The fundamental frequency was higher in the singing voice, as well as the values for the first formant; the second formant presented lower values in the singing voice, with statistically significant results only for women. the voice of choir conductors is adapted, presenting fewer deviations in the singing voice when compared to the speaking voice. Productions differ based the voice modality, singing or speaking.
Acoustic foundations of the speech-to-song illusion.

PubMed

Tierney, Adam; Patel, Aniruddh D; Breen, Mara

2018-06-01

In the "speech-to-song illusion," certain spoken phrases are heard as highly song-like when isolated from context and repeated. This phenomenon occurs to a greater degree for some stimuli than for others, suggesting that particular cues prompt listeners to perceive a spoken phrase as song. Here we investigated the nature of these cues across four experiments. In Experiment 1, participants were asked to rate how song-like spoken phrases were after each of eight repetitions. Initial ratings were correlated with the consistency of an underlying beat and within-syllable pitch slope, while rating change was linked to beat consistency, within-syllable pitch slope, and melodic structure. In Experiment 2, the within-syllable pitch slope of the stimuli was manipulated, and this manipulation changed the extent to which participants heard certain stimuli as more musical than others. In Experiment 3, the extent to which the pitch sequences of a phrase fit a computational model of melodic structure was altered, but this manipulation did not have a significant effect on musicality ratings. In Experiment 4, the consistency of intersyllable timing was manipulated, but this manipulation did not have an effect on the change in perceived musicality after repetition. Our methods provide a new way of studying the causal role of specific acoustic features in the speech-to-song illusion via subtle acoustic manipulations of speech, and show that listeners can rapidly (and implicitly) assess the degree to which nonmusical stimuli contain musical structure. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Multireceiver Acoustic Communications in Time-Varying Environments

DTIC Science & Technology

2014-06-01

Canberra, ACT, 2012, pp. 1–7. [7] W. Chen and F. Yanjun, “Physical layer design consideration for underwater acoustic sensor networks ,”3rd IEEE Int...analysis of underwater acoustic MIMO communications,”OCEANS, Sydney, NSW, 2010, pp. 1–8. [9] Wines lab (2013). Wireless networks and embedded... NETWORKS ......................................................................3 B. CHALLENGES OF UNDERWATER ACOUSTIC COMMUNICATIONS
How Speech Communication Training Interfaces with Public Relations Training.

ERIC Educational Resources Information Center

Bosley, Phyllis B.

Speech communication training is a valuable asset for those entering the public relations (PR) field. This notion is reinforced by the 1987 "Design for Undergraduate Public Relations Education," a guide for implementing speech communication courses within a public relations curriculum, and also in the incorporation of oral communication training…
High-speed acoustic communication by multiplexing orbital angular momentum

PubMed Central

Shi, Chengzhi; Dubois, Marc; Wang, Yuan

2017-01-01

Long-range acoustic communication is crucial to underwater applications such as collection of scientific data from benthic stations, ocean geology, and remote control of off-shore industrial activities. However, the transmission rate of acoustic communication is always limited by the narrow-frequency bandwidth of the acoustic waves because of the large attenuation for high-frequency sound in water. Here, we demonstrate a high-throughput communication approach using the orbital angular momentum (OAM) of acoustic vortex beams with one order enhancement of the data transmission rate at a single frequency. The topological charges of OAM provide intrinsically orthogonal channels, offering a unique ability to multiplex data transmission within a single acoustic beam generated by a transducer array, drastically increasing the information channels and capacity of acoustic communication. A high spectral efficiency of 8.0 ± 0.4 (bit/s)/Hz in acoustic communication has been achieved using topological charges between −4 and +4 without applying other communication modulation techniques. Such OAM is a completely independent degree of freedom which can be readily integrated with other state-of-the-art communication modulation techniques like quadrature amplitude modulation (QAM) and phase-shift keying (PSK). Information multiplexing through OAM opens a dimension for acoustic communication, providing a data transmission rate that is critical for underwater applications. PMID:28652341
Research in speech communication.

PubMed

Flanagan, J

1995-10-24

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.
Formant Centralization Ratio: A Proposal for a New Acoustic Measure of Dysarthric Speech

ERIC Educational Resources Information Center

Sapir, Shimon; Ramig, Lorraine O.; Spielman, Jennifer L.; Fox, Cynthia

2010-01-01

Purpose: The vowel space area (VSA) has been used as an acoustic metric of dysarthric speech, but with varying degrees of success. In this study, the authors aimed to test an alternative metric to the VSA--the "formant centralization ratio" (FCR), which is hypothesized to more effectively differentiate dysarthric from healthy speech and register…
Examining Acoustic and Kinematic Measures of Articulatory Working Space: Effects of Speech Intensity

ERIC Educational Resources Information Center

Whitfield, Jason A.; Dromey, Christopher; Palmer, Panika

2018-01-01

Purpose: The purpose of this study was to examine the effect of speech intensity on acoustic and kinematic vowel space measures and conduct a preliminary examination of the relationship between kinematic and acoustic vowel space metrics calculated from continuously sampled lingual marker and formant traces. Method: Young adult speakers produced 3…
[Acoustic voice analysis using the Praat program: comparative study with the Dr. Speech program].

PubMed

Núñez Batalla, Faustino; González Márquez, Rocío; Peláez González, M Belén; González Laborda, Irene; Fernández Fernández, María; Morato Galán, Marta

2014-01-01

The European Laryngological Society (ELS) basic protocol for functional assessment of voice pathology includes 5 different approaches: perception, videostroboscopy, acoustics, aerodynamics and subjective rating by the patient. In this study we focused on acoustic voice analysis. The purpose of the present study was to correlate the results obtained by the commercial software Dr. Speech and the free software Praat in 2 fields: 1. Narrow-band spectrogram (the presence of noise according to Yanagihara, and the presence of subharmonics) (semi-quantitative). 2. Voice acoustic parameters (jitter, shimmer, harmonics-to-noise ratio, fundamental frequency) (quantitative). We studied a total of 99 voice samples from individuals with Reinke's oedema diagnosed using videostroboscopy. One independent observer used Dr. Speech 3.0 and a second one used the Praat program (Phonetic Sciences, University of Amsterdam). The spectrographic analysis consisted of obtaining a narrow-band spectrogram from the previous digitalised voice samples by the 2 independent observers. They then determined the presence of noise in the spectrogram, using the Yanagihara grades, as well as the presence of subharmonics. As a final result, the acoustic parameters of jitter, shimmer, harmonics-to-noise ratio and fundamental frequency were obtained from the 2 acoustic analysis programs. The results indicated that the sound spectrogram and the numerical values obtained for shimmer and jitter were similar for both computer programs, even though types 1, 2 and 3 voice samples were analysed. The Praat and Dr. Speech programs provide similar results in the acoustic analysis of pathological voices. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Comparative efficacy of the picture exchange communication system (PECS) versus a speech-generating device: effects on social-communicative skills and speech development.

PubMed

Boesch, Miriam C; Wendt, Oliver; Subramanian, Anu; Hsu, Ning

2013-09-01

The Picture Exchange Communication System (PECS) and a speech-generating device (SGD) were compared in a study with a multiple baseline, alternating treatment design. The effectiveness of these methods in increasing social-communicative behavior and natural speech production were assessed with three elementary school-aged children with severe autism who demonstrated extremely limited functional communication skills. Results for social-communicative behavior were mixed for all participants in both treatment conditions. Relatively little difference was observed between PECS and SGD conditions. Although findings were inconclusive, data patterns suggest that Phase II of the PECS training protocol is conducive to encouraging social-communicative behavior. Data for speech outcomes did not reveal any increases across participants, and no differences between treatment conditions were observed.
Influence of compact disk recording protocols on reliability and comparability of speech audiometry outcomes: acoustic analysis.

PubMed

Di Berardino, F; Tognola, G; Paglialonga, A; Alpini, D; Grandori, F; Cesarani, A

2010-08-01

To assess whether different compact disk recording protocols, used to prepare speech test material, affect the reliability and comparability of speech audiometry testing. We conducted acoustic analysis of compact disks used in clinical practice, to determine whether speech material had been recorded using similar procedures. To assess the impact of different recording procedures on speech test outcomes, normal hearing subjects were tested using differently prepared compact disks, and their psychometric curves compared. Acoustic analysis revealed that speech material had been recorded using different protocols. The major difference was the gain between the levels at which the speech material and the calibration signal had been recorded. Although correct calibration of the audiometer was performed for each compact disk before testing, speech recognition thresholds and maximum intelligibility thresholds differed significantly between compact disks (p < 0.05), and were influenced by the gain between the recording level of the speech material and the calibration signal. To ensure the reliability and comparability of speech test outcomes obtained using different compact disks, it is recommended to check for possible differences in the recording gains used to prepare the compact disks, and then to compensate for any differences before testing.
[Simulation of speech perception with cochlear implants : Influence of frequency and level of fundamental frequency components with electronic acoustic stimulation].

PubMed

Rader, T; Fastl, H; Baumann, U

2017-03-01

After implantation of cochlear implants with hearing preservation for combined electronic acoustic stimulation (EAS), the residual acoustic hearing ability relays fundamental speech frequency information in the low frequency range. With the help of acoustic simulation of EAS hearing perception the impact of frequency and level fine structure of speech signals can be systematically examined. The aim of this study was to measure the speech reception threshold (SRT) under various noise conditions with acoustic EAS simulation by variation of the frequency and level information of the fundamental frequency f0 of speech. The study was carried out to determine to what extent the SRT is impaired by modification of the f0 fine structure. Using partial tone time pattern analysis an acoustic EAS simulation of the speech material from the Oldenburg sentence test (OLSA) was generated. In addition, determination of the f0 curve of the speech material was conducted. Subsequently, either the parameter frequency or level of f0 was fixed in order to remove one of the two fine contour information of the speech signal. The processed OLSA sentences were used to determine the SRT in background noise under various test conditions. The conditions "f0 fixed frequency" and "f0 fixed level" were tested under two different situations, under "amplitude modulated background noise" and "continuous background noise" conditions. A total of 24 subjects with normal hearing participated in the study. The SRT in background noise for the condition "f0 fixed frequency" was more favorable in continuous noise with 2.7 dB and in modulated noise with 0.8 dB compared to the condition "f0 fixed level" with 3.7 dB and 2.9 dB, respectively. In the simulation of speech perception with cochlear implants and acoustic components, the level information of the fundamental frequency had a stronger impact on speech intelligibility than the frequency information. The method of simulation of transmission of
Effects of Age, Acoustic Challenge, and Verbal Working Memory on Recall of Narrative Speech.

PubMed

Ward, Caitlin M; Rogers, Chad S; Van Engen, Kristin J; Peelle, Jonathan E

2016-01-01

A common goal during speech comprehension is to remember what we have heard. Encoding speech into long-term memory frequently requires processes such as verbal working memory that may also be involved in processing degraded speech. Here the authors tested whether young and older adult listeners' memory for short stories was worse when the stories were acoustically degraded, or whether the additional contextual support provided by a narrative would protect against these effects. The authors tested 30 young adults (aged 18-28 years) and 30 older adults (aged 65-79 years) with good self-reported hearing. Participants heard short stories that were presented as normal (unprocessed) speech or acoustically degraded using a noise vocoding algorithm with 24 or 16 channels. The degraded stories were still fully intelligible. Following each story, participants were asked to repeat the story in as much detail as possible. Recall was scored using a modified idea unit scoring approach, which included separately scoring hierarchical levels of narrative detail. Memory for acoustically degraded stories was significantly worse than for normal stories at some levels of narrative detail. Older adults' memory for the stories was significantly worse overall, but there was no interaction between age and acoustic clarity or level of narrative detail. Verbal working memory (assessed by reading span) significantly correlated with recall accuracy for both young and older adults, whereas hearing ability (better ear pure tone average) did not. The present findings are consistent with a framework in which the additional cognitive demands caused by a degraded acoustic signal use resources that would otherwise be available for memory encoding for both young and older adults. Verbal working memory is a likely candidate for supporting both of these processes.
Acoustics of Clear and Noise-Adapted Speech in Children, Young, and Older Adults

ERIC Educational Resources Information Center

Smiljanic, Rajka; Gilbert, Rachael C.

2017-01-01

Purpose: This study investigated acoustic-phonetic modifications produced in noise-adapted speech (NAS) and clear speech (CS) by children, young adults, and older adults. Method: Ten children (11-13 years of age), 10 young adults (18-29 years of age), and 10 older adults (60-84 years of age) read sentences in conversational and clear speaking…
Accuracy of Perceptual and Acoustic Methods for the Detection of Inspiratory Loci in Spontaneous Speech

PubMed Central

Wang, Yu-Tsai; Nip, Ignatius S. B.; Green, Jordan R.; Kent, Ray D.; Kent, Jane Finley; Ullman, Cara

2012-01-01

The current study investigates the accuracy of perceptually and acoustically determined inspiratory loci in spontaneous speech for the purpose of identifying breath groups. Sixteen participants were asked to talk about simple topics in daily life at a comfortable speaking rate and loudness while connected to a pneumotach and audio microphone. The locations of inspiratory loci were determined based on the aerodynamic signal, which served as a reference for loci identified perceptually and acoustically. Signal detection theory was used to evaluate the accuracy of the methods. The results showed that the greatest accuracy in pause detection was achieved (1) perceptually based on the agreement between at least 2 of the 3 judges; (2) acoustically using a pause duration threshold of 300 ms. In general, the perceptually-based method was more accurate than was the acoustically-based method. Inconsistencies among perceptually-determined, acoustically-determined, and aerodynamically-determined inspiratory loci for spontaneous speech should be weighed in selecting a method of breath-group determination. PMID:22362007
Amplitude Modulations of Acoustic Communication Signals

NASA Astrophysics Data System (ADS)

Turesson, Hjalmar K.

2011-12-01

In human speech, amplitude modulations at 3 -- 8 Hz are important for discrimination and detection. Two different neurophysiological theories have been proposed to explain this effect. The first theory proposes that, as a consequence of neocortical synaptic dynamics, signals that are amplitude modulated at 3 -- 8 Hz are propagated better than un-modulated signals, or signals modulated above 8 Hz. This suggests that neural activity elicited by vocalizations modulated at 3 -- 8 Hz is optimally transmitted, and the vocalizations better discriminated and detected. The second theory proposes that 3 -- 8 Hz amplitude modulations interact with spontaneous neocortical oscillations. Specifically, vocalizations modulated at 3 -- 8 Hz entrain local populations of neurons, which in turn, modulate the amplitude of high frequency gamma oscillations. This suggests that vocalizations modulated at 3 -- 8 Hz should induce stronger cross-frequency coupling. Similar to human speech, we found that macaque monkey vocalizations also are amplitude modulated between 3 and 8 Hz. Humans and macaque monkeys share similarities in vocal production, implying that the auditory systems subserving perception of acoustic communication signals also share similarities. Based on the similarities between human speech and macaque monkey vocalizations, we addressed how amplitude modulated vocalizations are processed in the auditory cortex of macaque monkeys, and what behavioral relevance modulations may have. Recording single neuron activity, as well as, the activity of local populations of neurons allowed us to test both of the neurophysiological theories presented above. We found that single neuron responses to vocalizations amplitude modulated at 3 -- 8 Hz resulted in better stimulus discrimination than vocalizations lacking 3 -- 8 Hz modulations, and that the effect most likely was mediated by synaptic dynamics. In contrast, we failed to find support for the oscillation-based model proposing a
Research in speech communication.

PubMed Central

Flanagan, J

1995-01-01

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806

Speech Act Theory and Business Communication Conventions.

ERIC Educational Resources Information Center

Ewald, Helen Rothschild; Stine, Donna

1983-01-01

Applies speech act theory to business writing to determine why certain letters and memos succeed while others fail. Specifically, shows how speech act theorist H. P. Grice's rules or maxims illuminate the writing process in business communication. (PD)
A chimpanzee recognizes synthetic speech with significantly reduced acoustic cues to phonetic content.

PubMed

Heimbauer, Lisa A; Beran, Michael J; Owren, Michael J

2011-07-26

A long-standing debate concerns whether humans are specialized for speech perception, which some researchers argue is demonstrated by the ability to understand synthetic speech with significantly reduced acoustic cues to phonetic content. We tested a chimpanzee (Pan troglodytes) that recognizes 128 spoken words, asking whether she could understand such speech. Three experiments presented 48 individual words, with the animal selecting a corresponding visuographic symbol from among four alternatives. Experiment 1 tested spectrally reduced, noise-vocoded (NV) synthesis, originally developed to simulate input received by human cochlear-implant users. Experiment 2 tested "impossibly unspeechlike" sine-wave (SW) synthesis, which reduces speech to just three moving tones. Although receiving only intermittent and noncontingent reward, the chimpanzee performed well above chance level, including when hearing synthetic versions for the first time. Recognition of SW words was least accurate but improved in experiment 3 when natural words in the same session were rewarded. The chimpanzee was more accurate with NV than SW versions, as were 32 human participants hearing these items. The chimpanzee's ability to spontaneously recognize acoustically reduced synthetic words suggests that experience rather than specialization is critical for speech-perception capabilities that some have suggested are uniquely human. Copyright © 2011 Elsevier Ltd. All rights reserved.
Twelve-month-old infants recognize that speech can communicate unobservable intentions.

PubMed

Vouloumanos, Athena; Onishi, Kristine H; Pogue, Amanda

2012-08-07

Much of our knowledge is acquired not from direct experience but through the speech of others. Speech allows rapid and efficient transfer of information that is otherwise not directly observable. Do infants recognize that speech, even if unfamiliar, can communicate about an important aspect of the world that cannot be directly observed: a person's intentions? Twelve-month-olds saw a person (the Communicator) attempt but fail to achieve a target action (stacking a ring on a funnel). The Communicator subsequently directed either speech or a nonspeech vocalization to another person (the Recipient) who had not observed the attempts. The Recipient either successfully stacked the ring (Intended outcome), attempted but failed to stack the ring (Observable outcome), or performed a different stacking action (Related outcome). Infants recognized that speech could communicate about unobservable intentions, looking longer at Observable and Related outcomes than the Intended outcome when the Communicator used speech. However, when the Communicator used nonspeech, infants looked equally at the three outcomes. Thus, for 12-month-olds, speech can transfer information about unobservable aspects of the world such as internal mental states, which provides preverbal infants with a tool for acquiring information beyond their immediate experience.
Twelve-month-old infants recognize that speech can communicate unobservable intentions

PubMed Central

Vouloumanos, Athena; Onishi, Kristine H.; Pogue, Amanda

2012-01-01

Much of our knowledge is acquired not from direct experience but through the speech of others. Speech allows rapid and efficient transfer of information that is otherwise not directly observable. Do infants recognize that speech, even if unfamiliar, can communicate about an important aspect of the world that cannot be directly observed: a person’s intentions? Twelve-month-olds saw a person (the Communicator) attempt but fail to achieve a target action (stacking a ring on a funnel). The Communicator subsequently directed either speech or a nonspeech vocalization to another person (the Recipient) who had not observed the attempts. The Recipient either successfully stacked the ring (Intended outcome), attempted but failed to stack the ring (Observable outcome), or performed a different stacking action (Related outcome). Infants recognized that speech could communicate about unobservable intentions, looking longer at Observable and Related outcomes than the Intended outcome when the Communicator used speech. However, when the Communicator used nonspeech, infants looked equally at the three outcomes. Thus, for 12-month-olds, speech can transfer information about unobservable aspects of the world such as internal mental states, which provides preverbal infants with a tool for acquiring information beyond their immediate experience. PMID:22826217
A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals With Parkinson's Disease

PubMed Central

Choi, Yaelin

2017-01-01

Purpose The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method A total of 48 speakers from the 4 speaker groups (AE speakers with PD, Korean speakers with PD, healthy English speakers, and healthy Korean speakers) were asked to read a paragraph in their native languages. Four acoustic variables were analyzed: acoustic vowel space, voice onset time contrast scores, normalized pairwise variability index, and articulation rate. Speech intelligibility scores were obtained from scaled estimates of sentences extracted from the paragraph. Results The findings indicated that the multiple regression models of speech intelligibility were different in Korean and AE, even with the same set of predictor variables and with speakers matched on speech intelligibility across languages. Analysis of the descriptive data for the acoustic variables showed the expected compression of the vowel space in speakers with PD in both languages, lower normalized pairwise variability index scores in Korean compared with AE, and no differences within or across language in articulation rate. Conclusions The results indicate that the basis of an intelligibility deficit in dysarthria is likely to depend on the native language of the speaker and listener. Additional research is required to explore other potential predictor variables, as well as additional language comparisons to pursue cross-linguistic considerations in classification and diagnosis of dysarthria types. PMID:28821018
Speech intelligibility in complex acoustic environments in young children

NASA Astrophysics Data System (ADS)

Litovsky, Ruth

2003-04-01

While the auditory system undergoes tremendous maturation during the first few years of life, it has become clear that in complex scenarios when multiple sounds occur and when echoes are present, children's performance is significantly worse than their adult counterparts. The ability of children (3-7 years of age) to understand speech in a simulated multi-talker environment and to benefit from spatial separation of the target and competing sounds was investigated. In these studies, competing sources vary in number, location, and content (speech, modulated or unmodulated speech-shaped noise and time-reversed speech). The acoustic spaces were also varied in size and amount of reverberation. Finally, children with chronic otitis media who received binaural training were tested pre- and post-training on a subset of conditions. Results indicated the following. (1) Children experienced significantly more masking than adults, even in the simplest conditions tested. (2) When the target and competing sounds were spatially separated speech intelligibility improved, but the amount varied with age, type of competing sound, and number of competitors. (3) In a large reverberant classroom there was no benefit of spatial separation. (4) Binaural training improved speech intelligibility performance in children with otitis media. Future work includes similar studies in children with unilateral and bilateral cochlear implants. [Work supported by NIDCD, DRF, and NOHR.
Speech Adaptation to Kinematic Recording Sensors: Perceptual and Acoustic Findings

ERIC Educational Resources Information Center

Dromey, Christopher; Hunter, Elise; Nissen, Shawn L.

2018-01-01

Purpose: This study used perceptual and acoustic measures to examine the time course of speech adaptation after the attachment of electromagnetic sensor coils to the tongue, lips, and jaw. Method: Twenty native English speakers read aloud stimulus sentences before the attachment of the sensors, immediately after attachment, and again 5, 10, 15,…
The Carolinas Speech Communication Annual, 1997.

ERIC Educational Resources Information Center

McKinney, Bruce C.

1997-01-01

This 1997 issue of "The Carolinas Speech Communication Annual" contains the following articles: "'Bridges of Understanding': UNESCO's Creation of a Fantasy for the American Public" (Michael H. Eaves and Charles F. Beadle, Jr.); "Developing a Communication Cooperative: A Student, Faculty, and Organizational Learning…
Speech acoustic markers of early stage and prodromal Huntington's disease: a marker of disease onset?

PubMed

Vogel, Adam P; Shirbin, Christopher; Churchyard, Andrew J; Stout, Julie C

2012-12-01

Speech disturbances (e.g., altered prosody) have been described in symptomatic Huntington's Disease (HD) individuals, however, the extent to which speech changes in gene positive pre-manifest (PreHD) individuals is largely unknown. The speech of individuals carrying the mutant HTT gene is a behavioural/motor/cognitive marker demonstrating some potential as an objective indicator of early HD onset and disease progression. Speech samples were acquired from 30 individuals carrying the mutant HTT gene (13 PreHD, 17 early stage HD) and 15 matched controls. Participants read a passage, produced a monologue and said the days of the week. Data were analysed acoustically for measures of timing, frequency and intensity. There was a clear effect of group across most acoustic measures, so that speech performance differed in-line with disease progression. Comparisons across groups revealed significant differences between the control and the early stage HD group on measures of timing (e.g., speech rate). Participants carrying the mutant HTT gene presented with slower rates of speech, took longer to say words and produced greater silences between and within words compared to healthy controls. Importantly, speech rate showed a significant correlation to burden of disease scores. The speech of early stage HD differed significantly from controls. The speech of PreHD, although not reaching significance, tended to lie between the performance of controls and early stage HD. This suggests that changes in speech production appear to be developing prior to diagnosis. Copyright © 2012 Elsevier Ltd. All rights reserved.
Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

ERIC Educational Resources Information Center

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…
Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures.

PubMed

Maryn, Youri; Ysenbaert, Femke; Zarowski, Andrzej; Vanspauwen, Robby

2017-03-01

The ability to move with mobile communication devices (MCDs; ie, smartphones and tablet computers) may induce differences in microphone-to-mouth positioning and use in noise-packed environments, and thus influence reliability of acoustic voice measurements. This study investigated differences in various acoustic voice measures between six recording equipments in backgrounds with low and increasing noise levels. One chain of continuous speech and sustained vowel from 50 subjects with voice disorders (all separated by silence intervals) was radiated and re-recorded in an anechoic chamber with five MCDs and one high-quality recording system. These recordings were acquired in one condition without ambient noise and in four conditions with increased ambient noise. A total of 10 acoustic voice markers were obtained in the program Praat. Differences between MCDs and noise condition were assessed with Friedman repeated-measures test and posthoc Wilcoxon signed-rank tests, both for related samples, after Bonferroni correction. (1) Except median fundamental frequency and seven nonsignificant differences, MCD samples have significantly higher acoustic markers than clinical reference samples in minimal environmental noise. (2) Except median fundamental frequency, jitter local, and jitter rap, all acoustic measures on samples recorded with the reference system experienced significant influence from room noise levels. Fundamental frequency is resistant to recording system, environmental noise, and their combination. All other measures, however, were impacted by both recording system and noise condition, and especially by their combination, often already in the reference/baseline condition without added ambient noise. Caution is therefore warranted regarding implementation of MCDs as clinical recording tools, particularly when applied for treatment outcomes assessments. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Study of opto-acoustic communication between air and underwater carrier

NASA Astrophysics Data System (ADS)

Zong, Si-Guang; Liu, Tao; Cao, Jing; He, Qi-Yi

2018-02-01

How to solve the communication problem to the underwater target has turned into one of the subjects that the militarists of all over the world commonly concern. Laser-induced acoustic signal is a new approach for underwater acoustic source, which has much virtue such as high intensity, short pulse and broad frequency. The paper studies the opto-acoustic communication method. The acoustic signal characteristic of laser-induced breakdown is studied and corresponding theory model is systemically analyzed. The opto-acoustic communication experimental measure investigation is formed with the high power laser, water tank and high frequency hydrophone. The characteristic of acoustic signal is analyzed, such as intensity and frequency. This makes a stride for pursing the feasibility of laser-acoustic underwater communication.
Sensory-motor relationships in speech production in post-lingually deaf cochlear-implanted adults and normal-hearing seniors: Evidence from phonetic convergence and speech imitation.

PubMed

Scarbel, Lucie; Beautemps, Denis; Schwartz, Jean-Luc; Sato, Marc

2017-07-01

Speech communication can be viewed as an interactive process involving a functional coupling between sensory and motor systems. One striking example comes from phonetic convergence, when speakers automatically tend to mimic their interlocutor's speech during communicative interaction. The goal of this study was to investigate sensory-motor linkage in speech production in postlingually deaf cochlear implanted participants and normal hearing elderly adults through phonetic convergence and imitation. To this aim, two vowel production tasks, with or without instruction to imitate an acoustic vowel, were proposed to three groups of young adults with normal hearing, elderly adults with normal hearing and post-lingually deaf cochlear-implanted patients. Measure of the deviation of each participant's f 0 from their own mean f 0 was measured to evaluate the ability to converge to each acoustic target. showed that cochlear-implanted participants have the ability to converge to an acoustic target, both intentionally and unintentionally, albeit with a lower degree than young and elderly participants with normal hearing. By providing evidence for phonetic convergence and speech imitation, these results suggest that, as in young adults, perceptuo-motor relationships are efficient in elderly adults with normal hearing and that cochlear-implanted adults recovered significant perceptuo-motor abilities following cochlear implantation. Copyright © 2017 Elsevier Ltd. All rights reserved.
The dance of communication: retaining family membership despite severe non-speech dementia.

PubMed

Walmsley, Bruce D; McCormack, Lynne

2014-09-01

There is minimal research investigating non-speech communication as a result of living with severe dementia. This phenomenological study explores retained awareness expressed through non-speech patterns of communication in a family member living with severe dementia. Further, it describes reciprocal efforts used by all family members to engage in alternative patterns of communication. Family interactions were filmed to observe speech and non-speech relational communication. Participants were four family groups each with a family member living with non-speech communication as a result of severe dementia. Overall there were 16 participants. Data were analysed using thematic analysis. One superordinate theme, Dance of Communication, describes the interactive patterns that were observed during family communication. Two subordinate themes emerged: (a) in-step; characterised by communication that indicated harmony, spontaneity and reciprocity, and; (b) out-of-step; characterised by communication that indicated disharmony, syncopation, and vulnerability. This study highlights that retained awareness can exist at levels previously unrecognised in those living with limited or absent speech as a result of severe dementia. A recommendation for the development of a communication program for caregivers of individuals living with dementia is presented. © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Acoustic communication in insect disease vectors

PubMed Central

Vigoder, Felipe de Mello; Ritchie, Michael Gordon; Gibson, Gabriella; Peixoto, Alexandre Afranio

2013-01-01

Acoustic signalling has been extensively studied in insect species, which has led to a better understanding of sexual communication, sexual selection and modes of speciation. The significance of acoustic signals for a blood-sucking insect was first reported in the XIX century by Christopher Johnston, studying the hearing organs of mosquitoes, but has received relatively little attention in other disease vectors until recently. Acoustic signals are often associated with mating behaviour and sexual selection and changes in signalling can lead to rapid evolutionary divergence and may ultimately contribute to the process of speciation. Songs can also have implications for the success of novel methods of disease control such as determining the mating competitiveness of modified insects used for mass-release control programs. Species-specific sound “signatures” may help identify incipient species within species complexes that may be of epidemiological significance, e.g. of higher vectorial capacity, thereby enabling the application of more focussed control measures to optimise the reduction of pathogen transmission. Although the study of acoustic communication in insect vectors has been relatively limited, this review of research demonstrates their value as models for understanding both the functional and evolutionary significance of acoustic communication in insects. PMID:24473800
Are you a good mimic? Neuro-acoustic signatures for speech imitation ability

PubMed Central

Reiterer, Susanne M.; Hu, Xiaochen; Sumathi, T. A.; Singh, Nandini C.

2013-01-01

We investigated individual differences in speech imitation ability in late bilinguals using a neuro-acoustic approach. One hundred and thirty-eight German-English bilinguals matched on various behavioral measures were tested for “speech imitation ability” in a foreign language, Hindi, and categorized into “high” and “low ability” groups. Brain activations and speech recordings were obtained from 26 participants from the two extreme groups as they performed a functional neuroimaging experiment which required them to “imitate” sentences in three conditions: (A) German, (B) English, and (C) German with fake English accent. We used recently developed novel acoustic analysis, namely the “articulation space” as a metric to compare speech imitation abilities of the two groups. Across all three conditions, direct comparisons between the two groups, revealed brain activations (FWE corrected, p < 0.05) that were more widespread with significantly higher peak activity in the left supramarginal gyrus and postcentral areas for the low ability group. The high ability group, on the other hand showed significantly larger articulation space in all three conditions. In addition, articulation space also correlated positively with imitation ability (Pearson's r = 0.7, p < 0.01). Our results suggest that an expanded articulation space for high ability individuals allows access to a larger repertoire of sounds, thereby providing skilled imitators greater flexibility in pronunciation and language learning. PMID:24155739
Developmental profile of speech-language and communicative functions in an individual with the preserved speech variant of Rett syndrome.

PubMed

Marschik, Peter B; Vollmann, Ralf; Bartl-Pokorny, Katrin D; Green, Vanessa A; van der Meer, Larah; Wolin, Thomas; Einspieler, Christa

2014-08-01

We assessed various aspects of speech-language and communicative functions of an individual with the preserved speech variant of Rett syndrome (RTT) to describe her developmental profile over a period of 11 years. For this study, we incorporated the following data resources and methods to assess speech-language and communicative functions during pre-, peri- and post-regressional development: retrospective video analyses, medical history data, parental checklists and diaries, standardized tests on vocabulary and grammar, spontaneous speech samples and picture stories to elicit narrative competences. Despite achieving speech-language milestones, atypical behaviours were present at all times. We observed a unique developmental speech-language trajectory (including the RTT typical regression) affecting all linguistic and socio-communicative sub-domains in the receptive as well as the expressive modality. Future research should take into consideration a potentially considerable discordance between formal and functional language use by interpreting communicative acts on a more cautionary note.
Increased pain intensity is associated with greater verbal communication difficulty and increased production of speech and co-speech gestures.

PubMed

Rowbotham, Samantha; Wardy, April J; Lloyd, Donna M; Wearden, Alison; Holler, Judith

2014-01-01

Effective pain communication is essential if adequate treatment and support are to be provided. Pain communication is often multimodal, with sufferers utilising speech, nonverbal behaviours (such as facial expressions), and co-speech gestures (bodily movements, primarily of the hands and arms that accompany speech and can convey semantic information) to communicate their experience. Research suggests that the production of nonverbal pain behaviours is positively associated with pain intensity, but it is not known whether this is also the case for speech and co-speech gestures. The present study explored whether increased pain intensity is associated with greater speech and gesture production during face-to-face communication about acute, experimental pain. Participants (N = 26) were exposed to experimentally elicited pressure pain to the fingernail bed at high and low intensities and took part in video-recorded semi-structured interviews. Despite rating more intense pain as more difficult to communicate (t(25) = 2.21, p = .037), participants produced significantly longer verbal pain descriptions and more co-speech gestures in the high intensity pain condition (Words: t(25) = 3.57, p = .001; Gestures: t(25) = 3.66, p = .001). This suggests that spoken and gestural communication about pain is enhanced when pain is more intense. Thus, in addition to conveying detailed semantic information about pain, speech and co-speech gestures may provide a cue to pain intensity, with implications for the treatment and support received by pain sufferers. Future work should consider whether these findings are applicable within the context of clinical interactions about pain.
Communication acoustics in Bell Labs

NASA Astrophysics Data System (ADS)

Flanagan, J. L.

2004-05-01

Communication aoustics has been a central theme in Bell Labs research since its inception. Telecommunication serves human information exchange. And, humans favor spoken language as a principal mode. The atmospheric medium typically provides the link between articulation and hearing. Creation, control and detection of sound, and the human's facility for generation and perception are basic ingredients of telecommunication. Electronics technology of the 1920s ushered in great advances in communication at a distance, a strong economical impetus being to overcome bandwidth limitations of wireline and cable. Early research established criteria for speech transmission with high quality and intelligibility. These insights supported exploration of means for efficient transmission-obtaining the greatest amount of speech information over a given bandwidth. Transoceanic communication was initiated by undersea cables for telegraphy. But these long cables exhibited very limited bandwidth (order of few hundred Hz). The challenge of sending voice across the oceans spawned perhaps the best known speech compression technique of history-the Vocoder, which parametrized the signal for transmission in about 300 Hz bandwidth, one-tenth that required for the typical waveform channel. Quality and intelligibility were grave issues (and they still are). At the same time parametric representation offered possibilities for encryption and privacy inside a traditional voice bandwidth. Confidential conversations between Roosevelt and Churchill during World War II were carried over high-frequency radio by an encrypted vocoder system known as Sigsaly. Major engineering advances in the late 1940s and early 1950s moved telecommunications into a new regime-digital technology. These key advances were at least three: (i) new understanding of time-discrete (sampled) representation of signals, (ii) digital computation (especially binary based), and (iii) evolving capabilities in microelectronics that
Developmental profile of speech-language and communicative functions in an individual with the Preserved Speech Variant of Rett syndrome

PubMed Central

Marschik, Peter B.; Vollmann, Ralf; Bartl-Pokorny, Katrin D.; Green, Vanessa A.; van der Meer, Larah; Wolin, Thomas; Einspieler, Christa

2018-01-01

Objective We assessed various aspects of speech-language and communicative functions of an individual with the preserved speech variant (PSV) of Rett syndrome (RTT) to describe her developmental profile over a period of 11 years. Methods For this study we incorporated the following data resources and methods to assess speech-language and communicative functions during pre-, peri- and post-regressional development: retrospective video analyses, medical history data, parental checklists and diaries, standardized tests on vocabulary and grammar, spontaneous speech samples, and picture stories to elicit narrative competences. Results Despite achieving speech-language milestones, atypical behaviours were present at all times. We observed a unique developmental speech-language trajectory (including the RTT typical regression) affecting all linguistic and socio-communicative sub-domains in the receptive as well as the expressive modality. Conclusion Future research should take into consideration a potentially considerable discordance between formal and functional language use by interpreting communicative acts on a more cautionary note. PMID:23870013

MURI: Impact of Oceanographic Variability on Acoustic Communications

DTIC Science & Technology

2011-09-01

multiplexing ( OFDM ), multiple- input/multiple-output ( MIMO ) transmissions, and multi-user single-input/multiple-output (SIMO) communications. Lastly... MIMO - OFDM communications: Receiver design for Doppler distorted underwater acoustic channels,” Proc. Asilomar Conf. on Signals, Systems, and... MIMO ) will be of particular interest. Validating experimental data will be obtained during the ONR acoustic communications experiment in summer 2008
[Effects of acaoustic adaptation of classrooms on the quality of verbal communication].

PubMed

Mikulski, Witold

2013-01-01

Voice organ disorders among teachers are caused by excessive voice strain. One of the measures to reduce this strain is to decrease background noise when teaching. Increasing the acoustic absorption of the room is a technical measure for achieving this aim. The absorption level also improves speech intelligibility rated by the following parameters: room reverberation time and speech transmission index (STI). This article presents the effects of acoustic adaptation of classrooms on the quality of verbal communication, aimed at getting the speech intelligibility at the good or excellent level. The article lists the criteria for evaluating classrooms in terms of the quality of verbal communication. The parameters were defined, using the measurement methods according to PN-EN ISO 3382-2:2010 and PN-EN 60268-16:2011. Acoustic adaptations were completed in two classrooms. After completing acoustic adaptations the reverberation time for the frequency of 1 kHz was reduced: in room no. 1 from 1.45 s to 0.44 s and in room no. 2 from 1.03 s to 0.37 s (maximum 0.65 s). At the same time, the speech transmission index increased: in room no. 1 from 0.55 (satisfactory speech intelligibility) to 0.75 (speech intelligibility close to excellent); in room no. 2 from 0.63 (good speech intelligibility) to 0.80 (excellent speech intelligibility). Therefore, it can be stated that prior to completing acoustic adaptations room no. 1 did not comply and room no. 2 barely complied with the criterion (speech transmission index of 0.62). After completing acoustic adaptations both rooms meet the requirements.
On the importance of early reflections for speech in rooms.

PubMed

Bradley, J S; Sato, H; Picard, M

2003-06-01

This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.
The Oral Communication Competence Dilemma: Are We Communicating Competently about Speech Communication?

ERIC Educational Resources Information Center

Fleuriet, Cathy A.

1997-01-01

Questions survey results which find oral communication education alive and well in higher education. Argues that those outside the discipline need to be educated about the nature of speech communication education and that a concerted effort must be made by faculty and administrators to reinforce the academic credibility of the discipline. (PA)
An acoustical assessment of pitch-matching accuracy in relation to speech frequency, speech frequency range, age and gender in preschool children

NASA Astrophysics Data System (ADS)

Trollinger, Valerie L.

This study investigated the relationship between acoustical measurement of singing accuracy in relationship to speech fundamental frequency, speech fundamental frequency range, age and gender in preschool-aged children. Seventy subjects from Southeastern Pennsylvania; the San Francisco Bay Area, California; and Terre Haute, Indiana, participated in the study. Speech frequency was measured by having the subjects participate in spontaneous and guided speech activities with the researcher, with 18 diverse samples extracted from each subject's recording for acoustical analysis for fundamental frequency in Hz with the CSpeech computer program. The fundamental frequencies were averaged together to derive a mean speech frequency score for each subject. Speech range was calculated by subtracting the lowest fundamental frequency produced from the highest fundamental frequency produced, resulting in a speech range measured in increments of Hz. Singing accuracy was measured by having the subjects each echo-sing six randomized patterns using the pitches Middle C, D, E, F♯, G and A (440), using the solfege syllables of Do and Re, which were recorded by a 5-year-old female model. For each subject, 18 samples of singing were recorded. All samples were analyzed by the CSpeech for fundamental frequency. For each subject, deviation scores in Hz were derived by calculating the difference between what the model sang in Hz and what the subject sang in response in Hz. Individual scores for each child consisted of an overall mean total deviation frequency, mean frequency deviations for each pattern, and mean frequency deviation for each pitch. Pearson correlations, MANOVA and ANOVA analyses, Multiple Regressions and Discriminant Analysis revealed the following findings: (1) moderate but significant (p < .001) relationships emerged between mean speech frequency and the ability to sing the pitches E, F♯, G and A in the study; (2) mean speech frequency also emerged as the strongest
Understanding the Abstract Role of Speech in Communication at 12 Months

ERIC Educational Resources Information Center

Martin, Alia; Onishi, Kristine H.; Vouloumanos, Athena

2012-01-01

Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the…
Hierarchical organization in the temporal structure of infant-direct speech and song.

PubMed

Falk, Simone; Kello, Christopher T

2017-06-01

Caregivers alter the temporal structure of their utterances when talking and singing to infants compared with adult communication. The present study tested whether temporal variability in infant-directed registers serves to emphasize the hierarchical temporal structure of speech. Fifteen German-speaking mothers sang a play song and told a story to their 6-months-old infants, or to an adult. Recordings were analyzed using a recently developed method that determines the degree of nested clustering of temporal events in speech. Events were defined as peaks in the amplitude envelope, and clusters of various sizes related to periods of acoustic speech energy at varying timescales. Infant-directed speech and song clearly showed greater event clustering compared with adult-directed registers, at multiple timescales of hundreds of milliseconds to tens of seconds. We discuss the relation of this newly discovered acoustic property to temporal variability in linguistic units and its potential implications for parent-infant communication and infants learning the hierarchical structures of speech and language. Copyright © 2017 Elsevier B.V. All rights reserved.
Speech intelligibility and speech quality of modified loudspeaker announcements examined in a simulated aircraft cabin.

PubMed

Pennig, Sibylle; Quehl, Julia; Wittkowski, Martin

2014-01-01

Acoustic modifications of loudspeaker announcements were investigated in a simulated aircraft cabin to improve passengers' speech intelligibility and quality of communication in this specific setting. Four experiments with 278 participants in total were conducted in an acoustic laboratory using a standardised speech test and subjective rating scales. In experiments 1 and 2 the sound pressure level (SPL) of the announcements was varied (ranging from 70 to 85 dB(A)). Experiments 3 and 4 focused on frequency modification (octave bands) of the announcements. All studies used a background noise with the same SPL (74 dB(A)), but recorded at different seat positions in the aircraft cabin (front, rear). The results quantify speech intelligibility improvements with increasing signal-to-noise ratio and amplification of particular octave bands, especially the 2 kHz and the 4 kHz band. Thus, loudspeaker power in an aircraft cabin can be reduced by using appropriate filter settings in the loudspeaker system.
Combined electric and acoustic hearing performance with Zebra® speech processor: speech reception, place, and temporal coding evaluation.

PubMed

Vaerenberg, Bart; Péan, Vincent; Lesbros, Guillaume; De Ceulaer, Geert; Schauwers, Karen; Daemers, Kristin; Gnansia, Dan; Govaerts, Paul J

2013-06-01

To assess the auditory performance of Digisonic(®) cochlear implant users with electric stimulation (ES) and electro-acoustic stimulation (EAS) with special attention to the processing of low-frequency temporal fine structure. Six patients implanted with a Digisonic(®) SP implant and showing low-frequency residual hearing were fitted with the Zebra(®) speech processor providing both electric and acoustic stimulation. Assessment consisted of monosyllabic speech identification tests in quiet and in noise at different presentation levels, and a pitch discrimination task using harmonic and disharmonic intonating complex sounds ( Vaerenberg et al., 2011 ). These tests investigate place and time coding through pitch discrimination. All tasks were performed with ES only and with EAS. Speech results in noise showed significant improvement with EAS when compared to ES. Whereas EAS did not yield better results in the harmonic intonation test, the improvements in the disharmonic intonation test were remarkable, suggesting better coding of pitch cues requiring phase locking. These results suggest that patients with residual hearing in the low-frequency range still have good phase-locking capacities, allowing them to process fine temporal information. ES relies mainly on place coding but provides poor low-frequency temporal coding, whereas EAS also provides temporal coding in the low-frequency range. Patients with residual phase-locking capacities can make use of these cues.
The Use of Artificial Neural Networks to Estimate Speech Intelligibility from Acoustic Variables: A Preliminary Analysis.

ERIC Educational Resources Information Center

Metz, Dale Evan; And Others

1992-01-01

A preliminary scheme for estimating the speech intelligibility of hearing-impaired speakers from acoustic parameters, using a computerized artificial neural network to process mathematically the acoustic input variables, is outlined. Tests with 60 hearing-impaired speakers found the scheme to be highly accurate in identifying speakers separated by…
General Systems Theory: Application To The Design Of Speech Communication Courses

ERIC Educational Resources Information Center

Tucker, Raymond K.

1971-01-01

General systems theory can be applied to problems in the teaching of speech communication courses. The author describes general systems theory as it is applied to the designing, conducting and evaluation of speech communication courses. (Author/MS)
Spatial acoustic signal processing for immersive communication

NASA Astrophysics Data System (ADS)

Atkins, Joshua

Computing is rapidly becoming ubiquitous as users expect devices that can augment and interact naturally with the world around them. In these systems it is necessary to have an acoustic front-end that is able to capture and reproduce natural human communication. Whether the end point is a speech recognizer or another human listener, the reduction of noise, reverberation, and acoustic echoes are all necessary and complex challenges. The focus of this dissertation is to provide a general method for approaching these problems using spherical microphone and loudspeaker arrays.. In this work, a theory of capturing and reproducing three-dimensional acoustic fields is introduced from a signal processing perspective. In particular, the decomposition of the spatial part of the acoustic field into an orthogonal basis of spherical harmonics provides not only a general framework for analysis, but also many processing advantages. The spatial sampling error limits the upper frequency range with which a sound field can be accurately captured or reproduced. In broadband arrays, the cost and complexity of using multiple transducers is an issue. This work provides a flexible optimization method for determining the location of array elements to minimize the spatial aliasing error. The low frequency array processing ability is also limited by the SNR, mismatch, and placement error of transducers. To address this, a robust processing method is introduced and used to design a reproduction system for rendering over arbitrary loudspeaker arrays or binaurally over headphones. In addition to the beamforming problem, the multichannel acoustic echo cancellation (MCAEC) issue is also addressed. A MCAEC must adaptively estimate and track the constantly changing loudspeaker-room-microphone response to remove the sound field presented over the loudspeakers from that captured by the microphones. In the multichannel case, the system is overdetermined and many adaptive schemes fail to converge to
The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.

PubMed

De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

2016-10-01

This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.
Political Science and Speech Communication--A Team Approach to Teaching Political Communication.

ERIC Educational Resources Information Center

Blatt, Stephen J.; Fogel, Norman

This paper proposes making speech communication more interdisciplinary and, in particular, combining political science and speech in a team-taught course in election campaigning. The goals, materials, activities, and plan of such a course are discussed. The goals include: (1) gaining new insights into the process of contemporary campaigns and…
Study of environmental sound source identification based on hidden Markov model for robust speech recognition

NASA Astrophysics Data System (ADS)

Nishiura, Takanobu; Nakamura, Satoshi

2003-10-01

Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self-moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)-based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound-added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional HMM composition with speech HMMs and a noise (environmental sound) HMM trained using noise periods prior to the target speech in a captured signal. [Work supported by Ministry of Public Management, Home Affairs, Posts and Telecommunications of Japan.
Acoustic communications for cabled seafloor observatories

NASA Astrophysics Data System (ADS)

Freitag, L.; Stojanovic, M.

2003-04-01

Cabled seafloor observatories will provide scientists with a continuous presence in both deep and shallow water. In the deep ocean, connecting sensors to seafloor nodes for power and data transfer will require cables and a highly-capable ROV, both of which are potentially expensive. For many applications where very high bandwidth is not required, and where a sensor is already designed to operate on battery power, the use of acoustic links should be considered. Acoustic links are particularly useful for large numbers of low-bandwidth sensors scattered over tens of square kilometers. Sensors used to monitor the chemistry and biology of vent fields are one example. Another important use for acoustic communication is monitoring of AUVs performing pre-programmed or adaptive sampling missions. A high data rate acoustic link with an AUV allows the observer on shore to direct the vehicle in real-time, providing for dynamic event response. Thus both fixed and mobile sensors motivate the development of observatory infrastructure that provides power-efficient, high bandwidth acoustic communication. A proposed system design that can provide the wireless infrastructure, and further examples of its use in networks such as NEPTUNE, are presented.
Augmentative and Alternative Communication in Autism: A Comparison of the Picture Exchange Communication System and Speech-Output Technology

ERIC Educational Resources Information Center

Boesch, Miriam Chacon

2011-01-01

The purpose of this comparative efficacy study was to investigate the Picture Exchange Communication System (PECS) and a speech-generating device (SGD) in developing requesting skills, social-communicative behavior, and speech for three elementary-age children with severe autism and little to no functional speech. Requesting was selected as the…
Acoustic Changes in the Speech of Children with Cerebral Palsy Following an Intensive Program of Dysarthria Therapy

ERIC Educational Resources Information Center

Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

2018-01-01

Background: The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. Aims: To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. Methods & Procedures: We recorded 16…
A comparison of recordings of sentences and spontaneous speech: perceptual and acoustic measures in preschool children's voices.

PubMed

McAllister, Anita; Brandt, Signe Kofoed

2012-09-01

A well-controlled recording in a studio is fundamental in most voice rehabilitation. However, this laboratory like recording method has been questioned because voice use in a natural environment may be quite different. In children's natural environment, high background noise levels are common and are an important factor contributing to voice problems. The primary noise source in day-care centers is the children themselves. The aim of the present study was to compare perceptual evaluations of voice quality and acoustic measures from a controlled recording with recordings of spontaneous speech in children's natural environment in a day-care setting. Eleven 5-year-old children were recorded three times during a day at the day care. The controlled speech material consisted of repeated sentences. Matching sentences were selected from the spontaneous speech. All sentences were repeated three times. Recordings were randomized and analyzed acoustically and perceptually. Statistic analyses showed that fundamental frequency was significantly higher in spontaneous speech (P<0.01) as was hyperfunction (P<0.001). The only characteristic the controlled sentences shared with spontaneous speech was degree of hoarseness (Spearman's rho=0.564). When data for boys and girls were analyzed separately, a correlation was found for the parameter breathiness (rho=0.551) for boys, and for girls the correlation for hoarseness remained (rho=0.752). Regarding acoustic data, none of the measures correlated across recording conditions for the whole group. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Predicting speech intelligibility with a multiple speech subsystems approach in children with cerebral palsy.

PubMed

Lee, Jimin; Hustad, Katherine C; Weismer, Gary

2014-10-01

Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (speech motor impairment [SMI] group) and 9 judged to be free of dysarthria (no SMI [NSMI] group). Data from children with CP were compared to data from age-matched typically developing children. Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and typically developing groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (adjusted R2 = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R2 analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Children in the SMI group had articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.

Acoustic Communications and Navigation for Mobile Under-Ice Sensors

DTIC Science & Technology

2017-02-04

From- To) 04/02/2017 Final Report 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Acoustic Communications and Navigation for Mobile Under-Ice Sensors...development and fielding of a new acoustic communications and navigation system for use on autonomous platforms (gliders and profiling floats) under the...contact below the ice. 15. SUBJECT TERMS Arctic Ocean, Undersea Workstations & Vehicles, Signal Processing, Navigation, Underwater Acoustics 16
Common cues to emotion in the dynamic facial expressions of speech and song.

PubMed

Livingstone, Steven R; Thompson, William F; Wanderley, Marcelo M; Palmer, Caroline

2015-01-01

Speech and song are universal forms of vocalization that may share aspects of emotional expression. Research has focused on parallels in acoustic features, overlooking facial cues to emotion. In three experiments, we compared moving facial expressions in speech and song. In Experiment 1, vocalists spoke and sang statements each with five emotions. Vocalists exhibited emotion-dependent movements of the eyebrows and lip corners that transcended speech-song differences. Vocalists' jaw movements were coupled to their acoustic intensity, exhibiting differences across emotion and speech-song. Vocalists' emotional movements extended beyond vocal sound to include large sustained expressions, suggesting a communicative function. In Experiment 2, viewers judged silent videos of vocalists' facial expressions prior to, during, and following vocalization. Emotional intentions were identified accurately for movements during and after vocalization, suggesting that these movements support the acoustic message. Experiment 3 compared emotional identification in voice-only, face-only, and face-and-voice recordings. Emotion judgements for voice-only singing were poorly identified, yet were accurate for all other conditions, confirming that facial expressions conveyed emotion more accurately than the voice in song, yet were equivalent in speech. Collectively, these findings highlight broad commonalities in the facial cues to emotion in speech and song, yet highlight differences in perception and acoustic-motor production.
Acoustic MIMO communications in a very shallow water channel

NASA Astrophysics Data System (ADS)

Zhou, Yuehai; Cao, Xiuling; Tong, Feng

2015-12-01

Underwater acoustic channels pose significant difficulty for the development of high speed communication due to highly limited band-width as well as hostile multipath interference. Enlightened by rapid progress of multiple input multiple output (MIMO) technologies in wireless communication scenarios, MIMO systems offer a potential solution by enabling multiple spatially parallel communication channels to improve communication performance as well as capacity. For MIMO acoustic communications, deep sea channels offer substantial spatial diversity among multiple channels that can be exploited to address simultaneous multipath and co-channel interference. At the same time, there are increasing requirements for high speed underwater communication in very shallow water area (for example, a depth less than 10 m). In this paper, a space-time multichannel adaptive receiver consisting of multiple decision feedback equalizers (DFE) is adopted as the receiver for a very shallow water MIMO acoustic communication system. The performance of multichannel DFE receivers with relatively small number of receiving elements are analyzed and compared with that of the multichannel time reversal receiver to evaluate the impact of limited spatial diversity on multi-channel equalization and time reversal processing. The results of sea trials in a very shallow water channel are presented to demonstrate the feasibility of very shallow water MIMO acoustic communication.
[Speech perception with electric-acoustic stimulation : Comparison with bilateral cochlear implant users in different noise conditions].

PubMed

Rader, T

2015-02-01

Cochlear implantation with the aim of hearing preservation for combined electric-acoustic stimulation (EAS) is the therapy of choice for patients with residual low-frequency hearing. Preserved residual acoustic hearing has a positive effect on speech intelligibility in difficult noise conditions. The goal of this study was to assess speech reception thresholds in various complex noise conditions for patients with EAS in comparison with patients using bilateral cochlear implants (CI). Speech perception in noise was measured for bilateral CI and EAS patient groups. A total of 22 listeners with normal hearing served as a control group. Speech reception thresholds (SRT) were measured using a closed-set sentence matrix test. Speech was presented with a single source in frontal position; noise was presented in frontal position or in a multisource noise field (MSNF) consisting of a four-loudspeaker array with independent noise sources. Modulated speech-simulating noise and pseudocontinuous noise served respectively as interference signal with different temporal characteristics. The average SRTs in the EAS group were significantly better in all test conditions than those of the group with bilateral CI. Both user groups showed significant improvement in the MSNF condition compared with the frontal noise condition as a result of bilateral interaction. The normal-hearing control group was able to use short temporal gaps in modulated noise to improve speech perception in noise (gap listening). This effect was absent in both implanted user groups. Patients with combined EAS in one ear and a hearing aid in the contralateral ear show significantly improved speech perception in complex noise conditions compared with bilateral CI recipients.
Measures to Evaluate the Effects of DBS on Speech Production

PubMed Central

Weismer, Gary; Yunusova, Yana; Bunton, Kate

2011-01-01

The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066
Predicting Speech Intelligibility with A Multiple Speech Subsystems Approach in Children with Cerebral Palsy

PubMed Central

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD). Results Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Conclusions Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems. PMID:24824584
Proceedings of the Speech Communication Association Summer Conference: Mini Courses in Speech Communication (7th, Chicago, July 8-10, 1971).

ERIC Educational Resources Information Center

Jeffrey, Robert C., Ed.

The Speech Communication Association's 1971 summer conference provided instruction in the application of basic research and innovative practices in communication. It was designed to assist elementary, secondary, and college teachers in the enrichment of content and procedures. The proceedings include syllabi, course units, and bibliographic…
Radio speech communication problems reported in a survey of military pilots.

PubMed

Lahtinen, Taija M M; Huttunen, Kerttu H; Kuronen, Pentti O; Sorri, Martti J; Leino, Tuomo K

2010-12-01

Despite technological advances in conveying information, speech communication is still a key safety factor in aviation. Effective radio communication is necessary, for example, in building and maintaining good team situation awareness. However, little has been reported concerning the prevalence and nature of radio communication problems in everyday working environments in military aviation. We surveyed Finnish Defense Forces pilots regarding the prevalence of radio speech communication problems. Of the 225 pilots contacted, 75% replied to our survey. Altogether 138 of the respondents were fixed-wing pilots and 31 were helicopter pilots. Problems in radio communication occurred, on average, during 14% of flight time. The most prevalent problems were multiple speakers on the same radio frequency band causing overlapping speech, missing acknowledgments, high background noise especially during helicopter operations, and technical problems. Of the respondents, 18% (31 pilots) reported having encountered at least one potentially dangerous event caused by problems in radio communication during their military aviation career. If the employer were to offer extra hearing protection, such as custom-made ear plugs, 93% of the pilots indicated that they would use it. Communication can be a flight safety factor especially during intense air combat exercises and other information-loaded flights. During these situations, communication should be clear and focused on the most essential information. So, training and technical improvements are necessary for better communication. High quality radio speech communication also improves operational effectiveness in military aviation.
Re-Evaluation and the Core Curriculum: How Will Speech Communication Fare?

ERIC Educational Resources Information Center

Madson, Lynda P.; Myers, Russel M.

This paper discusses the effects of reestablished or redefined core curriculum requirements on college speech communication programs, based on the survey responses of speech communication faculty members at 104 four-year schools. The following conclusions and recommendations are presented as a result of the survey: Although many employers consider…
Segregation of Whispered Speech Interleaved with Noise or Speech Maskers

DTIC Science & Technology

2011-08-01

range over which the talker can be heard. Whispered speech is produced by modulating the flow of air through partially open vocal folds. Because the...source of excitation is turbulent air flow , the acoustic characteristics of whispered speech differs from voiced speech [1, 2]. Despite the acoustic...signals provided by cochlear implants. Two studies investigated the segregation of simultaneously presented whispered vowels [7, 8] in a standard
An acoustic feature-based similarity scoring system for speech rehabilitation assistance.

PubMed

Syauqy, Dahnial; Wu, Chao-Min; Setyawati, Onny

2016-08-01

The purpose of this study is to develop a tool to assist speech therapy and rehabilitation, which focused on automatic scoring based on the comparison of the patient's speech with another normal speech on several aspects including pitch, vowel, voiced-unvoiced segments, strident fricative and sound intensity. The pitch estimation employed the use of cepstrum-based algorithm for its robustness; the vowel classification used multilayer perceptron (MLP) to classify vowel from pitch and formants; and the strident fricative detection was based on the major peak spectral intensity, location and the pitch existence in the segment. In order to evaluate the performance of the system, this study analyzed eight patient's speech recordings (four males, four females; 4-58-years-old), which had been recorded in previous study in cooperation with Taipei Veterans General Hospital and Taoyuan General Hospital. The experiment result on pitch algorithm showed that the cepstrum method had 5.3% of gross pitch error from a total of 2086 frames. On the vowel classification algorithm, MLP method provided 93% accuracy (men), 87% (women) and 84% (children). In total, the overall results showed that 156 tool's grading results (81%) were consistent compared to 192 audio and visual observations done by four experienced respondents. Implication for Rehabilitation Difficulties in communication may limit the ability of a person to transfer and exchange information. The fact that speech is one of the primary means of communication has encouraged the needs of speech diagnosis and rehabilitation. The advances of technology in computer-assisted speech therapy (CAST) improve the quality, time efficiency of the diagnosis and treatment of the disorders. The present study attempted to develop tool to assist speech therapy and rehabilitation, which provided simple interface to let the assessment be done even by the patient himself without the need of particular knowledge of speech processing while at the
Acoustic Quality Levels of Mosques in Batu Pahat

NASA Astrophysics Data System (ADS)

Azizah Adnan, Nor; Nafida Raja Shahminan, Raja; Khair Ibrahim, Fawazul; Tami, Hannifah; Yusuff, M. Rizal M.; Murniwaty Samsudin, Emedya; Ismail, Isham

2018-04-01

Every Friday, Muslims has been required to perform a special prayer known as the Friday prayers which involve the delivery of a brief lecture (Khutbah). Speech intelligibility in oral communications presented by the preacher affected all the congregation and determined the level of acoustic quality in the interior of the mosque. Therefore, this study intended to assess the level of acoustic quality of three public mosques in Batu Pahat. Good acoustic quality is essential in contributing towards appreciation in prayers and increasing khusyu’ during the worship, which is closely related to the speech intelligibility corresponding to the actual function of the mosque according to Islam. Acoustic parameters measured includes noise criteria (NC), reverberation time (RT) and speech transmission index (STI), and was performed using the sound level meter and sound measurement instruments. This test is carried out through the physical observation with the consideration of space and volume design as a factor affecting acoustic parameters. Results from all 3 mosques as the showed that the acoustic quality level inside these buildings are slightly poor which is at below 0.45 coefficients based on the standard. Among the factors that influencing the low acoustical quality are location, building materials, installation of sound absorption material and the number of occupants inside the mosque. As conclusion, the acoustic quality level of a mosque is highly depends on physical factors of the mosque such as the architectural design and space volume besides other factors as been identified by this study.
Speech Recognition Using Multiple Features and Multiple Recognizers

DTIC Science & Technology

1991-12-03

6 2.1 Introduction ............................................... 6 2.2 Human Speech Communication Process...119 How to Setup ASRT.......................................... 119 How to Use Interactive Menus .................................. 120...recognize a word from an acoustic signal. The human ear and brain perform this type of recognition with incredible speed and precision. Even though
Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease.

PubMed

Rusz, J; Cmejla, R; Ruzickova, H; Ruzicka, E

2011-01-01

An assessment of vocal impairment is presented for separating healthy people from persons with early untreated Parkinson's disease (PD). This study's main purpose was to (a) determine whether voice and speech disorder are present from early stages of PD before starting dopaminergic pharmacotherapy, (b) ascertain the specific characteristics of the PD-related vocal impairment, (c) identify PD-related acoustic signatures for the major part of traditional clinically used measurement methods with respect to their automatic assessment, and (d) design new automatic measurement methods of articulation. The varied speech data were collected from 46 Czech native speakers, 23 with PD. Subsequently, 19 representative measurements were pre-selected, and Wald sequential analysis was then applied to assess the efficiency of each measure and the extent of vocal impairment of each subject. It was found that measurement of the fundamental frequency variations applied to two selected tasks was the best method for separating healthy from PD subjects. On the basis of objective acoustic measures, statistical decision-making theory, and validation from practicing speech therapists, it has been demonstrated that 78% of early untreated PD subjects indicate some form of vocal impairment. The speech defects thus uncovered differ individually in various characteristics including phonation, articulation, and prosody.
From "The Speech Teacher" to "Communication Education": Some Reflections.

ERIC Educational Resources Information Center

Brown, Kenneth L.

2002-01-01

Describes personal reflections on "Communication Education" between 1976 and 1978 under the author's editorship. Discusses reflections under three headings: events and experiences shaping the editor's views of the field of communication education; the transition from "The Speech Teacher" to "Communication Education,"…
Collaboration between Teachers and Speech and Language Therapists: Services for Primary School Children with Speech, Language and Communication Needs

ERIC Educational Resources Information Center

Glover, Anna; McCormack, Jane; Smith-Tamaray, Michelle

2015-01-01

Speech, language and communication needs (SLCN) are prevalent among primary school-aged children. Collaboration between speech and language therapists (SLTs) and teachers is beneficial for supporting children's communication skills. The aim of this study was to investigate the needs of both professional groups and their preferences for service…
Mobile communication jacket for people with severe speech impairment.

PubMed

Lampe, Renée; Blumenstein, Tobias; Turova, Varvara; Alves-Pinto, Ana

2018-04-01

Cerebral palsy is a movement disorder caused by damage to motor control areas of the developing brain during early childhood. Motor disorders can also affect the ability to produce clear speech and to communicate. The aim of this study was to develop and to test a prototype of an assistive tool with an embedded mobile communication device to support patients with severe speech impairments. A prototype was developed by equipping a cycling jacket with a display, a small keyboard, a LED and an alarm system, all controlled by a microcontroller. Functionality of the prototype was tested in six participants (aged 7-20 years) with cerebral palsy and global developmental disorder and three healthy persons. A patient questionnaire consisting of seven items was used as an evaluation tool. A working prototype of the communication jacket was developed and tested. The questionnaire elicited positive responses from participants. Improvements to correct revealed weaknesses were proposed. Enhancements like voice output of pre-selected phrases and enlarged display were implemented. Integration in a jacket makes the system mobile and continuously available to the user. The communication jacket may be of great benefit to patients with motor and speech impairments. Implications for Rehabilitation The communication jacket developed can be easily used by people with movement and speech impairment. All technical components are integrated in a garment and do not have to be held with the hands or transported separately. The system is adaptable to individual use. Both expected and unexpected events can be dealt with, which contributes to the quality of life and self-fulfilment.
Speech, communication and use of augmentative communication in young people with cerebral palsy: the SH&PE population study.

PubMed

Cockerill, H; Elbourne, D; Allen, E; Scrutton, D; Will, E; McNee, A; Fairhurst, C; Baird, G

2014-03-01

Communication is frequently impaired in young people (YP) with bilateral cerebral palsy (CP). Important factors include motoric speech problems (dysarthria) and intellectual disability. Augmentative and Alternative Communication (AAC) techniques are often employed. The aim was to describe the speech problems in bilateral CP, factors associated with speech problems, current AAC provision and use, and to explore the views of both the parent/carer and young person about communication. A total population of children with bilateral CP (n = 346) from four consecutive years of births (1989-1992 inclusive) with onset of CP before 15 months were reassessed at age 16-18 years. Motor skills and speech were directly assessed and both parent/carer and the young person asked about communication and satisfaction with it. Sixty had died, eight had other conditions, 243 consented and speech was assessed in 224 of whom 141 (63%) had impaired speech. Fifty-two (23% of total YP) were mainly intelligible to unfamiliar people, 22 (10%) were mostly unintelligible to unfamiliar people, 67 (30%) were mostly or wholly unintelligible even to familiar adults. However, 89% of parent/carers said that they could communicate 1:1 with their young person. Of the 128 YP who could independently complete the questions, 107 (83.6%) were happy with their communication, nine (7%) neither happy nor unhappy and 12 (9.4%) unhappy. A total of 72 of 224 (32%) were provided with one or more types of AAC but in a significant number (75% of 52 recorded) AAC was not used at home, only in school. Factors associated with speech impairment were severity of physical impairment, as measured by Gross Motor Function Scale level and manipulation in the best hand, intellectual disability and current epilepsy. In a population representative group of YP, aged 16-18 years, with bilateral CP, 63% had impaired speech of varying severity, most had been provided with AAC but few used it at home for communication. © 2013 John
Automatic speech recognition technology development at ITT Defense Communications Division

NASA Technical Reports Server (NTRS)

White, George M.

1977-01-01

An assessment of the applications of automatic speech recognition to defense communication systems is presented. Future research efforts include investigations into the following areas: (1) dynamic programming; (2) recognition of speech degraded by noise; (3) speaker independent recognition; (4) large vocabulary recognition; (5) word spotting and continuous speech recognition; and (6) isolated word recognition.
Communication in Pipes Using Acoustic Modems that Provide Minimal Obstruction to Fluid Flow

NASA Technical Reports Server (NTRS)

Bar-Cohen, Yoseph (Inventor); Bao, Xiaoqi (Inventor); Sherrit, Stewart (Inventor); Archer, Eric D. (Inventor)

2016-01-01

A plurality of phased array acoustic communication devices are used to communicate data along a tubulation, such as a well. The phased array acoustic communication devices employ phased arrays of acoustic transducers, such as piezoelectric transducers, to direct acoustic energy in desired directions along the tubulation. The system is controlled by a computer-based controller. Information, including data and commands, is communicated using digital signaling.

Suprasegmental Characteristics of Spontaneous Speech Produced in Good and Challenging Communicative Conditions by Talkers Aged 9-14 Years.

PubMed

Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle

2016-12-01

This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Recordings were made for 96 children (50 female participants, 46 male participants) engaged in a problem-solving task with a same-sex friend; recordings for 20 adults were used as reference. The task was carried out in good listening conditions (normal transmission) and in degraded transmission conditions. Articulation rate, median fundamental frequency (f0), f0 range, and relative energy in the 1- to 3-kHz range were analyzed. With increasing age, children significantly reduced their median f0 and f0 range, became faster talkers, and reduced their mid-frequency energy in spontaneous speech. Children produced similar clear speech adaptations (in degraded transmission conditions) as adults, but only children aged 11-14 years increased their f0 range, an unhelpful strategy not transmitted via the vocoder. Changes made by children were consistent with a general increase in vocal effort. Further developments in speech production take place during later childhood. Children use clear speech strategies to benefit an interlocutor facing intelligibility problems but may not be able to attune these strategies to the same degree as adults.
[Influence of human personal features on acoustic correlates of speech emotional intonation characteristics].

PubMed

Dmitrieva, E S; Gel'man, V Ia; Zaĭtseva, K A; Orlov, A M

2009-01-01

Comparative study of acoustic correlates of emotional intonation was conducted on two types of speech material: sensible speech utterances and short meaningless words. The corpus of speech signals of different emotional intonations (happy, angry, frightened, sad and neutral) was created using the actor's method of simulation of emotions. Native Russian 20-70-year-old speakers (both professional actors and non-actors) participated in the study. In the corpus, the following characteristics were analyzed: mean values and standard deviations of the power, fundamental frequency, frequencies of the first and second formants, and utterance duration. Comparison of each emotional intonation with "neutral" utterances showed the greatest deviations of the fundamental frequency and frequencies of the first formant. The direction of these deviations was independent of the semantic content of speech utterance and its duration, age, gender, and being actor or non-actor, though the personal features of the speakers affected the absolute values of these frequencies.
Perceived gender in clear and conversational speech

NASA Astrophysics Data System (ADS)

Booz, Jaime A.

Although many studies have examined acoustic and sociolinguistic differences between male and female speech, the relationship between talker speaking style and perceived gender has not yet been explored. The present study attempts to determine whether clear speech, a style adopted by talkers who perceive some barrier to effective communication, shifts perceptions of femininity for male and female talkers. Much of our understanding of gender perception in voice and speech is based on sustained vowels or single words, eliminating temporal, prosodic, and articulatory cues available in more naturalistic, connected speech. Thus, clear and conversational sentence stimuli, selected from the 41 talkers of the Ferguson Clear Speech Database (Ferguson, 2004) were presented to 17 normal-hearing listeners, aged 18 to 30. They rated the talkers' gender using a visual analog scale with "masculine" and "feminine" endpoints. This response method was chosen to account for within-category shifts of gender perception by allowing nonbinary responses. Mixed-effects regression analysis of listener responses revealed a small but significant effect of speaking style, and this effect was larger for male talkers than female talkers. Because of the high degree of talker variability observed for talker gender, acoustic analyses of these sentences were undertaken to determine the relationship between acoustic changes in clear and conversational speech and perceived femininity. Results of these analyses showed that mean fundamental frequency (fo) and f o standard deviation were significantly correlated to perceived gender for both male and female talkers, and vowel space was significantly correlated only for male talkers. Speaking rate and breathiness measures (CPPS) were not significantly related for either group. Outcomes of this study indicate that adopting a clear speaking style is correlated with increases in perceived femininity. Although the increase was small, some changes associated
Speech, Voice, and Communication.

PubMed

Johnson, Julia A

2017-01-01

Communication changes are an important feature of Parkinson's and include both motor and nonmotor features. This chapter will cover briefly the motor features affecting speech production and voice function before focusing on the nonmotor aspects. A description of the difficulties experienced by people with Parkinson's when trying to communicate effectively is presented along with some of the assessment tools and therapists' treatment options. The idea of clinical heterogeneity of PD and subtyping patients with different communication problems is explored and suggestions are made on how this may influence clinicians' treatment methods and choices so as to provide personalized therapy programmes. The importance of encouraging and supporting people to maintain social networks, employment, and leisure activities is stated as the key to achieving sustainability. Finally looking into the future, the emergence of new technologies is seen as providing further possibilities to support therapists in the goal of helping people with Parkinson's to maintain good communication skills throughout the course of the disease. © 2017 Elsevier Inc. All rights reserved.
Acoustic analysis of speech variables during depression and after improvement.

PubMed

Nilsonne, A

1987-09-01

Speech recordings were made of 16 depressed patients during depression and after clinical improvement. The recordings were analyzed using a computer program which extracts acoustic parameters from the fundamental frequency contour of the voice. The percent pause time, the standard deviation of the voice fundamental frequency distribution, the standard deviation of the rate of change of the voice fundamental frequency and the average speed of voice change were found to correlate to the clinical state of the patient. The mean fundamental frequency, the total reading time and the average rate of change of the voice fundamental frequency did not differ between the depressed and the improved group. The acoustic measures were more strongly correlated to the clinical state of the patient as measured by global depression scores than to single depressive symptoms such as retardation or agitation.
Effects of noise on speech recognition: Challenges for communication by service members.

PubMed

Le Prell, Colleen G; Clavier, Odile H

2017-06-01

Speech communication often takes place in noisy environments; this is an urgent issue for military personnel who must communicate in high-noise environments. The effects of noise on speech recognition vary significantly according to the sources of noise, the number and types of talkers, and the listener's hearing ability. In this review, speech communication is first described as it relates to current standards of hearing assessment for military and civilian populations. The next section categorizes types of noise (also called maskers) according to their temporal characteristics (steady or fluctuating) and perceptive effects (energetic or informational masking). Next, speech recognition difficulties experienced by listeners with hearing loss and by older listeners are summarized, and questions on the possible causes of speech-in-noise difficulty are discussed, including recent suggestions of "hidden hearing loss". The final section describes tests used by military and civilian researchers, audiologists, and hearing technicians to assess performance of an individual in recognizing speech in background noise, as well as metrics that predict performance based on a listener and background noise profile. This article provides readers with an overview of the challenges associated with speech communication in noisy backgrounds, as well as its assessment and potential impact on functional performance, and provides guidance for important new research directions relevant not only to military personnel, but also to employees who work in high noise environments. Copyright © 2016 Elsevier B.V. All rights reserved.
Speech Intelligibility of Aircrew Mask Communication Configurations in High-Noise Environments

DTIC Science & Technology

2017-09-28

ARL-TR-8168 ● Sep 2017 US Army Research Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High ...Laboratory Speech Intelligibility of Aircrew Mask Communication Configurations in High -Noise Environments by Kimberly A Pollard and Lamar Garrett...in High - Noise Environments 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Kimberly A Pollard and Lamar
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hogden, J.

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.« less
50 years of progress in microphone arrays for speech processing

NASA Astrophysics Data System (ADS)

Elko, Gary W.; Frisk, George V.

2004-10-01

In the early 1980s, Jim Flanagan had a dream of covering the walls of a room with microphones. He occasionally referred to this concept as acoustic wallpaper. Being a new graduate in the field of acoustics and signal processing, it was fortunate that Bell Labs was looking for someone to investigate this area of microphone arrays for telecommunication. The job interview was exciting, with all of the big names in speech signal processing and acoustics sitting in the audience, many of whom were the authors of books and articles that were seminal contributions to the fields of acoustics and signal processing. If there ever was an opportunity of a lifetime, this was it. Fortunately, some of the work had already begun, and Sessler and West had already laid the groundwork for directional electret microphones. This talk will describe some of the very early work done at Bell Labs on microphone arrays and reflect on some of the many systems, from large 400-element arrays, to small two-microphone arrays. These microphone array systems were built under Jim Flanagan's leadership in an attempt to realize his vision of seamless hands-free speech communication between people and the communication of people with machines.
Speech Perception With Combined Electric-Acoustic Stimulation: A Simulation and Model Comparison.

PubMed

Rader, Tobias; Adel, Youssef; Fastl, Hugo; Baumann, Uwe

2015-01-01

The aim of this study is to simulate speech perception with combined electric-acoustic stimulation (EAS), verify the advantage of combined stimulation in normal-hearing (NH) subjects, and then compare it with cochlear implant (CI) and EAS user results from the authors' previous study. Furthermore, an automatic speech recognition (ASR) system was built to examine the impact of low-frequency information and is proposed as an applied model to study different hypotheses of the combined-stimulation advantage. Signal-detection-theory (SDT) models were applied to assess predictions of subject performance without the need to assume any synergistic effects. Speech perception was tested using a closed-set matrix test (Oldenburg sentence test), and its speech material was processed to simulate CI and EAS hearing. A total of 43 NH subjects and a customized ASR system were tested. CI hearing was simulated by an aurally adequate signal spectrum analysis and representation, the part-tone-time-pattern, which was vocoded at 12 center frequencies according to the MED-EL DUET speech processor. Residual acoustic hearing was simulated by low-pass (LP)-filtered speech with cutoff frequencies 200 and 500 Hz for NH subjects and in the range from 100 to 500 Hz for the ASR system. Speech reception thresholds were determined in amplitude-modulated noise and in pseudocontinuous noise. Previously proposed SDT models were lastly applied to predict NH subject performance with EAS simulations. NH subjects tested with EAS simulations demonstrated the combined-stimulation advantage. Increasing the LP cutoff frequency from 200 to 500 Hz significantly improved speech reception thresholds in both noise conditions. In continuous noise, CI and EAS users showed generally better performance than NH subjects tested with simulations. In modulated noise, performance was comparable except for the EAS at cutoff frequency 500 Hz where NH subject performance was superior. The ASR system showed similar behavior
Communication without Speech: A Guide for Parents and Teachers.

ERIC Educational Resources Information Center

Bloomberg, Karen, Ed.; Johnson, Hilary, Ed.

This guide addresses issues facing the parents, teachers and caregivers of children who are unable to use normal speech as a means of communication. It focuses on people who are intellectually disabled or children who are starting to use augmentative communication. The guide includes the following topics: the nature of communication; an overview…
Time Reversal Acoustic Communication Using Filtered Multitone Modulation

PubMed Central

Sun, Lin; Chen, Baowei; Li, Haisen; Zhou, Tian; Li, Ruo

2015-01-01

The multipath spread in underwater acoustic channels is severe and, therefore, when the symbol rate of the time reversal (TR) acoustic communication using single-carrier (SC) modulation is high, the large intersymbol interference (ISI) span caused by multipath reduces the performance of the TR process and needs to be removed using the long adaptive equalizer as the post-processor. In this paper, a TR acoustic communication method using filtered multitone (FMT) modulation is proposed in order to reduce the residual ISI in the processed signal using TR. In the proposed method, FMT modulation is exploited to modulate information symbols onto separate subcarriers with high spectral containment and TR technique, as well as adaptive equalization is adopted at the receiver to suppress ISI and noise. The performance of the proposed method is assessed through simulation and real data from a trial in an experimental pool. The proposed method was compared with the TR acoustic communication using SC modulation with the same spectral efficiency. Results demonstrate that the proposed method can improve the performance of the TR process and reduce the computational complexity of adaptive equalization for post-process. PMID:26393586
Time Reversal Acoustic Communication Using Filtered Multitone Modulation.

PubMed

Sun, Lin; Chen, Baowei; Li, Haisen; Zhou, Tian; Li, Ruo

2015-09-17

The multipath spread in underwater acoustic channels is severe and, therefore, when the symbol rate of the time reversal (TR) acoustic communication using single-carrier (SC) modulation is high, the large intersymbol interference (ISI) span caused by multipath reduces the performance of the TR process and needs to be removed using the long adaptive equalizer as the post-processor. In this paper, a TR acoustic communication method using filtered multitone (FMT) modulation is proposed in order to reduce the residual ISI in the processed signal using TR. In the proposed method, FMT modulation is exploited to modulate information symbols onto separate subcarriers with high spectral containment and TR technique, as well as adaptive equalization is adopted at the receiver to suppress ISI and noise. The performance of the proposed method is assessed through simulation and real data from a trial in an experimental pool. The proposed method was compared with the TR acoustic communication using SC modulation with the same spectral efficiency. Results demonstrate that the proposed method can improve the performance of the TR process and reduce the computational complexity of adaptive equalization for post-process.
Designing acoustics for linguistically diverse classrooms: Effects of background noise, reverberation and talker foreign accent on speech comprehension by native and non-native English-speaking listeners

NASA Astrophysics Data System (ADS)

Peng, Zhao Ellen

The current classroom acoustics standard (ANSI S12.60-2010) recommends core learning spaces not to exceed background noise level (BNL) of 35 dBA and reverberation time (RT) of 0.6 second, based on speech intelligibility performance mainly by the native English-speaking population. Existing literature has not correlated these recommended values well with student learning outcomes. With a growing population of non-native English speakers in American classrooms, the special needs for perceiving degraded speech among non-native listeners, either due to realistic room acoustics or talker foreign accent, have not been addressed in the current standard. This research seeks to investigate the effects of BNL and RT on the comprehension of English speech from native English and native Mandarin Chinese talkers as perceived by native and non-native English listeners, and to provide acoustic design guidelines to supplement the existing standard. This dissertation presents two studies on the effects of RT and BNL on more realistic classroom learning experiences. How do native and non-native English-speaking listeners perform on speech comprehension tasks under adverse acoustic conditions, if the English speech is produced by talkers of native English (Study 1) versus native Mandarin Chinese (Study 2)? Speech comprehension materials were played back in a listening chamber to individual listeners: native and non-native English-speaking in Study 1; native English, native Mandarin Chinese, and other non-native English-speaking in Study 2. Each listener was screened for baseline English proficiency level, and completed dual tasks simultaneously involving speech comprehension and adaptive dot-tracing under 15 acoustic conditions, comprised of three BNL conditions (RC-30, 40, and 50) and five RT scenarios (0.4 to 1.2 seconds). The results show that BNL and RT negatively affect both objective performance and subjective perception of speech comprehension, more severely for non
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2002-01-01

Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.
Tongue- and Jaw-Specific Contributions to Acoustic Vowel Contrast Changes in the Diphthong /ai/ in Response to Slow, Loud, And Clear Speech

ERIC Educational Resources Information Center

Mefferd, Antje S.

2017-01-01

Purpose: This study sought to determine decoupled tongue and jaw displacement changes and their specific contributions to acoustic vowel contrast changes during slow, loud, and clear speech. Method: Twenty typical talkers repeated "see a kite again" 5 times in 4 speech conditions (typical, slow, loud, clear). Speech kinematics were…
Acoustical considerations for secondary uses of government facilities

NASA Astrophysics Data System (ADS)

Evans, Jack B.

2003-10-01

Government buildings are by their nature, public and multi-functional. Whether in meetings, presentations, documentation processing, work instructions or dispatch, speech communications are critical. Full-time occupancy facilities may require sleep or rest areas adjacent to active spaces. Rooms designed for some other primary use may be used for public assembly, receptions or meetings. In addition, environmental noise impacts to the building or from the building should be considered, especially where adjacent to hospitals, hotels, apartments or other urban sensitive land uses. Acoustical criteria and design parameters for reverberation, background noise and sound isolation should enhance speech intelligibility and privacy. This presentation looks at unusual spaces and unexpected uses of spaces with regard to room acoustics and noise control. Examples of various spaces will be discussed, including an atrium used for reception and assembly, multi-jurisdictional (911) emergency control center, frequent or long-duration use of emergency generators, renovations of historically significant buildings, and the juxtaposition of acoustically incompatible functions. Brief case histories of acoustical requirements, constraints and design solutions will be presented, including acoustical measurements, plan illustrations and photographs. Acoustical criteria for secondary functional uses of spaces will be proposed.
Speech Perception in the Classroom.

ERIC Educational Resources Information Center

Smaldino, Joseph J.; Crandell, Carl C.

1999-01-01

This article discusses how poor room acoustics can make speech inaudible and presents a speech-perception model demonstrating the linkage between adequacy of classroom acoustics and the development of a speech and language systems. It argues both aspects must be considered when evaluating barriers to listening and learning in a classroom.…
Environmental Fluctuations and Acoustic Data Communications

DTIC Science & Technology

2015-09-30

July 2011 along with subsequent analysis of the experiment data. KAM11 Experiment (2011) A shallow water acoustic communications experiment...packet and packet-to-packet variability. Algorithm Design and Experiment Data Analysis Communication receiver algorithm design for shallow water is...exhibited substantial daily oceanographic variability. Analysis of the KAM11 experiment data this past year has focused on fixed source transmissions
New Orleans Revisited and Revised: Recommendations for the Field of Speech Communication.

ERIC Educational Resources Information Center

Roever, James E.; And Others

This collection of four papers is the result of an action caucus held in association with the Speech Communication Association's 1972 convention, focusing on developments in the speech communication field since the 1968 USOE/SAA New Orleans conference (ED 028 164). In the first paper, "New Orleans Revisited but Briefly," James E. Roever summarizes…

Intelligent acoustic data fusion technique for information security analysis

NASA Astrophysics Data System (ADS)

Jiang, Ying; Tang, Yize; Lu, Wenda; Wang, Zhongfeng; Wang, Zepeng; Zhang, Luming

2017-08-01

Tone is an essential component of word formation in all tonal languages, and it plays an important role in the transmission of information in speech communication. Therefore, tones characteristics study can be applied into security analysis of acoustic signal by the means of language identification, etc. In speech processing, fundamental frequency (F0) is often viewed as representing tones by researchers of speech synthesis. However, regular F0 values may lead to low naturalness in synthesized speech. Moreover, F0 and tone are not equivalent linguistically; F0 is just a representation of a tone. Therefore, the Electroglottography (EGG) signal is collected for deeper tones characteristics study. In this paper, focusing on the Northern Kam language, which has nine tonal contours and five level tone types, we first collected EGG and speech signals from six natural male speakers of the Northern Kam language, and then achieved the clustering distributions of the tone curves. After summarizing the main characteristics of tones of Northern Kam, we analyzed the relationship between EGG and speech signal parameters, and laid the foundation for further security analysis of acoustic signal.
Multi-carrier Communications over Time-varying Acoustic Channels

NASA Astrophysics Data System (ADS)

Aval, Yashar M.

Acoustic communication is an enabling technology for many autonomous undersea systems, such as those used for ocean monitoring, offshore oil and gas industry, aquaculture, or port security. There are three main challenges in achieving reliable high-rate underwater communication: the bandwidth of acoustic channels is extremely limited, the propagation delays are long, and the Doppler distortions are more pronounced than those found in wireless radio channels. In this dissertation we focus on assessing the fundamental limitations of acoustic communication, and designing efficient signal processing methods that cam overcome these limitations. We address the fundamental question of acoustic channel capacity (achievable rate) for single-input-multi-output (SIMO) acoustic channels using a per-path Rician fading model, and focusing on two scenarios: narrowband channels where the channel statistics can be approximated as frequency- independent, and wideband channels where the nominal path loss is frequency-dependent. In each scenario, we compare several candidate power allocation techniques, and show that assigning uniform power across all frequencies for the first scenario, and assigning uniform power across a selected frequency-band for the second scenario, are the best practical choices in most cases, because the long propagation delay renders the feedback information outdated for power allocation based on the estimated channel response. We quantify our results using the channel information extracted form the 2010 Mobile Acoustic Communications Experiment (MACE'10). Next, we focus on achieving reliable high-rate communication over underwater acoustic channels. Specifically, we investigate orthogonal frequency division multiplexing (OFDM) as the state-of-the-art technique for dealing with frequency-selective multipath channels, and propose a class of methods that compensate for the time-variation of the underwater acoustic channel. These methods are based on multiple
The Contributions of Speech Communication Scholarship to the Study of Terrorism: Review and Preview.

ERIC Educational Resources Information Center

Dowling, Ralph E.

Based on the premise that existing research into terrorism shows great promise, this paper notes that, despite widespread recognition of terrorism's communicative dimensions, few studies have been done from within the discipline of speech communication. The paper defines the discipline of speech communication and rhetorical studies, reviews the…
Common cues to emotion in the dynamic facial expressions of speech and song

PubMed Central

Livingstone, Steven R.; Thompson, William F.; Wanderley, Marcelo M.; Palmer, Caroline

2015-01-01

Speech and song are universal forms of vocalization that may share aspects of emotional expression. Research has focused on parallels in acoustic features, overlooking facial cues to emotion. In three experiments, we compared moving facial expressions in speech and song. In Experiment 1, vocalists spoke and sang statements each with five emotions. Vocalists exhibited emotion-dependent movements of the eyebrows and lip corners that transcended speech–song differences. Vocalists’ jaw movements were coupled to their acoustic intensity, exhibiting differences across emotion and speech–song. Vocalists’ emotional movements extended beyond vocal sound to include large sustained expressions, suggesting a communicative function. In Experiment 2, viewers judged silent videos of vocalists’ facial expressions prior to, during, and following vocalization. Emotional intentions were identified accurately for movements during and after vocalization, suggesting that these movements support the acoustic message. Experiment 3 compared emotional identification in voice-only, face-only, and face-and-voice recordings. Emotion judgements for voice-only singing were poorly identified, yet were accurate for all other conditions, confirming that facial expressions conveyed emotion more accurately than the voice in song, yet were equivalent in speech. Collectively, these findings highlight broad commonalities in the facial cues to emotion in speech and song, yet highlight differences in perception and acoustic-motor production. PMID:25424388
Brain 'talks over' boring quotes: top-down activation of voice-selective areas while listening to monotonous direct speech quotations.

PubMed

Yao, Bo; Belin, Pascal; Scheepers, Christoph

2012-04-15

In human communication, direct speech (e.g., Mary said, "I'm hungry") is perceived as more vivid than indirect speech (e.g., Mary said that she was hungry). This vividness distinction has previously been found to underlie silent reading of quotations: Using functional magnetic resonance imaging (fMRI), we found that direct speech elicited higher brain activity in the temporal voice areas (TVA) of the auditory cortex than indirect speech, consistent with an "inner voice" experience in reading direct speech. Here we show that listening to monotonously spoken direct versus indirect speech quotations also engenders differential TVA activity. This suggests that individuals engage in top-down simulations or imagery of enriched supra-segmental acoustic representations while listening to monotonous direct speech. The findings shed new light on the acoustic nature of the "inner voice" in understanding direct speech. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Effectiveness of the Picture Exchange Communication System (PECS) on communication and speech for children with autism spectrum disorders: a meta-analysis.

PubMed

Flippin, Michelle; Reszka, Stephanie; Watson, Linda R

2010-05-01

The Picture Exchange Communication System (PECS) is a popular communication-training program for young children with autism spectrum disorders (ASD). This meta-analysis reviews the current empirical evidence for PECS in affecting communication and speech outcomes for children with ASD. A systematic review of the literature on PECS written between 1994 and June 2009 was conducted. Quality of scientific rigor was assessed and used as an inclusion criterion in computation of effect sizes. Effect sizes were aggregated separately for single-subject and group studies for communication and speech outcomes. Eight single-subject experiments (18 participants) and 3 group studies (95 PECS participants, 65 in other intervention/control) were included. Results indicated that PECS is a promising but not yet established evidence-based intervention for facilitating communication in children with ASD ages 1-11 years. Small to moderate gains in communication were demonstrated following training. Gains in speech were small to negative. This meta-analysis synthesizes gains in communication and relative lack of gains made in speech across the PECS literature for children with ASD. Concerns about maintenance and generalization are identified. Emerging evidence of potential preintervention child characteristics is discussed. Phase IV was identified as a possibly influential program characteristic for speech outcomes.
[Prosody, speech input and language acquisition].

PubMed

Jungheim, M; Miller, S; Kühn, D; Ptok, M

2014-04-01

In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.
Oral motor functions, speech and communication before a definitive diagnosis of amyotrophic lateral sclerosis.

PubMed

Makkonen, Tanja; Korpijaakko-Huuhka, Anna-Maija; Ruottinen, Hanna; Puhto, Riitta; Hollo, Kirsi; Ylinen, Aarne; Palmio, Johanna

2016-01-01

The aim of this study was to explore the cranial nerve symptoms, speech disorders and communicative effectiveness of Finnish patients with diagnosed or possible amyotrophic lateral sclerosis (ALS) at their first assessment by a speech-language pathologist. The group studied consisted of 30 participants who had clinical signs of bulbar deterioration at the beginning of the study. They underwent a thorough clinical speech and communication examination. The cranial nerve symptoms and ability to communicate were compared in 14 participants with probable or definitive ALS and in 16 participants with suspected or possible ALS. The initial type of ALS was also assessed. More deterioration in soft palate function was found in participants with possible ALS than with diagnosed ALS. Likewise, a slower speech rate combined with more severe dysarthria was observed in possible ALS. In both groups, there was some deterioration in communicative effectiveness. In the possible ALS group the diagnostic delay was longer and speech therapy intervention actualized later. The participants with ALS showed multidimensional decline in communication at their first visit to the speech-language pathologist, but impairments and activity limitations were more severe in suspected or possible ALS. The majority of persons with bulbar-onset ALS in this study were in the latter diagnostic group. This suggests that they are more susceptible to delayed diagnosis and delayed speech therapy assessment. It is important to start speech therapy intervention during the diagnostic processes particularly if the person already shows bulbar symptoms. Copyright © 2016. Published by Elsevier Inc.
[Communicative and social behavior of speech disordered children].

PubMed

Eiberger, W; Hügel, H

1978-07-01

The spheres covering behaviour disorders, social behaviour and communicative behaviour of speech impaired pupils which until now have been analyzed on a more theoretical level, ought to be studied using psychometric testing procedures and an esperimental observational situation in order to gain base data with which to set up a concrete catalogue of aims (learning program) based on the deficits thereby obtained. The study took place at the special school in Esslinger-Berkheim (Baden-Wurttemberg). By taking into account relevant specialized literature and the results of other studies, the following general hypotheses were advanced, namely, that the communication of speech handicapped children is troubled in respect of its content and relation, and that their social behaviour shows more egoistic than cooperative features. In order to determine social motivations and attitudes, we used Muller's "Social Motivation Test" (SMT) and Jorger's "Group test for the social attitude" (S-E-T). Due to the inconsistency between the attitudes measured by means of psychometric methods and the sbusequent free and genuine behaviour, an observational situation was developed during which the pupils, either in pairs or in groups of four and using puppets, took turns in thinking up a story, discussing the plot, roles, etc. and finally putting on the play. The whole was then analyzed by means of tape recordings and film shots, the interaction of the communicating partners being analyzed and categorized in two separate assessment stages: communicative behaviour and social behaviour. The pragmatic axioms of P. Watzlawick, the communication researcher, functioned as theoretical background. Flanders's linear time diagram was used as assessment system. Communicative and social learning aims were prepared in accordance with confirming hypotheses to enable a "preliminary area" for the practical work in (special) education to be defined. In addition, a rough outline was made of the conditional
Hybrid Speaker Recognition Using Universal Acoustic Model

NASA Astrophysics Data System (ADS)

Nishimura, Jun; Kuroda, Tadahiro

We propose a novel speaker recognition approach using a speaker-independent universal acoustic model (UAM) for sensornet applications. In sensornet applications such as “Business Microscope”, interactions among knowledge workers in an organization can be visualized by sensing face-to-face communication using wearable sensor nodes. In conventional studies, speakers are detected by comparing energy of input speech signals among the nodes. However, there are often synchronization errors among the nodes which degrade the speaker recognition performance. By focusing on property of the speaker's acoustic channel, UAM can provide robustness against the synchronization error. The overall speaker recognition accuracy is improved by combining UAM with the energy-based approach. For 0.1s speech inputs and 4 subjects, speaker recognition accuracy of 94% is achieved at the synchronization error less than 100ms.
Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing

PubMed Central

Doelling, Keith; Arnal, Luc; Ghitza, Oded; Poeppel, David

2013-01-01

A growing body of research suggests that intrinsic neuronal slow (< 10 Hz) oscillations in auditory cortex appear to track incoming speech and other spectro-temporally complex auditory signals. Within this framework, several recent studies have identified critical-band temporal envelopes as the specific acoustic feature being reflected by the phase of these oscillations. However, how this alignment between speech acoustics and neural oscillations might underpin intelligibility is unclear. Here we test the hypothesis that the ‘sharpness’ of temporal fluctuations in the critical band envelope acts as a temporal cue to speech syllabic rate, driving delta-theta rhythms to track the stimulus and facilitate intelligibility. We interpret our findings as evidence that sharp events in the stimulus cause cortical rhythms to re-align and parse the stimulus into syllable-sized chunks for further decoding. Using magnetoencephalographic recordings, we show that by removing temporal fluctuations that occur at the syllabic rate, envelope-tracking activity is reduced. By artificially reinstating these temporal fluctuations, envelope-tracking activity is regained. These changes in tracking correlate with intelligibility of the stimulus. Together, the results suggest that the sharpness of fluctuations in the stimulus, as reflected in the cochlear output, drive oscillatory activity to track and entrain to the stimulus, at its syllabic rate. This process likely facilitates parsing of the stimulus into meaningful chunks appropriate for subsequent decoding, enhancing perception and intelligibility. PMID:23791839
Communication attitude and speech in 10-year-old children with cleft (lip and) palate: an ICF perspective.

PubMed

Havstam, Christina; Sandberg, Annika Dahlgren; Lohmander, Anette

2011-04-01

Many children born with cleft palate have impaired speech during their pre-school years, but usually the speech difficulties are transient and resolved by later childhood. This study investigated communication attitude with the Swedish version of the Communication Attitude Test (CAT-S) in 54 10-year-olds with cleft (lip and) palate. In addition, environmental factors were assessed via parent questionnaire. These data were compared to speech assessments by experienced listeners, who rated the children's velopharyngeal function, articulation, intelligibility, and general impression of speech at ages 5, 7, and 10 years. The children with clefts scored significantly higher on the CAT-S compared to reference data, indicating a more negative communication attitude on group level but with large individual variation. All speech variables, except velopharyngeal function at earlier ages, as well as the parent questionnaire scores, correlated significantly with the CAT-S scores. Although there was a relationship between speech and communication attitude, not all children with impaired speech developed negative communication attitudes. The assessment of communication attitude can make an important contribution to our understanding of the communicative situation for children with cleft (lip and) palate and give important indications for intervention.
Speech after Radial Forearm Free Flap Reconstruction of the Tongue: A Longitudinal Acoustic Study of Vowel and Diphthong Sounds

ERIC Educational Resources Information Center

Laaksonen, Juha-Pertti; Rieger, Jana; Happonen, Risto-Pekka; Harris, Jeffrey; Seikaly, Hadi

2010-01-01

The purpose of this study was to use acoustic analyses to describe speech outcomes over the course of 1 year after radial forearm free flap (RFFF) reconstruction of the tongue. Eighteen Canadian English-speaking females and males with reconstruction for oral cancer had speech samples recorded (pre-operative, and 1 month, 6 months, and 1 year…
Acoustical and Intelligibility Test of the Vocera(Copyright) B3000 Communication Badge

NASA Technical Reports Server (NTRS)

Archer, Ronald; Litaker, Harry; Chu, Shao-Sheng R.; Simon, Cory; Romero, Andy; Moses, Haifa

2012-01-01

To communicate with each other or ground support, crew members on board the International Space Station (ISS) currently use the Audio Terminal Units (ATU), which are located in each ISS module. However, to use the ATU, crew members must stop their current activity, travel to a panel, and speak into a wall-mounted microphone, or use either a handheld microphone or a Crew Communication Headset that is connected to a panel. These actions unnecessarily may increase task times, lower productivity, create cable management issues, and thus increase crew frustration. Therefore, the Habitability and Human Factors and Human Interface Branches at the NASA Johnson Space Center (JSC) are currently investigating a commercial-off-the-shelf (COTS) wireless communication system, Vocera(C), as a near-term solution for ISS communication. The objectives of the acoustics and intelligibility testing of this system were to answer the following questions: 1. How intelligibly can a human hear the transmitted message from a Vocera(c) badge in three different noise environments (Baseline = 20 dB, US Lab Module = 58 dB, Russian Module = 70.6 dB)? 2. How accurate is the Vocera(C) badge at recognizing voice commands in three different noise environments? 3. What body location (chest, upper arm, or shoulder) is optimal for speech intelligibility and voice recognition accuracy of the Vocera(C) badge on a human in three different noise environments?
Assessment of voice, speech and communication changes associated with cervical spinal cord injury.

PubMed

Johansson, Kerstin; Seiger, Åke; Forsén, Malin; Holmgren Nilsson, Jeanette; Hartelius, Lena; Schalling, Ellika

2018-02-24

Respiratory muscle impairment following cervical spinal cord injury (CSCI) may lead to reduced voice function, although the individual variation is large. Voice problems in this population may not always receive attention since individuals with CSCI face other, more acute and life-threatening issues that need/receive attention. Currently there is no consensus on the tasks suitable to identify the specific voice impairments and functional voice changes experienced by individuals with CSCI. To examine which voice/speech tasks identify the specific voice and communication changes associated with CSCI, habitual and maximum speech performance of a group with CSCI was compared with that of a healthy control group (CG), and the findings were related to respiratory function and to self-reported voice problems. Respiratory, aerodynamic, acoustic and self-reported voice data from 19 individuals (nine women and 10 men, aged 23-59 years, heights = 153-192 cm) with CSCI (levels C3-C7) were compared with data from a CG consisting of 19 carefully matched non-injured people (nine women and 10 men, aged 19-59 years, heights = 152-187 cm). Despite considerable variability of performance, highly significant differences between the group with CSCI and the CG were found in maximum phonation time, maximum duration of breath phrases, maximum sound pressure level and maximum voice area in voice-range profiles (all p = .000). Subglottal pressure was lower and phonatory stability was reduced in some of the individuals with CSCI, but differences between the groups were not statistically significant. Six of 19 had voice handicap index (VHI) scores above 20 (the cut-off for voice disorder). Individuals with a vital capacity below 50% of the expected for an equivalent reference individual performed significantly worse than participants with more normal vital capacity. Completeness and level of injury seemed to impact vocal function in some individuals. A combination of maximum performance
Neural source dynamics of brain responses to continuous stimuli: Speech processing from acoustics to comprehension.

PubMed

Brodbeck, Christian; Presacco, Alessandro; Simon, Jonathan Z

2018-05-15

Human experience often involves continuous sensory information that unfolds over time. This is true in particular for speech comprehension, where continuous acoustic signals are processed over seconds or even minutes. We show that brain responses to such continuous stimuli can be investigated in detail, for magnetoencephalography (MEG) data, by combining linear kernel estimation with minimum norm source localization. Previous research has shown that the requirement to average data over many trials can be overcome by modeling the brain response as a linear convolution of the stimulus and a kernel, or response function, and estimating a kernel that predicts the response from the stimulus. However, such analysis has been typically restricted to sensor space. Here we demonstrate that this analysis can also be performed in neural source space. We first computed distributed minimum norm current source estimates for continuous MEG recordings, and then computed response functions for the current estimate at each source element, using the boosting algorithm with cross-validation. Permutation tests can then assess the significance of individual predictor variables, as well as features of the corresponding spatio-temporal response functions. We demonstrate the viability of this technique by computing spatio-temporal response functions for speech stimuli, using predictor variables reflecting acoustic, lexical and semantic processing. Results indicate that processes related to comprehension of continuous speech can be differentiated anatomically as well as temporally: acoustic information engaged auditory cortex at short latencies, followed by responses over the central sulcus and inferior frontal gyrus, possibly related to somatosensory/motor cortex involvement in speech perception; lexical frequency was associated with a left-lateralized response in auditory cortex and subsequent bilateral frontal activity; and semantic composition was associated with bilateral temporal and
Cognitive Spare Capacity and Speech Communication: A Narrative Overview

PubMed Central

2014-01-01

Background noise can make speech communication tiring and cognitively taxing, especially for individuals with hearing impairment. It is now well established that better working memory capacity is associated with better ability to understand speech under adverse conditions as well as better ability to benefit from the advanced signal processing in modern hearing aids. Recent work has shown that although such processing cannot overcome hearing handicap, it can increase cognitive spare capacity, that is, the ability to engage in higher level processing of speech. This paper surveys recent work on cognitive spare capacity and suggests new avenues of investigation. PMID:24971355
A screening approach for classroom acoustics using web-based listening tests and subjective ratings.

PubMed

Persson Waye, Kerstin; Magnusson, Lennart; Fredriksson, Sofie; Croy, Ilona

2015-01-01

Perception of speech is crucial in school where speech is the main mode of communication. The aim of the study was to evaluate whether a web based approach including listening tests and questionnaires could be used as a screening tool for poor classroom acoustics. The prime focus was the relation between pupils' comprehension of speech, the classroom acoustics and their description of the acoustic qualities of the classroom. In total, 1106 pupils aged 13-19, from 59 classes and 38 schools in Sweden participated in a listening study using Hagerman's sentences administered via Internet. Four listening conditions were applied: high and low background noise level and positions close and far away from the loudspeaker. The pupils described the acoustic quality of the classroom and teachers provided information on the physical features of the classroom using questionnaires. In 69% of the classes, at least three pupils described the sound environment as adverse and in 88% of the classes one or more pupil reported often having difficulties concentrating due to noise. The pupils' comprehension of speech was strongly influenced by the background noise level (p<0.001) and distance to the loudspeakers (p<0.001). Of the physical classroom features, presence of suspended acoustic panels (p<0.05) and length of the classroom (p<0.01) predicted speech comprehension. Of the pupils' descriptions of acoustic qualities, clattery significantly (p<0.05) predicted speech comprehension. Clattery was furthermore associated to difficulties understanding each other, while the description noisy was associated to concentration difficulties. The majority of classrooms do not seem to have an optimal sound environment. The pupil's descriptions of acoustic qualities and listening tests can be one way of predicting sound conditions in the classroom.
Ocean Variability Effects on Underwater Acoustic Communications

DTIC Science & Technology

2007-09-30

sea surface was rougher. To recover the transmitted symbols which have been passed through the time-varying multi-path acoustic channels, a new ...B is about 6 dB higher than that during enviromental case A. Due to the large aperture and deployment range of the MPL array, the channel impulse...environmental fluctuations and the performance of coherent underwater acoustic communications presents new insights into the operational effectiveness of
Noise can affect acoustic communication and subsequent spawning success in fish.

PubMed

de Jong, Karen; Amorim, M Clara P; Fonseca, Paulo J; Fox, Clive J; Heubel, Katja U

2018-06-01

There are substantial concerns that increasing levels of anthropogenic noise in the oceans may impact aquatic animals. Noise can affect animals physically, physiologically and behaviourally, but one of the most obvious effects is interference with acoustic communication. Acoustic communication often plays a crucial role in reproductive interactions and over 800 species of fish have been found to communicate acoustically. There is very little data on whether noise affects reproduction in aquatic animals, and none in relation to acoustic communication. In this study we tested the effect of continuous noise on courtship behaviour in two closely-related marine fishes: the two-spotted goby (Gobiusculus flavescens) and the painted goby (Pomatoschistus pictus) in aquarium experiments. Both species use visual and acoustic signals during courtship. In the two-spotted goby we used a repeated-measures design testing the same individuals in the noise and the control treatment, in alternating order. For the painted goby we allowed females to spawn, precluding a repeated-measures design, but permitting a test of the effect of noise on female spawning decisions. Males of both species reduced acoustic courtship, but only painted gobies also showed less visual courtship in the noise treatment compared to the control. Female painted gobies were less likely to spawn in the noise treatment. Thus, our results provide experimental evidence for negative effects of noise on acoustic communication and spawning success. Spawning is a crucial component of reproduction. Therefore, even though laboratory results should not be extrapolated directly to field populations, our results suggest that reproductive success may be sensitive to noise pollution, potentially reducing fitness. Copyright © 2017 Elsevier Ltd. All rights reserved.

Infant-Mother Acoustic-Prosodic Alignment and Developmental Risk

ERIC Educational Resources Information Center

Seidl, Amanda; Cristia, Alejandrina; Soderstrom, Melanie; Ko, Eon-Suk; Abel, Emily A.; Kellerman, Ashleigh; Schwichtenberg, A. J.

2018-01-01

Purpose: One promising early marker for autism and other communicative and language disorders is early infant speech production. Here we used daylong recordings of high- and low-risk infant-mother dyads to examine whether acoustic-prosodic alignment as well as two automated measures of infant vocalization are related to developmental risk status…
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common

PubMed Central

Weninger, Felix; Eyben, Florian; Schuller, Björn W.; Mortillaro, Marcello; Scherer, Klaus R.

2013-01-01

Without doubt, there is emotional information in almost any kind of sound received by humans every day: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow’s pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of “the sound that something makes,” in order to evaluate the system’s auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects. PMID:23750144
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common.

PubMed

Weninger, Felix; Eyben, Florian; Schuller, Björn W; Mortillaro, Marcello; Scherer, Klaus R

2013-01-01

WITHOUT DOUBT, THERE IS EMOTIONAL INFORMATION IN ALMOST ANY KIND OF SOUND RECEIVED BY HUMANS EVERY DAY: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow's pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of "the sound that something makes," in order to evaluate the system's auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
Channel coding for underwater acoustic single-carrier CDMA communication system

NASA Astrophysics Data System (ADS)

Liu, Lanjun; Zhang, Yonglei; Zhang, Pengcheng; Zhou, Lin; Niu, Jiong

2017-01-01

CDMA is an effective multiple access protocol for underwater acoustic networks, and channel coding can effectively reduce the bit error rate (BER) of the underwater acoustic communication system. For the requirements of underwater acoustic mobile networks based on CDMA, an underwater acoustic single-carrier CDMA communication system (UWA/SCCDMA) based on the direct-sequence spread spectrum is proposed, and its channel coding scheme is studied based on convolution, RA, Turbo and LDPC coding respectively. The implementation steps of the Viterbi algorithm of convolutional coding, BP and minimum sum algorithms of RA coding, Log-MAP and SOVA algorithms of Turbo coding, and sum-product algorithm of LDPC coding are given. An UWA/SCCDMA simulation system based on Matlab is designed. Simulation results show that the UWA/SCCDMA based on RA, Turbo and LDPC coding have good performance such that the communication BER is all less than 10-6 in the underwater acoustic channel with low signal to noise ratio (SNR) from -12 dB to -10dB, which is about 2 orders of magnitude lower than that of the convolutional coding. The system based on Turbo coding with Log-MAP algorithm has the best performance.
Speech masking and cancelling and voice obscuration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, John F.

A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.
Interpersonal Learning Systems for National Speech-Communication.

ERIC Educational Resources Information Center

Heinberg, Paul

A consensus has prevailed among educators that Americans of verying ethnic, social, cultural, and linguistic backgrounds who must communicate with each other in social, academic, and occupational situations might achieve a greater degree of rapport if the dialect of the English mutually spoken and the speech mannerisms used were standardized.…
Meeting the needs of children and young people with speech, language and communication difficulties.

PubMed

Lindsay, Geoff; Dockrell, Julie; Desforges, Martin; Law, James; Peacey, Nick

2010-01-01

The UK government set up a review of provision for children and young people with the full range of speech, language and communication needs led by a Member of Parliament, John Bercow. A research study was commissioned to provide empirical evidence to inform the Bercow Review. To examine the efficiency and effectiveness of different arrangements for organizing and providing services for children and young people with needs associated with primary speech, language and communication difficulties. Six Local Authorities in England and associated Primary Care Trusts were selected to represent a range of locations reflecting geographic spread, urban/rural and prevalence of children with speech, language and communication difficulties. In each case study, interviews were held with the senior Local Authority manager for special educational needs and a Primary Care Trust senior manager for speech and language therapy. A further 23 head teachers or heads of specialist provision for speech, language and communication difficulties were also interviewed and policy documents were examined. A thematic analysis of the interviews produced four main themes: identification of children and young people with speech, language and communication difficulties; meeting their needs; monitoring and evaluation; and research and evaluation. There were important differences between Local Authorities and Primary Care Trusts in the collection, analysis and use of data, in particular. There were also differences between Local Authority/Primary Care Trust pairs, especially in the degree to which they collaborated in developing policy and implementing practice. This study has demonstrated a lack of consistency across Local Authorities and Primary Care Trusts. Optimizing provision to meet the needs of children and young people with speech, language and communication difficulties will require concerted action, with leadership from central government. The study was used by the Bercow Review whose
Speech Cues Contribute to Audiovisual Spatial Integration

PubMed Central

Bishop, Christopher W.; Miller, Lee M.

2011-01-01

Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral ‘what’ and dorsal ‘where’ pathways. PMID:21909378
Decoding Articulatory Features from fMRI Responses in Dorsal Speech Regions.

PubMed

Correia, Joao M; Jansma, Bernadette M B; Bonte, Milene

2015-11-11

The brain's circuitry for perceiving and producing speech may show a notable level of overlap that is crucial for normal development and behavior. The extent to which sensorimotor integration plays a role in speech perception remains highly controversial, however. Methodological constraints related to experimental designs and analysis methods have so far prevented the disentanglement of neural responses to acoustic versus articulatory speech features. Using a passive listening paradigm and multivariate decoding of single-trial fMRI responses to spoken syllables, we investigated brain-based generalization of articulatory features (place and manner of articulation, and voicing) beyond their acoustic (surface) form in adult human listeners. For example, we trained a classifier to discriminate place of articulation within stop syllables (e.g., /pa/ vs /ta/) and tested whether this training generalizes to fricatives (e.g., /fa/ vs /sa/). This novel approach revealed generalization of place and manner of articulation at multiple cortical levels within the dorsal auditory pathway, including auditory, sensorimotor, motor, and somatosensory regions, suggesting the representation of sensorimotor information. Additionally, generalization of voicing included the right anterior superior temporal sulcus associated with the perception of human voices as well as somatosensory regions bilaterally. Our findings highlight the close connection between brain systems for speech perception and production, and in particular, indicate the availability of articulatory codes during passive speech perception. Sensorimotor integration is central to verbal communication and provides a link between auditory signals of speech perception and motor programs of speech production. It remains highly controversial, however, to what extent the brain's speech perception system actively uses articulatory (motor), in addition to acoustic/phonetic, representations. In this study, we examine the role of
A high-frequency warm shallow water acoustic communications channel model and measurements.

PubMed

Chitre, Mandar

2007-11-01

Underwater acoustic communication is a core enabling technology with applications in ocean monitoring using remote sensors and autonomous underwater vehicles. One of the more challenging underwater acoustic communication channels is the medium-range very shallow warm-water channel, common in tropical coastal regions. This channel exhibits two key features-extensive time-varying multipath and high levels of non-Gaussian ambient noise due to snapping shrimp-both of which limit the performance of traditional communication techniques. A good understanding of the communications channel is key to the design of communication systems. It aids in the development of signal processing techniques as well as in the testing of the techniques via simulation. In this article, a physics-based channel model for the very shallow warm-water acoustic channel at high frequencies is developed, which are of interest to medium-range communication system developers. The model is based on ray acoustics and includes time-varying statistical effects as well as non-Gaussian ambient noise statistics observed during channel studies. The model is calibrated and its accuracy validated using measurements made at sea.
Perception of the Voicing Distinction in Speech Produced during Simultaneous Communication

ERIC Educational Resources Information Center

MacKenzie, Douglas J.; Schiavetti, Nicholas; Whitehead, Robert L.; Metz, Dale Evan

2006-01-01

This study investigated the perception of voice onset time (VOT) in speech produced during simultaneous communication (SC). Four normally hearing, experienced sign language users were recorded under SC and speech alone (SA) conditions speaking stimulus words with voiced and voiceless initial consonants embedded in a sentence. Twelve…
Acoustic communication at the water's edge: evolutionary insights from a mudskipper.

PubMed

Polgar, Gianluca; Malavasi, Stefano; Cipolato, Giacomo; Georgalas, Vyron; Clack, Jennifer A; Torricelli, Patrizia

2011-01-01

Coupled behavioural observations and acoustical recordings of aggressive dyadic contests showed that the mudskipper Periophthalmodon septemradiatus communicates acoustically while out of water. An analysis of intraspecific variability showed that specific acoustic components may act as tags for individual recognition, further supporting the sounds' communicative value. A correlative analysis amongst acoustical properties and video-acoustical recordings in slow-motion supported first hypotheses on the emission mechanism. Acoustic transmission through the wet exposed substrate was also discussed. These observations were used to support an "exaptation hypothesis", i.e. the maintenance of key adaptations during the first stages of water-to-land vertebrate eco-evolutionary transitions (based on eco-evolutionary and palaeontological considerations), through a comparative bioacoustic analysis of aquatic and semiterrestrial gobiid taxa. In fact, a remarkable similarity was found between mudskipper vocalisations and those emitted by gobioids and other soniferous benthonic fishes.
A Speech Communication Program in Malaysia: Case Study in the Conundrums of Teaching Abroad.

ERIC Educational Resources Information Center

Dick, Robert C.; Robinson, Brenda M.

1998-01-01

Reports speech communication courses were taught in Malaysia as part of a cooperative educational program between Indiana University and the Malaysian government. Examines unique elements of the culture of the Malaysian students that affect their speech communication; suggests issues to be addressed in the "Malaysianized" program to…
Communicative performance of adolescents with severe speech impairment: influence of context.

PubMed

Dalton, B M; Bedrosian, J L

1989-08-01

The communicative performance of 4 preoperational-level adolescents, using limited speech, gestures, and communication board techniques, was examined in a two-part investigation. In Part 1, each subject participated in an academic interaction with a teacher in a therapy room. Data were transcribed and coded for communication mode, function, and role. Two subjects were found to predominantly use the speech mode, while the remaining 2 predominantly used board and one other mode. The majority of productions consisted of responses to requests, and the initiator role was infrequently occupied. These findings were similar to those reported in previous investigations conducted in classroom settings. In Part 2, another examination of the communicative performance of these subjects was conducted in spontaneous interactions involving speaking and nonspeaking peers in a therapy room. Using the same data analysis procedures, gesture and speech modes predominated for 3 of the subjects in the nonspeaking peer interactions. The remaining subject exhibited minimal interaction. No consistent pattern of mode usage was exhibited across the speaking peer interactions. In the nonspeaking peer interactions, request predominated. In contrast, a variety of communication functions was exhibited in the speaking peer interactions. Both the initiator and the maintainer roles were occupied in the majority of interactions. Pertinent variables and clinical implications are discussed.
Speech reception with different bilateral directional processing schemes: Influence of binaural hearing, audiometric asymmetry, and acoustic scenario.

PubMed

Neher, Tobias; Wagener, Kirsten C; Latzel, Matthias

2017-09-01

Hearing aid (HA) users can differ markedly in their benefit from directional processing (or beamforming) algorithms. The current study therefore investigated candidacy for different bilateral directional processing schemes. Groups of elderly listeners with symmetric (N = 20) or asymmetric (N = 19) hearing thresholds for frequencies below 2 kHz, a large spread in the binaural intelligibility level difference (BILD), and no difference in age, overall degree of hearing loss, or performance on a measure of selective attention took part. Aided speech reception was measured using virtual acoustics together with a simulation of a linked pair of completely occluding behind-the-ear HAs. Five processing schemes and three acoustic scenarios were used. The processing schemes differed in the tradeoff between signal-to-noise ratio (SNR) improvement and binaural cue preservation. The acoustic scenarios consisted of a frontal target talker presented against two speech maskers from ±60° azimuth or spatially diffuse cafeteria noise. For both groups, a significant interaction between BILD, processing scheme, and acoustic scenario was found. This interaction implied that, in situations with lateral speech maskers, HA users with BILDs larger than about 2 dB profited more from preserved low-frequency binaural cues than from greater SNR improvement, whereas for smaller BILDs the opposite was true. Audiometric asymmetry reduced the influence of binaural hearing. In spatially diffuse noise, the maximal SNR improvement was generally beneficial. N 0 S π detection performance at 500 Hz predicted the benefit from low-frequency binaural cues. Together, these findings provide a basis for adapting bilateral directional processing to individual and situational influences. Further research is needed to investigate their generalizability to more realistic HA conditions (e.g., with low-frequency vent-transmitted sound). Copyright © 2017 Elsevier B.V. All rights reserved.
Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels

PubMed Central

Caballero-Morales, Santiago-Omar

2013-01-01

An approach for the recognition of emotions in speech is presented. The target language is Mexican Spanish, and for this purpose a speech database was created. The approach consists in the phoneme acoustic modelling of emotion-specific vowels. For this, a standard phoneme-based Automatic Speech Recognition (ASR) system was built with Hidden Markov Models (HMMs), where different phoneme HMMs were built for the consonants and emotion-specific vowels associated with four emotional states (anger, happiness, neutral, sadness). Then, estimation of the emotional state from a spoken sentence is performed by counting the number of emotion-specific vowels found in the ASR's output for the sentence. With this approach, accuracy of 87–100% was achieved for the recognition of emotional state of Mexican Spanish speech. PMID:23935410
Singing whales generate high levels of particle motion: implications for acoustic communication and hearing?

PubMed

Mooney, T Aran; Kaplan, Maxwell B; Lammers, Marc O

2016-11-01

Acoustic signals are fundamental to animal communication, and cetaceans are often considered bioacoustic specialists. Nearly all studies of their acoustic communication focus on sound pressure measurements, overlooking the particle motion components of their communication signals. Here we characterized the levels of acoustic particle velocity (and pressure) of song produced by humpback whales. We demonstrate that whales generate acoustic fields that include significant particle velocity components that are detectable over relatively long distances sufficient to play a role in acoustic communication. We show that these signals attenuate predictably in a manner similar to pressure and that direct particle velocity measurements can provide bearings to singing whales. Whales could potentially use such information to determine the distance of signalling animals. Additionally, the vibratory nature of particle velocity may stimulate bone conduction, a hearing modality found in other low-frequency specialized mammals, offering a parsimonious mechanism of acoustic energy transduction into the massive ossicles of whale ears. With substantial concerns regarding the effects of increasing anthropogenic ocean noise and major uncertainties surrounding mysticete hearing, these results highlight both an unexplored pathway that may be available for whale acoustic communication and the need to better understand the biological role of acoustic particle motion. © 2016 The Author(s).
Singing whales generate high levels of particle motion: implications for acoustic communication and hearing?

PubMed Central

Kaplan, Maxwell B.; Lammers, Marc O.

2016-01-01

Acoustic signals are fundamental to animal communication, and cetaceans are often considered bioacoustic specialists. Nearly all studies of their acoustic communication focus on sound pressure measurements, overlooking the particle motion components of their communication signals. Here we characterized the levels of acoustic particle velocity (and pressure) of song produced by humpback whales. We demonstrate that whales generate acoustic fields that include significant particle velocity components that are detectable over relatively long distances sufficient to play a role in acoustic communication. We show that these signals attenuate predictably in a manner similar to pressure and that direct particle velocity measurements can provide bearings to singing whales. Whales could potentially use such information to determine the distance of signalling animals. Additionally, the vibratory nature of particle velocity may stimulate bone conduction, a hearing modality found in other low-frequency specialized mammals, offering a parsimonious mechanism of acoustic energy transduction into the massive ossicles of whale ears. With substantial concerns regarding the effects of increasing anthropogenic ocean noise and major uncertainties surrounding mysticete hearing, these results highlight both an unexplored pathway that may be available for whale acoustic communication and the need to better understand the biological role of acoustic particle motion. PMID:27807249
Captain's Log...The Speech Communication Oral Journal.

ERIC Educational Resources Information Center

Strong, William F.

1983-01-01

The logic and the benefits of requiring college students in basic speech communication classes to tape-record oral journals are set forth along with a detailed description of the assignment. Instructions to the students explain the mechanics of the assignment as follows: (1) obtain and properly label a quality cassette tape; (2) make seven…
Acoustic Communication at the Water's Edge: Evolutionary Insights from a Mudskipper

PubMed Central

Polgar, Gianluca; Malavasi, Stefano; Cipolato, Giacomo; Georgalas, Vyron; Clack, Jennifer A.; Torricelli, Patrizia

2011-01-01

Coupled behavioural observations and acoustical recordings of aggressive dyadic contests showed that the mudskipper Periophthalmodon septemradiatus communicates acoustically while out of water. An analysis of intraspecific variability showed that specific acoustic components may act as tags for individual recognition, further supporting the sounds' communicative value. A correlative analysis amongst acoustical properties and video-acoustical recordings in slow-motion supported first hypotheses on the emission mechanism. Acoustic transmission through the wet exposed substrate was also discussed. These observations were used to support an “exaptation hypothesis”, i.e. the maintenance of key adaptations during the first stages of water-to-land vertebrate eco-evolutionary transitions (based on eco-evolutionary and palaeontological considerations), through a comparative bioacoustic analysis of aquatic and semiterrestrial gobiid taxa. In fact, a remarkable similarity was found between mudskipper vocalisations and those emitted by gobioids and other soniferous benthonic fishes. PMID:21738663

A Secure Communication Suite for Underwater Acoustic Sensor Networks

PubMed Central

Dini, Gianluca; Duca, Angelica Lo

2012-01-01

In this paper we describe a security suite for Underwater Acoustic Sensor Networks comprising both fixed and mobile nodes. The security suite is composed of a secure routing protocol and a set of cryptographic primitives aimed at protecting the confidentiality and the integrity of underwater communication while taking into account the unique characteristics and constraints of the acoustic channel. By means of experiments and simulations based on real data, we show that the suite is suitable for an underwater networking environment as it introduces limited, and sometimes negligible, communication and power consumption overhead. PMID:23202204
Multichannel spatial auditory display for speech communications

NASA Technical Reports Server (NTRS)

Begault, D. R.; Erbe, T.; Wenzel, E. M. (Principal Investigator)

1994-01-01

A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.
Multichannel spatial auditory display for speech communications.

PubMed

Begault, D R; Erbe, T

1994-10-01

A spatial auditory display for multiple speech communications was developed at NASA/Ames Research Center. Input is spatialized by the use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four-letter call signs used by launch personnel at NASA against diotic speech babble. Spatial positions at 30 degrees azimuth increments were evaluated. The results from eight subjects showed a maximum intelligibility improvement of about 6-7 dB when the signal was spatialized to 60 or 90 degrees azimuth positions.
Speech perception and production in severe environments

NASA Astrophysics Data System (ADS)

Pisoni, David B.

1990-09-01

The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.
Combined Electric and Acoustic Stimulation With Hearing Preservation: Effect of Cochlear Implant Low-Frequency Cutoff on Speech Understanding and Perceived Listening Difficulty.

PubMed

Gifford, René H; Davis, Timothy J; Sunderhaus, Linsey W; Menapace, Christine; Buck, Barbara; Crosson, Jillian; O'Neill, Lori; Beiter, Anne; Segel, Phil

The primary objective of this study was to assess the effect of electric and acoustic overlap for speech understanding in typical listening conditions using semidiffuse noise. This study used a within-subjects, repeated measures design including 11 experienced adult implant recipients (13 ears) with functional residual hearing in the implanted and nonimplanted ear. The aided acoustic bandwidth was fixed and the low-frequency cutoff for the cochlear implant (CI) was varied systematically. Assessments were completed in the R-SPACE sound-simulation system which includes a semidiffuse restaurant noise originating from eight loudspeakers placed circumferentially about the subject's head. AzBio sentences were presented at 67 dBA with signal to noise ratio varying between +10 and 0 dB determined individually to yield approximately 50 to 60% correct for the CI-alone condition with full CI bandwidth. Listening conditions for all subjects included CI alone, bimodal (CI + contralateral hearing aid), and bilateral-aided electric and acoustic stimulation (EAS; CI + bilateral hearing aid). Low-frequency cutoffs both below and above the original "clinical software recommendation" frequency were tested for all patients, in all conditions. Subjects estimated listening difficulty for all conditions using listener ratings based on a visual analog scale. Three primary findings were that (1) there was statistically significant benefit of preserved acoustic hearing in the implanted ear for most overlap conditions, (2) the default clinical software recommendation rarely yielded the highest level of speech recognition (1 of 13 ears), and (3) greater EAS overlap than that provided by the clinical recommendation yielded significant improvements in speech understanding. For standard-electrode CI recipients with preserved hearing, spectral overlap of acoustic and electric stimuli yielded significantly better speech understanding and less listening effort in a laboratory-based, restaurant
Auditory-Perceptual and Acoustic Methods in Measuring Dysphonia Severity of Korean Speech.

PubMed

Maryn, Youri; Kim, Hyung-Tae; Kim, Jaeock

2016-09-01

The purpose of this study was to explore the criterion-related concurrent validity of two standardized auditory-perceptual rating protocols and the Acoustic Voice Quality Index (AVQI) for measuring dysphonia severity in Korean speech. Sixty native Korean subjects with various voice disorders were asked to sustain the vowel [a:] and to read aloud the Korean text "Walk." A 3-second midvowel portion of the sustained vowel and two sentences (with 25 syllables) were edited, concatenated, and analyzed according to methods described elsewhere. From 56 participants, both continuous speech and sustained vowel recordings had sufficiently high signal-to-noise ratios (35.5 dB and 37 dB on average, respectively) and were therefore subjected to further dysphonia severity analysis with (1) "G" or Grade from the GRBAS protocol, (2) "OS" or Overall Severity from the Consensus Auditory-Perceptual Evaluation of Voice protocol, and (3) AVQI. First, high correlations were found between G and OS (rS = 0.955 for sustained vowels; rS = 0.965 for continuous speech). Second, the AVQI showed a strong correlation with G (rS = 0.911) as well as OS (rP = 0.924). These findings are in agreement with similar studies dealing with continuous speech in other languages. The present study highlights the criterion-related concurrent validity of these methods in Korean speech. Furthermore, it supports the cross-linguistic robustness of the AVQI as a valid and objective marker of overall dysphonia severity. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds

PubMed Central

Agnew, Z.K.; McGettigan, C.; Scott, S.K.

2012-01-01

Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557
Hello World, It's Me: Bringing the Basic Speech Communication Course into the Digital Age

ERIC Educational Resources Information Center

Kirkwood, Jessica; Gutgold, Nichola D.; Manley, Destiny

2011-01-01

During the past decade, instructors of speech communication have been adapting the introductory speech course to keep up with the television age. Learning units in speech textbooks now teach how to speak well on television, as well as how to interpret speeches in the media. This article argues that the computer age invites adaptation of the…
The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter speech describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm
The Effect of Noise on Relationships Between Speech Intelligibility and Self-Reported Communication Measures in Tracheoesophageal Speakers.

PubMed

Eadie, Tanya L; Otero, Devon Sawin; Bolt, Susan; Kapsner-Smith, Mara; Sullivan, Jessica R

2016-08-01

The purpose of this study was to examine how sentence intelligibility relates to self-reported communication in tracheoesophageal speakers when speech intelligibility is measured in quiet and noise. Twenty-four tracheoesophageal speakers who were at least 1 year postlaryngectomy provided audio recordings of 5 sentences from the Sentence Intelligibility Test. Speakers also completed self-reported measures of communication-the Voice Handicap Index-10 and the Communicative Participation Item Bank short form. Speech recordings were presented to 2 groups of inexperienced listeners who heard sentences in quiet or noise. Listeners transcribed the sentences to yield speech intelligibility scores. Very weak relationships were found between intelligibility in quiet and measures of voice handicap and communicative participation. Slightly stronger, but still weak and nonsignificant, relationships were observed between measures of intelligibility in noise and both self-reported measures. However, 12 speakers who were more than 65% intelligible in noise showed strong and statistically significant relationships with both self-reported measures (R2 = .76-.79). Speech intelligibility in quiet is a weak predictor of self-reported communication measures in tracheoesophageal speakers. Speech intelligibility in noise may be a better metric of self-reported communicative function for speakers who demonstrate higher speech intelligibility in noise.
Scientific bases of human-machine communication by voice.

PubMed Central

Schafer, R W

1995-01-01

The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines. PMID:7479802
Cross-Cultural Communication through Course Linkage: Utilizing Experiential Learning in Speech 110 (Introduction to Speech/Communication) & ESL 009 (Oral Skills).

ERIC Educational Resources Information Center

Mackler, Tobi; Savard, Theresa

Taking advantage of the opportunity to heighten cultural awareness and create an intercultural exchange, this paper presents two articles that provide a summary of the rationale, methodology, and assignments used to teach the linked courses of an introductory speech communication course and an English-as-a-Second-Language Oral Skills course. The…
Learning Resources for the Secondary Speech Communication Classroom.

ERIC Educational Resources Information Center

Wolvin, Andrew D.

1974-01-01

New print and nonprint resources for secondary level classroom use are available in the field of speech communication, which has become process oriented with continual interaction between speaker and listener. Of five specific books, three provide valuable resource material for teachers, focusing on practical teaching suggestions and the necessity…
Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

PubMed

Larm, Petra; Hongisto, Valtteri

2006-02-01

During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.
Acoustic changes in the speech of children with cerebral palsy following an intensive program of dysarthria therapy.

PubMed

Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

2018-01-01

The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. We recorded 16 young people with cerebral palsy and dysarthria (nine girls; mean age 14 years, SD = 2; nine spastic type, two dyskinetic, four mixed; one Worster-Drought) producing speech in two conditions (single words, connected speech) twice before and twice after therapy focusing on respiration, phonation and rate. In both single-word and connected speech we measured vocal intensity (root mean square-RMS), period-to-period variability (Shimmer APQ, Jitter RAP and PPQ) and harmonics-to-noise ratio (HNR). In connected speech we also measured mean fundamental frequency, utterance duration in seconds and speech and articulation rate (syllables/s with and without pauses respectively). All acoustic measures were made using Praat. Intelligibility was calculated in previous research. In single words statistically significant but very small reductions were observed in period-to-period variability following therapy: Shimmer APQ -0.15 (95% CI = -0.21 to -0.09); Jitter RAP -0.08 (95% CI = -0.14 to -0.01); Jitter PPQ -0.08 (95% CI = -0.15 to -0.01). No changes in period-to-period perturbation across phrases in connected speech were detected. However, changes in connected speech were observed in phrase length, rate and intensity. Following therapy, mean utterance duration increased by 1.11 s (95% CI = 0.37-1.86) when measured with pauses and by 1.13 s (95% CI = 0.40-1.85) when measured without pauses. Articulation rate increased by 0.07 syllables/s (95% CI = 0.02-0.13); speech rate increased by 0.06 syllables/s (95% CI = < 0.01-0.12); and intensity increased by 0.03 Pascals (95% CI = 0.02-0.04). There was a gradual reduction in mean fundamental frequency across all time points (-11.85 Hz, 95
Understanding speech when wearing communication headsets and hearing protectors with subband processing.

PubMed

Brammer, Anthony J; Yu, Gongqiang; Bernstein, Eric R; Cherniack, Martin G; Peterson, Donald R; Tufts, Jennifer B

2014-08-01

An adaptive, delayless, subband feed-forward control structure is employed to improve the speech signal-to-noise ratio (SNR) in the communication channel of a circumaural headset/hearing protector (HPD) from 90 Hz to 11.3 kHz, and to provide active noise control (ANC) from 50 to 800 Hz to complement the passive attenuation of the HPD. The task involves optimizing the speech SNR for each communication channel subband, subject to limiting the maximum sound level at the ear, maintaining a speech SNR preferred by users, and reducing large inter-band gain differences to improve speech quality. The performance of a proof-of-concept device has been evaluated in a pseudo-diffuse sound field when worn by human subjects under conditions of environmental noise and speech that do not pose a risk to hearing, and by simulation for other conditions. For the environmental noises employed in this study, subband speech SNR control combined with subband ANC produced greater improvement in word scores than subband ANC alone, and improved the consistency of word scores across subjects. The simulation employed a subject-specific linear model, and predicted that word scores are maintained in excess of 90% for sound levels outside the HPD of up to ∼115 dBA.
Gender and vocal production mode discrimination using the high frequencies for speech and singing

PubMed Central

Monson, Brian B.; Lotto, Andrew J.; Story, Brad H.

2014-01-01

Humans routinely produce acoustical energy at frequencies above 6 kHz during vocalization, but this frequency range is often not represented in communication devices and speech perception research. Recent advancements toward high-definition (HD) voice and extended bandwidth hearing aids have increased the interest in the high frequencies. The potential perceptual information provided by high-frequency energy (HFE) is not well characterized. We found that humans can accomplish tasks of gender discrimination and vocal production mode discrimination (speech vs. singing) when presented with acoustic stimuli containing only HFE at both amplified and normal levels. Performance in these tasks was robust in the presence of low-frequency masking noise. No substantial learning effect was observed. Listeners also were able to identify the sung and spoken text (excerpts from “The Star-Spangled Banner”) with very few exposures. These results add to the increasing evidence that the high frequencies provide at least redundant information about the vocal signal, suggesting that its representation in communication devices (e.g., cell phones, hearing aids, and cochlear implants) and speech/voice synthesizers could improve these devices and benefit normal-hearing and hearing-impaired listeners. PMID:25400613
Speech coding at 4800 bps for mobile satellite communications

NASA Technical Reports Server (NTRS)

Gersho, Allen; Chan, Wai-Yip; Davidson, Grant; Chen, Juin-Hwey; Yong, Mei

1988-01-01

A speech compression project has recently been completed to develop a speech coding algorithm suitable for operation in a mobile satellite environment aimed at providing telephone quality natural speech at 4.8 kbps. The work has resulted in two alternative techniques which achieve reasonably good communications quality at 4.8 kbps while tolerating vehicle noise and rather severe channel impairments. The algorithms are embodied in a compact self-contained prototype consisting of two AT and T 32-bit floating-point DSP32 digital signal processors (DSP). A Motorola 68HC11 microcomputer chip serves as the board controller and interface handler. On a wirewrapped card, the prototype's circuit footprint amounts to only 200 sq cm, and consumes about 9 watts of power.
Subcortical Contributions to Motor Speech: Phylogenetic, Developmental, Clinical.

PubMed

Ziegler, W; Ackermann, H

2017-08-01

Vocal learning is an exclusively human trait among primates. However, songbirds demonstrate behavioral features resembling human speech learning. Two circuits have a preeminent role in this human behavior; namely, the corticostriatal and the cerebrocerebellar motor loops. While the striatal contribution can be traced back to the avian anterior forebrain pathway (AFP), the sensorimotor adaptation functions of the cerebellum appear to be human specific in acoustic communication. This review contributes to an ongoing discussion on how birdsong translates into human speech. While earlier approaches were focused on higher linguistic functions, we place the motor aspects of speaking at center stage. Genetic data are brought together with clinical and developmental evidence to outline the role of cerebrocerebellar and corticostriatal interactions in human speech. Copyright © 2017 Elsevier Ltd. All rights reserved.
Hidden Markov models in automatic speech recognition

NASA Astrophysics Data System (ADS)

Wrzoskowicz, Adam

1993-11-01

This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.

Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

PubMed Central

Nakajima, Yoshitaka; Matsuda, Mizuki; Ueda, Kazuo; Remijn, Gerard B.

2018-01-01

Temporal resolution needed for Japanese speech communication was measured. A new experimental paradigm that can reflect the spectro-temporal resolution necessary for healthy listeners to perceive speech is introduced. As a first step, we report listeners' intelligibility scores of Japanese speech with a systematically degraded temporal resolution, so-called “mosaic speech”: speech mosaicized in the coordinates of time and frequency. The results of two experiments show that mosaic speech cut into short static segments was almost perfectly intelligible with a temporal resolution of 40 ms or finer. Intelligibility dropped for a temporal resolution of 80 ms, but was still around 50%-correct level. The data are in line with previous results showing that speech signals separated into short temporal segments of <100 ms can be remarkably robust in terms of linguistic-content perception against drastic manipulations in each segment, such as partial signal omission or temporal reversal. The human perceptual system thus can extract meaning from unexpectedly rough temporal information in speech. The process resembles that of the visual system stringing together static movie frames of ~40 ms into vivid motion. PMID:29740295
Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

PubMed

Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

2018-05-01

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Speech adjustments for room acoustics and their effects on vocal effort

PubMed Central

Bottalico, Pasquale

2016-01-01

Objectives The aims of the present study are: (1) to analyze the effects of the acoustical environment and the voice style on time dose (Dt_p,) and fundamental frequency (mean fo and standard deviation std_fo), while taking into account the effect of short term vocal fatigue; (2) to predict the self-reported vocal effort from the voice acoustical parameters. Methods Ten male and ten female subjects were recorded while reading a text in normal and loud styles, in three rooms - anechoic, semi-reverberant and reverberant –with and without acrylic glass panels 0.5 m from the mouth, which increased external auditory feedback. Subjects quantified how much effort was required to speak in each condition on a visual analogue scale after each task. Results (Aim1) In the loud style, Dt_p, fo and std_fo increased. The Dt_p was higher in the reverberant room compared to the other two rooms. Both genders tended to increase fo in less reverberant environments, while a more monotonous speech was produced in rooms with greater reverberation. All three voice parameters increased with short-term vocal fatigue. (Aim2) A model of the vocal effort to acoustic vocal parameters is proposed. The SPL (Sound Pressure Level) contributed to 66% of the variance explained by the model, followed by the fundamental frequency (30%) and the modulation in amplitude (4%). Conclusions The results provide insight into how voice acoustical parameters can predict vocal effort. In particular, it increased when SPL and fo increased and when the amplitude voice modulation (std_ΔSPL) decreased. PMID:28029555
Age-Related Changes in Objective and Subjective Speech Perception in Complex Listening Environments

ERIC Educational Resources Information Center

Helfer, Karen S.; Merchant, Gabrielle R.; Wasiuk, Peter A.

2017-01-01

Purpose: A frequent complaint by older adults is difficulty communicating in challenging acoustic environments. The purpose of this work was to review and summarize information about how speech perception in complex listening situations changes across the adult age range. Method: This article provides a review of age-related changes in speech…
Speech pathologists' current practice with cognitive-communication assessment during post-traumatic amnesia: a survey.

PubMed

Steel, Joanne; Ferguson, Alison; Spencer, Elizabeth; Togher, Leanne

2013-01-01

To investigate speech pathologists' current practice with adults who are in post-traumatic amnesia (PTA). Speech pathologists with experience of adults in PTA were invited to take part in an online survey through Australian professional email/internet-based interest groups. Forty-five speech pathologists responded to the online survey. The majority of respondents (78%) reported using informal, observational assessment methods commencing at initial contact with people in PTA or when patients' level of alertness allowed and initiating formal assessment on emergence from PTA. Seven respondents (19%) reported undertaking no assessment during PTA. Clinicians described using a range of techniques to monitor cognitive-communication during PTA, including static, dynamic, functional and impairment-based methods. The study confirmed that speech pathologists have a key role in the multidisciplinary team caring for the person in PTA, especially with family education and facilitating interactions with the rehabilitation team and family. Decision-making around timing and means of assessment of cognitive-communication during PTA appeared primarily reliant on speech pathologists' professional experience and the culture of their workplace. The findings support the need for further research into the nature of cognitive-communication disorder and resolution over this period.
A voice-input voice-output communication aid for people with severe speech impairment.

PubMed

Hawley, Mark S; Cunningham, Stuart P; Green, Phil D; Enderby, Pam; Palmer, Rebecca; Sehgal, Siddharth; O'Neill, Peter

2013-01-01

A new form of augmentative and alternative communication (AAC) device for people with severe speech impairment-the voice-input voice-output communication aid (VIVOCA)-is described. The VIVOCA recognizes the disordered speech of the user and builds messages, which are converted into synthetic speech. System development was carried out employing user-centered design and development methods, which identified and refined key requirements for the device. A novel methodology for building small vocabulary, speaker-dependent automatic speech recognizers with reduced amounts of training data, was applied. Experiments showed that this method is successful in generating good recognition performance (mean accuracy 96%) on highly disordered speech, even when recognition perplexity is increased. The selected message-building technique traded off various factors including speed of message construction and range of available message outputs. The VIVOCA was evaluated in a field trial by individuals with moderate to severe dysarthria and confirmed that they can make use of the device to produce intelligible speech output from disordered speech input. The trial highlighted some issues which limit the performance and usability of the device when applied in real usage situations, with mean recognition accuracy of 67% in these circumstances. These limitations will be addressed in future work.
Speech Acts across Cultures: Challenges to Communication in a Second Language. Studies on Language Acquisition, 11.

ERIC Educational Resources Information Center

Gass, Susan M., Ed.; Neu, Joyce, Ed.

Articles on speech acts and intercultural communication include: "Investigating the Production of Speech Act Sets" (Andrew Cohen); "Non-Native Refusals: A Methodological Perspective" (Noel Houck, Susan M. Gass); "Natural Speech Act Data versus Written Questionnaire Data: How Data Collection Method Affects Speech Act…
Coupled Research in Ocean Acoustics and Signal Processing for the Next Generation of Underwater Acoustic Communication Systems

DTIC Science & Technology

2016-08-05

JPAnalytics LLC CC: DCMA Boston DTIC Director, NRL Progress Report #8 Coupled Research in Ocean Acoustics and Signal Processing for the Next...Generation of Underwater Acoustic Communication Systems Principal Investigator’s Name: Dr. James Preisig Period Covered By Report: 1/20/2016 to 4/19/2016...Technical work this period has spanned two areas. The first of these is VHF Acoustics . During this time period, the Principle Investigator worked with Dr
Key considerations in designing a speech brain-computer interface.

PubMed

Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Chabardès, Stéphan; Yvert, Blaise

2016-11-01

Restoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cortical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior

PubMed Central

2018-01-01

acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners’ abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication. PMID:28938250
Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior.

PubMed

Peelle, Jonathan E

acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners' abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication.
High-frequency neural activity predicts word parsing in ambiguous speech streams

PubMed Central

Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

2016-01-01

During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528
Cochlear Implant Microphone Location Affects Speech Recognition in Diffuse Noise

PubMed Central

Kolberg, Elizabeth R.; Sheffield, Sterling W.; Davis, Timothy J.; Sunderhaus, Linsey W.; Gifford, René H.

2015-01-01

Background Despite improvements in cochlear implants (CIs), CI recipients continue to experience significant communicative difficulty in background noise. Many potential solutions have been proposed to help increase signal-to-noise ratio in noisy environments, including signal processing and external accessories. To date, however, the effect of microphone location on speech recognition in noise has focused primarily on hearing aid users. Purpose The purpose of this study was to (1) measure physical output for the T-Mic as compared with the integrated behind-the-ear(BTE) processor mic for various source azimuths, and (2) to investigate the effect of CI processor mic location for speech recognition in semi-diffuse noise with speech originating from various source azimuths as encountered in everyday communicative environments. Research Design A repeated-measures, within-participant design was used to compare performance across listening conditions. Study Sample A total of 11 adults with Advanced Bionics CIs were recruited for this study. Data Collection and Analysis Physical acoustic output was measured on a Knowles Experimental Mannequin for Acoustic Research (KEMAR) for the T-Mic and BTE mic, with broadband noise presented at 0 and 90° (directed toward the implant processor). In addition to physical acoustic measurements, we also assessed recognition of sentences constructed by researchers at Texas Instruments, the Massachusetts Institute of Technology, and the Stanford Research Institute (TIMIT sentences) at 60 dBA for speech source azimuths of 0, 90, and 270°. Sentences were presented in a semi-diffuse restaurant noise originating from the R-SPACE 8-loudspeaker array. Signal-to-noise ratio was determined individually to achieve approximately 50% correct in the unilateral implanted listening condition with speech at 0°. Performance was compared across the T-Mic, 50/50, and the integrated BTE processor mic. Results The integrated BTE mic provided approximately 5
Cochlear implant microphone location affects speech recognition in diffuse noise.

PubMed

Kolberg, Elizabeth R; Sheffield, Sterling W; Davis, Timothy J; Sunderhaus, Linsey W; Gifford, René H

2015-01-01

Despite improvements in cochlear implants (CIs), CI recipients continue to experience significant communicative difficulty in background noise. Many potential solutions have been proposed to help increase signal-to-noise ratio in noisy environments, including signal processing and external accessories. To date, however, the effect of microphone location on speech recognition in noise has focused primarily on hearing aid users. The purpose of this study was to (1) measure physical output for the T-Mic as compared with the integrated behind-the-ear (BTE) processor mic for various source azimuths, and (2) to investigate the effect of CI processor mic location for speech recognition in semi-diffuse noise with speech originating from various source azimuths as encountered in everyday communicative environments. A repeated-measures, within-participant design was used to compare performance across listening conditions. A total of 11 adults with Advanced Bionics CIs were recruited for this study. Physical acoustic output was measured on a Knowles Experimental Mannequin for Acoustic Research (KEMAR) for the T-Mic and BTE mic, with broadband noise presented at 0 and 90° (directed toward the implant processor). In addition to physical acoustic measurements, we also assessed recognition of sentences constructed by researchers at Texas Instruments, the Massachusetts Institute of Technology, and the Stanford Research Institute (TIMIT sentences) at 60 dBA for speech source azimuths of 0, 90, and 270°. Sentences were presented in a semi-diffuse restaurant noise originating from the R-SPACE 8-loudspeaker array. Signal-to-noise ratio was determined individually to achieve approximately 50% correct in the unilateral implanted listening condition with speech at 0°. Performance was compared across the T-Mic, 50/50, and the integrated BTE processor mic. The integrated BTE mic provided approximately 5 dB attenuation from 1500-4500 Hz for signals presented at 0° as compared with 90�
Using others' words: conversational use of reported speech by individuals with aphasia and their communication partners.

PubMed

Hengst, Julie A; Frame, Simone R; Neuman-Stritzel, Tiffany; Gannaway, Rachel

2005-02-01

Reported speech, wherein one quotes or paraphrases the speech of another, has been studied extensively as a set of linguistic and discourse practices. Researchers agree that reported speech is pervasive, found across languages, and used in diverse contexts. However, to date, there have been no studies of the use of reported speech among individuals with aphasia. Grounded in an interactional sociolinguistic perspective, the study presented here documents and analyzes the use of reported speech by 7 adults with mild to moderately severe aphasia and their routine communication partners. Each of the 7 pairs was videotaped in 4 everyday activities at home or around the community, yielding over 27 hr of conversational interaction for analysis. A coding scheme was developed that identified 5 types of explicitly marked reported speech: direct, indirect, projected, indexed, and undecided. Analysis of the data documented reported speech as a common discourse practice used successfully by the individuals with aphasia and their communication partners. All participants produced reported speech at least once, and across all observations the target pairs produced 400 reported speech episodes (RSEs), 149 by individuals with aphasia and 251 by their communication partners. For all participants, direct and indirect forms were the most prevalent (70% of RSEs). Situated discourse analysis of specific episodes of reported speech used by 3 of the pairs provides detailed portraits of the diverse interactional, referential, social, and discourse functions of reported speech and explores ways that the pairs used reported speech to successfully frame talk despite their ongoing management of aphasia.
Designing the Speech Communication Classroom: A Viable Alternative.

ERIC Educational Resources Information Center

Springhorn, Ron G.

This paper presents a structure for the speech communication classroom, based on a philosophically existential approach to education. The following suggestions are offered to those considering such an approach. There should be movable furniture, enabling students to move about and to turn toward one another so that they can be physically in…
Telephone Communication for the Deaf: Speech Indicator Manual.

ERIC Educational Resources Information Center

Jones, Ray L.

The instructional manual is designed to accompany the Speech Indicator, a small, portable, economical ($15) device for deaf persons for telephone communication (available from Leadership Training Program in the Area of the Deaf, San Fernando State College). The device indicates when the other party speaks, not what he says. A topic outline and…
Musical melody and speech intonation: singing a different tune.

PubMed

Zatorre, Robert J; Baum, Shari R

2012-01-01

Music and speech are often cited as characteristically human forms of communication. Both share the features of hierarchical structure, complex sound systems, and sensorimotor sequencing demands, and both are used to convey and influence emotions, among other functions [1]. Both music and speech also prominently use acoustical frequency modulations, perceived as variations in pitch, as part of their communicative repertoire. Given these similarities, and the fact that pitch perception and production involve the same peripheral transduction system (cochlea) and the same production mechanism (vocal tract), it might be natural to assume that pitch processing in speech and music would also depend on the same underlying cognitive and neural mechanisms. In this essay we argue that the processing of pitch information differs significantly for speech and music; specifically, we suggest that there are two pitch-related processing systems, one for more coarse-grained, approximate analysis and one for more fine-grained accurate representation, and that the latter is unique to music. More broadly, this dissociation offers clues about the interface between sensory and motor systems, and highlights the idea that multiple processing streams are a ubiquitous feature of neuro-cognitive architectures.
A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals with Parkinson's Disease

ERIC Educational Resources Information Center

Kim, Yunjung; Choi, Yaelin

2017-01-01

Purpose: The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method: A total of 48 speakers from the 4 speaker groups (AE speakers with…
Experimental studies of applications of time-reversal acoustics to noncoherent underwater communications.

PubMed

Heinemann, M; Larraza, A; Smith, K B

2003-06-01

The most difficult problem in shallow underwater acoustic communications is considered to be the time-varying multipath propagation because it impacts negatively on data rates. At high data rates the intersymbol interference requires adaptive algorithms on the receiver side that lead to computationally intensive and complex signal processing. A novel technique called time-reversal acoustics (TRA) can environmentally adapt the acoustic propagation effects of a complex medium in order to focus energy at a particular target range and depth. Using TRA, the multipath structure is reduced because all the propagation paths add coherently at the intended target location. This property of time-reversal acoustics suggests a potential application in the field of noncoherent acoustic communications. This work presents results of a tank scale experiment using an algorithm for rapid transmission of binary data in a complex underwater environment with the TRA approach. A simple 15-symbol code provides an example of the simplicity and feasibility of the approach. Covert coding due to the inherent scrambling induced by the environment at points other than the intended receiver is also investigated. The experiments described suggest a high potential in data rate for the time-reversal approach in underwater acoustic communications while keeping the computational complexity low.

Contributions of local speech encoding and functional connectivity to audio-visual speech perception

PubMed Central

Giordano, Bruno L; Ince, Robin A A; Gross, Joachim; Schyns, Philippe G; Panzeri, Stefano; Kayser, Christoph

2017-01-01

Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments. DOI: http://dx.doi.org/10.7554/eLife.24763.001 PMID:28590903
Infant-Mother Acoustic-Prosodic Alignment and Developmental Risk.

PubMed

Seidl, Amanda; Cristia, Alejandrina; Soderstrom, Melanie; Ko, Eon-Suk; Abel, Emily A; Kellerman, Ashleigh; Schwichtenberg, A J

2018-06-19

One promising early marker for autism and other communicative and language disorders is early infant speech production. Here we used daylong recordings of high- and low-risk infant-mother dyads to examine whether acoustic-prosodic alignment as well as two automated measures of infant vocalization are related to developmental risk status indexed via familial risk and developmental progress at 36 months of age. Automated analyses of the acoustics of daylong real-world interactions were used to examine whether pitch characteristics of one vocalization by the mother or the child predicted those of the vocalization response by the other speaker and whether other features of infants' speech in daylong recordings were associated with developmental risk status or outcomes. Low-risk and high-risk dyads did not differ in the level of acoustic-prosodic alignment, which was overall not significant. Further analyses revealed that acoustic-prosodic alignment did not predict infants' later developmental progress, which was, however, associated with two automated measures of infant vocalizations (daily vocalizations and conversational turns). Although further research is needed, these findings suggest that automated measures of vocalizations drawn from daylong recordings are a possible early identification tool for later developmental progress/concerns. https://osf.io/cdn3v/.
The aprosody of schizophrenia: Computationally derived acoustic phonetic underpinnings of monotone speech.

PubMed

Compton, Michael T; Lunden, Anya; Cleary, Sean D; Pauselli, Luca; Alolayan, Yazeed; Halpern, Brooke; Broussard, Beth; Crisafio, Anthony; Capulong, Leslie; Balducci, Pierfrancesco Maria; Bernardini, Francesco; Covington, Michael A

2018-02-12

Acoustic phonetic methods are useful in examining some symptoms of schizophrenia; we used such methods to understand the underpinnings of aprosody. We hypothesized that, compared to controls and patients without clinically rated aprosody, patients with aprosody would exhibit reduced variability in: pitch (F0), jaw/mouth opening and tongue height (formant F1), tongue front/back position and/or lip rounding (formant F2), and intensity/loudness. Audiorecorded speech was obtained from 98 patients (including 25 with clinically rated aprosody and 29 without) and 102 unaffected controls using five tasks: one describing a drawing, two based on spontaneous speech elicited through a question (Tasks 2 and 3), and two based on reading prose excerpts (Tasks 4 and 5). We compared groups on variation in pitch (F0), formant F1 and F2, and intensity/loudness. Regarding pitch variation, patients with aprosody differed significantly from controls in Task 5 in both unadjusted tests and those adjusted for sociodemographics. For the standard deviation (SD) of F1, no significant differences were found in adjusted tests. Regarding SD of F2, patients with aprosody had lower values than controls in Task 3, 4, and 5. For variation in intensity/loudness, patients with aprosody had lower values than patients without aprosody and controls across the five tasks. Findings could represent a step toward developing new methods for measuring and tracking the severity of this specific negative symptom using acoustic phonetic parameters; such work is relevant to other psychiatric and neurological disorders. Copyright © 2018 Elsevier B.V. All rights reserved.
Multi-channel spatial auditory display for speech communications

NASA Astrophysics Data System (ADS)

Begault, Durand; Erbe, Tom

1993-10-01

A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.
Multi-channel spatial auditory display for speech communications

NASA Technical Reports Server (NTRS)

Begault, Durand; Erbe, Tom

1993-01-01

A spatial auditory display for multiple speech communications was developed at NASA-Ames Research Center. Input is spatialized by use of simplified head-related transfer functions, adapted for FIR filtering on Motorola 56001 digital signal processors. Hardware and firmware design implementations are overviewed for the initial prototype developed for NASA-Kennedy Space Center. An adaptive staircase method was used to determine intelligibility levels of four letter call signs used by launch personnel at NASA, against diotic speech babble. Spatial positions at 30 deg azimuth increments were evaluated. The results from eight subjects showed a maximal intelligibility improvement of about 6 to 7 dB when the signal was spatialized to 60 deg or 90 deg azimuth positions.
Transient Auditory Storage of Acoustic Details Is Associated with Release of Speech from Informational Masking in Reverberant Conditions

ERIC Educational Resources Information Center

Huang, Ying; Huang, Qiang; Chen, Xun; Wu, Xihong; Li, Liang

2009-01-01

Perceptual integration of the sound directly emanating from the source with reflections needs both temporal storage and correlation computation of acoustic details. We examined whether the temporal storage is frequency dependent and associated with speech unmasking. In Experiment 1, a break in correlation (BIC) between interaurally correlated…
MURI: Impact of Oceanographic Variability on Acoustic Communications

DTIC Science & Technology

2012-09-30

ACSSC.2010.5757934 (2010). [published] [50] K. Tu, T.M. Duman, J.G. Proakis, and M. Stojanovic, “Cooperative MIMO - OFDM communications: Receiver...considered across bands of frequencies in the range 1-50 kHz. Multiple source and receiver cases ( MIMO ) will be of particular interest. Validating...Parabolic Equation (PE) acoustic models. Communication receiver design has included processors for orthogonal frequency division multiplexing ( OFDM
Classroom Acoustics. IssueTrak: A CEFPI Brief on Educational Facility Issues.

ERIC Educational Resources Information Center

Erdreich, John

This report examines the problem of acoustic inadequacy in the classroom, how it affects students and teachers, and possible solutions. It explains how to predict classroom adequacy for communication by assessing the level of speech in competition with other noise, and the level of that competing noise itself in terms of reverberation that allows…
Room Acoustics

NASA Astrophysics Data System (ADS)

Kuttruff, Heinrich; Mommertz, Eckard

The traditional task of room acoustics is to create or formulate conditions which ensure the best possible propagation of sound in a room from a sound source to a listener. Thus, objects of room acoustics are in particular assembly halls of all kinds, such as auditoria and lecture halls, conference rooms, theaters, concert halls or churches. Already at this point, it has to be pointed out that these conditions essentially depend on the question if speech or music should be transmitted; in the first case, the criterion for transmission quality is good speech intelligibility, in the other case, however, the success of room-acoustical efforts depends on other factors that cannot be quantified that easily, not least it also depends on the hearing habits of the listeners. In any case, absolutely "good acoustics" of a room do not exist.
High-frequency neural activity predicts word parsing in ambiguous speech streams.

PubMed

Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

2016-12-01

During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.
Award 1 Title: Acoustic Communications 2011 Experiment: Deployment Support and Post Experiment Data Handling and Analysis. Award 2 Title: Exploiting Structured Dependencies in the Design of Adaptive Algorithms for Underwater Communication Award. 3 Title: Coupled Research in Ocean Acoustics and Signal Processing for the Next Generation of Underwater Acoustic Communication Systems

DTIC Science & Technology

2015-09-30

Wireless Networks (WUWNet’14), Rome, Italy, Nov. 12 14, 2014. J. Preisig, “ Underwater Acoustic Communications: Enabling the Next Generation at the...on Wireless Communication. M. Pajovic, J. Preisig, “Performance Analytics and Optimal Design of Multichannel Equalizers for Underwater Acoustic Communications”, to appear in IEEE Journal of Oceanic Engineering. 6 ...Exploiting Structured Dependencies in the Design of Adaptive Algorithms for Underwater Communication Award #3
Speech Adjustments for Room Acoustics and Their Effects on Vocal Effort.

PubMed

Bottalico, Pasquale

2017-05-01

The aims of the present study are (1) to analyze the effects of the acoustical environment and the voice style on time dose (D t_p ) and fundamental frequency (mean f 0 and standard deviation std_f 0 ) while taking into account the effect of short-term vocal fatigue and (2) to predict the self-reported vocal effort from the voice acoustical parameters. Ten male and ten female subjects were recorded while reading a text in normal and loud styles, in three rooms-anechoic, semi-reverberant, and reverberant-with and without acrylic glass panels 0.5 m from the mouth, which increased external auditory feedback. Subjects quantified how much effort was required to speak in each condition on a visual analogue scale after each task. (Aim1) In the loud style, D t_p , f 0 , and std_f 0 increased. The D t_p was higher in the reverberant room compared to the other two rooms. Both genders tended to increase f 0 in less reverberant environments, whereas a more monotonous speech was produced in rooms with greater reverberation. All three voice parameters increased with short-term vocal fatigue. (Aim2) A model of the vocal effort to acoustic vocal parameters is proposed. The sound pressure level contributed to 66% of the variance explained by the model, followed by the f 0 (30%) and the modulation in amplitude (4%). The results provide insight into how voice acoustical parameters can predict vocal effort. In particular, it increased when SPL and f 0 increased and when the amplitude voice modulation decreased. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Figures of Speech, "Ethos," and Aristotle: Notes toward a Rhetoric of Business Communication.

ERIC Educational Resources Information Center

Kallendorf, Craig; Kallendorf, Carol

1985-01-01

Demonstrates that business writers rely far more heavily than expected on classical figures of speech. Uses Aristotle's "Rhetoric" to show that figures of speech offer a powerful tool for the persuasive function of modern business communication. (PD)
Predicting couple therapy outcomes based on speech acoustic features

PubMed Central

Nasir, Md; Baucom, Brian Robert; Narayanan, Shrikanth

2017-01-01

Automated assessment and prediction of marital outcome in couples therapy is a challenging task but promises to be a potentially useful tool for clinical psychologists. Computational approaches for inferring therapy outcomes using observable behavioral information obtained from conversations between spouses offer objective means for understanding relationship dynamics. In this work, we explore whether the acoustics of the spoken interactions of clinically distressed spouses provide information towards assessment of therapy outcomes. The therapy outcome prediction task in this work includes detecting whether there was a relationship improvement or not (posed as a binary classification) as well as discerning varying levels of improvement or decline in the relationship status (posed as a multiclass recognition task). We use each interlocutor’s acoustic speech signal characteristics such as vocal intonation and intensity, both independently and in relation to one another, as cues for predicting the therapy outcome. We also compare prediction performance with one obtained via standardized behavioral codes characterizing the relationship dynamics provided by human experts as features for automated classification. Our experiments, using data from a longitudinal clinical study of couples in distressed relations, showed that predictions of relationship outcomes obtained directly from vocal acoustics are comparable or superior to those obtained using human-rated behavioral codes as prediction features. In addition, combining direct signal-derived features with manually coded behavioral features improved the prediction performance in most cases, indicating the complementarity of relevant information captured by humans and machine algorithms. Additionally, considering the vocal properties of the interlocutors in relation to one another, rather than in isolation, showed to be important for improving the automatic prediction. This finding supports the notion that behavioral
[Perception of emotional intonation of noisy speech signal with different acoustic parameters by adults of different age and gender].

PubMed

Dmitrieva, E S; Gel'man, V Ia

2011-01-01

The listener-distinctive features of recognition of different emotional intonations (positive, negative and neutral) of male and female speakers in the presence or absence of background noise were studied in 49 adults aged 20-79 years. In all the listeners noise produced the most pronounced decrease in recognition accuracy for positive emotional intonation ("joy") as compared to other intonations, whereas it did not influence the recognition accuracy of "anger" in 65-79-year-old listeners. The higher emotion recognition rates of a noisy signal were observed for speech emotional intonations expressed by female speakers. Acoustic characteristics of noisy and clear speech signals underlying perception of speech emotional prosody were found for adult listeners of different age and gender.
Acoustic diagnosis of pulmonary hypertension: automated speech- recognition-inspired classification algorithm outperforms physicians

NASA Astrophysics Data System (ADS)

Kaddoura, Tarek; Vadlamudi, Karunakar; Kumar, Shine; Bobhate, Prashant; Guo, Long; Jain, Shreepal; Elgendi, Mohamed; Coe, James Y.; Kim, Daniel; Taylor, Dylan; Tymchak, Wayne; Schuurmans, Dale; Zemp, Roger J.; Adatia, Ian

2016-09-01

We hypothesized that an automated speech- recognition-inspired classification algorithm could differentiate between the heart sounds in subjects with and without pulmonary hypertension (PH) and outperform physicians. Heart sounds, electrocardiograms, and mean pulmonary artery pressures (mPAp) were recorded simultaneously. Heart sound recordings were digitized to train and test speech-recognition-inspired classification algorithms. We used mel-frequency cepstral coefficients to extract features from the heart sounds. Gaussian-mixture models classified the features as PH (mPAp ≥ 25 mmHg) or normal (mPAp < 25 mmHg). Physicians blinded to patient data listened to the same heart sound recordings and attempted a diagnosis. We studied 164 subjects: 86 with mPAp ≥ 25 mmHg (mPAp 41 ± 12 mmHg) and 78 with mPAp < 25 mmHg (mPAp 17 ± 5 mmHg) (p < 0.005). The correct diagnostic rate of the automated speech-recognition-inspired algorithm was 74% compared to 56% by physicians (p = 0.005). The false positive rate for the algorithm was 34% versus 50% (p = 0.04) for clinicians. The false negative rate for the algorithm was 23% and 68% (p = 0.0002) for physicians. We developed an automated speech-recognition-inspired classification algorithm for the acoustic diagnosis of PH that outperforms physicians that could be used to screen for PH and encourage earlier specialist referral.
Analysis of Acoustic Features in Speakers with Cognitive Disorders and Speech Impairments

NASA Astrophysics Data System (ADS)

Saz, Oscar; Simón, Javier; Rodríguez, W. Ricardo; Lleida, Eduardo; Vaquero, Carlos

2009-12-01

This work presents the results in the analysis of the acoustic features (formants and the three suprasegmental features: tone, intensity and duration) of the vowel production in a group of 14 young speakers suffering different kinds of speech impairments due to physical and cognitive disorders. A corpus with unimpaired children's speech is used to determine the reference values for these features in speakers without any kind of speech impairment within the same domain of the impaired speakers; this is 57 isolated words. The signal processing to extract the formant and pitch values is based on a Linear Prediction Coefficients (LPCs) analysis of the segments considered as vowels in a Hidden Markov Model (HMM) based Viterbi forced alignment. Intensity and duration are also based in the outcome of the automated segmentation. As main conclusion of the work, it is shown that intelligibility of the vowel production is lowered in impaired speakers even when the vowel is perceived as correct by human labelers. The decrease in intelligibility is due to a 30% of increase in confusability in the formants map, a reduction of 50% in the discriminative power in energy between stressed and unstressed vowels and to a 50% increase of the standard deviation in the length of the vowels. On the other hand, impaired speakers keep good control of tone in the production of stressed and unstressed vowels.
Sexual Hearing: The influence of sex hormones on acoustic communication in frogs

PubMed Central

Arch, Victoria S.; Narins, Peter M.

2009-01-01

The majority of anuran amphibians (frogs and toads) use acoustic communication to mediate sexual behavior and reproduction. Generally, females find and select their mates using acoustic cues provided by males in the form of conspicuous advertisement calls. In these species, vocal signal production and reception are intimately tied to successful reproduction. Research with anurans has demonstrated that acoustic communication is modulated by reproductive hormones, including gonadal steroids and peptide neuromodulators. Most of these studies have focused on the ways in which hormonal systems influence vocal signal production; however, here we will concentrate on a growing body of literature that examines hormonal modulation of call reception. This literature suggests that reproductive hormones contribute to the coordination of reproductive behaviors between signaler and receiver by modulating sensitivity and spectral filtering of the anuran auditory system. It has become evident that the hormonal systems that influence reproductive behaviors are highly conserved among vertebrate taxa, thus studying the endocrine and neuromodulatory bases of acoustic communication in frogs and toads can lead to insights of broader applicability to hormonal modulation of vertebrate sensory physiology and behavior. PMID:19272318
The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control
A novel probabilistic framework for event-based speech recognition

NASA Astrophysics Data System (ADS)

Juneja, Amit; Espy-Wilson, Carol

2003-10-01

One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.

Effects of programming threshold and maplaw settings on acoustic thresholds and speech discrimination with the MED-EL COMBI 40+ cochlear implant.

PubMed

Boyd, Paul J

2006-12-01

The principal task in the programming of a cochlear implant (CI) speech processor is the setting of the electrical dynamic range (output) for each electrode, to ensure that a comfortable loudness percept is obtained for a range of input levels. This typically involves separate psychophysical measurement of electrical threshold ([theta] e) and upper tolerance levels using short current bursts generated by the fitting software. Anecdotal clinical experience and some experimental studies suggest that the measurement of [theta]e is relatively unimportant and that the setting of upper tolerance limits is more critical for processor programming. The present study aims to test this hypothesis and examines in detail how acoustic thresholds and speech recognition are affected by setting of the lower limit of the output ("Programming threshold" or "PT") to understand better the influence of this parameter and how it interacts with certain other programming parameters. Test programs (maps) were generated with PT set to artificially high and low values and tested on users of the MED-EL COMBI 40+ CI system. Acoustic thresholds and speech recognition scores (sentence tests) were measured for each of the test maps. Acoustic thresholds were also measured using maps with a range of output compression functions ("maplaws"). In addition, subjective reports were recorded regarding the presence of "background threshold stimulation" which is occasionally reported by CI users if PT is set to relatively high values when using the CIS strategy. Manipulation of PT was found to have very little effect. Setting PT to minimum produced a mean 5 dB (S.D. = 6.25) increase in acoustic thresholds, relative to thresholds with PT set normally, and had no statistically significant effect on speech recognition scores on a sentence test. On the other hand, maplaw setting was found to have a significant effect on acoustic thresholds (raised as maplaw is made more linear), which provides some theoretical
Selected Speeches on Obscenity by Federal Communications Commission Chairman Dean Burch, 1969-74.

ERIC Educational Resources Information Center

Hartenberger, Karen Schmidt

This study is a descriptive/historical account focusing on the obscenity issue and the selected manuscript speeches of Dean Burch while he served as chairman of the Federal Communications Commission (FCC) from October 1969 to March 1974. Research centers on the speaker and the specific manuscript speeches, considering the timeliness and…
Lexico-semantic and acoustic-phonetic processes in the perception of noise-vocoded speech: implications for cochlear implantation

PubMed Central

McGettigan, Carolyn; Rosen, Stuart; Scott, Sophie K.

2014-01-01

Noise-vocoding is a transformation which, when applied to speech, severely reduces spectral resolution and eliminates periodicity, yielding a stimulus that sounds “like a harsh whisper” (Scott et al., 2000, p. 2401). This process simulates a cochlear implant, where the activity of many thousand hair cells in the inner ear is replaced by direct stimulation of the auditory nerve by a small number of tonotopically-arranged electrodes. Although a cochlear implant offers a powerful means of restoring some degree of hearing to profoundly deaf individuals, the outcomes for spoken communication are highly variable (Moore and Shannon, 2009). Some variability may arise from differences in peripheral representation (e.g., the degree of residual nerve survival) but some may reflect differences in higher-order linguistic processing. In order to explore this possibility, we used noise-vocoding to explore speech recognition and perceptual learning in normal-hearing listeners tested across several levels of the linguistic hierarchy: segments (consonants and vowels), single words, and sentences. Listeners improved significantly on all tasks across two test sessions. In the first session, individual differences analyses revealed two independently varying sources of variability: one lexico-semantic in nature and implicating the recognition of words and sentences, and the other an acoustic-phonetic factor associated with words and segments. However, consequent to learning, by the second session there was a more uniform covariance pattern concerning all stimulus types. A further analysis of phonetic feature recognition allowed greater insight into learning-related changes in perception and showed that, surprisingly, participants did not make full use of cues that were preserved in the stimuli (e.g., vowel duration). We discuss these findings in relation cochlear implantation, and suggest auditory training strategies to maximize speech recognition performance in the absence of
Telephone speech comprehension in children with multichannel cochlear implants.

PubMed

Aronson, L; Estienne, P; Arauz, S L; Pallante, S A

1997-11-01

Telephone speech comprehension is being evaluated in six prelingually deaf children implanted with the Nucleus 22 prosthesis fitted with the Speak strategy. All of them have had at least 1.5 years of experience with their implant. When the tests began, they had already had at least 2 months' experience with the same map in their speech processor. The children were trained in the use of the telephone as part of the rehabilitation program. None of them used it regularly but as a game that they found very entertaining. A special battery, the Bate-fon (batería para teléfono = telephone battery), was designed for training and evaluation purposes. It includes the five Spanish vowels in isolation, diphthongs, onomatopoetic animal voices, two-syllable, and three-syllable words. The tests were administered 1.5-2 years after the switch-on of their speech processor. Standard acoustic telephone coupling was used. The speech material was presented to the child on colored cards. Stimuli were presented twice. Children were informed when the response was incorrect. Averaged results indicated that the percentages of correct responses for all the speech material increase in the second presentation. All children have shown some degree of telephone communication abilities. As a result of the training, some of the children are using the telephone to communicate with their families.
Speech perception with combined electric-acoustic stimulation and bilateral cochlear implants in a multisource noise field.

PubMed

Rader, Tobias; Fastl, Hugo; Baumann, Uwe

2013-01-01

The aim of the study was to measure and compare speech perception in users of electric-acoustic stimulation (EAS) supported by a hearing aid in the unimplanted ear and in bilateral cochlear implant (CI) users under different noise and sound field conditions. Gap listening was assessed by comparing performance in unmodulated and modulated Comité Consultatif International Téléphonique et Télégraphique (CCITT) noise conditions, and binaural interaction was investigated by comparing single source and multisource sound fields. Speech perception in noise was measured using a closed-set sentence test (Oldenburg Sentence Test, OLSA) in a multisource noise field (MSNF) consisting of a four-loudspeaker array with independent noise sources and a single source in frontal position (S0N0). Speech simulating noise (Fastl-noise), CCITT-noise (continuous), and OLSA-noise (pseudo continuous) served as noise sources with different temporal patterns. Speech tests were performed in two groups of subjects who were using either EAS (n = 12) or bilateral CIs (n = 10). All subjects in the EAS group were fitted with a high-power hearing aid in the opposite ear (bimodal EAS). The average group score on monosyllable in quiet was 68.8% (EAS) and 80.5% (bilateral CI). A group of 22 listeners with normal hearing served as controls to compare and evaluate potential gap listening effects in implanted patients. Average speech reception thresholds in the EAS group were significantly lower than those for the bilateral CI group in all test conditions (CCITT 6.1 dB, p = 0.001; Fastl-noise 5.4 dB, p < 0.01; Oldenburg-(OL)-noise 1.6 dB, p < 0.05). Bilateral CI and EAS user groups showed a significant improvement of 4.3 dB (p = 0.004) and 5.4 dB (p = 0.002) between S0N0 and MSNF sound field conditions respectively, which signifies advantages caused by bilateral interaction in both groups. Performance in the control group showed a significant gap listening effect with a difference of 6.5 dB between
Speech-Language Pathologists' Opinions on Communication Disorders and Violence

ERIC Educational Resources Information Center

Sanger, Dixie; Moore-Brown, Barbara J.; Montgomery, Judy; Hellerich, Susan

2004-01-01

Purpose: This study investigated the opinions of speech-language pathologists (SLPs) regarding their role, education, and training in serving students with communication disorders who have been involved in violence. Method: A survey consisting of 26 items was given to 598 SLPs from eight states representing geographic regions of the United…
Examining the relationship between speech intensity and self-rated communicative effectiveness in individuals with Parkinson's disease and hypophonia.

PubMed

Dykstra, Allyson D; Adams, Scott G; Jog, Mandar

2015-01-01

To examine the relationship between speech intensity and self-ratings of communicative effectiveness in speakers with Parkinson's disease (PD) and hypophonia. An additional purpose was to evaluate if self-ratings of communicative effectiveness made by participants with PD differed from ratings made by primary communication partners. Thirty participants with PD and 15 healthy older adults completed the Communication Effectiveness Survey. Thirty primary communication partners rated the communicative effectiveness of his/her partner with PD. Speech intensity was calculated for participants with PD and control participants based on conversational utterances. Results revealed significant differences between groups in conversational speech intensity (p=.001). Participants with PD self-rated communicative effectiveness significantly lower than control participants (p=.000). Correlational analyses revealed a small but non-significant relationship between speech intensity and communicative effectiveness for participants with PD (r=0.298, p=.110) and control participants (r=0.327, p=.234). Self-ratings of communicative effectiveness made participants with PD was not significantly different than ratings made by primary communication partners (p=.20). Obtaining information on communicative effectiveness may help to broaden outcome measurement and may aid in the provision of educational strategies. Findings also suggest that communicative effectiveness may be a separate and a distinct construct that cannot necessarily be predicted from the severity of hypophonia. Copyright © 2015 Elsevier Inc. All rights reserved.
Peripheral facial palsy: Speech, communication and oral motor function.

PubMed

Movérare, T; Lohmander, A; Hultcrantz, M; Sjögreen, L

2017-02-01

The aim of the present study was to examine the effect of acquired unilateral peripheral facial palsy on speech, communication and oral functions and to study the relationship between the degree of facial palsy and articulation, saliva control, eating ability and lip force. In this descriptive study, 27 patients (15 men and 12 women, mean age 48years) with unilateral peripheral facial palsy were included if they were graded under 70 on the Sunnybrook Facial Grading System. The assessment was carried out in connection with customary visits to the ENT Clinic and comprised lip force, articulation and intelligibility, together with perceived ability to communicate and ability to eat and control saliva conducted through self-response questionnaires. The patients with unilateral facial palsy had significantly lower lip force, poorer articulation and ability to eat and control saliva compared with reference data in healthy populations. The degree of facial palsy correlated significantly with lip force but not with articulation, intelligibility, perceived communication ability or reported ability to eat and control saliva. Acquired peripheral facial palsy may affect communication and the ability to eat and control saliva. Physicians should be aware that there is no direct correlation between the degree of facial palsy and the possible effect on communication, eating ability and saliva control. Physicians are therefore recommended to ask specific questions relating to problems with these functions during customary medical visits and offer possible intervention by a speech-language pathologist or a physiotherapist. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Advancing Underwater Acoustic Communication for Autonomous Distributed Networks via Sparse Channel Sensing, Coding, and Navigation Support

DTIC Science & Technology

2014-09-30

underwater acoustic communication technologies for autonomous distributed underwater networks , through innovative signal processing, coding, and...4. TITLE AND SUBTITLE Advancing Underwater Acoustic Communication for Autonomous Distributed Networks via Sparse Channel Sensing, Coding, and...coding: 3) OFDM modulated dynamic coded cooperation in underwater acoustic channels; 3 Localization, Networking , and Testbed: 4) On-demand
A multimodal spectral approach to characterize rhythm in natural speech.

PubMed

Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

2016-01-01

Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.
Effects of Additional Low-Pass-Filtered Speech on Listening Effort for Noise-Band-Vocoded Speech in Quiet and in Noise.

PubMed

Pals, Carina; Sarampalis, Anastasios; van Dijk, Mart; Başkent, Deniz

2018-05-11

Residual acoustic hearing in electric-acoustic stimulation (EAS) can benefit cochlear implant (CI) users in increased sound quality, speech intelligibility, and improved tolerance to noise. The goal of this study was to investigate whether the low-pass-filtered acoustic speech in simulated EAS can provide the additional benefit of reducing listening effort for the spectrotemporally degraded signal of noise-band-vocoded speech. Listening effort was investigated using a dual-task paradigm as a behavioral measure, and the NASA Task Load indeX as a subjective self-report measure. The primary task of the dual-task paradigm was identification of sentences presented in three experiments at three fixed intelligibility levels: at near-ceiling, 50%, and 79% intelligibility, achieved by manipulating the presence and level of speech-shaped noise in the background. Listening effort for the primary intelligibility task was reflected in the performance on the secondary, visual response time task. Experimental speech processing conditions included monaural or binaural vocoder, with added low-pass-filtered speech (to simulate EAS) or without (to simulate CI). In Experiment 1, in quiet with intelligibility near-ceiling, additional low-pass-filtered speech reduced listening effort compared with binaural vocoder, in line with our expectations, although not compared with monaural vocoder. In Experiments 2 and 3, for speech in noise, added low-pass-filtered speech allowed the desired intelligibility levels to be reached at less favorable speech-to-noise ratios, as expected. It is interesting that this came without the cost of increased listening effort usually associated with poor speech-to-noise ratios; at 50% intelligibility, even a reduction in listening effort on top of the increased tolerance to noise was observed. The NASA Task Load indeX did not capture these differences. The dual-task results provide partial evidence for a potential decrease in listening effort as a result of
Child speech, language and communication need re-examined in a public health context: a new direction for the speech and language therapy profession.

PubMed

Law, James; Reilly, Sheena; Snow, Pamela C

2013-01-01

Historically speech and language therapy services for children have been framed within a rehabilitative framework with explicit assumptions made about providing therapy to individuals. While this is clearly important in many cases, we argue that this model needs revisiting for a number of reasons. First, our understanding of the nature of disability, and therefore communication disabilities, has changed over the past century. Second, there is an increasing understanding of the impact that the social gradient has on early communication difficulties. Finally, understanding how these factors interact with one other and have an impact across the life course remains poorly understood. To describe the public health paradigm and explore its implications for speech and language therapy with children. We test the application of public health methodologies to speech and language therapy services by looking at four dimensions of service delivery: (1) the uptake of services and whether those children who need services receive them; (2) the development of universal prevention services in relation to social disadvantage; (3) the risk of over-interpreting co-morbidity from clinical samples; and (4) the overlap between communicative competence and mental health. It is concluded that there is a strong case for speech and language therapy services to be reconceptualized to respond to the needs of the whole population and according to socially determined needs, focusing on primary prevention. This is not to disregard individual need, but to highlight the needs of the population as a whole. Although the socio-political context is different between countries, we maintain that this is relevant wherever speech and language therapists have a responsibility for covering whole populations. Finally, we recommend that speech and language therapy services be conceptualized within the framework laid down in The Ottawa Charter for Health Promotion. © 2013 Royal College of Speech and Language
ON THE NATURE OF SPEECH SCIENCE.

ERIC Educational Resources Information Center

PETERSON, GORDON E.

IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…
Speech versus non-speech as irrelevant sound: controlling acoustic variation.

PubMed

Little, Jason S; Martin, Frances Heritage; Thomson, Richard H S

2010-09-01

Functional differences between speech and non-speech within the irrelevant sound effect were investigated using repeated and changing formats of irrelevant sounds in the form of intelligible words and unintelligible signal correlated noise (SCN) versions of the words. Event-related potentials were recorded from 25 females aged between 18 and 25 while they completed a serial order recall task in the presence of irrelevant sound or silence. As expected and in line with the changing-state hypothesis both words and SCN produced robust changing-state effects. However, words produced a greater changing-state effect than SCN indicating that the spectral detail inherent within speech accounts for the greater irrelevant sound effect and changing-state effect typically observed with speech. ERP data in the form of N1 amplitude was modulated within some irrelevant sound conditions suggesting that attentional aspects are involved in the elicitation of the irrelevant sound effect. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Auditory and Acoustic Research & Development at Air Force Research Laboratory (AFRL)

DTIC Science & Technology

2010-09-01

aircraft noise measurement and modeling, speech communication in noise, and national and international standards for over 60 years. This article ...substantial technical document and a complete review is beyond the scope of this article . The purpose of this section is to give some examples of...acoustics facilities and instrumentation. The multi-disciplinary researchers included experts in audiology , biomedical engineering, human factors
Adapting to the Job Market: Graduate Programs in Speech Communication.

ERIC Educational Resources Information Center

Berg, David M.

The percentage of speech communication doctoral graduates employed full time and the percentage working in academic institutions have declined considerably since 1968. The glut of humanities doctorates appears to present three courses of action: increase undergraduate enrollments, decrease graduate enrollments, or increase nonacademic employment…
Optimal Deployment of Sensor Nodes Based on Performance Surface of Underwater Acoustic Communication

PubMed Central

Choi, Jee Woong

2017-01-01

The underwater acoustic sensor network (UWASN) is a system that exchanges data between numerous sensor nodes deployed in the sea. The UWASN uses an underwater acoustic communication technique to exchange data. Therefore, it is important to design a robust system that will function even in severely fluctuating underwater communication conditions, along with variations in the ocean environment. In this paper, a new algorithm to find the optimal deployment positions of underwater sensor nodes is proposed. The algorithm uses the communication performance surface, which is a map showing the underwater acoustic communication performance of a targeted area. A virtual force-particle swarm optimization algorithm is then used as an optimization technique to find the optimal deployment positions of the sensor nodes, using the performance surface information to estimate the communication radii of the sensor nodes in each generation. The algorithm is evaluated by comparing simulation results between two different seasons (summer and winter) for an area located off the eastern coast of Korea as the selected targeted area. PMID:29053569
Objective speech quality evaluation of real-time speech coders

NASA Astrophysics Data System (ADS)

Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

1984-02-01

This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.
Electrophysiological and Kinematic Correlates of Communicative Intent in the Planning and Production of Pointing Gestures and Speech.

PubMed

Peeters, David; Chu, Mingyuan; Holler, Judith; Hagoort, Peter; Özyürek, Aslı

2015-12-01

In everyday human communication, we often express our communicative intentions by manually pointing out referents in the material world around us to an addressee, often in tight synchronization with referential speech. This study investigated whether and how the kinematic form of index finger pointing gestures is shaped by the gesturer's communicative intentions and how this is modulated by the presence of concurrently produced speech. Furthermore, we explored the neural mechanisms underpinning the planning of communicative pointing gestures and speech. Two experiments were carried out in which participants pointed at referents for an addressee while the informativeness of their gestures and speech was varied. Kinematic and electrophysiological data were recorded online. It was found that participants prolonged the duration of the stroke and poststroke hold phase of their gesture to be more communicative, in particular when the gesture was carrying the main informational burden in their multimodal utterance. Frontal and P300 effects in the ERPs suggested the importance of intentional and modality-independent attentional mechanisms during the planning phase of informative pointing gestures. These findings contribute to a better understanding of the complex interplay between action, attention, intention, and language in the production of pointing gestures, a communicative act core to human interaction.
The Effect of Uni- and Bilateral Thalamic Deep Brain Stimulation on Speech in Patients With Essential Tremor: Acoustics and Intelligibility.

PubMed

Becker, Johannes; Barbe, Michael T; Hartinger, Mariam; Dembek, Till A; Pochmann, Jil; Wirths, Jochen; Allert, Niels; Mücke, Doris; Hermes, Anne; Meister, Ingo G; Visser-Vandewalle, Veerle; Grice, Martine; Timmermann, Lars

2017-04-01

Deep brain stimulation (DBS) of the ventral intermediate nucleus (VIM) is performed to suppress medically-resistant essential tremor (ET). However, stimulation induced dysarthria (SID) is a common side effect, limiting the extent to which tremor can be suppressed. To date, the exact pathogenesis of SID in VIM-DBS treated ET patients is unknown. We investigate the effect of inactivated, uni- and bilateral VIM-DBS on speech production in patients with ET. We employ acoustic measures, tempo, and intelligibility ratings and patient's self-estimated speech to quantify SID, with a focus on comparing bilateral to unilateral stimulation effects and the effect of electrode position on speech. Sixteen German ET patients participated in this study. Each patient was acoustically recorded with DBS-off, unilateral-right-hemispheric-DBS-on, unilateral-left-hemispheric-DBS-on, and bilateral-DBS-on during an oral diadochokinesis task and a read German standard text. To capture the extent of speech impairment, we measured syllable duration and intensity ratio during the DDK task. Naïve listeners rated speech tempo and speech intelligibility of the read text on a 5-point-scale. Patients had to rate their "ability to speak". We found an effect of bilateral compared to unilateral and inactivated stimulation on syllable durations and intensity ratio, as well as on external intelligibility ratings and patients' VAS scores. Additionally, VAS scores are associated with more laterally located active contacts. For speech ratings, we found an effect of syllable duration such that tempo and intelligibility was rated worse for speakers exhibiting greater syllable durations. Our data confirms that SID is more pronounced under bilateral compared to unilateral stimulation. Laterally located electrodes are associated with more severe SID according to patient's self-ratings. We can confirm the relation between diadochokinetic rate and SID in that listener's tempo and intelligibility ratings can be

The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing

PubMed Central

Reichenbach, Chagit S.; Braiman, Chananel; Schiff, Nicholas D.; Hudspeth, A. J.; Reichenbach, Tobias

2016-01-01

The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function. PMID:27303286
Finding the music of speech: Musical knowledge influences pitch processing in speech.

PubMed

Vanden Bosch der Nederlanden, Christina M; Hannon, Erin E; Snyder, Joel S

2015-10-01

Few studies comparing music and language processing have adequately controlled for low-level acoustical differences, making it unclear whether differences in music and language processing arise from domain-specific knowledge, acoustic characteristics, or both. We controlled acoustic characteristics by using the speech-to-song illusion, which often results in a perceptual transformation to song after several repetitions of an utterance. Participants performed a same-different pitch discrimination task for the initial repetition (heard as speech) and the final repetition (heard as song). Better detection was observed for pitch changes that violated rather than conformed to Western musical scale structure, but only when utterances transformed to song, indicating that music-specific pitch representations were activated and influenced perception. This shows that music-specific processes can be activated when an utterance is heard as song, suggesting that the high-level status of a stimulus as either language or music can be behaviorally dissociated from low-level acoustic factors. Copyright © 2015 Elsevier B.V. All rights reserved.
Communication as a human right: Citizenship, politics and the role of the speech-language pathologist.

PubMed

Murphy, Declan; Lyons, Rena; Carroll, Clare; Caulfield, Mari; De Paor, Gráinne

2018-02-01

According to Article 19 of the Universal Declaration on Human Rights "Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers." The purpose of this paper is to elucidate communication as a human right in the life of a young man called Declan who has Down syndrome. This commentary paper is co-written by Declan, his sister who is a speech-language pathologist (SLP) with an advocacy role, his SLP, and academics. Declan discusses, in his own words, what makes communication hard, what helps communication, his experiences of speech-language pathology, and what he knows about human rights. He also discusses his passion for politics, his right to be an active citizen and participate in the political process. This paper also focuses on the role of speech-language pathology in supporting and partnering with people with communication disabilities to have their voices heard and exercise their human rights.
The ``listener'' in the modeling of speech prosody

NASA Astrophysics Data System (ADS)

Kohler, Klaus J.

2004-05-01

Autosegmental-metrical modeling of speech prosody is principally speaker-oriented. The production of pitch patterns, in systematic lab speech experiments as well as in spontaneous speech corpora, is analyzed in f0 tracings, from which sequences of H(igh) and L(ow) are abstracted. The perceptual relevance of these pitch categories in the transmission from speakers to listeners is largely not conceptualized; thus their modeling in speech communication lacks an essential component. In the metalinguistic task of labeling speech data with the annotation system ToBI, the ``listener'' plays a subordinate role as well: H and L, being suggestive of signal values, are allocated with reference to f0 curves and little or no concern for perceptual classification by the trained labeler. The seriousness of this theoretical gap in the modeling of speech prosody is demonstrated by experimental data concerning f0-peak alignment. A number of papers in JASA have dealt with this topic from the point of synchronizing f0 with the vocal tract time course in acoustic output. However, perceptual experiments within the Kiel intonation model show that ``early,'' ``medial'' and ``late'' peak alignments need to be defined perceptually and that in doing so microprosodic variation has to be filtered out from the surface signal.
Animal models of speech and vocal communication deficits associated with psychiatric disorders

PubMed Central

Konopka, Genevieve; Roberts, Todd F.

2015-01-01

Disruptions in speech, language and vocal communication are hallmarks of several neuropsychiatric disorders, most notably autism spectrum disorders. Historically, the use of animal models to dissect molecular pathways and connect them to behavioral endophenotypes in cognitive disorders has proven to be an effective approach for developing and testing disease-relevant therapeutics. The unique aspects of human language when compared to vocal behaviors in other animals make such an approach potentially more challenging. However, the study of vocal learning in species with analogous brain circuits to humans may provide entry points for understanding this human-specific phenotype and diseases. Here, we review animal models of vocal learning and vocal communication, and specifically link phenotypes of psychiatric disorders to relevant model systems. Evolutionary constraints in the organization of neural circuits and synaptic plasticity result in similarities in the brain mechanisms for vocal learning and vocal communication. Comparative approaches and careful consideration of the behavioral limitations among different animal models can provide critical avenues for dissecting the molecular pathways underlying cognitive disorders that disrupt speech, language and vocal communication. PMID:26232298
Speech and Communication Disorders

MedlinePlus

... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...
Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension.

PubMed

Howard, Mary F; Poeppel, David

2010-11-01

Speech stimuli give rise to neural activity in the listener that can be observed as waveforms using magnetoencephalography. Although waveforms vary greatly from trial to trial due to activity unrelated to the stimulus, it has been demonstrated that spoken sentences can be discriminated based on theta-band (3-7 Hz) phase patterns in single-trial response waveforms. Furthermore, manipulations of the speech signal envelope and fine structure that reduced intelligibility were found to produce correlated reductions in discrimination performance, suggesting a relationship between theta-band phase patterns and speech comprehension. This study investigates the nature of this relationship, hypothesizing that theta-band phase patterns primarily reflect cortical processing of low-frequency (<40 Hz) modulations present in the acoustic signal and required for intelligibility, rather than processing exclusively related to comprehension (e.g., lexical, syntactic, semantic). Using stimuli that are quite similar to normal spoken sentences in terms of low-frequency modulation characteristics but are unintelligible (i.e., their time-inverted counterparts), we find that discrimination performance based on theta-band phase patterns is equal for both types of stimuli. Consistent with earlier findings, we also observe that whereas theta-band phase patterns differ across stimuli, power patterns do not. We use a simulation model of the single-trial response to spoken sentence stimuli to demonstrate that phase-locked responses to low-frequency modulations of the acoustic signal can account not only for the phase but also for the power results. The simulation offers insight into the interpretation of the empirical results with respect to phase-resetting and power-enhancement models of the evoked response.
Temporal modulations in speech and music.

PubMed

Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

2017-10-01

Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Acoustic and aerodynamic characteristics of the oesophageal voice].

PubMed

Vázquez de la Iglesia, F; Fernández González, S

2005-12-01

The aim of the study is to determine the physiology and pathophisiology of esophageal voice according to objective aerodynamic and acoustic parameters (quantitative and qualitative parameters). Our subjects were comprised of 33 laryngectomized patients (all male) that underwent aerodynamic, acoustic and perceptual protocol. There is a statistical association between acoustic and aerodynamic qualitative parameters (phonation flow chart type, sound spectrum, perceptual analysis) among quantitative parameters (neoglotic pressure, phonation flow, phonation time, fundamental frequency, maximum intensity sound level, speech rate). Nevertheles, not always such observations bring practical resources to clinical practice. We consider that the facts studied may enable us to add, pragmatically, new resources to the more effective vocal rehabilitation to these patients. The physiology of esophageal voice is well understood by the method we have applied, also seeking for rehabilitation, improving oral communication skills in the laryngectomee population.
Effects of reverberation time on the cognitive load in speech communication: theoretical considerations.

PubMed

Kjellberg, A

2004-01-01

The paper presents a theoretical analysis of possible effects of reverberation time on the cognitive load in speech communication. Speech comprehension requires not only phonological processing of the spoken words. Simultaneously, this information must be further processed and stored. All this processing takes place in the working memory, which has a limited processing capacity. The more resources that are allocated to word identification, the fewer resources are therefore left for the further processing and storing of the information. Reverberation conditions that allow the identification of almost all words may therefore still interfere with speech comprehension and memory storing. These problems are likely to be especially serious in situations where speech has to be followed continuously for a long time. An unfavourable reverberation time (RT) then could contribute to the development of cognitive fatigue, which means that working memory resources are gradually reduced. RT may also affect the cognitive load in two other ways: RT may change the distracting effects of a sound and a person's mood. Both effects could influence the cognitive load of a listener. It is argued that we need studies of RT effects in realistic long-lasting listening situations to better understand the effect of RT on speech communication. Furthermore, the effect of RT on distraction and mood need to be better understood.
Prediction and constraint in audiovisual speech perception.

PubMed

Peelle, Jonathan E; Sommers, Mitchell S

2015-07-01

During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration
Asynchronous sampling of speech with some vocoder experimental results

NASA Technical Reports Server (NTRS)

Babcock, M. L.

1972-01-01

The method of asynchronously sampling speech is based upon the derivatives of the acoustical speech signal. The following results are apparent from experiments to date: (1) It is possible to represent speech by a string of pulses of uniform amplitude, where the only information contained in the string is the spacing of the pulses in time; (2) the string of pulses may be produced in a simple analog manner; (3) the first derivative of the original speech waveform is the most important for the encoding process; (4) the resulting pulse train can be utilized to control an acoustical signal production system to regenerate the intelligence of the original speech.
Diversity-based acoustic communication with a glider in deep water.

PubMed

Song, H C; Howe, Bruce M; Brown, Michael G; Andrew, Rex K

2014-03-01

The primary use of underwater gliders is to collect oceanographic data within the water column and periodically relay the data at the surface via a satellite connection. In summer 2006, a Seaglider equipped with an acoustic recording system received transmissions from a broadband acoustic source centered at 75 Hz deployed on the bottom off Kauai, Hawaii, while moving away from the source at ranges up to ∼200 km in deep water and diving up to 1000-m depth. The transmitted signal was an m-sequence that can be treated as a binary-phase shift-keying communication signal. In this letter multiple receptions are exploited (i.e., diversity combining) to demonstrate the feasibility of using the glider as a mobile communication gateway.
Proximate and ultimate aspects of acoustic and multimodal communication in butterflyfishes

NASA Astrophysics Data System (ADS)

Boyle, Kelly S.

Communication in social animals is shaped by natural selection on both sender and receiver. Diurnal butterflyfishes use a combination of visual cues like bright color patterns and motor pattern driven displays, acoustic communication, and olfactory cues that may advertise territorial behavior, facilitate recognition of individuals, and provide cues for courtship. This dissertation examines proximate and multimodal communication in several butterflyfishes, with an emphasis on acoustic communication which has recently garnered attention within the Chaetodontidae. Sound production in the genus Forcipiger involves a novel mechanism with synchronous contractions of opposing head muscles at the onset of sound emission and rapid cranial rotation that lags behind sound emission. Acoustic signals in F. flavissimus provide an accurate indicator of body size, and to a lesser extent cranial rotation velocity and acceleration. The closely related Hemitaurichthys polylepis produces rapid pulse trains of similar duration and spectral content to F. flavissimus, but with a dramatically different mechanism which involves contractions of hypaxial musculature at the anterior end of the swim bladder that occur with synchronous muscle action potentials. Both H. polylepis sonic and hypaxial trunk muscle fibers have triads at the z-line, but sonic fibers have smaller cross-sectional areas, more developed sarcoplasmic reticula, longer sarcomere lengths, and wider t-tubules. Sonic motor neurons are located along a long motor column entirely within the spinal cord and are composed of large and small types. Forcipiger flavissimus and F. longirostris are site attached and territorial, with F. flavissimus engaged in harem polygyny and F. longirostris in social monogamy. Both produce similar pulse sounds to conspecifics during territoriality that vary little with respect to communicative context. Chaetodon multicinctus can discriminate between mates and non-mate intruders, but require combined
Energy scavenging system by acoustic wave and integrated wireless communication

NASA Astrophysics Data System (ADS)

Kim, Albert

The purpose of the project was developing an energy-scavenging device for other bio implantable devices. Researchers and scientist have studied energy scavenging method because of the limitation of traditional power source, especially for bio-implantable devices. In this research, piezoelectric power generator that activates by acoustic wave, or music was developed. Follow by power generator, a wireless communication also integrated with the device for monitoring the power generation. The Lead Zirconate Titanate (PZT) bimorph cantilever with a proof mass at the free end tip was studied to convert acoustic wave to power. The music or acoustic wave played through a speaker to vibrate piezoelectric power generator. The LC circuit integrated with the piezoelectric material for purpose of wireless monitoring power generation. However, wireless monitoring can be used as wireless power transmission, which means the signal received via wireless communication also can be used for power for other devices. Size of 74 by 7 by 7cm device could generate and transmit 100mVp from 70 mm distance away with electrical resonant frequency at 420.2 kHz..
Effectiveness of the Picture Exchange Communication System (PECS) on Communication and Speech for Children with Autism Spectrum Disorders: A Meta-Analysis

ERIC Educational Resources Information Center

Flippin, Michelle; Reszka, Stephanie; Watson, Linda R.

2010-01-01

Purpose: The Picture Exchange Communication System (PECS) is a popular communication-training program for young children with autism spectrum disorders (ASD). This meta-analysis reviews the current empirical evidence for PECS in affecting communication and speech outcomes for children with ASD. Method: A systematic review of the literature on PECS…
Logopenic and Nonfluent Variants of Primary Progressive Aphasia Are Differentiated by Acoustic Measures of Speech Production

PubMed Central

Ballard, Kirrie J.; Savage, Sharon; Leyton, Cristian E.; Vogel, Adam P.; Hornberger, Michael; Hodges, John R.

2014-01-01

Differentiation of logopenic (lvPPA) and nonfluent/agrammatic (nfvPPA) variants of Primary Progressive Aphasia is important yet remains challenging since it hinges on expert based evaluation of speech and language production. In this study acoustic measures of speech in conjunction with voxel-based morphometry were used to determine the success of the measures as an adjunct to diagnosis and to explore the neural basis of apraxia of speech in nfvPPA. Forty-one patients (21 lvPPA, 20 nfvPPA) were recruited from a consecutive sample with suspected frontotemporal dementia. Patients were diagnosed using the current gold-standard of expert perceptual judgment, based on presence/absence of particular speech features during speaking tasks. Seventeen healthy age-matched adults served as controls. MRI scans were available for 11 control and 37 PPA cases; 23 of the PPA cases underwent amyloid ligand PET imaging. Measures, corresponding to perceptual features of apraxia of speech, were periods of silence during reading and relative vowel duration and intensity in polysyllable word repetition. Discriminant function analyses revealed that a measure of relative vowel duration differentiated nfvPPA cases from both control and lvPPA cases (r 2 = 0.47) with 88% agreement with expert judgment of presence of apraxia of speech in nfvPPA cases. VBM analysis showed that relative vowel duration covaried with grey matter intensity in areas critical for speech motor planning and programming: precentral gyrus, supplementary motor area and inferior frontal gyrus bilaterally, only affected in the nfvPPA group. This bilateral involvement of frontal speech networks in nfvPPA potentially affects access to compensatory mechanisms involving right hemisphere homologues. Measures of silences during reading also discriminated the PPA and control groups, but did not increase predictive accuracy. Findings suggest that a measure of relative vowel duration from of a polysyllable word repetition task
Extensions to the Speech Disorders Classification System (SDCS)

PubMed Central

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

This report describes three extensions to a classification system for pediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three subtypes of motor speech disorders. Part II describes the Madison Speech Assessment Protocol (MSAP), an approximately two-hour battery of 25 measures that includes 15 speech tests and tasks. Part III describes the Competence, Precision, and Stability Analytics (CPSA) framework, a current set of approximately 90 perceptual- and acoustic-based indices of speech, prosody, and voice used to quantify and classify subtypes of Speech Sound Disorders (SSD). A companion paper, Shriberg, Fourakis, et al. (2010) provides reliability estimates for the perceptual and acoustic data reduction methods used in the SDCS. The agreement estimates in the companion paper support the reliability of SDCS methods and illustrate the complementary roles of perceptual and acoustic methods in diagnostic analyses of SSD of unknown origin. Examples of research using the extensions to the SDCS described in the present report include diagnostic findings for a sample of youth with motor speech disorders associated with galactosemia (Shriberg, Potter, & Strand, 2010) and a test of the hypothesis of apraxia of speech in a group of children with autism spectrum disorders (Shriberg, Paul, Black, & van Santen, 2010). All SDCS methods and reference databases running in the PEPPER (Programs to Examine Phonetic and Phonologic Evaluation Records; [Shriberg, Allen, McSweeny, & Wilson, 2001]) environment will be disseminated without cost when complete. PMID:20831378
Research on the optoacoustic communication system for speech transmission by variable laser-pulse repetition rates

NASA Astrophysics Data System (ADS)

Jiang, Hongyan; Qiu, Hongbing; He, Ning; Liao, Xin

2018-06-01

For the optoacoustic communication from in-air platforms to submerged apparatus, a method based on speech recognition and variable laser-pulse repetition rates is proposed, which realizes character encoding and transmission for speech. Firstly, the theories and spectrum characteristics of the laser-generated underwater sound are analyzed; and moreover character conversion and encoding for speech as well as the pattern of codes for laser modulation is studied; lastly experiments to verify the system design are carried out. Results show that the optoacoustic system, where laser modulation is controlled by speech-to-character baseband codes, is beneficial to improve flexibility in receiving location for underwater targets as well as real-time performance in information transmission. In the overwater transmitter, a pulse laser is controlled to radiate by speech signals with several repetition rates randomly selected in the range of one to fifty Hz, and then in the underwater receiver laser pulse repetition rate and data can be acquired by the preamble and information codes of the corresponding laser-generated sound. When the energy of the laser pulse is appropriate, real-time transmission for speaker-independent speech can be realized in that way, which solves the problem of underwater bandwidth resource and provides a technical approach for the air-sea communication.
The Experimental Social Scientific Model in Speech Communication Research: Influences and Consequences.

ERIC Educational Resources Information Center

Ferris, Sharmila Pixy

A substantial number of published articles in speech communication research today is experimental/social scientific in nature. It is only in the past decade that scholars have begun to put the history of communication under the lens. Early advocates of the adoption of the method of social scientific inquiry were J. A. Winans, J. M. O'Neill, and C.…

The Status of Ethics Scholarship in Speech Communication Journals from 1915 to 1985.

ERIC Educational Resources Information Center

Arnett, Ronald C.

To examine the theoretical status of ethics scholarship and to explore the historical and present directions of ethics in human communication research, this paper reviews more than 100 articles drawn from the speech communication literature. Following a brief introduction that sets forth the criteria for article selection, the paper discusses…
Business Communication Students Learn to Hear a Bad Speech Habit

ERIC Educational Resources Information Center

Bell, Reginald L.; Liang-Bell, Lei Paula; Deselle, Bettye

2006-01-01

Students were trained to perceive filled pauses (FP) as a bad speech habit. In a series of classroom sensitivity training activities, followed by students being rewarded to observe twenty minutes of live television from the public media, no differences between male and female Business Communication students was revealed. The practice of teaching…
Single Carrier with Frequency Domain Equalization for Synthetic Aperture Underwater Acoustic Communications

PubMed Central

He, Chengbing; Xi, Rui; Wang, Han; Jing, Lianyou; Shi, Wentao; Zhang, Qunfei

2017-01-01

Phase-coherent underwater acoustic (UWA) communication systems typically employ multiple hydrophones in the receiver to achieve spatial diversity gain. However, small underwater platforms can only carry a single transducer which can not provide spatial diversity gain. In this paper, we propose single-carrier with frequency domain equalization (SC-FDE) for phase-coherent synthetic aperture acoustic communications in which a virtual array is generated by the relative motion between the transmitter and the receiver. This paper presents synthetic aperture acoustic communication results using SC-FDE through data collected during a lake experiment in January 2016. The performance of two receiver algorithms is analyzed and compared, including the frequency domain equalizer (FDE) and the hybrid time frequency domain equalizer (HTFDE). The distances between the transmitter and the receiver in the experiment were about 5 km. The bit error rate (BER) and output signal-to-noise ratio (SNR) performances with different receiver elements and transmission numbers were presented. After combining multiple transmissions, error-free reception using a convolution code with a data rate of 8 kbps was demonstrated. PMID:28684683
SPEECH--MAN'S NATURAL COMMUNICATION.

ERIC Educational Resources Information Center

DUDLEY, HOMER; AND OTHERS

SESSION 63 OF THE 1967 INSTITUTE OF ELECTRICAL AND ELECTRONIC ENGINEERS INTERNATIONAL CONVENTION BROUGHT TOGETHER SEVEN DISTINGUISHED MEN WORKING IN FIELDS RELEVANT TO LANGUAGE. THEIR TOPICS INCLUDED ORIGIN AND EVOLUTION OF SPEECH AND LANGUAGE, LANGUAGE AND CULTURE, MAN'S PHYSIOLOGICAL MECHANISMS FOR SPEECH, LINGUISTICS, AND TECHNOLOGY AND…
Fluids and Combustion Facility Acoustic Emissions Controlled by Aggressive Low-Noise Design Process

NASA Technical Reports Server (NTRS)

Cooper, Beth A.; Young, Judith A.

2004-01-01

The Fluids and Combustion Facility (FCF) is a dual-rack microgravity research facility that is being developed by Northrop Grumman Information Technology (NGIT) for the International Space Station (ISS) at the NASA Glenn Research Center. As an on-orbit test bed, FCF will host a succession of experiments in fluid and combustion physics. The Fluids Integrated Rack (FIR) and the Combustion Integrated Rack (CIR) must meet ISS acoustic emission requirements (ref. 1), which support speech communication and hearing-loss-prevention goals for ISS crew. To meet these requirements, the NGIT acoustics team implemented an aggressive low-noise design effort that incorporated frequent acoustic emission testing for all internal noise sources, larger-scale systems, and fully integrated racks (ref. 2). Glenn's Acoustical Testing Laboratory (ref. 3) provided acoustical testing services (see the following photograph) as well as specialized acoustical engineering support as part of the low-noise design process (ref. 4).
Analysis of False Starts in Spontaneous Speech.

ERIC Educational Resources Information Center

O'Shaughnessy, Douglas

A primary difference between spontaneous speech and read speech concerns the use of false starts, where a speaker interrupts the flow of speech to restart his or her utterance. A study examined the acoustic aspects of such restarts in a widely-used speech database, examining approximately 1000 utterances, about 10% of which contained a restart.…
"The Communication Needs and Rights of Mankind", Group 1 Report of the Futuristic Priorities Division of the Speech Communication Association. "Future Communication Technologies; Hardware and Software"; Group 2 Report.

ERIC Educational Resources Information Center

Dance, Frank E. X.; And Others

This paper reports on the Futuristic Priorities Division members' recommendations and priorities concerning the impact of the future on communication and on the speech communication discipline. The recommendations and priorities are listed for two subgroups: The Communication Needs and Rights of Mankind; and Future Communication Technologies:…
Dimension-based statistical learning affects both speech perception and production

PubMed Central

Lehet, Matthew; Holt, Lori L.

2016-01-01

Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more “perceptual weight” and more effectively signal category membership to native listeners. Yet, perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners’ perceptual weights in response to speech that deviates from the norms also affects listeners’ own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, then shifted to an “artificial accent” that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 x VOT correlation, F0 was a less robust cue to voicing in listeners’ own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted. PMID:27666146
Identification of four class emotion from Indonesian spoken language using acoustic and lexical features

NASA Astrophysics Data System (ADS)

Kasyidi, Fatan; Puji Lestari, Dessi

2018-03-01

One of the important aspects in human to human communication is to understand emotion of each party. Recently, interactions between human and computer continues to develop, especially affective interaction where emotion recognition is one of its important components. This paper presents our extended works on emotion recognition of Indonesian spoken language to identify four main class of emotions: Happy, Sad, Angry, and Contentment using combination of acoustic/prosodic features and lexical features. We construct emotion speech corpus from Indonesia television talk show where the situations are as close as possible to the natural situation. After constructing the emotion speech corpus, the acoustic/prosodic and lexical features are extracted to train the emotion model. We employ some machine learning algorithms such as Support Vector Machine (SVM), Naive Bayes, and Random Forest to get the best model. The experiment result of testing data shows that the best model has an F-measure score of 0.447 by using only the acoustic/prosodic feature and F-measure score of 0.488 by using both acoustic/prosodic and lexical features to recognize four class emotion using the SVM RBF Kernel.
Spectral identification of sperm whales from Littoral Acoustic Demonstration Center passive acoustic recordings

NASA Astrophysics Data System (ADS)

Sidorovskaia, Natalia A.; Richard, Blake; Ioup, George E.; Ioup, Juliette W.

2005-09-01

The Littoral Acoustic Demonstration Center (LADC) made a series of passive broadband acoustic recordings in the Gulf of Mexico and Ligurian Sea to study noise and marine mammal phonations. The collected data contain a large amount of various types of sperm whale phonations, such as isolated clicks and communication codas. It was previously reported that the spectrograms of the extracted clicks and codas contain well-defined null patterns that seem to be unique for individuals. The null pattern is formed due to individual features of the sound production organs of an animal. These observations motivated the present studies of adapting human speech identification techniques for deep-diving marine mammal phonations. A three-state trained hidden Markov model (HMM) was used with the phonation spectra of sperm whales. The HHM-algorithm gave 75% accuracy in identifying individuals when it had been initially tested for the acoustic data set correlated with visual observations of sperm whales. A comparison of the identification accuracy based on null-pattern similarity analysis and the HMM-algorithm is presented. The results can establish the foundation for developing an acoustic identification database for sperm whales and possibly other deep-diving marine mammals that would be difficult to observe visually. [Research supported by ONR.
Deep Brain Stimulation of the Subthalamic Nucleus Parameter Optimization for Vowel Acoustics and Speech Intelligibility in Parkinson's Disease

ERIC Educational Resources Information Center

Knowles, Thea; Adams, Scott; Abeyesekera, Anita; Mancinelli, Cynthia; Gilmore, Greydon; Jog, Mandar

2018-01-01

Purpose: The settings of 3 electrical stimulation parameters were adjusted in 12 speakers with Parkinson's disease (PD) with deep brain stimulation of the subthalamic nucleus (STN-DBS) to examine their effects on vowel acoustics and speech intelligibility. Method: Participants were tested under permutations of low, mid, and high STN-DBS frequency,…
Cortical Tracking of Global and Local Variations of Speech Rhythm during Connected Natural Speech Perception.

PubMed

Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

2018-06-19

During natural speech perception, listeners must track the global speaking rate, that is, the overall rate of incoming linguistic information, as well as transient, local speaking rate variations occurring within the global speaking rate. Here, we address the hypothesis that this tracking mechanism is achieved through coupling of cortical signals to the amplitude envelope of the perceived acoustic speech signals. Cortical signals were recorded with magnetoencephalography (MEG) while participants perceived spontaneously produced speech stimuli at three global speaking rates (slow, normal/habitual, and fast). Inherently to spontaneously produced speech, these stimuli also featured local variations in speaking rate. The coupling between cortical and acoustic speech signals was evaluated using audio-MEG coherence. Modulations in audio-MEG coherence spatially differentiated between tracking of global speaking rate, highlighting the temporal cortex bilaterally and the right parietal cortex, and sensitivity to local speaking rate variations, emphasizing the left parietal cortex. Cortical tuning to the temporal structure of natural connected speech thus seems to require the joint contribution of both auditory and parietal regions. These findings suggest that cortical tuning to speech rhythm operates on two functionally distinct levels: one encoding the global rhythmic structure of speech and the other associated with online, rapidly evolving temporal predictions. Thus, it may be proposed that speech perception is shaped by evolutionary tuning, a preference for certain speaking rates, and predictive tuning, associated with cortical tracking of the constantly changing rate of linguistic information in a speech stream.
Integrating Music Therapy Services and Speech-Language Therapy Services for Children with Severe Communication Impairments: A Co-Treatment Model

ERIC Educational Resources Information Center

Geist, Kamile; McCarthy, John; Rodgers-Smith, Amy; Porter, Jessica

2008-01-01

Documenting how music therapy can be integrated with speech-language therapy services for children with communication delay is not evident in the literature. In this article, a collaborative model with procedures, experiences, and communication outcomes of integrating music therapy with the existing speech-language services is given. Using…
Study on Gender-Related Speech Communication in Classical Chinese Poetry

ERIC Educational Resources Information Center

Tian, Xinhe; Qin, Dandan

2016-01-01

Gender, formed in men and women's growth which is constrained by social context, is tightly tied to the distinction which is presented in the process of men and women's language use. Hence, it's a new breakthrough for studies on gender and difference by analyzing gender-related speech communication on the background of ancient Chinese culture.
Primate vocal communication: a useful tool for understanding human speech and language evolution?

PubMed

Fedurek, Pawel; Slocombe, Katie E

2011-04-01

Language is a uniquely human trait, and questions of how and why it evolved have been intriguing scientists for years. Nonhuman primates (primates) are our closest living relatives, and their behavior can be used to estimate the capacities of our extinct ancestors. As humans and many primate species rely on vocalizations as their primary mode of communication, the vocal behavior of primates has been an obvious target for studies investigating the evolutionary roots of human speech and language. By studying the similarities and differences between human and primate vocalizations, comparative research has the potential to clarify the evolutionary processes that shaped human speech and language. This review examines some of the seminal and recent studies that contribute to our knowledge regarding the link between primate calls and human language and speech. We focus on three main aspects of primate vocal behavior: functional reference, call combinations, and vocal learning. Studies in these areas indicate that despite important differences, primate vocal communication exhibits some key features characterizing human language. They also indicate, however, that some critical aspects of speech, such as vocal plasticity, are not shared with our primate cousins. We conclude that comparative research on primate vocal behavior is a very promising tool for deepening our understanding of the evolution of human speech and language, but much is still to be done as many aspects of monkey and ape vocalizations remain largely unexplored.
The Effectiveness of Clear Speech as a Masker

ERIC Educational Resources Information Center

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Neurophysiological Influence of Musical Training on Speech Perception

PubMed Central

Shahin, Antoine J.

2011-01-01

Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL. PMID:21716639
Neurophysiological influence of musical training on speech perception.

PubMed

Shahin, Antoine J

2011-01-01

Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL.
Female voice communications in high level aircraft cockpit noises--part II: vocoder and automatic speech recognition systems.

PubMed

Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M

1998-11-01

The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.
Lexical effects on speech production and intelligibility in Parkinson's disease

NASA Astrophysics Data System (ADS)

Chiu, Yi-Fang

Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the

Evidence-Based Occupational Hearing Screening I: Modeling the Effects of Real-World Noise Environments on the Likelihood of Effective Speech Communication.

PubMed

Soli, Sigfrid D; Giguère, Christian; Laroche, Chantal; Vaillancourt, Véronique; Dreschler, Wouter A; Rhebergen, Koenraad S; Harkins, Kevin; Ruckstuhl, Mark; Ramulu, Pradeep; Meyers, Lawrence S

The objectives of this study were to (1) identify essential hearing-critical job tasks for public safety and law enforcement personnel; (2) determine the locations and real-world noise environments where these tasks are performed; (3) characterize each noise environment in terms of its impact on the likelihood of effective speech communication, considering the effects of different levels of vocal effort, communication distances, and repetition; and (4) use this characterization to define an objective normative reference for evaluating the ability of individuals to perform essential hearing-critical job tasks in noisy real-world environments. Data from five occupational hearing studies performed over a 17-year period for various public safety agencies were analyzed. In each study, job task analyses by job content experts identified essential hearing-critical tasks and the real-world noise environments where these tasks are performed. These environments were visited, and calibrated recordings of each noise environment were made. The extended speech intelligibility index (ESII) was calculated for each 4-sec interval in each recording. These data, together with the estimated ESII value required for effective speech communication by individuals with normal hearing, allowed the likelihood of effective speech communication in each noise environment for different levels of vocal effort and communication distances to be determined. These likelihoods provide an objective norm-referenced and standardized means of characterizing the predicted impact of real-world noise on the ability to perform essential hearing-critical tasks. A total of 16 noise environments for law enforcement personnel and eight noise environments for corrections personnel were analyzed. Effective speech communication was essential to hearing-critical tasks performed in these environments. Average noise levels, ranged from approximately 70 to 87 dBA in law enforcement environments and 64 to 80 dBA in
Phrase-level speech simulation with an airway modulation model of speech production

PubMed Central

Story, Brad H.

2012-01-01

Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated. PMID:23503742
The development of co-speech gesture in the communication of children with autism spectrum disorders.

PubMed

Sowden, Hannah; Clegg, Judy; Perkins, Michael

2013-12-01

Co-speech gestures have a close semantic relationship to speech in adult conversation. In typically developing children co-speech gestures which give additional information to speech facilitate the emergence of multi-word speech. A difficulty with integrating audio-visual information is known to exist for individuals with Autism Spectrum Disorder (ASD), which may affect development of the speech-gesture system. A longitudinal observational study was conducted with four children with ASD, aged 2;4 to 3;5 years. Participants were video-recorded for 20 min every 2 weeks during their attendance on an intervention programme. Recording continued for up to 8 months, thus affording a rich analysis of gestural practices from pre-verbal to multi-word speech across the group. All participants combined gesture with either speech or vocalisations. Co-speech gestures providing additional information to speech were observed to be either absent or rare. Findings suggest that children with ASD do not make use of the facilitating communicative effects of gesture in the same way as typically developing children.
[Vocal effectiveness in speech and singing: acoustical, physiological and perceptive aspects. applications in speech therapy].

PubMed

Pillot, C; Vaissière, J

2006-01-01

What is vocal effectiveness in lyrical singing in comparison to speech? Our study tries to answer this question, using vocal efficiency and spectral vocal effectiveness. Vocal efficiency was mesured for a trained and untrained subject. According to these invasive measures, it appears that the trained singer uses her larynx less efficiently. Efficiency of the larynx in terms of energy then appears to be secondary to the desired voice quality. The acoustic measures of spectral vocal effectiveness of vowels and sentences, spoken and sung by 23 singers, reveal two complementary markers: The "singing power ratio" and the difference in amplitude between the singing formant and the spectral minimum that follows it. Magnetic resonance imaging and simulations of [a], [i] and [o] spoken and sung show laryngeal lowering and the role of the piriform sinuses as the physiological foundations of spectral vocal effectiveness, perceptively related to carrying power. These scientifical aspects allow applications in voice therapy, such as physiological and perceptual foundations allowing patients to recuperate voice carrying power with or without background noise.
Articulatory-to-Acoustic Relations in Response to Speaking Rate and Loudness Manipulations

ERIC Educational Resources Information Center

Mefferd, Antje S.; Green, Jordan R.

2010-01-01

Purpose: In this investigation, the authors determined the strength of association between tongue kinematic and speech acoustics changes in response to speaking rate and loudness manipulations. Performance changes in the kinematic and acoustic domains were measured using two aspects of speech production presumably affecting speech clarity:…
Performance of a low data rate speech codec for land-mobile satellite communications

NASA Technical Reports Server (NTRS)

Gersho, Allen; Jedrey, Thomas C.

1990-01-01

In an effort to foster the development of new technologies for the emerging land mobile satellite communications services, JPL funded two development contracts in 1984: one to the Univ. of Calif., Santa Barbara and the other to the Georgia Inst. of Technology, to develop algorithms and real time hardware for near toll quality speech compression at 4800 bits per second. Both universities have developed and delivered speech codecs to JPL, and the UCSB codec was extensively tested by JPL in a variety of experimental setups. The basic UCSB speech codec algorithms and the test results of the various experiments performed with this codec are presented.
A survey of acoustic conditions in semi-open plan classrooms in the United Kingdom.

PubMed

Greenland, Emma E; Shield, Bridget M

2011-09-01

This paper reports the results of a large scale, detailed acoustic survey of 42 open plan classrooms of varying design in the UK each of which contained between 2 and 14 teaching areas or classbases. The objective survey procedure, which was designed specifically for use in open plan classrooms, is described. The acoustic measurements relating to speech intelligibility within a classbase, including ambient noise level, intrusive noise level, speech to noise ratio, speech transmission index, and reverberation time, are presented. The effects on speech intelligibility of critical physical design variables, such as the number of classbases within an open plan unit and the selection of acoustic finishes for control of reverberation, are examined. This analysis enables limitations of open plan classrooms to be discussed and acoustic design guidelines to be developed to ensure good listening conditions. The types of teaching activity to provide adequate acoustic conditions, plus the speech intelligibility requirements of younger children, are also discussed. © 2011 Acoustical Society of America
Culturally/Linguistically Different Children: Report Writing Guidelines for Speech-Language Pathologists [and] Summary of Project Communicate.

ERIC Educational Resources Information Center

Schloff, Rose-Laurie; Martinez, Silvia

Guidelines for writing assessments of the English language skills of minority, bilingual, preschool and elementary school children are presented for monolingual speech-language pathologists. In addition, a project (Project Communicate) providing direct client services and training of speech-language pathologists is briefly described. With regard…
Formant trajectory characteristics in speakers with dysarthria and homogeneous speech intelligibility scores: Further data

NASA Astrophysics Data System (ADS)

Kim, Yunjung; Weismer, Gary; Kent, Ray D.

2005-09-01

In previous work [J. Acoust. Soc. Am. 117, 2605 (2005)], we reported on formant trajectory characteristics of a relatively large number of speakers with dysarthria and near-normal speech intelligibility. The purpose of that analysis was to begin a documentation of the variability, within relatively homogeneous speech-severity groups, of acoustic measures commonly used to predict across-speaker variation in speech intelligibility. In that study we found that even with near-normal speech intelligibility (90%-100%), many speakers had reduced formant slopes for some words and distributional characteristics of acoustic measures that were different than values obtained from normal speakers. In the current report we extend those findings to a group of speakers with dysarthria with somewhat poorer speech intelligibility than the original group. Results are discussed in terms of the utility of certain acoustic measures as indices of speech intelligibility, and as explanatory data for theories of dysarthria. [Work supported by NIH Award R01 DC00319.
Impact of Aberrant Acoustic Properties on the Perception of Sound Quality in Electrolarynx Speech

ERIC Educational Resources Information Center

Meltzner, Geoffrey S.; Hillman, Robert E.

2005-01-01

A large percentage of patients who have undergone laryngectomy to treat advanced laryngeal cancer rely on an electrolarynx (EL) to communicate verbally. Although serviceable, EL speech is plagued by shortcomings in both sound quality and intelligibility. This study sought to better quantify the relative contributions of previously identified…
Speech-to-Speech Relay Service

MedlinePlus

... are specifically trained in understanding a variety of speech disorders, which enables them to repeat what the caller says in a manner that makes the caller’s words clear and understandable to the ... people with speech disabilities cannot communicate by telephone because the parties ...
[Communication and noise. Speech intelligibility of airplane pilots with and without active noise compensation].

PubMed

Matschke, R G

1994-08-01

Noise exposure measurements were performed with pilots of the German Federal Navy during flight situations. The ambient noise levels during regular flight were maintained at levels above a 90 dB A-weighted level. This noise intensity requires wearing ear protection to avoid sound-induced hearing loss. To be able to understand radio communication (ATC) in spite of a noisy environment, headphone volume must be raised above the noise of the engines. The use of ear plugs in addition to the headsets and flight helmets is only of limited value because personal ear protection affects the intelligibility of ATC. Whereas speech intelligibility of pilots with normal hearing is affected to only a smaller degree, pilots with pre-existing high-frequency hearing losses show substantial impairments of speech intelligibility that vary in proportion to the hearing deficit present. Communication abilities can be reduced drastically, which in turn can affect air traffic security. The development of active noise compensation devices (ANC) that make use of the "anti-noise" principle may be a solution to this dilemma. To evaluate the effectiveness of an ANC-system and its influence on speech intelligibility, speech audiometry was performed with a German standardized test during simulated flight conditions with helicopter pilots. Results demonstrate the helpful effect on speech understanding especially for pilots with noise-induced hearing losses. This may help to avoid pre-retirement professional disability.
Speech Music Discrimination Using Class-Specific Features

DTIC Science & Technology

2004-08-01

Speech Music Discrimination Using Class-Specific Features Thomas Beierholm...between speech and music . Feature extraction is class-specific and can therefore be tailored to each class meaning that segment size, model orders...interest. Some of the applications of audio signal classification are speech/ music classification [1], acoustical environmental classification [2][3
Research on Localization Algorithms Based on Acoustic Communication for Underwater Sensor Networks

PubMed Central

Fan, Liying; Wu, Shan; Yan, Xueting

2017-01-01

The water source, as a significant body of the earth, with a high value, serves as a hot topic to study Underwater Sensor Networks (UWSNs). Various applications can be realized based on UWSNs. Our paper mainly concentrates on the localization algorithms based on the acoustic communication for UWSNs. An in-depth survey of localization algorithms is provided for UWSNs. We first introduce the acoustic communication, network architecture, and routing technique in UWSNs. The localization algorithms are classified into five aspects, namely, computation algorithm, spatial coverage, range measurement, the state of the nodes and communication between nodes that are different from all other survey papers. Moreover, we collect a lot of pioneering papers, and a comprehensive comparison is made. In addition, some challenges and open issues are raised in our paper. PMID:29301369
Tutorial on architectural acoustics

NASA Astrophysics Data System (ADS)

Shaw, Neil; Talaske, Rick; Bistafa, Sylvio

2002-11-01

This tutorial is intended to provide an overview of current knowledge and practice in architectural acoustics. Topics covered will include basic concepts and history, acoustics of small rooms (small rooms for speech such as classrooms and meeting rooms, music studios, small critical listening spaces such as home theatres) and the acoustics of large rooms (larger assembly halls, auditoria, and performance halls).
Neural Oscillations Carry Speech Rhythm through to Comprehension

PubMed Central

Peelle, Jonathan E.; Davis, Matthew H.

2012-01-01

A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners’ processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging – particularly electroencephalography (EEG) and magnetoencephalography (MEG) – point to phase locking by ongoing cortical oscillations to low-frequency information (~4–8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain. PMID:22973251
Classifying acoustic signals into phoneme categories: average and dyslexic readers make use of complex dynamical patterns and multifractal scaling properties of the speech signal

PubMed Central

2015-01-01

Several competing aetiologies of developmental dyslexia suggest that the problems with acquiring literacy skills are causally entailed by low-level auditory and/or speech perception processes. The purpose of this study is to evaluate the diverging claims about the specific deficient peceptual processes under conditions of strong inference. Theoretically relevant acoustic features were extracted from a set of artificial speech stimuli that lie on a /bAk/-/dAk/ continuum. The features were tested on their ability to enable a simple classifier (Quadratic Discriminant Analysis) to reproduce the observed classification performance of average and dyslexic readers in a speech perception experiment. The ‘classical’ features examined were based on component process accounts of developmental dyslexia such as the supposed deficit in Envelope Rise Time detection and the deficit in the detection of rapid changes in the distribution of energy in the frequency spectrum (formant transitions). Studies examining these temporal processing deficit hypotheses do not employ measures that quantify the temporal dynamics of stimuli. It is shown that measures based on quantification of the dynamics of complex, interaction-dominant systems (Recurrence Quantification Analysis and the multifractal spectrum) enable QDA to classify the stimuli almost identically as observed in dyslexic and average reading participants. It seems unlikely that participants used any of the features that are traditionally associated with accounts of (impaired) speech perception. The nature of the variables quantifying the temporal dynamics of the speech stimuli imply that the classification of speech stimuli cannot be regarded as a linear aggregate of component processes that each parse the acoustic signal independent of one another, as is assumed by the ‘classical’ aetiologies of developmental dyslexia. It is suggested that the results imply that the differences in speech perception performance between
Report of the Research Priorities Division of the Speech Communication Association.

ERIC Educational Resources Information Center

Bitzer, Lloyd F.; And Others

A wide variety of topics are discussed in relation to research needs and classified in relation to problem areas, decision-making areas, and recommendations. Areas under discussion include an examination of the decision-making structure of the Speech Communication Association, criteria by which decisions can be evaluated, conceptualizing the…
An international perspective: supporting adolescents with speech, language, and communication needs in the United Kingdom.

PubMed

Joffe, Victoria

2015-02-01

This article provides an overview of the education system in the United Kingdom, with a particular focus on the secondary school context and supporting older children and young people with speech, language, and communication needs (SLCNs). Despite the pervasive nature of speech, language, and communication difficulties and their long-term impact on academic performance, mental health, and well-being, evidence suggests that there is limited support to older children and young people with SLCNs in the United Kingdom, relative to what is available in the early years. Focus in secondary schools is predominantly on literacy, with little attention to supporting oral language. The article provides a synopsis of the working practices of pediatric speech and language therapists working with adolescents in the United Kingdom and the type and level of speech and language therapy support provided for older children and young people with SLCNs in secondary and further education. Implications for the nature and type of specialist support to adolescents and adults with SLCNs are discussed. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Human emotions track changes in the acoustic environment.

PubMed

Ma, Weiyi; Thompson, William Forde

2015-11-24

Emotional responses to biologically significant events are essential for human survival. Do human emotions lawfully track changes in the acoustic environment? Here we report that changes in acoustic attributes that are well known to interact with human emotions in speech and music also trigger systematic emotional responses when they occur in environmental sounds, including sounds of human actions, animal calls, machinery, or natural phenomena, such as wind and rain. Three changes in acoustic attributes known to signal emotional states in speech and music were imposed upon 24 environmental sounds. Evaluations of stimuli indicated that human emotions track such changes in environmental sounds just as they do for speech and music. Such changes not only influenced evaluations of the sounds themselves, they also affected the way accompanying facial expressions were interpreted emotionally. The findings illustrate that human emotions are highly attuned to changes in the acoustic environment, and reignite a discussion of Charles Darwin's hypothesis that speech and music originated from a common emotional signal system based on the imitation and modification of environmental sounds.

Human emotions track changes in the acoustic environment

PubMed Central

Ma, Weiyi; Thompson, William Forde

2015-01-01

Emotional responses to biologically significant events are essential for human survival. Do human emotions lawfully track changes in the acoustic environment? Here we report that changes in acoustic attributes that are well known to interact with human emotions in speech and music also trigger systematic emotional responses when they occur in environmental sounds, including sounds of human actions, animal calls, machinery, or natural phenomena, such as wind and rain. Three changes in acoustic attributes known to signal emotional states in speech and music were imposed upon 24 environmental sounds. Evaluations of stimuli indicated that human emotions track such changes in environmental sounds just as they do for speech and music. Such changes not only influenced evaluations of the sounds themselves, they also affected the way accompanying facial expressions were interpreted emotionally. The findings illustrate that human emotions are highly attuned to changes in the acoustic environment, and reignite a discussion of Charles Darwin’s hypothesis that speech and music originated from a common emotional signal system based on the imitation and modification of environmental sounds. PMID:26553987
Dimension-Based Statistical Learning Affects Both Speech Perception and Production.

PubMed

Lehet, Matthew; Holt, Lori L

2017-04-01

Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more "perceptual weight" and more effectively signal category membership to native listeners. Yet perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners' perceptual weights in response to speech that deviates from the norms also affects listeners' own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, and then shifted to an "artificial accent" that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 × VOT correlation, F0 was a less robust cue to voicing in listeners' own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted. Copyright © 2016 Cognitive Science Society, Inc.
Imaging for understanding speech communication: Advances and challenges

NASA Astrophysics Data System (ADS)

Narayanan, Shrikanth

2005-04-01

Research in speech communication has relied on a variety of instrumentation methods to illuminate details of speech production and perception. One longstanding challenge has been the ability to examine real-time changes in the shaping of the vocal tract; a goal that has been furthered by imaging techniques such as ultrasound, movement tracking, and magnetic resonance imaging. The spatial and temporal resolution afforded by these techniques, however, has limited the scope of the investigations that could be carried out. In this talk, we focus on some recent advances in magnetic resonance imaging that allow us to perform near real-time investigations on the dynamics of vocal tract shaping during speech. Examples include Demolin et al. (2000) (4-5 images/second, ultra-fast turbo spin echo) and Mady et al. (2001,2002) (8 images/second, T1 fast gradient echo). A recent study by Narayanan et al. (2004) that used a spiral readout scheme to accelerate image acquisition has allowed for image reconstruction rates of 24 images/second. While these developments offer exciting prospects, a number of challenges lie ahead, including: (1) improving image acquisition protocols, hardware for enhancing signal-to-noise ratio, and optimizing spatial sampling; (2) acquiring quality synchronized audio; and (3) analyzing and modeling image data including cross-modality registration. [Work supported by NIH and NSF.
Auditory perception bias in speech imitation

PubMed Central

Postma-Nilsenová, Marie; Postma, Eric

2013-01-01

In an experimental study, we explored the role of auditory perception bias in vocal pitch imitation. Psychoacoustic tasks involving a missing fundamental indicate that some listeners are attuned to the relationship between all the higher harmonics present in the signal, which supports their perception of the fundamental frequency (the primary acoustic correlate of pitch). Other listeners focus on the lowest harmonic constituents of the complex sound signal which may hamper the perception of the fundamental. These two listener types are referred to as fundamental and spectral listeners, respectively. We hypothesized that the individual differences in speakers' capacity to imitate F0 found in earlier studies, may at least partly be due to the capacity to extract information about F0 from the speech signal. Participants' auditory perception bias was determined with a standard missing fundamental perceptual test. Subsequently, speech data were collected in a shadowing task with two conditions, one with a full speech signal and one with high-pass filtered speech above 300 Hz. The results showed that perception bias toward fundamental frequency was related to the degree of F0 imitation. The effect was stronger in the condition with high-pass filtered speech. The experimental outcomes suggest advantages for fundamental listeners in communicative situations where F0 imitation is used as a behavioral cue. Future research needs to determine to what extent auditory perception bias may be related to other individual properties known to improve imitation, such as phonetic talent. PMID:24204361
Rating, ranking, and understanding acoustical quality in university classrooms

NASA Astrophysics Data System (ADS)

Hodgson, Murray

2002-08-01

Nonoptimal classroom acoustical conditions directly affect speech perception and, thus, learning by students. Moreover, they may lead to voice problems for the instructor, who is forced to raise his/her voice when lecturing to compensate for poor acoustical conditions. The project applied previously developed simplified methods to predict speech intelligibility in occupied classrooms from measurements in unoccupied and occupied university classrooms. The methods were used to predict the speech intelligibility at various positions in 279 University of British Columbia (UBC) classrooms, when 70% occupied, and for four instructor voice levels. Classrooms were classified and rank ordered by acoustical quality, as determined by the room-average speech intelligibility. This information was used by UBC to prioritize classrooms for renovation. Here, the statistical results are reported to illustrate the range of acoustical qualities found at a typical university. Moreover, the variations of quality with relevant classroom acoustical parameters were studied to better understand the results. In particular, the factors leading to the best and worst conditions were studied. It was found that 81% of the 279 classrooms have "good," "very good," or "excellent" acoustical quality with a "typical" (average-male) instructor. However, 50 (18%) of the classrooms had "fair" or "poor" quality, and two had "bad" quality, due to high ventilation-noise levels. Most rooms were "very good" or "excellent" at the front, and "good" or "very good" at the back. Speech quality varied strongly with the instructor voice level. In the worst case considered, with a quiet female instructor, most of the classrooms were "bad" or "poor." Quality also varies with occupancy, with decreased occupancy resulting in decreased quality. The research showed that a new classroom acoustical design and renovation should focus on limiting background noise. They should promote high instructor speech levels at the back
Speech Disfluency-dependent Amygdala Activity in Adults Who Stutter: Neuroimaging of Interpersonal Communication in MRI Scanner Environment.

PubMed

Toyomura, Akira; Fujii, Tetsunoshin; Yokosawa, Koichi; Kuriki, Shinya

2018-03-15

Affective states, such as anticipatory anxiety, critically influence speech communication behavior in adults who stutter. However, there is currently little evidence regarding the involvement of the limbic system in speech disfluency during interpersonal communication. We designed this neuroimaging study and experimental procedure to sample neural activity during interpersonal communication between human participants, and to investigate the relationship between the amygdala activity and speech disfluency. Participants were required to engage in live communication with a stranger of the opposite sex in the MRI scanner environment. In the gaze condition, the stranger gazed at the participant without speaking, while in the live conversation condition, the stranger asked questions that the participant was required to answer. The stranger continued to gaze silently at the participant while the participant answered. Adults who stutter reported significantly higher discomfort than fluent controls during the experiment. Activity in the right amygdala, a key anatomical region in the limbic system involved in emotion, was significantly correlated with stuttering occurrences in adults who stutter. Right amygdala activity from pooled data of all participants also showed a significant correlation with discomfort level during the experiment. Activity in the prefrontal cortex, which forms emotion regulation neural circuitry with the amygdala, was decreased in adults who stutter than in fluent controls. This is the first study to demonstrate that amygdala activity during interpersonal communication is involved in disfluent speech in adults who stutter. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Phase-Locked Responses to Speech in Human Auditory Cortex are Enhanced During Comprehension

PubMed Central

Peelle, Jonathan E.; Gross, Joachim; Davis, Matthew H.

2013-01-01

A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners’ ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction. PMID:22610394
Phase-locked responses to speech in human auditory cortex are enhanced during comprehension.

PubMed

Peelle, Jonathan E; Gross, Joachim; Davis, Matthew H

2013-06-01

A growing body of evidence shows that ongoing oscillations in auditory cortex modulate their phase to match the rhythm of temporally regular acoustic stimuli, increasing sensitivity to relevant environmental cues and improving detection accuracy. In the current study, we test the hypothesis that nonsensory information provided by linguistic content enhances phase-locked responses to intelligible speech in the human brain. Sixteen adults listened to meaningful sentences while we recorded neural activity using magnetoencephalography. Stimuli were processed using a noise-vocoding technique to vary intelligibility while keeping the temporal acoustic envelope consistent. We show that the acoustic envelopes of sentences contain most power between 4 and 7 Hz and that it is in this frequency band that phase locking between neural activity and envelopes is strongest. Bilateral oscillatory neural activity phase-locked to unintelligible speech, but this cerebro-acoustic phase locking was enhanced when speech was intelligible. This enhanced phase locking was left lateralized and localized to left temporal cortex. Together, our results demonstrate that entrainment to connected speech does not only depend on acoustic characteristics, but is also affected by listeners' ability to extract linguistic information. This suggests a biological framework for speech comprehension in which acoustic and linguistic cues reciprocally aid in stimulus prediction.
Sensitivity to Structure in the Speech Signal by Children with Speech Sound Disorder and Reading Disability

PubMed Central

Johnson, Erin Phinney; Pennington, Bruce F.; Lowenstein, Joanna H.; Nittrouer, Susan

2011-01-01

Purpose Children with speech sound disorder (SSD) and reading disability (RD) have poor phonological awareness, a problem believed to arise largely from deficits in processing the sensory information in speech, specifically individual acoustic cues. However, such cues are details of acoustic structure. Recent theories suggest that listeners also need to be able to integrate those details to perceive linguistically relevant form. This study examined abilities of children with SSD, RD, and SSD+RD not only to process acoustic cues but also to recover linguistically relevant form from the speech signal. Method Ten- to 11-year-olds with SSD (n = 17), RD (n = 16), SSD+RD (n = 17), and Controls (n = 16) were tested to examine their sensitivity to (1) voice onset times (VOT); (2) spectral structure in fricative-vowel syllables; and (3) vocoded sentences. Results Children in all groups performed similarly with VOT stimuli, but children with disorders showed delays on other tasks, although the specifics of their performance varied. Conclusion Children with poor phonemic awareness not only lack sensitivity to acoustic details, but are also less able to recover linguistically relevant forms. This is contrary to one of the main current theories of the relation between spoken and written language development. PMID:21329941
Out-of-synchrony speech entrainment in developmental dyslexia.

PubMed

Molinaro, Nicola; Lizarazu, Mikel; Lallier, Marie; Bourguignon, Mathieu; Carreiras, Manuel

2016-08-01

Developmental dyslexia is a reading disorder often characterized by reduced awareness of speech units. Whether the neural source of this phonological disorder in dyslexic readers results from the malfunctioning of the primary auditory system or damaged feedback communication between higher-order phonological regions (i.e., left inferior frontal regions) and the auditory cortex is still under dispute. Here we recorded magnetoencephalographic (MEG) signals from 20 dyslexic readers and 20 age-matched controls while they were listening to ∼10-s-long spoken sentences. Compared to controls, dyslexic readers had (1) an impaired neural entrainment to speech in the delta band (0.5-1 Hz); (2) a reduced delta synchronization in both the right auditory cortex and the left inferior frontal gyrus; and (3) an impaired feedforward functional coupling between neural oscillations in the right auditory cortex and the left inferior frontal regions. This shows that during speech listening, individuals with developmental dyslexia present reduced neural synchrony to low-frequency speech oscillations in primary auditory regions that hinders higher-order speech processing steps. The present findings, thus, strengthen proposals assuming that improper low-frequency acoustic entrainment affects speech sampling. This low speech-brain synchronization has the strong potential to cause severe consequences for both phonological and reading skills. Interestingly, the reduced speech-brain synchronization in dyslexic readers compared to normal readers (and its higher-order consequences across the speech processing network) appears preserved through the development from childhood to adulthood. Thus, the evaluation of speech-brain synchronization could possibly serve as a diagnostic tool for early detection of children at risk of dyslexia. Hum Brain Mapp 37:2767-2783, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
The development of pointing perception in infancy: effects of communicative signals on covert shifts of attention.

PubMed

Daum, Moritz M; Ulber, Julia; Gredebäck, Gustaf

2013-10-01

The present study aims to investigate the interplay of verbal and nonverbal communication with respect to infants' perception of pointing gestures. Infants were presented with still images of pointing hands (cue) in combination with an acoustic stimulus. The communicative content of this acoustic stimulus was varied from being human and communicative to artificial. Saccadic reaction times (SRTs) from the cue to a peripheral target were measured as an indicator of the modulation of covert attention. A significant cueing effect (facilitated SRTs for congruent compared with incongruent trials) was only present in a condition with additional communicative and referential speech. In addition, the size of the cueing effect increased the more human and communicative the acoustic stimulus was. This indicates a beneficial effect of verbal communication on the perception of nonverbal communicative pointing gestures, emphasizing the important role of verbal communication in facilitating social understanding across domains. These findings additionally suggest that human and communicative (ostensive) signals are not qualitatively different from other less social signals but just quantitatively the most attention grabbing among a number of other signals.
Using the picture exchange communication system (PECS) with children with autism: assessment of PECS acquisition, speech, social-communicative behavior, and problem behavior.

PubMed

Charlop-Christy, Marjorie H; Carpenter, Michael; Le, Loc; LeBlanc, Linda A; Kellet, Kristen

2002-01-01

The picture exchange communication system (PECS) is an augmentative communication system frequently used with children with autism (Bondy & Frost, 1994; Siegel, 2000; Yamall, 2000). Despite its common clinical use, no well-controlled empirical investigations have been conducted to test the effectiveness of PECS. Using a multiple baseline design, the present study examined the acquisition of PECS with 3 children with autism. In addition, the study examined the effects of PECS training on the emergence of speech in play and academic settings. Ancillary measures of social-communicative behaviors and problem behaviors were recorded. Results indicated that all 3 children met the learning criterion for PECS and showed concomitant increases in verbal speech. Ancillary gains were associated with increases in social-communicative behaviors and decreases in problem behaviors. The results are discussed in terms of the provision of empirical support for PECS as well as the concomitant positive side effects of its use.
Resonant acoustic transducer and driver system for a well drilling string communication system

DOEpatents

Chanson, Gary J.; Nicolson, Alexander M.

1981-01-01

The acoustic data communication system includes an acoustic transmitter and receiver wherein low frequency acoustic waves, propagating in relatively loss free manner in well drilling string piping, are efficiently coupled to the drill string and propagate at levels competitive with the levels of noise generated by drilling machinery also present in the drill string. The transmitting transducer incorporates a mass-spring piezoelectric transmitter and amplifier combination that permits self-oscillating resonant operation in the desired low frequency range.
Call recognition and individual identification of fish vocalizations based on automatic speech recognition: An example with the Lusitanian toadfish.

PubMed

Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C

2015-12-01

The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Top-down Processes in Simulated Electric-Acoustic Hearing: The Effect of Linguistic Context on Bimodal Benefit for Temporally Interrupted Speech

PubMed Central

Oh, Soo Hee; Donaldson, Gail S.; Kong, Ying-Yee

2016-01-01

Objectives Previous studies have documented the benefits of bimodal hearing as compared with a CI alone, but most have focused on the importance of bottom-up, low-frequency cues. The purpose of the present study was to evaluate the role of top-down processing in bimodal hearing by measuring the effect of sentence context on bimodal benefit for temporally interrupted sentences. It was hypothesized that low-frequency acoustic cues would facilitate the use of contextual information in the interrupted sentences, resulting in greater bimodal benefit for the higher context (CUNY) sentences than for the lower context (IEEE) sentences. Design Young normal-hearing listeners were tested in simulated bimodal listening conditions in which noise band vocoded sentences were presented to one ear with or without low-pass (LP) filtered speech or LP harmonic complexes (LPHCs) presented to the contralateral ear. Speech recognition scores were measured in three listening conditions: vocoder-alone, vocoder combined with LP speech, and vocoder combined with LPHCs. Temporally interrupted versions of the CUNY and IEEE sentences were used to assess listeners’ ability to fill in missing segments of speech by using top-down linguistic processing. Sentences were square-wave gated at a rate of 5 Hz with a 50 percent duty cycle. Three vocoder channel conditions were tested for each type of sentence (8, 12, and 16 channels for CUNY; 12, 16, and 32 channels for IEEE) and bimodal benefit was compared for similar amounts of spectral degradation (matched-channel comparisons) and similar ranges of baseline performance. Two gain measures, percentage-point gain and normalized gain, were examined. Results Significant effects of context on bimodal benefit were observed when LP speech was presented to the residual-hearing ear. For the matched-channel comparisons, CUNY sentences showed significantly higher normalized gains than IEEE sentences for both the 12-channel (20 points higher) and 16-channel (18
Acoustic analysis in Mudejar-Gothic churches: Experimental results

NASA Astrophysics Data System (ADS)

Galindo, Miguel; Zamarreño, Teófilo; Girón, Sara

2005-05-01

This paper describes the preliminary results of research work in acoustics, conducted in a set of 12 Mudejar-Gothic churches in the city of Seville in the south of Spain. Despite common architectural style, the churches feature individual characteristics and have volumes ranging from 3947 to 10 708 m3. Acoustic parameters were measured in unoccupied churches according to the ISO-3382 standard. An extensive experimental study was carried out using impulse response analysis through a maximum length sequence measurement system in each church. It covered aspects such as reverberation (reverberation times, early decay times), distribution of sound levels (sound strength); early to late sound energy parameters derived from the impulse responses (center time, clarity for speech, clarity, definition, lateral energy fraction), and speech intelligibility (rapid speech transmission index), which all take both spectral and spatial distribution into account. Background noise was also measured to obtain the NR indices. The study describes the acoustic field inside each temple and establishes a discussion for each one of the acoustic descriptors mentioned by using the theoretical models available and the principles of architectural acoustics. Analysis of the quality of the spaces for music and speech is carried out according to the most widespread criteria for auditoria. .
Acoustic analysis in Mudejar-Gothic churches: experimental results.

PubMed

Galindo, Miguel; Zamarreño, Teófilo; Girón, Sara

2005-05-01

This paper describes the preliminary results of research work in acoustics, conducted in a set of 12 Mudejar-Gothic churches in the city of Seville in the south of Spain. Despite common architectural style, the churches feature individual characteristics and have volumes ranging from 3947 to 10 708 m3. Acoustic parameters were measured in unoccupied churches according to the ISO-3382 standard. An extensive experimental study was carried out using impulse response analysis through a maximum length sequence measurement system in each church. It covered aspects such as reverberation (reverberation times, early decay times), distribution of sound levels (sound strength); early to late sound energy parameters derived from the impulse responses (center time, clarity for speech, clarity, definition, lateral energy fraction), and speech intelligibility (rapid speech transmission index), which all take both spectral and spatial distribution into account. Background noise was also measured to obtain the NR indices. The study describes the acoustic field inside each temple and establishes a discussion for each one of the acoustic descriptors mentioned by using the theoretical models available and the principles of architectural acoustics. Analysis of the quality of the spaces for music and speech is carried out according to the most widespread criteria for auditoria.
The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech.

PubMed

Crosse, Michael J; Lalor, Edmund C

2014-04-01

Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.
Examination of time-reversal acoustics in shallow water and applications to noncoherent underwater communications.

PubMed

Smith, Kevin B; Abrantes, Antonio A M; Larraza, Andres

2003-06-01

The shallow water acoustic communication channel is characterized by strong signal degradation caused by multipath propagation and high spatial and temporal variability of the channel conditions. At the receiver, multipath propagation causes intersymbol interference and is considered the most important of the channel distortions. This paper examines the application of time-reversal acoustic (TRA) arrays, i.e., phase-conjugated arrays (PCAs), that generate a spatio-temporal focus of acoustic energy at the receiver location, eliminating distortions introduced by channel propagation. This technique is self-adaptive and automatically compensates for environmental effects and array imperfections without the need to explicitly characterize the environment. An attempt is made to characterize the influences of a PCA design on its focusing properties with particular attention given to applications in noncoherent underwater acoustic communication systems. Due to the PCA spatial diversity focusing properties, PC arrays may have an important role in an acoustic local area network. Each array is able to simultaneously transmit different messages that will focus only at the destination receiver node.
Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

NASA Astrophysics Data System (ADS)

Middag, Catherine; Martens, Jean-Pierre; Van Nuffelen, Gwen; De Bodt, Marc

2009-12-01

It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008) is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.

Speech endpoint detection with non-language speech sounds for generic speech processing applications

NASA Astrophysics Data System (ADS)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Real-Time Communication Support for Underwater Acoustic Sensor Networks †.

PubMed

Santos, Rodrigo; Orozco, Javier; Micheletto, Matias; Ochoa, Sergio F; Meseguer, Roc; Millan, Pere; Molina, And Carlos

2017-07-14

Underwater sensor networks represent an important and promising field of research due to the large diversity of underwater ubiquitous applications that can be supported by these networks, e.g., systems that deliver tsunami and oil spill warnings, or monitor submarine ecosystems. Most of these monitoring and warning systems require real-time communication in wide area networks that have a low density of nodes. The underwater communication medium involved in these networks is very harsh and imposes strong restrictions to the communication process. In this scenario, the real-time transmission of information is done mainly using acoustic signals, since the network nodes are not physically close. The features of the communication scenario and the requirements of the communication process represent major challenges for designers of both, communication protocols and monitoring and warning systems. The lack of models to represent these networks is the main stumbling block for the proliferation of underwater ubiquitous systems. This paper presents a real-time communication model for underwater acoustic sensor networks (UW-ASN) that are designed to cover wide areas with a low density of nodes, using any-to-any communication. This model is analytic, considers two solution approaches for scheduling the real-time messages, and provides a time-constraint analysis for the network performance. Using this model, the designers of protocols and underwater ubiquitous systems can quickly prototype and evaluate their solutions in an evolving way, in order to determine the best solution to the problem being addressed. The suitability of the proposal is illustrated with a case study that shows the performance of a UW-ASN under several initial conditions. This is the first analytic model for representing real-time communication in this type of network, and therefore, it opens the door for the development of underwater ubiquitous systems for several application scenarios.
Real-Time Communication Support for Underwater Acoustic Sensor Networks †

PubMed Central

Santos, Rodrigo; Orozco, Javier; Micheletto, Matias

2017-01-01

Underwater sensor networks represent an important and promising field of research due to the large diversity of underwater ubiquitous applications that can be supported by these networks, e.g., systems that deliver tsunami and oil spill warnings, or monitor submarine ecosystems. Most of these monitoring and warning systems require real-time communication in wide area networks that have a low density of nodes. The underwater communication medium involved in these networks is very harsh and imposes strong restrictions to the communication process. In this scenario, the real-time transmission of information is done mainly using acoustic signals, since the network nodes are not physically close. The features of the communication scenario and the requirements of the communication process represent major challenges for designers of both, communication protocols and monitoring and warning systems. The lack of models to represent these networks is the main stumbling block for the proliferation of underwater ubiquitous systems. This paper presents a real-time communication model for underwater acoustic sensor networks (UW-ASN) that are designed to cover wide areas with a low density of nodes, using any-to-any communication. This model is analytic, considers two solution approaches for scheduling the real-time messages, and provides a time-constraint analysis for the network performance. Using this model, the designers of protocols and underwater ubiquitous systems can quickly prototype and evaluate their solutions in an evolving way, in order to determine the best solution to the problem being addressed. The suitability of the proposal is illustrated with a case study that shows the performance of a UW-ASN under several initial conditions. This is the first analytic model for representing real-time communication in this type of network, and therefore, it opens the door for the development of underwater ubiquitous systems for several application scenarios. PMID:28708093
Speech interference and transmission on residential balconies with road traffic noise.

PubMed

Naish, Daniel A; Tan, Andy C C; Nur Demirbilek, F

2013-01-01

Balcony acoustic treatments can mitigate the effects of community road traffic noise. To further investigate, a theoretical study into the effects of balcony acoustic treatment combinations on speech interference and transmission is conducted for various street geometries. Nine different balcony types are investigated using a combined specular and diffuse reflection computer model. Diffusion in the model is calculated using the radiosity technique. The balcony types include a standard balcony with or without a ceiling and with various combinations of parapet, ceiling absorption and ceiling shield. A total of 70 balcony and street geometrical configurations are analyzed with each balcony type, resulting in 630 scenarios. In each scenario the reverberation time, speech interference level (SIL) and speech transmission index (STI) are calculated. These indicators are compared to determine trends based on the effects of propagation path, inclusion of opposite buildings and difference with a reference position outside the balcony. The results demonstrate trends in SIL and STI with different balcony types. It is found that an acoustically treated balcony reduces speech interference. A parapet provides the largest improvement, followed by absorption on the ceiling. The largest reductions in speech interference arise when a combination of balcony acoustic treatments are applied.
TongueToSpeech (TTS): Wearable wireless assistive device for augmented speech.

PubMed

Marjanovic, Nicholas; Piccinini, Giacomo; Kerr, Kevin; Esmailbeigi, Hananeh

2017-07-01

Speech is an important aspect of human communication; individuals with speech impairment are unable to communicate vocally in real time. Our team has developed the TongueToSpeech (TTS) device with the goal of augmenting speech communication for the vocally impaired. The proposed device is a wearable wireless assistive device that incorporates a capacitive touch keyboard interface embedded inside a discrete retainer. This device connects to a computer, tablet or a smartphone via Bluetooth connection. The developed TTS application converts text typed by the tongue into audible speech. Our studies have concluded that an 8-contact point configuration between the tongue and the TTS device would yield the best user precision and speed performance. On average using the TTS device inside the oral cavity takes 2.5 times longer than the pointer finger using a T9 (Text on 9 keys) keyboard configuration to type the same phrase. In conclusion, we have developed a discrete noninvasive wearable device that allows the vocally impaired individuals to communicate in real time.
Talker variability in audio-visual speech perception

PubMed Central

Heald, Shannon L. M.; Nusbaum, Howard C.

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker’s face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919
Talker variability in audio-visual speech perception.

PubMed

Heald, Shannon L M; Nusbaum, Howard C

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.
Classroom Acoustics: Understanding Barriers to Learning.

ERIC Educational Resources Information Center

Crandell, Carl C., Ed.; Smaldino, Joseph J., Ed.

2001-01-01

This booklet explores classroom acoustics and their importance on the learning potential of children with hearing loss and related disabilities. The booklet also reviews research on classroom acoustics and the need for the development of classroom acoustics standards. Chapters examine: 1) a speech-perception model demonstrating the linkage between…
A Robust Approach For Acoustic Noise Suppression In Speech Using ANFIS

NASA Astrophysics Data System (ADS)

Martinek, Radek; Kelnar, Michal; Vanus, Jan; Bilik, Petr; Zidek, Jan

2015-11-01

The authors of this article deals with the implementation of a combination of techniques of the fuzzy system and artificial intelligence in the application area of non-linear noise and interference suppression. This structure used is called an Adaptive Neuro Fuzzy Inference System (ANFIS). This system finds practical use mainly in audio telephone (mobile) communication in a noisy environment (transport, production halls, sports matches, etc). Experimental methods based on the two-input adaptive noise cancellation concept was clearly outlined. Within the experiments carried out, the authors created, based on the ANFIS structure, a comprehensive system for adaptive suppression of unwanted background interference that occurs in audio communication and degrades the audio signal. The system designed has been tested on real voice signals. This article presents the investigation and comparison amongst three distinct approaches to noise cancellation in speech; they are LMS (least mean squares) and RLS (recursive least squares) adaptive filtering and ANFIS. A careful review of literatures indicated the importance of non-linear adaptive algorithms over linear ones in noise cancellation. It was concluded that the ANFIS approach had the overall best performance as it efficiently cancelled noise even in highly noise-degraded speech. Results were drawn from the successful experimentation, subjective-based tests were used to analyse their comparative performance while objective tests were used to validate them. Implementation of algorithms was experimentally carried out in Matlab to justify the claims and determine their relative performances.
Cross-modal interactions during perception of audiovisual speech and nonspeech signals: an fMRI study.

PubMed

Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann

2011-01-01

During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream--prior to its fusion with auditory phonological features [Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. Time course of early audiovisual interactions during speech and non-speech central-auditory processing: An MEG study. Journal of Cognitive Neuroscience, 21, 259-274, 2009]. Using functional magnetic resonance imaging, the present follow-up study aims to further elucidate the topographic distribution of visual-phonological operations and audiovisual (AV) interactions during speech perception. Ambiguous acoustic syllables--disambiguated to /pa/ or /ta/ by the visual channel (speaking face)--served as test materials, concomitant with various control conditions (nonspeech AV signals, visual-only and acoustic-only speech, and nonspeech stimuli). (i) Visual speech yielded an AV-subadditive activation of primary auditory cortex and the anterior superior temporal gyrus (STG), whereas the posterior STG responded both to speech and nonspeech motion. (ii) The inferior frontal and the fusiform gyrus of the right hemisphere showed a strong phonetic/phonological impact (differential effects of visual /pa/ vs. /ta/) upon hemodynamic activation during presentation of speaking faces. Taken together with the previous MEG data, these results point at a dual-pathway model of visual speech information processing: On the one hand, access to the auditory system via the anterior supratemporal “what" path may give rise to direct activation of "auditory objects." On the other hand, visual speech information seems to be represented in a right-hemisphere visual working memory, providing a potential basis for later interactions with auditory information such as the McGurk effect.
Auditory-tactile echo-reverberating stuttering speech corrector

NASA Astrophysics Data System (ADS)

Kuniszyk-Jozkowiak, Wieslawa; Adamczyk, Bogdan

1997-02-01

The work presents the construction of a device, which transforms speech sounds into acoustical and tactile signals of echo and reverberation. Research has been done on the influence of the echo and reverberation, which are transmitted as acoustic and tactile stimuli, on speech fluency. Introducing the echo or reverberation into the auditory feedback circuit results in a reduction of stuttering. A bit less, but still significant corrective effects are observed while using the tactile channel for transmitting the signals. The use of joined auditory and tactile channels increases the effects of their corrective influence on the stutterers' speech. The results of the experiment justify the use of the tactile channel in the stutterers' therapy.
An integrated optical/acoustic communication system for seafloor observatories: A field test of high data rate communications at CORK 857D

NASA Astrophysics Data System (ADS)

Tivey, M.; Farr, N.; Ware, J.; Pontbriand, C.

2010-12-01

We report the successful deployment and testing of an underwater optical communication system that provides high data rate communications over a range of 100 meters from a deep sea borehole observatory located in the northeast Pacific. Optical underwater communications offers many advantages over acoustic or underwater wet mateable connections (UWMC). UMWCs requires periodic visits from a submersible or ROV to plug in and download data. Typically, these vehicles cannot perform any other tasks during these download periods - their time on station is limited, restricting the amount of data that can be downloaded. To eliminate the need for UWMCs requires the use of remote communication techniques such as acoustics or optical communications. Optical communications is capable of high data rates up to 10 mega bits per sec (Mbps) compared to acoustic data rates of 57 Kbps. We have developed an integrated optical/acoustic telemetry system (OTS) that uses an acoustic command system to control a high bandwidth, low latency optical communication system. In July 2010, we used the deep submersible ALVIN to install the Optical Telemetry System (OTS) at CORK 857D. The CORK is instrumented with a thermistor string and pressure sensors that record downhole formation pressures and temperatures within oceanic basement that is pressure sealed from the overlying water column. The seafloor OTS was plugged into the CORK’s existing UWMC to provide an optical and acoustic communication interface and additional data storage and battery power for the CORK to sample at 1 Hz data-rate, an increase over the normal 15 sec data sample rate. Using a CTD-mounted OTS lowered by wire from a surface ship, we established an optical communication link at 100 meters range at rates of 1, 5 and 10 Mbps with no bit errors. Tests were also done to establish the optical range of various data rates and the optical power of the system. After a week, we repeated the CTD-OTS experiment and downloaded 20 Mbytes
Auditory-Perceptual Learning Improves Speech Motor Adaptation in Children

PubMed Central

Shiller, Douglas M.; Rochon, Marie-Lyne

2015-01-01

Auditory feedback plays an important role in children’s speech development by providing the child with information about speech outcomes that is used to learn and fine-tune speech motor plans. The use of auditory feedback in speech motor learning has been extensively studied in adults by examining oral motor responses to manipulations of auditory feedback during speech production. Children are also capable of adapting speech motor patterns to perceived changes in auditory feedback, however it is not known whether their capacity for motor learning is limited by immature auditory-perceptual abilities. Here, the link between speech perceptual ability and the capacity for motor learning was explored in two groups of 5–7-year-old children who underwent a period of auditory perceptual training followed by tests of speech motor adaptation to altered auditory feedback. One group received perceptual training on a speech acoustic property relevant to the motor task while a control group received perceptual training on an irrelevant speech contrast. Learned perceptual improvements led to an enhancement in speech motor adaptation (proportional to the perceptual change) only for the experimental group. The results indicate that children’s ability to perceive relevant speech acoustic properties has a direct influence on their capacity for sensory-based speech motor adaptation. PMID:24842067
Noise and communication: a three-year update.

PubMed

Brammer, Anthony J; Laroche, Chantal

2012-01-01

Noise is omnipresent and impacts us all in many aspects of daily living. Noise can interfere with communication not only in industrial workplaces, but also in other work settings (e.g. open-plan offices, construction, and mining) and within buildings (e.g. residences, arenas, and schools). The interference of noise with communication can have significant social consequences, especially for persons with hearing loss, and may compromise safety (e.g. failure to perceive auditory warning signals), influence worker productivity and learning in children, affect health (e.g. vocal pathology, noise-induced hearing loss), compromise speech privacy, and impact social participation by the elderly. For workers, attempts have been made to: 1) Better define the auditory performance needed to function effectively and to directly measure these abilities when assessing Auditory Fitness for Duty, 2) design hearing protection devices that can improve speech understanding while offering adequate protection against loud noises, and 3) improve speech privacy in open-plan offices. As the elderly are particularly vulnerable to the effects of noise, an understanding of the interplay between auditory, cognitive, and social factors and its effect on speech communication and social participation is also critical. Classroom acoustics and speech intelligibility in children have also gained renewed interest because of the importance of effective speech comprehension in noise on learning. Finally, substantial work has been made in developing models aimed at better predicting speech intelligibility. Despite progress in various fields, the design of alarm signals continues to lag behind advancements in knowledge. This summary of the last three years' research highlights some of the most recent issues for the workplace, for older adults, and for children, as well as the effectiveness of warning sounds and models for predicting speech intelligibility. Suggestions for future work are also discussed.
Children with Speech, Language and Communication Needs: Their Perceptions of Their Quality of Life

ERIC Educational Resources Information Center

Markham, Chris; van Laar, Darren; Gibbard, Deborah; Dean, Taraneh

2009-01-01

Background: This study is part of a programme of research aiming to develop a quantitative measure of quality of life for children with communication needs. It builds on the preliminary findings of Markham and Dean (2006), which described some of the perception's parents and carers of children with speech language and communication needs had…
Visual and acoustic communication in non-human animals: a comparison.

PubMed

Rosenthal, G G; Ryan, M J

2000-09-01

The visual and auditory systems are two major sensory modalities employed in communication. Although communication in these two sensory modalities can serve analogous functions and evolve in response to similar selection forces, the two systems also operate under different constraints imposed by the environment and the degree to which these sensory modalities are recruited for non-communication functions. Also, the research traditions in each tend to differ, with studies of mechanisms of acoustic communication tending to take a more reductionist tack often concentrating on single signal parameters, and studies of visual communication tending to be more concerned with multivariate signal arrays in natural environments and higher level processing of such signals. Each research tradition would benefit by being more expansive in its approach.
Time-frequency feature representation using multi-resolution texture analysis and acoustic activity detector for real-life speech emotion recognition.

PubMed

Wang, Kun-Ching

2015-01-14

The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.
Time-Frequency Feature Representation Using Multi-Resolution Texture Analysis and Acoustic Activity Detector for Real-Life Speech Emotion Recognition

PubMed Central

Wang, Kun-Ching

2015-01-01

The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech. PMID:25594590
Using the picture exchange communication system (PECS) with children with autism: assessment of PECS acquisition, speech, social-communicative behavior, and problem behavior.

PubMed Central

Charlop-Christy, Marjorie H; Carpenter, Michael; Le, Loc; LeBlanc, Linda A; Kellet, Kristen

2002-01-01

The picture exchange communication system (PECS) is an augmentative communication system frequently used with children with autism (Bondy & Frost, 1994; Siegel, 2000; Yamall, 2000). Despite its common clinical use, no well-controlled empirical investigations have been conducted to test the effectiveness of PECS. Using a multiple baseline design, the present study examined the acquisition of PECS with 3 children with autism. In addition, the study examined the effects of PECS training on the emergence of speech in play and academic settings. Ancillary measures of social-communicative behaviors and problem behaviors were recorded. Results indicated that all 3 children met the learning criterion for PECS and showed concomitant increases in verbal speech. Ancillary gains were associated with increases in social-communicative behaviors and decreases in problem behaviors. The results are discussed in terms of the provision of empirical support for PECS as well as the concomitant positive side effects of its use. PMID:12365736
Why would Musical Training Benefit the Neural Encoding of Speech? The OPERA Hypothesis.

PubMed

Patel, Aniruddh D

2011-01-01

Mounting evidence suggests that musical training benefits the neural encoding of speech. This paper offers a hypothesis specifying why such benefits occur. The "OPERA" hypothesis proposes that such benefits are driven by adaptive plasticity in speech-processing networks, and that this plasticity occurs when five conditions are met. These are: (1) Overlap: there is anatomical overlap in the brain networks that process an acoustic feature used in both music and speech (e.g., waveform periodicity, amplitude envelope), (2) Precision: music places higher demands on these shared networks than does speech, in terms of the precision of processing, (3) Emotion: the musical activities that engage this network elicit strong positive emotion, (4) Repetition: the musical activities that engage this network are frequently repeated, and (5) Attention: the musical activities that engage this network are associated with focused attention. According to the OPERA hypothesis, when these conditions are met neural plasticity drives the networks in question to function with higher precision than needed for ordinary speech communication. Yet since speech shares these networks with music, speech processing benefits. The OPERA hypothesis is used to account for the observed superior subcortical encoding of speech in musically trained individuals, and to suggest mechanisms by which musical training might improve linguistic reading abilities.

A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech

PubMed Central

Tóth, László; Hoffmann, Ildikó; Gosztolya, Gábor; Vincze, Veronika; Szatlóczki, Gréta; Bánréti, Zoltán; Pákáski, Magdolna; Kálmán, János

2018-01-01

Background: Even today the reliable diagnosis of the prodromal stages of Alzheimer’s disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive de-cline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Methods: Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech sig-nals, first manually (using the Praat software), and then automatically, with an automatic speech recogni-tion (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. Results: The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process – that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78
A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech.

PubMed

Toth, Laszlo; Hoffmann, Ildiko; Gosztolya, Gabor; Vincze, Veronika; Szatloczki, Greta; Banreti, Zoltan; Pakaski, Magdolna; Kalman, Janos

2018-01-01

Even today the reliable diagnosis of the prodromal stages of Alzheimer's disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive decline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech signals, first manually (using the Praat software), and then automatically, with an automatic speech recognition (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process - that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. The temporal analysis of spontaneous speech
Speech Intelligibility and Marital Communication in Amyotrophic Lateral Sclerosis: An Exploratory Study

ERIC Educational Resources Information Center

Joubert, Karin; Bornman, Juan; Alant, Erna

2011-01-01

Amyotrophic lateral sclerosis (ALS), a rapidly progressive neuromuscular disease, has a devastating impact not only on individuals diagnosed with ALS but also their spouses. Speech intelligibility, often compromised as a result of dysarthria, affects the couple's ability to maintain effective, intimate communication. The purpose of this…
a Comparative Analysis of Fluent and Cerebral Palsied Speech.

NASA Astrophysics Data System (ADS)

van Doorn, Janis Lee

Several features of the acoustic waveforms of fluent and cerebral palsied speech were compared, using six fluent and seven cerebral palsied subjects, with a major emphasis being placed on an investigation of the trajectories of the first three formants (vocal tract resonances). To provide an overall picture which included other acoustic features, fundamental frequency, intensity, speech timing (speech rate and syllable duration), and prevocalization (vocalization prior to initial stop consonants found in cerebral palsied speech) were also investigated. Measurements were made using repetitions of a test sentence which was chosen because it required large excursions of the speech articulators (lips, tongue and jaw), so that differences in the formant trajectories for the fluent and cerebral palsied speakers would be emphasized. The acoustic features were all extracted from the digitized speech waveform (10 kHz sampling rate): the fundamental frequency contours were derived manually, the intensity contours were measured using the signal covariance, speech rate and syllable durations were measured manually, as were the prevocalization durations, while the formant trajectories were derived from short time spectra which were calculated for each 10 ms of speech using linear prediction analysis. Differences which were found in the acoustic features can be summarized as follows. For cerebral palsied speakers, the fundamental frequency contours generally showed inappropriate exaggerated fluctuations, as did some of the intensity contours; the mean fundamental frequencies were either higher or the same as for the fluent subjects; speech rates were reduced, and syllable durations were longer; prevocalization was consistently present at the beginning of the test sentence; formant trajectories were found to have overall reduced frequency ranges, and to contain anomalous transitional features, but it is noteworthy that for any one cerebral palsied subject, the inappropriate
First Annual Progress Report on Transmission of Information by Acoustic Communication along Metal Pathways in Nuclear Facilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heifetz, A.; Bakhtiari, S.; Huang, X.

The objective of this project is to develop and demonstrate methods for transmission of information in nuclear facilities by acoustic means along existing in-place metal piping infrastructure. Pipes are omnipresent in a nuclear facility, and penetrate enclosures and partitions, such as the containment building wall. In the envisioned acoustic communication (AC) system, packets of information will be transmitted as guided acoustic waves along pipes. Performance of AC hardware and network protocols for efficient and secure communications under development in this project will be eventually evaluated in a representative nuclear power plant environment. Research efforts in the first year of thismore » project have been focused on identification of appropriate transducers, and evaluation of their performance for information transmission along nuclear-grade metallic pipes. COMSOL computer simulations were performed to study acoustic wave generation, propagation, and attenuation on pipes. An experimental benchtop system was used to evaluate signal attenuation and spectral dispersion using piezo-electric transducers (PZTs) and electromagnetic acoustic transducers (EMATs). Communication protocols under evaluation consisted on-off keying (OOK) signal modulation, in particular amplitude shift keying (ASK) and phase shift keying (PSK). Tradeoffs between signal power and communication data rate were considered for ASK and PSK coding schemes.« less
Children with Autistic Spectrum Disorders and Speech-Generating Devices: Communication in Different Activities at Home

ERIC Educational Resources Information Center

Thunberg, Gunilla; Ahlsen, Elisabeth; Sandberg, Annika Dahlgren

2007-01-01

The communication of four children with autistic spectrum disorder was investigated when they were supplied with a speech-generating device (SGD) in three different activities in their home environment: mealtime, story reading and "sharing experiences of the preschool day". An activity based communication analysis, in which collective and…
[Nature of speech disorders in Parkinson disease].

PubMed

Pawlukowska, W; Honczarenko, K; Gołąb-Janowska, M

2013-01-01

The aim of the study was to discuss physiology and pathology of speech and review of the literature on speech disorders in Parkinson disease. Additionally, the most effective methods to diagnose the speech disorders in Parkinson disease were also stressed. Afterward, articulatory, respiratory, acoustic and pragmatic factors contributing to the exacerbation of the speech disorders were discussed. Furthermore, the study dealt with the most important types of speech treatment techniques available (pharmacological and behavioral) and a significance of Lee Silverman Voice Treatment was highlighted.
Subcortical processing of speech regularities underlies reading and music aptitude in children.

PubMed

Strait, Dana L; Hornickel, Jane; Kraus, Nina

2011-10-17

Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings
Subcortical processing of speech regularities underlies reading and music aptitude in children

PubMed Central

2011-01-01

Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input
Speech intelligibility with helicopter noise: tests of three helmet-mounted communication systems.

PubMed

Ribera, John E; Mozo, Ben T; Murphy, Barbara A

2004-02-01

Military aviator helmet communications systems are designed to enhance speech intelligibility (SI) in background noise and reduce exposure to harmful levels of noise. Some aviators, over the course of their aviation career, develop noise-induced hearing loss that may affect their ability to perform required tasks. New technology can improve SI in noise for aviators with normal hearing as well as those with hearing loss. SI in noise scores were obtained from 40 rotary-wing aviators (20 with normal hearing and 20 with hearing-loss waivers). There were three communications systems evaluated: a standard SPH-4B, an SPH-4B aviator helmet modified with communications earplug (CEP), and an SPH-4B modified with active noise reduction (ANR). Subjects' SI was better in noise with newer technologies than with the standard issue aviator helmet. A significant number of aviators on waivers for hearing loss performed within the range of their normal hearing counterparts when wearing the newer technology. The rank order of perceived speech clarity was 1) CEP, 2) ANR, and 3) unmodified SPH-4B. To insure optimum SI in noise for rotary-wing aviators, consideration should be given to retrofitting existing aviator helmets with new technology, and incorporating such advances in communication systems of the future. Review of standards for determining fitness to fly is needed.
Single- and multi-channel underwater acoustic communication channel capacity: a computational study.

PubMed

Hayward, Thomas J; Yang, T C

2007-09-01

Acoustic communication channel capacity determines the maximum data rate that can be supported by an acoustic channel for a given source power and source/receiver configuration. In this paper, broadband acoustic propagation modeling is applied to estimate the channel capacity for a time-invariant shallow-water waveguide for a single source-receiver pair and for vertical source and receiver arrays. Without bandwidth constraints, estimated single-input, single-output (SISO) capacities approach 10 megabitss at 1 km range, but beyond 2 km range they decay at a rate consistent with previous estimates by Peloquin and Leinhos (unpublished, 1997), which were based on a sonar equation calculation. Channel capacities subject to source bandwidth constraints are approximately 30-90% lower than for the unconstrained case, and exhibit a significant wind speed dependence. Channel capacity is investigated for single-input, multi-output (SIMO) and multi-input, multi-output (MIMO) systems, both for finite arrays and in the limit of a dense array spanning the entire water column. The limiting values of the SIMO and MIMO channel capacities for the modeled environment are found to be about four times higher and up to 200-400 times higher, respectively, than for the SISO case. Implications for underwater acoustic communication systems are discussed.
Electrocorticographic representations of segmental features in continuous speech

PubMed Central

Lotte, Fabien; Brumberg, Jonathan S.; Brunner, Peter; Gunduz, Aysegul; Ritaccio, Anthony L.; Guan, Cuntai; Schalk, Gerwin

2015-01-01

Acoustic speech output results from coordinated articulation of dozens of muscles, bones and cartilages of the vocal mechanism. While we commonly take the fluency and speed of our speech productions for granted, the neural mechanisms facilitating the requisite muscular control are not completely understood. Previous neuroimaging and electrophysiology studies of speech sensorimotor control has typically concentrated on speech sounds (i.e., phonemes, syllables and words) in isolation; sentence-length investigations have largely been used to inform coincident linguistic processing. In this study, we examined the neural representations of segmental features (place and manner of articulation, and voicing status) in the context of fluent, continuous speech production. We used recordings from the cortical surface [electrocorticography (ECoG)] to simultaneously evaluate the spatial topography and temporal dynamics of the neural correlates of speech articulation that may mediate the generation of hypothesized gestural or articulatory scores. We found that the representation of place of articulation involved broad networks of brain regions during all phases of speech production: preparation, execution and monitoring. In contrast, manner of articulation and voicing status were dominated by auditory cortical responses after speech had been initiated. These results provide a new insight into the articulatory and auditory processes underlying speech production in terms of their motor requirements and acoustic correlates. PMID:25759647
Identifying the Challenges and Opportunities to Meet the Needs of Children with Speech, Language and Communication Difficulties

ERIC Educational Resources Information Center

Dockrell, Julie E.; Howell, Peter

2015-01-01

The views of experienced educational practitioners were examined with respect to the terminology used to describe children with speech, language and communication needs (SLCN), associated problems and the impact of speech and language difficulties in the classroom. Results showed that education staff continue to experience challenges with the…
Criterion Referenced Measurement in Speech-Communication Classrooms: Panacea for Mediocrity. Research Report.

ERIC Educational Resources Information Center

Buley, Jerry L.

The philosophical underpinnings of the typical testing practices of speech communication teachers in regard to norm-referenced measurement contain several assumptions which teachers may find untenable on closer inspection. Some of the consequences of these assumptions are a waste of human potential, inefficient use of instructional expertise,…
Teaching Speech Communication in a Black College: Does Technology Make a Difference?

ERIC Educational Resources Information Center

Nwadike, Fellina O.; Ekeanyanwu, Nnamdi T.

2011-01-01

Teaching a speech communication course in typical HBCUs (historically black colleges and universities) comes with many issues, because the application of technology in some minority institutions differs. The levels of acceptability as well as affordability are also core issues that affect application. Using technology in the classroom means many…
Dog-directed speech: why do we use it and do dogs pay attention to it?

PubMed Central

Ben-Aderet, Tobey; Gallego-Abenza, Mario

2017-01-01

Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. PMID:28077769
Dog-directed speech: why do we use it and do dogs pay attention to it?

PubMed

Ben-Aderet, Tobey; Gallego-Abenza, Mario; Reby, David; Mathevon, Nicolas

2017-01-11

Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. © 2017 The Author(s).
Voice Modulations in German Ironic Speech

ERIC Educational Resources Information Center

Scharrer, Lisa; Christmann, Ursula; Knoll, Monja

2011-01-01

Previous research has shown that in different languages ironic speech is acoustically modulated compared to literal speech, and these modulations are assumed to aid the listener in the comprehension process by acting as cues that mark utterances as ironic. The present study was conducted to identify paraverbal features of German "ironic…
Effect of body position on vocal tract acoustics: Acoustic pharyngometry and vowel formants.

PubMed

Vorperian, Houri K; Kurtzweil, Sara L; Fourakis, Marios; Kent, Ray D; Tillman, Katelyn K; Austin, Diane

2015-08-01

The anatomic basis and articulatory features of speech production are often studied with imaging studies that are typically acquired in the supine body position. It is important to determine if changes in body orientation to the gravitational field alter vocal tract dimensions and speech acoustics. The purpose of this study was to assess the effect of body position (upright versus supine) on (1) oral and pharyngeal measurements derived from acoustic pharyngometry and (2) acoustic measurements of fundamental frequency (F0) and the first four formant frequencies (F1-F4) for the quadrilateral point vowels. Data were obtained for 27 male and female participants, aged 17 to 35 yrs. Acoustic pharyngometry showed a statistically significant effect of body position on volumetric measurements, with smaller values in the supine than upright position, but no changes in length measurements. Acoustic analyses of vowels showed significantly larger values in the supine than upright position for the variables of F0, F3, and the Euclidean distance from the centroid to each corner vowel in the F1-F2-F3 space. Changes in body position affected measurements of vocal tract volume but not length. Body position also affected the aforementioned acoustic variables, but the main vowel formants were preserved.
Studies in automatic speech recognition and its application in aerospace

NASA Astrophysics Data System (ADS)

Taylor, Michael Robinson

Human communication is characterized in terms of the spectral and temporal dimensions of speech waveforms. Electronic speech recognition strategies based on Dynamic Time Warping and Markov Model algorithms are described and typical digit recognition error rates are tabulated. The application of Direct Voice Input (DVI) as an interface between man and machine is explored within the context of civil and military aerospace programmes. Sources of physical and emotional stress affecting speech production within military high performance aircraft are identified. Experimental results are reported which quantify fundamental frequency and coarse temporal dimensions of male speech as a function of the vibration, linear acceleration and noise levels typical of aerospace environments; preliminary indications of acoustic phonetic variability reported by other researchers are summarized. Connected whole-word pattern recognition error rates are presented for digits spoken under controlled Gz sinusoidal whole-body vibration. Correlations are made between significant increases in recognition error rate and resonance of the abdomen-thorax and head subsystems of the body. The phenomenon of vibrato style speech produced under low frequency whole-body Gz vibration is also examined. Interactive DVI system architectures and avionic data bus integration concepts are outlined together with design procedures for the efficient development of pilot-vehicle command and control protocols.

An Acoustic Study of the Relationships among Neurologic Disease, Dysarthria Type, and Severity of Dysarthria

ERIC Educational Resources Information Center

Kim, Yunjung; Kent, Raymond D.; Weismer, Gary

2011-01-01

Purpose: This study examined acoustic predictors of speech intelligibility in speakers with several types of dysarthria secondary to different diseases and conducted classification analysis solely by acoustic measures according to 3 variables (disease, speech severity, and dysarthria type). Method: Speech recordings from 107 speakers with…
Freedom of Speech Newsletter, September, 1975.

ERIC Educational Resources Information Center

Allen, Winfred G., Jr., Ed.

The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…
The Mechanism of Speech Processing in Congenital Amusia: Evidence from Mandarin Speakers

PubMed Central

Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

2012-01-01

Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results. PMID:22347374
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

PubMed

Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

2012-01-01

Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
Lip Movement Exaggerations During Infant-Directed Speech

PubMed Central

Green, Jordan R.; Nip, Ignatius S. B.; Wilson, Erin M.; Mefferd, Antje S.; Yunusova, Yana

2011-01-01

Purpose Although a growing body of literature has indentified the positive effects of visual speech on speech and language learning, oral movements of infant-directed speech (IDS) have rarely been studied. This investigation used 3-dimensional motion capture technology to describe how mothers modify their lip movements when talking to their infants. Method Lip movements were recorded from 25 mothers as they spoke to their infants and other adults. Lip shapes were analyzed for differences across speaking conditions. The maximum fundamental frequency, duration, acoustic intensity, and first and second formant frequency of each vowel also were measured. Results Lip movements were significantly larger during IDS than during adult-directed speech, although the exaggerations were vowel specific. All of the vowels produced during IDS were characterized by an elevated vocal pitch and a slowed speaking rate when compared with vowels produced during adult-directed speech. Conclusion The pattern of lip-shape exaggerations did not provide support for the hypothesis that mothers produce exemplar visual models of vowels during IDS. Future work is required to determine whether the observed increases in vertical lip aperture engender visual and acoustic enhancements that facilitate the early learning of speech. PMID:20699342
Virtual acoustics displays

NASA Astrophysics Data System (ADS)

Wenzel, Elizabeth M.; Fisher, Scott S.; Stone, Philip K.; Foster, Scott H.

1991-03-01

The real time acoustic display capabilities are described which were developed for the Virtual Environment Workstation (VIEW) Project at NASA-Ames. The acoustic display is capable of generating localized acoustic cues in real time over headphones. An auditory symbology, a related collection of representational auditory 'objects' or 'icons', can be designed using ACE (Auditory Cue Editor), which links both discrete and continuously varying acoustic parameters with information or events in the display. During a given display scenario, the symbology can be dynamically coordinated in real time with 3-D visual objects, speech, and gestural displays. The types of displays feasible with the system range from simple warnings and alarms to the acoustic representation of multidimensional data or events.
Virtual acoustics displays

NASA Technical Reports Server (NTRS)

Wenzel, Elizabeth M.; Fisher, Scott S.; Stone, Philip K.; Foster, Scott H.

1991-01-01

The real time acoustic display capabilities are described which were developed for the Virtual Environment Workstation (VIEW) Project at NASA-Ames. The acoustic display is capable of generating localized acoustic cues in real time over headphones. An auditory symbology, a related collection of representational auditory 'objects' or 'icons', can be designed using ACE (Auditory Cue Editor), which links both discrete and continuously varying acoustic parameters with information or events in the display. During a given display scenario, the symbology can be dynamically coordinated in real time with 3-D visual objects, speech, and gestural displays. The types of displays feasible with the system range from simple warnings and alarms to the acoustic representation of multidimensional data or events.
Using listening difficulty ratings of conditions for speech communication in rooms

NASA Astrophysics Data System (ADS)

Sato, Hiroshi; Bradley, John S.; Morimoto, Masayuki

2005-03-01

The use of listening difficulty ratings of speech communication in rooms is explored because, in common situations, word recognition scores do not discriminate well among conditions that are near to acceptable. In particular, the benefits of early reflections of speech sounds on listening difficulty were investigated and compared to the known benefits to word intelligibility scores. Listening tests were used to assess word intelligibility and perceived listening difficulty of speech in simulated sound fields. The experiments were conducted in three types of sound fields with constant levels of ambient noise: only direct sound, direct sound with early reflections, and direct sound with early reflections and reverberation. The results demonstrate that (1) listening difficulty can better discriminate among these conditions than can word recognition scores; (2) added early reflections increase the effective signal-to-noise ratio equivalent to the added energy in the conditions without reverberation; (3) the benefit of early reflections on difficulty scores is greater than expected from the simple increase in early arriving speech energy with reverberation; (4) word intelligibility tests are most appropriate for conditions with signal-to-noise (S/N) ratios less than 0 dBA, and where S/N is between 0 and 15-dBA S/N, listening difficulty is a more appropriate evaluation tool. .
Speech research: A report on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications

NASA Astrophysics Data System (ADS)

Liberman, A. M.

1980-06-01

This report (1 April - 30 June) is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. Manuscripts cover the following topics: The perceptual equivalance of two acoustic cues for a speech contrast is specific to phonetic perception; Duplex perception of acoustic patterns as speech and nonspeech; Evidence for phonetic processing of cues to place of articulation: Perceived manner affects perceived place; Some articulatory correlates of perceptual isochrony; Effects of utterance continuity on phonetic judgments; Laryngeal adjustments in stuttering: A glottographic observation using a modified reaction paradigm; Missing -ing in reading: Letter detection errors on word endings; Speaking rate; syllable stress, and vowel identity; Sonority and syllabicity: Acoustic correlates of perception, Influence of vocalic context on perception of the (S)-(s) distinction.
Profiling Early Socio-Communicative Development in Five Young Girls with the Preserved Speech Variant of Rett Syndrome

ERIC Educational Resources Information Center

Marschik, Peter B.; Kaufmann, Walter E.; Einspieler, Christa; Bartl-Pokorny, Katrin D.; Wolin, Thomas; Pini, Giorgio; Budimirovic, Dejan B.; Zappella, Michele; Sigafoos, Jeff

2012-01-01

Rett syndrome (RTT) is a developmental disorder characterized by regression of purposeful hand skills and spoken language, although some affected children retain some ability to speech. We assessed the communicative abilities of five young girls, who were later diagnosed with the preserved speech variant of RTT, during the pre-regression period…
"… Trial and error …": Speech-language pathologists' perspectives of working with Indigenous Australian adults with acquired communication disorders.

PubMed

Cochrane, Frances Clare; Brown, Louise; Siyambalapitiya, Samantha; Plant, Christopher

2016-10-01

This study explored speech-language pathologists' (SLPs) perspectives about factors that influence clinical management of Aboriginal and Torres Strait Islander adults with acquired communication disorders (e.g. aphasia, motor speech disorders). Using a qualitative phenomenological approach, seven SLPs working in North Queensland, Australia with experience working with this population participated in semi-structured in-depth interviews. Qualitative content analysis was used to identify categories and overarching themes within the data. Four categories, in relation to barriers and facilitators, were identified from participants' responses: (1) The Practice Context; (2) Working Together; (3) Client Factors; and (4) Speech-Language Pathologist Factors. Three overarching themes were also found to influence effective speech pathology services: (1) Aboriginal and Torres Strait Islander Cultural Practices; (2) Information and Communication; and (3) Time. This study identified many complex and inter-related factors which influenced SLPs' effective clinical management of this caseload. The findings suggest that SLPs should employ a flexible, holistic and collaborative approach in order to facilitate effective clinical management with Aboriginal and Torres Strait Islander people with acquired communication disorders.
Development of a test battery for evaluating speech perception in complex listening environments.

PubMed

Brungart, Douglas S; Sheffield, Benjamin M; Kubli, Lina R

2014-08-01

In the real world, spoken communication occurs in complex environments that involve audiovisual speech cues, spatially separated sound sources, reverberant listening spaces, and other complicating factors that influence speech understanding. However, most clinical tools for assessing speech perception are based on simplified listening environments that do not reflect the complexities of real-world listening. In this study, speech materials from the QuickSIN speech-in-noise test by Killion, Niquette, Gudmundsen, Revit, and Banerjee [J. Acoust. Soc. Am. 116, 2395-2405 (2004)] were modified to simulate eight listening conditions spanning the range of auditory environments listeners encounter in everyday life. The standard QuickSIN test method was used to estimate 50% speech reception thresholds (SRT50) in each condition. A method of adjustment procedure was also used to obtain subjective estimates of the lowest signal-to-noise ratio (SNR) where the listeners were able to understand 100% of the speech (SRT100) and the highest SNR where they could detect the speech but could not understand any of the words (SRT0). The results show that the modified materials maintained most of the efficiency of the QuickSIN test procedure while capturing performance differences across listening conditions comparable to those reported in previous studies that have examined the effects of audiovisual cues, binaural cues, room reverberation, and time compression on the intelligibility of speech.
A study on nonlinear characteristics of speech sound with reference to some languages of North East region

NASA Astrophysics Data System (ADS)

Dutta, Rashmi

INTRODUCTION : Speech science is, in fact, a sub-discipline of the Nonlinear Dynamical System [2,104 ]. There are two different types of Dynamical System. A Continuous Dynamical System may be defined for the continuous time case, by the equation: x = F (x), where x is a vector of length d, defining a point in a d- dimensional space, F is some function (linear or nonlinear) operating on x, and x is the time derivative of x. This system is deterministic, in that it is possible to completely specify its evolution or flow of trajectories in the d- dimensional space, given the initial starting conditions. A Discrete Dynamical System can be defined as a map [by the process of literations]: Xn+1 = G ( Xn ), where Xn is again a d- length vector at time step n, and G is an operator function. Given an initial state, X0, it is possible to calculate the value of xn for any n > 0. Speech has evolved as a primary form of communication between humans, i.e. speech and hearing are the man's most used means of communication [104, 114]. Analysis of human speech has been a goal of Research during the last few decades [105, 108]. With the rapid development of information technology (IT), the human-machine communication, using natural speech, has received wide attention from both academic and business communities. One highly quantitative approach of characterizing the communications potential of speech is in terms of information theory ideas as introduced by Shannon [C.E. Shannon, "A Mathematical Theory of Communication," Bell System Tech journal, Vol 27, pp623- 656, October, 1968]. According to information theory, speech can be represented in terms of its message content, or information. An alternative way of characterizing speech is in terms of the signal carrying the message information, i.e., the acoustic waveform. Although information theoretic ideas have played a major role in sophisticated communications systems, it is the speech representation based on the waveform, or some
Longitudinal development of communication in children with cerebral palsy between 24 and 53 months: Predicting speech outcomes.

PubMed

Hustad, Katherine C; Allison, Kristen M; Sakash, Ashley; McFadd, Emily; Broman, Aimee Teo; Rathouz, Paul J

2017-08-01

To determine whether communication at 2 years predicted communication at 4 years in children with cerebral palsy (CP); and whether the age a child first produces words imitatively predicts change in speech production. 30 children (15 males) with CP participated and were seen 5 times at 6-month intervals between 24 and 53 months (mean age at time 1 = 26.9 months (SD 1.9)). Variables were communication classification at 24 and 53 months, age that children were first able to produce words imitatively, single-word intelligibility, and longest utterance produced. Communication at 24 months was highly predictive of abilities at 53 months. Speaking earlier led to faster gains in intelligibility and length of utterance and better outcomes at 53 months than speaking later. Inability to speak at 24 months indicates greater speech and language difficulty at 53 months and a strong need for early communication intervention.
Free Speech Yearbook: 1972.

ERIC Educational Resources Information Center

Tedford, Thomas L., Ed.

This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…
Toward an Understanding of Successful Career Placement by Undergraduate Speech Communication Departments.

ERIC Educational Resources Information Center

Cahn, Dudley D.

Noting that placement of graduating speech communication students is an important measure of the success of career programs, and that faculty and department heads who are presently developing, recommending, or supervising career programs may be interested in useful career attitudes and placement activities, a study was conducted to determine what…
Classroom acoustics and intervention strategies to enhance the learning environment

NASA Astrophysics Data System (ADS)

Savage, Christal

The classroom environment can be an acoustically difficult atmosphere for students to learn effectively, sometimes due in part to poor acoustical properties. Noise and reverberation have a substantial influence on room acoustics and subsequently intelligibility of speech. The American Speech-Language-Hearing Association (ASHA, 1995) developed minimal standards for noise and reverberation in a classroom for the purpose of providing an adequate listening environment. A lack of adherence to these standards may have undesirable consequences, which may lead to poor academic performance. The purpose of this capstone project is to develop a protocol to measure the acoustical properties of reverberation time and noise levels in elementary classrooms and present the educators with strategies to improve the learning environment. Noise level and reverberation will be measured and recorded in seven, unoccupied third grade classrooms in Lincoln Parish in North Louisiana. The recordings will occur at six specific distances in the classroom to simulate teacher and student positions. The recordings will be compared to the American Speech-Language-Hearing Association standards for noise and reverberation. If discrepancies are observed, the primary investigator will serve as an auditory consultant for the school and educators to recommend remediation and intervention strategies to improve these acoustical properties. The hypothesis of the study is that the classroom acoustical properties of noise and reverberation will exceed the American Speech-Language-Hearing Association standards; therefore, the auditory consultant will provide strategies to improve those acoustical properties.
The influence of selective attention to auditory and visual speech on the integration of audiovisual speech information.

PubMed

Buchan, Julie N; Munhall, Kevin G

2011-01-01

Conflicting visual speech information can influence the perception of acoustic speech, causing an illusory percept of a sound not present in the actual acoustic speech (the McGurk effect). We examined whether participants can voluntarily selectively attend to either the auditory or visual modality by instructing participants to pay attention to the information in one modality and to ignore competing information from the other modality. We also examined how performance under these instructions was affected by weakening the influence of the visual information by manipulating the temporal offset between the audio and video channels (experiment 1), and the spatial frequency information present in the video (experiment 2). Gaze behaviour was also monitored to examine whether attentional instructions influenced the gathering of visual information. While task instructions did have an influence on the observed integration of auditory and visual speech information, participants were unable to completely ignore conflicting information, particularly information from the visual stream. Manipulating temporal offset had a more pronounced interaction with task instructions than manipulating the amount of visual information. Participants' gaze behaviour suggests that the attended modality influences the gathering of visual information in audiovisual speech perception.
Differences in early speech patterns between Parkinson variant of multiple system atrophy and Parkinson's disease.

PubMed

Huh, Young Eun; Park, Jongkyu; Suh, Mee Kyung; Lee, Sang Eun; Kim, Jumin; Jeong, Yuri; Kim, Hee-Tae; Cho, Jin Whan

2015-08-01

In Parkinson variant of multiple system atrophy (MSA-P), patterns of early speech impairment and their distinguishing features from Parkinson's disease (PD) require further exploration. Here, we compared speech data among patients with early-stage MSA-P, PD, and healthy subjects using quantitative acoustic and perceptual analyses. Variables were analyzed for men and women in view of gender-specific features of speech. Acoustic analysis revealed that male patients with MSA-P exhibited more profound speech abnormalities than those with PD, regarding increased voice pitch, prolonged pause time, and reduced speech rate. This might be due to widespread pathology of MSA-P in nigrostriatal or extra-striatal structures related to speech production. Although several perceptual measures were mildly impaired in MSA-P and PD patients, none of these parameters showed a significant difference between patient groups. Detailed speech analysis using acoustic measures may help distinguish between MSA-P and PD early in the disease process. Copyright © 2015 Elsevier Inc. All rights reserved.
Speech as a breakthrough signaling resource in the cognitive evolution of biological complex adaptive systems.

PubMed

Mattei, Tobias A

2014-12-01

In self-adapting dynamical systems, a significant improvement in the signaling flow among agents constitutes one of the most powerful triggering events for the emergence of new complex behaviors. Ackermann and colleagues' comprehensive phylogenetic analysis of the brain structures involved in acoustic communication provides further evidence of the essential role which speech, as a breakthrough signaling resource, has played in the evolutionary development of human cognition viewed from the standpoint of complex adaptive system analysis.

Start/End Delays of Voiced and Unvoiced Speech Signals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herrnstein, A

Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measuredmore » acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.« less
Acoustic resonance at the dawn of life: musical fundamentals of the psychoanalytic relationship.

PubMed

Pickering, Judith

2015-11-01

This paper uses a case vignette to show how musical elements of speech are a crucial source of information regarding the patient's emotional states and associated memory systems that are activated at a given moment in the analytic field. There are specific psychoacoustic markers associated with different memory systems which indicate whether a patient is immersed in a state of creative intersubjective relatedness related to autobiographical memory, or has been triggered into a traumatic memory system. When a patient feels immersed in an atmosphere of intersubjective mutuality, dialogue features a rhythmical and tuneful form of speech featuring improvized reciprocal imitation, theme and variation. When the patient is catapulted into a traumatic memory system, speech becomes monotone and disjointed. Awareness of such acoustic features of the traumatic memory system helps to alert the analyst that such a shift has taken place informing appropriate responses and interventions. Communicative musicality (Malloch & Trevarthen 2009) originates in the earliest non-verbal vocal communication between infant and care-giver, states of primary intersubjectivity. Such musicality continues to be the primary vehicle for transmitting emotional meaning and for integrating right and left hemispheres. This enables communication that expresses emotional significance, personal value as well as conceptual reasoning. © 2015, The Society of Analytical Psychology.
Articulatory Mediation of Speech Perception: A Causal Analysis of Multi-Modal Imaging Data

ERIC Educational Resources Information Center

Gow, David W., Jr.; Segawa, Jennifer A.

2009-01-01

The inherent confound between the organization of articulation and the acoustic-phonetic structure of the speech signal makes it exceptionally difficult to evaluate the competing claims of motor and acoustic-phonetic accounts of how listeners recognize coarticulated speech. Here we use Granger causation analysis of high spatiotemporal resolution…
Associations between tongue movement pattern consistency and formant movement pattern consistency in response to speech behavioral modificationsa)

PubMed Central

Mefferd, Antje S.

2016-01-01

The degree of speech movement pattern consistency can provide information about speech motor control. Although tongue motor control is particularly important because of the tongue's primary contribution to the speech acoustic signal, capturing tongue movements during speech remains difficult and costly. This study sought to determine if formant movements could be used to estimate tongue movement pattern consistency indirectly. Two age groups (seven young adults and seven older adults) and six speech conditions (typical, slow, loud, clear, fast, bite block speech) were selected to elicit an age- and task-dependent performance range in tongue movement pattern consistency. Kinematic and acoustic spatiotemporal indexes (STI) were calculated based on sentence-length tongue movement and formant movement signals, respectively. Kinematic and acoustic STI values showed strong associations across talkers and moderate to strong associations for each talker across speech tasks; although, in cases where task-related tongue motor performance changes were relatively small, the acoustic STI values were poorly associated with kinematic STI values. These findings suggest that, depending on the sensitivity needs, formant movement pattern consistency could be used in lieu of direct kinematic analysis to indirectly examine speech motor control. PMID:27908069
Communicating Epistemic Stance: How Speech and Gesture Patterns Reflect Epistemicity and Evidentiality

ERIC Educational Resources Information Center

Roseano, Paolo; González, Montserrat; Borràs-Comes, Joan; Prieto, Pilar

2016-01-01

This study investigates how epistemic stance is encoded and perceived in face-to-face communication when language is regarded as comprised by speech and gesture. Two studies were conducted with this goal in mind. The first study consisted of a production task in which participants performed opinion reports. Results showed that speakers communicate…
Does communication partner training improve the conversation skills of speech-language pathology students when interacting with people with aphasia?

PubMed

Finch, Emma; Cameron, Ashley; Fleming, Jennifer; Lethlean, Jennifer; Hudson, Kyla; McPhail, Steven

2017-07-01

Aphasia is a common consequence of stroke. Despite receiving specialised training in communication, speech-language pathology students may lack confidence when communicating with People with Aphasia (PWA). This paper reports data from secondary outcome measures from a randomised controlled trial. The aim of the current study was to examine the effects of communication partner training on the communication skills of speech-language pathology students during conversations with PWA. Thirty-eight speech-language pathology students were randomly allocated to trained and untrained groups. The first group received a lecture about communication strategies for communicating with PWA then participated in a conversation with PWA (Trained group), while the second group of students participated in a conversation with the PWA without receiving the lecture (Untrained group). The conversations between the groups were analysed according to the Measure of skill in Supported Conversation (MSC) scales, Measure of Participation in Conversation (MPC) scales, types of strategies used in conversation, and the occurrence and repair of conversation breakdowns. The trained group received significantly higher MSC Revealing Competence scores, used significantly more props, and introduced significantly more new ideas into the conversation than the untrained group. The trained group also used more gesture and writing to facilitate the conversation, however, the difference was not significant. There was no significant difference between the groups according to MSC Acknowledging Competence scores, MPC Interaction or Transaction scores, or in the number of interruptions, minor or major conversation breakdowns, or in the success of strategies initiated to repair the conversation breakdowns. Speech-language pathology students may benefit from participation in communication partner training programs. Copyright © 2017 Elsevier Inc. All rights reserved.
Automatic detection of obstructive sleep apnea using speech signals.

PubMed

Goldshtein, Evgenia; Tarasiuk, Ariel; Zigel, Yaniv

2011-05-01

Obstructive sleep apnea (OSA) is a common disorder associated with anatomical abnormalities of the upper airways that affects 5% of the population. Acoustic parameters may be influenced by the vocal tract structure and soft tissue properties. We hypothesize that speech signal properties of OSA patients will be different than those of control subjects not having OSA. Using speech signal processing techniques, we explored acoustic speech features of 93 subjects who were recorded using a text-dependent speech protocol and a digital audio recorder immediately prior to polysomnography study. Following analysis of the study, subjects were divided into OSA (n=67) and non-OSA (n=26) groups. A Gaussian mixture model-based system was developed to model and classify between the groups; discriminative features such as vocal tract length and linear prediction coefficients were selected using feature selection technique. Specificity and sensitivity of 83% and 79% were achieved for the male OSA and 86% and 84% for the female OSA patients, respectively. We conclude that acoustic features from speech signals during wakefulness can detect OSA patients with good specificity and sensitivity. Such a system can be used as a basis for future development of a tool for OSA screening. © 2011 IEEE
Comparative Efficacy of the Picture Exchange Communication System (PECS) versus a Speech-Generating Device: Effects on Requesting Skills

ERIC Educational Resources Information Center

Boesch, Miriam C.; Wendt, Oliver; Subramanian, Anu; Hsu, Ning

2013-01-01

An experimental, single-subject research study investigated the comparative efficacy of the Picture Exchange Communication System (PECS) versus a speech-generating device (SGD) in developing requesting skills for three elementary-age children with severe autism and little to no functional speech. Results demonstrated increases in requesting…
DIAGNOSIS AND APPRAISAL OF COMMUNICATION DISORDERS. PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

ERIC Educational Resources Information Center

DARLEY, FREDERIC L.

THIS TEXT GIVES THE STUDENT AN OUTLINE OF THE BASIC PRINCIPLES OF SCIENTIFIC METHODOLOGY WHICH UNDERLIE EVALUATIVE WORK IN SPEECH DISORDERS. RATIONALE AND ASSESSMENT TECHNIQUES ARE GIVEN FOR EXAMINATION OF THE BASIC COMMUNICATION PROCESSES OF SYMBOLIZATION, RESPIRATION, PHONATION, ARTICULATION-RESONANCE, PROSODY, ASSOCIATED SENSORY AND PERCEPTUAL…
Age-Related Changes in Preschoolers' Ability to Communicate Using Iconic Gestures in the Absence of Speech

ERIC Educational Resources Information Center

Vasc, Dermina; Miclea, Mircea

2018-01-01

Iconic gestures illustrate complex meanings and clarify and enrich the speech they accompany. Little is known, however, about how children use iconic gestures in the absence of speech. In this study, we used a cross-sectional design to investigate how 3-, 4- and 5-year-old children (N = 51) communicate using pantomime iconic gestures. Children…
Contributions of speech science to the technology of man-machine voice interactions

NASA Technical Reports Server (NTRS)

Lea, Wayne A.

1977-01-01

Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.
Speech Communication and Communication Processes: Abstracts of Doctoral Dissertations Published in "Dissertation Abstracts International," April and May 1978 (Vol. 38 Nos. 10 and 11).

ERIC Educational Resources Information Center

ERIC Clearinghouse on Reading and Communication Skills, Urbana, IL.

This collection of abstracts is part of a continuing series providing information on recent doctoral dissertations. The 25 titles deal with a variety of topics, including the following: the nature of creativity in advertising communication; speech communication difficulties of international professors; rhetorical arguments regarding the…
The effects of reverberant self- and overlap-masking on speech recognition in cochlear implant listeners.

PubMed

Desmond, Jill M; Collins, Leslie M; Throckmorton, Chandra S

2014-06-01

Many cochlear implant (CI) listeners experience decreased speech recognition in reverberant environments [Kokkinakis et al., J. Acoust. Soc. Am. 129(5), 3221-3232 (2011)], which may be caused by a combination of self- and overlap-masking [Bolt and MacDonald, J. Acoust. Soc. Am. 21(6), 577-580 (1949)]. Determining the extent to which these effects decrease speech recognition for CI listeners may influence reverberation mitigation algorithms. This study compared speech recognition with ideal self-masking mitigation, with ideal overlap-masking mitigation, and with no mitigation. Under these conditions, mitigating either self- or overlap-masking resulted in significant improvements in speech recognition for both normal hearing subjects utilizing an acoustic model and for CI listeners using their own devices.
Speaker verification system using acoustic data and non-acoustic data

DOEpatents

Gable, Todd J [Walnut Creek, CA; Ng, Lawrence C [Danville, CA; Holzrichter, John F [Berkeley, CA; Burnett, Greg C [Livermore, CA

2006-03-21

A method and system for speech characterization. One embodiment includes a method for speaker verification which includes collecting data from a speaker, wherein the data comprises acoustic data and non-acoustic data. The data is used to generate a template that includes a first set of "template" parameters. The method further includes receiving a real-time identity claim from a claimant, and using acoustic data and non-acoustic data from the identity claim to generate a second set of parameters. The method further includes comparing the first set of parameters to the set of parameters to determine whether the claimant is the speaker. The first set of parameters and the second set of parameters include at least one purely non-acoustic parameter, including a non-acoustic glottal shape parameter derived from averaging multiple glottal cycle waveforms.
Advancing Underwater Acoustic Communication for Autonomous Distributed Networks via Sparse Channel Sensing, Coding, and Navigation Support

DTIC Science & Technology

2011-09-30

channel interference mitigation for underwater acoustic MIMO - OFDM . 3) Turbo equalization for OFDM modulated physical layer network coding. 4) Blind CFO...Underwater Acoustic MIMO - OFDM . MIMO - OFDM has been actively studied for high data rate communications over the bandwidthlimited underwater acoustic...with the cochannel interference (CCI) due to parallel transmissions in MIMO - OFDM . Our proposed receiver has the following components: 1
ARAMCO Education: Teaching Speech Communication to a Sub-Culture in Saudi Arabia.

ERIC Educational Resources Information Center

Dick, Robert C.

Based on experiences gained by an educator from Indiana University who taught a speech communication course in Saudi Arabia, this paper details the adaptations the educator had to make in order to teach Arabian American Oil Company (ARAMCO) employees and their spouses in the politically difficult period of 1981-82. Following a brief background…
Adopting public health approaches to communication disability: challenges for the education of speech-language pathologists.

PubMed

Wylie, Karen; McAllister, Lindy; Davidson, Bronwyn; Marshall, Julie; Law, James

2014-01-01

Public health approaches to communication disability challenge the profession of speech-language pathology (SLP) to reconsider both frames of reference for practice and models of education. This paper reviews the impetus for public health approaches to communication disability and considers how public health is, and could be, incorporated into SLP education, both now and in the future. The paper describes tensions between clinical services, which have become increasingly specialized, and public health approaches that offer a broader view of communication disability and communication disability prevention. It presents a discussion of these tensions and asserts that public health approaches to communication are themselves a specialist field, requiring specific knowledge and skills. The authors suggest the use of the term 'communication disability public health' to refer to this type of work and offer a preliminary definition in order to advance discussion. Examples from three countries are provided of how some SLP degree programmes are integrating public health into the SLP curriculum. Alternative models of training for communication disability public health that may be relevant in the future in different contexts and countries are presented, prompting the SLP profession to consider whether communication disability public health is a field of practice for speech-language pathologists or whether it has broader workforce implications. The paper concludes with some suggestions for the future which may advance thinking, research and practice in communication disability public health. © 2015 S. Karger AG, Basel.
Management of swallowing and communication difficulties in Down syndrome: A survey of speech-language pathologists.

PubMed

Meyer, Carly; Theodoros, Deborah; Hickson, Louise

2017-02-01

To explore speech pathology services for people with Down syndrome across the lifespan. Speech-language pathologists (SLPs) working in Australia were invited to complete an online survey, which enquired about the speech pathology services they had provided to client/s with Down syndrome in the past 12 months. The data were analysed using descriptive statistics. A total of 390 SLPs completed the survey; 62% reported seeing a client with Down syndrome in the past 12 months. Most commonly, SLPs provided assessment and individual intervention for communication with varying levels of family involvement. The areas of dysphagia and/or communication addressed by SLPs, or in need of more services differed according to the age of the person with Down syndrome. SLPs reported a number of reasons why services were restricted. There is a need to re-assess the way that SLPs currently provide services to people with Down syndrome. More research is needed to develop and evaluate treatment approaches that can be used to better address the needs of this population.
27. LAUNCH CONTROL CAPSULE. ACOUSTICAL ENCLOSURE. COMMUNICATIONS CONSOLE AT LEFT; ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

27. LAUNCH CONTROL CAPSULE. ACOUSTICAL ENCLOSURE. COMMUNICATIONS CONSOLE AT LEFT; LAUNCH CONTROL CONSOLE AT RIGHT. PADLOCKED PANEL AT TOP CENTER CONTAINS MISSILE LAUNCH KEYS. SHOCK ISOLATOR AT FAR LEFT. VIEW TO EAST. - Minuteman III ICBM Launch Control Facility November-1, 1.5 miles North of New Raymer & State Highway 14, New Raymer, Weld County, CO
Revisiting Neil Armstrongs Moon-Landing Quote: Implications for Speech Perception, Function Word Reduction, and Acoustic Ambiguity

PubMed Central

Baese-Berk, Melissa M.; Dilley, Laura C.; Schmidt, Stephanie; Morrill, Tuuli H.; Pitt, Mark A.

2016-01-01

Neil Armstrong insisted that his quote upon landing on the moon was misheard, and that he had said one small step for a man, instead of one small step for man. What he said is unclear in part because function words like a can be reduced and spectrally indistinguishable from the preceding context. Therefore, their presence can be ambiguous, and they may disappear perceptually depending on the rate of surrounding speech. Two experiments are presented examining production and perception of reduced tokens of for and for a in spontaneous speech. Experiment 1 investigates the distributions of several acoustic features of for and for a. The results suggest that the distributions of for and for a overlap substantially, both in terms of temporal and spectral characteristics. Experiment 2 examines perception of these same tokens when the context speaking rate differs. The perceptibility of the function word a varies as a function of this context speaking rate. These results demonstrate that substantial ambiguity exists in the original quote from Armstrong, and that this ambiguity may be understood through context speaking rate. PMID:27603209

Some links on this page may take you to non-federal websites. Their policies may differ from this site.