Auditory Speech Perception Tests in Relation to the Coding Strategy in Cochlear Implant.
Bazon, Aline Cristine; Mantello, Erika Barioni; Gonçales, Alina Sanches; Isaac, Myriam de Lima; Hyppolito, Miguel Angelo; Reis, Ana Cláudia Mirândola Barbosa
2016-07-01
The objective of the evaluation of auditory perception of cochlear implant users is to determine how the acoustic signal is processed, leading to the recognition and understanding of sound. To investigate the differences in the process of auditory speech perception in individuals with postlingual hearing loss wearing a cochlear implant, using two different speech coding strategies, and to analyze speech perception and handicap perception in relation to the strategy used. This study is prospective cross-sectional cohort study of a descriptive character. We selected ten cochlear implant users that were characterized by hearing threshold by the application of speech perception tests and of the Hearing Handicap Inventory for Adults. There was no significant difference when comparing the variables subject age, age at acquisition of hearing loss, etiology, time of hearing deprivation, time of cochlear implant use and mean hearing threshold with the cochlear implant with the shift in speech coding strategy. There was no relationship between lack of handicap perception and improvement in speech perception in both speech coding strategies used. There was no significant difference between the strategies evaluated and no relation was observed between them and the variables studied.
Everyday listening questionnaire: correlation between subjective hearing and objective performance.
Brendel, Martina; Frohne-Buechner, Carolin; Lesinski-Schiedat, Anke; Lenarz, Thomas; Buechner, Andreas
2014-01-01
Clinical experience has demonstrated that speech understanding by cochlear implant (CI) recipients has improved over recent years with the development of new technology. The Everyday Listening Questionnaire 2 (ELQ 2) was designed to collect information regarding the challenges faced by CI recipients in everyday listening. The aim of this study was to compare self-assessment of CI users using ELQ 2 with objective speech recognition measures and to compare results between users of older and newer coding strategies. During their regular clinical review appointments a group of representative adult CI recipients implanted with the Advanced Bionics implant system were asked to complete the questionnaire. The first 100 patients who agreed to participate in this survey were recruited independent of processor generation and speech coding strategy. Correlations between subjectively scored hearing performance in everyday listening situations and objectively measured speech perception abilities were examined relative to the speech coding strategies used. When subjects were grouped by strategy there were significant differences between users of older 'standard' strategies and users of the newer, currently available strategies (HiRes and HiRes 120), especially in the categories of telephone use and music perception. Significant correlations were found between certain subjective ratings and the objective speech perception data in noise. There is a good correlation between subjective and objective data. Users of more recent speech coding strategies tend to have fewer problems in difficult hearing situations.
Multipath search coding of stationary signals with applications to speech
NASA Astrophysics Data System (ADS)
Fehn, H. G.; Noll, P.
1982-04-01
This paper deals with the application of multipath search coding (MSC) concepts to the coding of stationary memoryless and correlated sources, and of speech signals, at a rate of one bit per sample. Use is made of three MSC classes: (1) codebook coding, or vector quantization, (2) tree coding, and (3) trellis coding. This paper explains the performances of these coders and compares them both with those of conventional coders and with rate-distortion bounds. The potentials of MSC coding strategies are demonstrated by illustrations. The paper reports also on results of MSC coding of speech, where both the strategy of adaptive quantization and of adaptive prediction were included in coder design.
Landwehr, Markus; Fürstenberg, Dirk; Walger, Martin; von Wedel, Hasso; Meister, Hartmut
2014-01-01
Advances in speech coding strategies and electrode array designs for cochlear implants (CIs) predominantly aim at improving speech perception. Current efforts are also directed at transmitting appropriate cues of the fundamental frequency (F0) to the auditory nerve with respect to speech quality, prosody, and music perception. The aim of this study was to examine the effects of various electrode configurations and coding strategies on speech intonation identification, speaker gender identification, and music quality rating. In six MED-EL CI users electrodes were selectively deactivated in order to simulate different insertion depths and inter-electrode distances when using the high definition continuous interleaved sampling (HDCIS) and fine structure processing (FSP) speech coding strategies. Identification of intonation and speaker gender was determined and music quality rating was assessed. For intonation identification HDCIS was robust against the different electrode configurations, whereas fine structure processing showed significantly worse results when a short electrode depth was simulated. In contrast, speaker gender recognition was not affected by electrode configuration or speech coding strategy. Music quality rating was sensitive to electrode configuration. In conclusion, the three experiments revealed different outcomes, even though they all addressed the reception of F0 cues. Rapid changes in F0, as seen with intonation, were the most sensitive to electrode configurations and coding strategies. In contrast, electrode configurations and coding strategies did not show large effects when F0 information was available over a longer time period, as seen with speaker gender. Music quality relies on additional spectral cues other than F0, and was poorest when a shallow insertion was simulated.
Lorens, Artur; Zgoda, Małgorzata; Obrycka, Anita; Skarżynski, Henryk
2010-12-01
Presently, there are only few studies examining the benefits of fine structure information in coding strategies. Against this background, this study aims to assess the objective and subjective performance of children experienced with the C40+ cochlear implant using the CIS+ coding strategy who were upgraded to the OPUS 2 processor using FSP and HDCIS. In this prospective study, 60 children with more than 3.5 years of experience with the C40+ cochlear implant were upgraded to the OPUS 2 processor and fit and tested with HDCIS (Interval I). After 3 months of experience with HDCIS, they were fit with the FSP coding strategy (Interval II) and tested with all strategies (FSP, HDCIS, CIS+). After an additional 3-4 months, they were assessed on all three strategies and asked to choose their take-home strategy (Interval III). The children were tested using the Adaptive Auditory Speech Test which measures speech reception threshold (SRT) in quiet and noise at each test interval. The children were also asked to rate on a Visual Analogue Scale their satisfaction and coding strategy preference when listening to speech and a pop song. However, since not all tests could be performed at one single visit, some children were not able complete all tests at all intervals. At the study endpoint, speech in quiet showed a significant difference in SRT of 1.0 dB between FSP and HDCIS, with FSP performing better. FSP proved a better strategy compared with CIS+, showing lower SRT results of 5.2 dB. Speech in noise tests showed FSP to be significantly better than CIS+ by 0.7 dB, and HDCIS to be significantly better than CIS+ by 0.8 dB. Both satisfaction and coding strategy preference ratings also revealed that FSP and HDCIS strategies were better than CIS+ strategy when listening to speech and music. FSP was better than HDCIS when listening to speech. This study demonstrates that long-term pediatric users of the COMBI 40+ are able to upgrade to a newer processor and coding strategy without compromising their listening performance and even improving their performance with FSP after a short time of experience. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Athaudage, Chandranath R. N.; Bradley, Alan B.; Lech, Margaret
2003-12-01
A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%-60% compression of speech spectral information with negligible degradation in the decoded speech quality.
Speech perception of young children using nucleus 22-channel or CLARION cochlear implants.
Young, N M; Grohne, K M; Carrasco, V N; Brown, C
1999-04-01
This study compares the auditory perceptual skill development of 23 congenitally deaf children who received the Nucleus 22-channel cochlear implant with the SPEAK speech coding strategy, and 20 children who received the CLARION Multi-Strategy Cochlear Implant with the Continuous Interleaved Sampler (CIS) speech coding strategy. All were under 5 years old at implantation. Preimplantation, there were no significant differences between the groups in age, length of hearing aid use, or communication mode. Auditory skills were assessed at 6 months and 12 months after implantation. Postimplantation, the mean scores on all speech perception tests were higher for the Clarion group. These differences were statistically significant for the pattern perception and monosyllable subtests of the Early Speech Perception battery at 6 months, and for the Glendonald Auditory Screening Procedure at 12 months. Multiple regression analysis revealed that device type accounted for the greatest variance in performance after 12 months of implant use. We conclude that children using the CIS strategy implemented in the Clarion implant may develop better auditory perceptual skills during the first year postimplantation than children using the SPEAK strategy with the Nucleus device.
Coding strategies for cochlear implants under adverse environments
NASA Astrophysics Data System (ADS)
Tahmina, Qudsia
Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise.
Neben, Nicole; Lenarz, Thomas; Schuessler, Mark; Harpel, Theo; Buechner, Andreas
2013-05-01
Results for speech recognition in noise tests when using a new research coding strategy designed to introduce the virtual channel effect provided no advantage over MP3(000™). Although statistically significant smaller just noticeable differences (JNDs) were obtained, the findings for pitch ranking proved to have little clinical impact. The aim of this study was to explore whether modifications to MP3000 by including sequential virtual channel stimulation would lead to further improvements in hearing, particularly for speech recognition in background noise and in competing-talker conditions, and to compare results for pitch perception and melody recognition, as well as informally collect subjective impressions on strategy preference. Nine experienced cochlear implant subjects were recruited for the prospective study. Two variants of the experimental strategy were compared to MP3000. The study design was a single-blinded ABCCBA cross-over trial paradigm with 3 weeks of take-home experience for each user condition. Comparing results of pitch-ranking, a significantly reduced JND was identified. No significant effect of coding strategy on speech understanding in noise or competing-talker materials was found. Melody recognition skills were the same under all user conditions.
NASA Technical Reports Server (NTRS)
Mcaulay, Robert J.; Quatieri, Thomas F.
1988-01-01
It has been shown that an analysis/synthesis system based on a sinusoidal representation of speech leads to synthetic speech that is essentially perceptually indistinguishable from the original. Strategies for coding the amplitudes, frequencies and phases of the sine waves have been developed that have led to a multirate coder operating at rates from 2400 to 9600 bps. The encoded speech is highly intelligible at all rates with a uniformly improving quality as the data rate is increased. A real-time fixed-point implementation has been developed using two ADSP2100 DSP chips. The methods used for coding and quantizing the sine-wave parameters for operation at the various frame rates are described.
A software tool for analyzing multichannel cochlear implant signals.
Lai, Wai Kong; Bögli, Hans; Dillier, Norbert
2003-10-01
A useful and convenient means to analyze the radio frequency (RF) signals being sent by a speech processor to a cochlear implant would be to actually capture and display them with appropriate software. This is particularly useful for development or diagnostic purposes. sCILab (Swiss Cochlear Implant Laboratory) is such a PC-based software tool intended for the Nucleus family of Multichannel Cochlear Implants. Its graphical user interface provides a convenient and intuitive means for visualizing and analyzing the signals encoding speech information. Both numerical and graphic displays are available for detailed examination of the captured CI signals, as well as an acoustic simulation of these CI signals. sCILab has been used in the design and verification of new speech coding strategies, and has also been applied as an analytical tool in studies of how different parameter settings of existing speech coding strategies affect speech perception. As a diagnostic tool, it is also useful for troubleshooting problems with the external equipment of the cochlear implant systems.
“Down the Language Rabbit Hole with Alice”: A Case Study of a Deaf Girl with a Cochlear Implant
Andrews, Jean F.; Dionne, Vickie
2011-01-01
Alice, a deaf girl who was implanted after age three years of age was exposed to four weeks of storybook sessions conducted in American Sign Language (ASL) and speech (English). Two research questions were address: (1) how did she use her sign bimodal/bilingualism, codeswitching, and code mixing during reading activities and (2) what sign bilingual code-switching and code-mixing strategies did she use while attending to stories delivered under two treatments: ASL only and speech only. Retelling scores were collected to determine the type and frequency of her codeswitching/codemixing strategies between both languages after Alice was read to a story in ASL and in spoken English. Qualitative descriptive methods were utilized. Teacher, clinician and student transcripts of the reading and retelling sessions were recorded. Results showed Alice frequently used codeswitching and codeswitching strategies while retelling the stories retold under both treatments. Alice increased in her speech production retellings of the stories under both the ASL storyreading and spoken English-only reading of the story. The ASL storyreading did not decrease Alice's retelling scores in spoken English. Professionals are encouraged to consider the benefits of early sign bimodal/bilingualism to enhance the overall speech, language and reading proficiency of deaf children with cochlear implants. PMID:22135677
Results using the OPAL strategy in Mandarin speaking cochlear implant recipients.
Vandali, Andrew E; Dawson, Pam W; Arora, Komal
2017-01-01
To evaluate the effectiveness of an experimental pitch-coding strategy for improving recognition of Mandarin lexical tone in cochlear implant (CI) recipients. Adult CI recipients were tested on recognition of Mandarin tones in quiet and speech-shaped noise at a signal-to-noise ratio of +10 dB; Mandarin sentence speech-reception threshold (SRT) in speech-shaped noise; and pitch discrimination of synthetic complex-harmonic tones in quiet. Two versions of the experimental strategy were examined: (OPAL) linear (1:1) mapping of fundamental frequency (F0) to the coded modulation rate; and (OPAL+) transposed mapping of high F0s to a lower coded rate. Outcomes were compared to results using the clinical ACE™ strategy. Five Mandarin speaking users of Nucleus® cochlear implants. A small but significant benefit in recognition of lexical tones was observed using OPAL compared to ACE in noise, but not in quiet, and not for OPAL+ compared to ACE or OPAL in quiet or noise. Sentence SRTs were significantly better using OPAL+ and comparable using OPAL to those using ACE. No differences in pitch discrimination thresholds were observed across strategies. OPAL can provide benefits to Mandarin lexical tone recognition in moderately noisy conditions and preserve perception of Mandarin sentences in challenging noise conditions.
Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Gorospe, José M; Ruiz, Santiago Santa Cruz; Benito, Fernando; Wilson, Blake S
2017-05-01
We have recently proposed a binaural cochlear implant (CI) sound processing strategy inspired by the contralateral medial olivocochlear reflex (the MOC strategy) and shown that it improves intelligibility in steady-state noise (Lopez-Poveda et al., 2016, Ear Hear 37:e138-e148). The aim here was to evaluate possible speech-reception benefits of the MOC strategy for speech maskers, a more natural type of interferer. Speech reception thresholds (SRTs) were measured in six bilateral and two single-sided deaf CI users with the MOC strategy and with a standard (STD) strategy. SRTs were measured in unilateral and bilateral listening conditions, and for target and masker stimuli located at azimuthal angles of (0°, 0°), (-15°, +15°), and (-90°, +90°). Mean SRTs were 2-5 dB better with the MOC than with the STD strategy for spatially separated target and masker sources. For bilateral CI users, the MOC strategy (1) facilitated the intelligibility of speech in competition with spatially separated speech maskers in both unilateral and bilateral listening conditions; and (2) led to an overall improvement in spatial release from masking in the two listening conditions. Insofar as speech is a more natural type of interferer than steady-state noise, the present results suggest that the MOC strategy holds potential for promising outcomes for CI users. Copyright © 2017. Published by Elsevier B.V.
Equality marker in the language of bali
NASA Astrophysics Data System (ADS)
Wajdi, Majid; Subiyanto, Paulus
2018-01-01
The language of Bali could be grouped into one of the most elaborate languages of the world since the existence of its speech levels, low and high speech levels, as the language of Java has. Low and high speech levels of the language of Bali are language codes that could be used to show and express social relationship between or among its speakers. This paper focuses on describing, analyzing, and interpreting the use of the low code of the language of Bali in daily communication in the speech community of Pegayaman, Bali. Observational and documentation methods were applied to provide the data for the research. Recoding and field note techniques were executed to provide the data. Recorded in spoken language and the study of novel of Balinese were transcribed into written form to ease the process of analysis. Symmetric use of low code expresses social equality between or among the participants involves in the communication. It also implies social intimacy between or among the speakers of the language of Bali. Regular and patterned use of the low code of the language of Bali is not merely communication strategy, but it is a kind of communication agreement or communication contract between the participants. By using low code during their social and communication activities, the participants shared and express their social equality and intimacy between or among the participants involve in social and communication activities.
Design and Evaluation of a Cochlear Implant Strategy Based on a “Phantom” Channel
Nogueira, Waldo; Litvak, Leonid M.; Saoji, Aniket A.; Büchner, Andreas
2015-01-01
Unbalanced bipolar stimulation, delivered using charge balanced pulses, was used to produce “Phantom stimulation”, stimulation beyond the most apical contact of a cochlear implant’s electrode array. The Phantom channel was allocated audio frequencies below 300Hz in a speech coding strategy, conveying energy some two octaves lower than the clinical strategy and hence delivering the fundamental frequency of speech and of many musical tones. A group of 12 Advanced Bionics cochlear implant recipients took part in a chronic study investigating the fitting of the Phantom strategy and speech and music perception when using Phantom. The evaluation of speech in noise was performed immediately after fitting Phantom for the first time (Session 1) and after one month of take-home experience (Session 2). A repeated measures of analysis of variance (ANOVA) within factors strategy (Clinical, Phantom) and interaction time (Session 1, Session 2) revealed a significant effect for the interaction time and strategy. Phantom obtained a significant improvement in speech intelligibility after one month of use. Furthermore, a trend towards a better performance with Phantom (48%) with respect to F120 (37%) after 1 month of use failed to reach significance after type 1 error correction. Questionnaire results show a preference for Phantom when listening to music, likely driven by an improved balance between high and low frequencies. PMID:25806818
Coutinho, Eduardo; Schuller, Björn
2017-01-01
Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies-the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain.
Eustaquio-Martín, Almudena; Stohl, Joshua S.; Wolford, Robert D.; Schatzer, Reinhold; Wilson, Blake S.
2016-01-01
Objectives: In natural hearing, cochlear mechanical compression is dynamically adjusted via the efferent medial olivocochlear reflex (MOCR). These adjustments probably help understanding speech in noisy environments and are not available to the users of current cochlear implants (CIs). The aims of the present study are to: (1) present a binaural CI sound processing strategy inspired by the control of cochlear compression provided by the contralateral MOCR in natural hearing; and (2) assess the benefits of the new strategy for understanding speech presented in competition with steady noise with a speech-like spectrum in various spatial configurations of the speech and noise sources. Design: Pairs of CI sound processors (one per ear) were constructed to mimic or not mimic the effects of the contralateral MOCR on compression. For the nonmimicking condition (standard strategy or STD), the two processors in a pair functioned similarly to standard clinical processors (i.e., with fixed back-end compression and independently of each other). When configured to mimic the effects of the MOCR (MOC strategy), the two processors communicated with each other and the amount of back-end compression in a given frequency channel of each processor in the pair decreased/increased dynamically (so that output levels dropped/increased) with increases/decreases in the output energy from the corresponding frequency channel in the contralateral processor. Speech reception thresholds in speech-shaped noise were measured for 3 bilateral CI users and 2 single-sided deaf unilateral CI users. Thresholds were compared for the STD and MOC strategies in unilateral and bilateral listening conditions and for three spatial configurations of the speech and noise sources in simulated free-field conditions: speech and noise sources colocated in front of the listener, speech on the left ear with noise in front of the listener, and speech on the left ear with noise on the right ear. In both bilateral and unilateral listening, the electrical stimulus delivered to the test ear(s) was always calculated as if the listeners were wearing bilateral processors. Results: In both unilateral and bilateral listening conditions, mean speech reception thresholds were comparable with the two strategies for colocated speech and noise sources, but were at least 2 dB lower (better) with the MOC than with the STD strategy for spatially separated speech and noise sources. In unilateral listening conditions, mean thresholds improved with increasing the spatial separation between the speech and noise sources regardless of the strategy but the improvement was significantly greater with the MOC strategy. In bilateral listening conditions, thresholds improved significantly with increasing the speech-noise spatial separation only with the MOC strategy. Conclusions: The MOC strategy (1) significantly improved the intelligibility of speech presented in competition with a spatially separated noise source, both in unilateral and bilateral listening conditions; (2) produced significant spatial release from masking in bilateral listening conditions, something that did not occur with fixed compression; and (3) enhanced spatial release from masking in unilateral listening conditions. The MOC strategy as implemented here, or a modified version of it, may be usefully applied in CIs and in hearing aids. PMID:26862711
Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena; Stohl, Joshua S; Wolford, Robert D; Schatzer, Reinhold; Wilson, Blake S
2016-01-01
In natural hearing, cochlear mechanical compression is dynamically adjusted via the efferent medial olivocochlear reflex (MOCR). These adjustments probably help understanding speech in noisy environments and are not available to the users of current cochlear implants (CIs). The aims of the present study are to: (1) present a binaural CI sound processing strategy inspired by the control of cochlear compression provided by the contralateral MOCR in natural hearing; and (2) assess the benefits of the new strategy for understanding speech presented in competition with steady noise with a speech-like spectrum in various spatial configurations of the speech and noise sources. Pairs of CI sound processors (one per ear) were constructed to mimic or not mimic the effects of the contralateral MOCR on compression. For the nonmimicking condition (standard strategy or STD), the two processors in a pair functioned similarly to standard clinical processors (i.e., with fixed back-end compression and independently of each other). When configured to mimic the effects of the MOCR (MOC strategy), the two processors communicated with each other and the amount of back-end compression in a given frequency channel of each processor in the pair decreased/increased dynamically (so that output levels dropped/increased) with increases/decreases in the output energy from the corresponding frequency channel in the contralateral processor. Speech reception thresholds in speech-shaped noise were measured for 3 bilateral CI users and 2 single-sided deaf unilateral CI users. Thresholds were compared for the STD and MOC strategies in unilateral and bilateral listening conditions and for three spatial configurations of the speech and noise sources in simulated free-field conditions: speech and noise sources colocated in front of the listener, speech on the left ear with noise in front of the listener, and speech on the left ear with noise on the right ear. In both bilateral and unilateral listening, the electrical stimulus delivered to the test ear(s) was always calculated as if the listeners were wearing bilateral processors. In both unilateral and bilateral listening conditions, mean speech reception thresholds were comparable with the two strategies for colocated speech and noise sources, but were at least 2 dB lower (better) with the MOC than with the STD strategy for spatially separated speech and noise sources. In unilateral listening conditions, mean thresholds improved with increasing the spatial separation between the speech and noise sources regardless of the strategy but the improvement was significantly greater with the MOC strategy. In bilateral listening conditions, thresholds improved significantly with increasing the speech-noise spatial separation only with the MOC strategy. The MOC strategy (1) significantly improved the intelligibility of speech presented in competition with a spatially separated noise source, both in unilateral and bilateral listening conditions; (2) produced significant spatial release from masking in bilateral listening conditions, something that did not occur with fixed compression; and (3) enhanced spatial release from masking in unilateral listening conditions. The MOC strategy as implemented here, or a modified version of it, may be usefully applied in CIs and in hearing aids.
The development of the Nucleus Freedom Cochlear implant system.
Patrick, James F; Busby, Peter A; Gibson, Peter J
2006-12-01
Cochlear Limited (Cochlear) released the fourth-generation cochlear implant system, Nucleus Freedom, in 2005. Freedom is based on 25 years of experience in cochlear implant research and development and incorporates advances in medicine, implantable materials, electronic technology, and sound coding. This article presents the development of Cochlear's implant systems, with an overview of the first 3 generations, and details of the Freedom system: the CI24RE receiver-stimulator, the Contour Advance electrode, the modular Freedom processor, the available speech coding strategies, the input processing options of Smart Sound to improve the signal before coding as electrical signals, and the programming software. Preliminary results from multicenter studies with the Freedom system are reported, demonstrating better levels of performance compared with the previous systems. The final section presents the most recent implant reliability data, with the early findings at 18 months showing improved reliability of the Freedom implant compared with the earlier Nucleus 3 System. Also reported are some of the findings of Cochlear's collaborative research programs to improve recipient outcomes. Included are studies showing the benefits from bilateral implants, electroacoustic stimulation using an ipsilateral and/or contralateral hearing aid, advanced speech coding, and streamlined speech processor programming.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, J.F.; Ng, L.C.
1998-03-17
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, John F.; Ng, Lawrence C.
1998-01-01
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Masking of errors in transmission of VAPC-coded speech
NASA Technical Reports Server (NTRS)
Cox, Neil B.; Froese, Edwin L.
1990-01-01
A subjective evaluation is provided of the bit error sensitivity of the message elements of a Vector Adaptive Predictive (VAPC) speech coder, along with an indication of the amenability of these elements to a popular error masking strategy (cross frame hold over). As expected, a wide range of bit error sensitivity was observed. The most sensitive message components were the short term spectral information and the most significant bits of the pitch and gain indices. The cross frame hold over strategy was found to be useful for pitch and gain information, but it was not beneficial for the spectral information unless severe corruption had occurred.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, J.F.; Ng, L.C.
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less
Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain
Gross, Joachim; Hoogenboom, Nienke; Thut, Gregor; Schyns, Philippe; Panzeri, Stefano; Belin, Pascal; Garrod, Simon
2013-01-01
Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations. PMID:24391472
ERIC Educational Resources Information Center
Hickok, Gregory
2012-01-01
Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the…
Speech processing using conditional observable maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, John; Nix, David
A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence ofmore » speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ravishankar, C., Hughes Network Systems, Germantown, MD
Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfullymore » regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.« less
Ortwein, Heiderose; Benz, Alexander; Carl, Petra; Huwendiek, Sören; Pander, Tanja; Kiessling, Claudia
2017-02-01
To investigate whether the Verona Coding Definitions of Emotional Sequences to code health providers' responses (VR-CoDES-P) can be used for assessment of medical students' responses to patients' cues and concerns provided in written case vignettes. Student responses in direct speech to patient cues and concerns were analysed in 21 different case scenarios using VR-CoDES-P. A total of 977 student responses were available for coding, and 857 responses were codable with the VR-CoDES-P. In 74.6% of responses, the students used either a "reducing space" statement only or a "providing space" statement immediately followed by a "reducing space" statement. Overall, the most frequent response was explicit information advice (ERIa) followed by content exploring (EPCEx) and content acknowledgement (EPCAc). VR-CoDES-P were applicable to written responses of medical students when they were phrased in direct speech. The application of VR-CoDES-P is reliable and feasible when using the differentiation of "providing" and "reducing space" responses. Communication strategies described by students in non-direct speech were difficult to code and produced many missings. VR-CoDES-P are useful for analysis of medical students' written responses when focusing on emotional issues. Students need precise instructions for their response in the given test format. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Vector Adaptive/Predictive Encoding Of Speech
NASA Technical Reports Server (NTRS)
Chen, Juin-Hwey; Gersho, Allen
1989-01-01
Vector adaptive/predictive technique for digital encoding of speech signals yields decoded speech of very good quality after transmission at coding rate of 9.6 kb/s and of reasonably good quality at 4.8 kb/s. Requires 3 to 4 million multiplications and additions per second. Combines advantages of adaptive/predictive coding, and code-excited linear prediction, yielding speech of high quality but requires 600 million multiplications and additions per second at encoding rate of 4.8 kb/s. Vector adaptive/predictive coding technique bridges gaps in performance and complexity between adaptive/predictive coding and code-excited linear prediction.
Hateful Help--A Practical Look at the Issue of Hate Speech.
ERIC Educational Resources Information Center
Shelton, Michael W.
Many college and university administrators have responded to the recent increase in hateful incidents on campus by putting hate speech codes into place. The establishment of speech codes has sparked a heated debate over the impact that such codes have upon free speech and First Amendment values. Some commentators have suggested that viewing hate…
NASA Astrophysics Data System (ADS)
Studdert-Kennedy, M.; Obrien, N.
1983-05-01
This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. Manuscripts cover the following topics: The influence of subcategorical mismatches on lexical access; The Serbo-Croatian orthography constraints the reader to a phonologically analytic strategy; Grammatical priming effects between pronouns and inflected verb forms; Misreadings by beginning readers of Serrbo-Croatian; Bi-alphabetism and work recognition; Orthographic and phonemic coding for word identification: Evidence for Hebrew; Stress and vowel duration effects on syllable recognition; Phonetic and auditory trading relations between acoustic cues in speech perception: Further results; Linguistic coding by deaf children in relation beginning reading success; Determinants of spelling ability in deaf and hearing adults: Access to linguistic structures; A dynamical basis for action systems; On the space-time structure of human interlimb coordination; Some acoustic and physiological observations on diphthongs; Relationship between pitch control and vowel articulation; Laryngeal vibrations: A comparison between high-speed filming and glottographic techniques; Compensatory articulation in hearing impaired speakers: A cinefluorographic study; and Review (Pierre Delattre: Studies in comparative phonetics.)
Impact of dynamic rate coding aspects of mobile phone networks on forensic voice comparison.
Alzqhoul, Esam A S; Nair, Balamurali B T; Guillemin, Bernard J
2015-09-01
Previous studies have shown that landline and mobile phone networks are different in their ways of handling the speech signal, and therefore in their impact on it. But the same is also true of the different networks within the mobile phone arena. There are two major mobile phone technologies currently in use today, namely the global system for mobile communications (GSM) and code division multiple access (CDMA) and these are fundamentally different in their design. For example, the quality of the coded speech in the GSM network is a function of channel quality, whereas in the CDMA network it is determined by channel capacity (i.e., the number of users sharing a cell site). This paper examines the impact on the speech signal of a key feature of these networks, namely dynamic rate coding, and its subsequent impact on the task of likelihood-ratio-based forensic voice comparison (FVC). Surprisingly, both FVC accuracy and precision are found to be better for both GSM- and CDMA-coded speech than for uncoded. Intuitively one expects FVC accuracy to increase with increasing coded speech quality. This trend is shown to occur for the CDMA network, but, surprisingly, not for the GSM network. Further, in respect to comparisons between these two networks, FVC accuracy for CDMA-coded speech is shown to be slightly better than for GSM-coded speech, particularly when the coded-speech quality is high, but in terms of FVC precision the two networks are shown to be very similar. Copyright © 2015 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.
A Comparison of LBG and ADPCM Speech Compression Techniques
NASA Astrophysics Data System (ADS)
Bachu, Rajesh G.; Patel, Jignasa; Barkana, Buket D.
Speech compression is the technology of converting human speech into an efficiently encoded representation that can later be decoded to produce a close approximation of the original signal. In all speech there is a degree of predictability and speech coding techniques exploit this to reduce bit rates yet still maintain a suitable level of quality. This paper is a study and implementation of Linde-Buzo-Gray Algorithm (LBG) and Adaptive Differential Pulse Code Modulation (ADPCM) algorithms to compress speech signals. In here we implemented the methods using MATLAB 7.0. The methods we used in this study gave good results and performance in compressing the speech and listening tests showed that efficient and high quality coding is achieved.
Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses
ERIC Educational Resources Information Center
Foundation for Individual Rights in Education (NJ1), 2012
2012-01-01
The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be…
Arndt, Susan; Aschendorff, Antje; Laszig, Roland; Wesarg, Thomas
2016-01-01
The ability to detect a target signal masked by noise is improved in normal-hearing listeners when interaural phase differences (IPDs) between the ear signals exist either in the masker or in the signal. To improve binaural hearing in bilaterally implanted cochlear implant (BiCI) users, a coding strategy providing the best possible access to IPD is highly desirable. In this study, we compared two coding strategies in BiCI users provided with CI systems from MED-EL (Innsbruck, Austria). The CI systems were bilaterally programmed either with the fine structure processing strategy FS4 or with the constant rate strategy high definition continuous interleaved sampling (HDCIS). Familiarization periods between 6 and 12 weeks were considered. The effect of IPD was measured in two types of experiments: (a) IPD detection thresholds with tonal signals addressing mainly one apical interaural electrode pair and (b) with speech in noise in terms of binaural speech intelligibility level differences (BILD) addressing multiple electrodes bilaterally. The results in (a) showed improved IPD detection thresholds with FS4 compared with HDCIS in four out of the seven BiCI users. In contrast, 12 BiCI users in (b) showed similar BILD with FS4 (0.6 ± 1.9 dB) and HDCIS (0.5 ± 2.0 dB). However, no correlation between results in (a) and (b) both obtained with FS4 was found. In conclusion, the degree of IPD sensitivity determined on an apical interaural electrode pair was not an indicator for BILD based on bilateral multielectrode stimulation. PMID:27659487
Zirn, Stefan; Arndt, Susan; Aschendorff, Antje; Laszig, Roland; Wesarg, Thomas
2016-09-22
The ability to detect a target signal masked by noise is improved in normal-hearing listeners when interaural phase differences (IPDs) between the ear signals exist either in the masker or in the signal. To improve binaural hearing in bilaterally implanted cochlear implant (BiCI) users, a coding strategy providing the best possible access to IPD is highly desirable. In this study, we compared two coding strategies in BiCI users provided with CI systems from MED-EL (Innsbruck, Austria). The CI systems were bilaterally programmed either with the fine structure processing strategy FS4 or with the constant rate strategy high definition continuous interleaved sampling (HDCIS). Familiarization periods between 6 and 12 weeks were considered. The effect of IPD was measured in two types of experiments: (a) IPD detection thresholds with tonal signals addressing mainly one apical interaural electrode pair and (b) with speech in noise in terms of binaural speech intelligibility level differences (BILD) addressing multiple electrodes bilaterally. The results in (a) showed improved IPD detection thresholds with FS4 compared with HDCIS in four out of the seven BiCI users. In contrast, 12 BiCI users in (b) showed similar BILD with FS4 (0.6 ± 1.9 dB) and HDCIS (0.5 ± 2.0 dB). However, no correlation between results in (a) and (b) both obtained with FS4 was found. In conclusion, the degree of IPD sensitivity determined on an apical interaural electrode pair was not an indicator for BILD based on bilateral multielectrode stimulation. © The Author(s) 2016.
Pulse Vector-Excitation Speech Encoder
NASA Technical Reports Server (NTRS)
Davidson, Grant; Gersho, Allen
1989-01-01
Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.
Neural Coding of Formant-Exaggerated Speech in the Infant Brain
ERIC Educational Resources Information Center
Zhang, Yang; Koerner, Tess; Miller, Sharon; Grice-Patil, Zach; Svec, Adam; Akbari, David; Tusler, Liz; Carney, Edward
2011-01-01
Speech scientists have long proposed that formant exaggeration in infant-directed speech plays an important role in language acquisition. This event-related potential (ERP) study investigated neural coding of formant-exaggerated speech in 6-12-month-old infants. Two synthetic /i/ vowels were presented in alternating blocks to test the effects of…
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.
Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth
2017-08-09
Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate anatomically distinct cortical representations of modulated noise in normal-hearing and hearing-impaired listeners. This work provides the first link among hearing thresholds, the amplitude of cortical representations of modulated sounds, and the ability to understand speech in modulated background noise. In light of previous work, we propose that magnified cortical representations of modulated sounds disrupt the separation of speech from modulated background noise in auditory cortex. Copyright © 2017 Millman et al.
Auditory-neurophysiological responses to speech during early childhood: Effects of background noise
White-Schwoch, Travis; Davies, Evan C.; Thompson, Elaine C.; Carr, Kali Woodruff; Nicol, Trent; Bradlow, Ann R.; Kraus, Nina
2015-01-01
Early childhood is a critical period of auditory learning, during which children are constantly mapping sounds to meaning. But learning rarely occurs under ideal listening conditions—children are forced to listen against a relentless din. This background noise degrades the neural coding of these critical sounds, in turn interfering with auditory learning. Despite the importance of robust and reliable auditory processing during early childhood, little is known about the neurophysiology underlying speech processing in children so young. To better understand the physiological constraints these adverse listening scenarios impose on speech sound coding during early childhood, auditory-neurophysiological responses were elicited to a consonant-vowel syllable in quiet and background noise in a cohort of typically-developing preschoolers (ages 3–5 yr). Overall, responses were degraded in noise: they were smaller, less stable across trials, slower, and there was poorer coding of spectral content and the temporal envelope. These effects were exacerbated in response to the consonant transition relative to the vowel, suggesting that the neural coding of spectrotemporally-dynamic speech features is more tenuous in noise than the coding of static features—even in children this young. Neural coding of speech temporal fine structure, however, was more resilient to the addition of background noise than coding of temporal envelope information. Taken together, these results demonstrate that noise places a neurophysiological constraint on speech processing during early childhood by causing a breakdown in neural processing of speech acoustics. These results may explain why some listeners have inordinate difficulties understanding speech in noise. Speech-elicited auditory-neurophysiological responses offer objective insight into listening skills during early childhood by reflecting the integrity of neural coding in quiet and noise; this paper documents typical response properties in this age group. These normative metrics may be useful clinically to evaluate auditory processing difficulties during early childhood. PMID:26113025
Ultra-narrow bandwidth voice coding
Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA
2007-01-09
A system of removing excess information from a human speech signal and coding the remaining signal information, transmitting the coded signal, and reconstructing the coded signal. The system uses one or more EM wave sensors and one or more acoustic microphones to determine at least one characteristic of the human speech signal.
Language Recognition via Sparse Coding
2016-09-08
a posteriori (MAP) adaptation scheme that further optimizes the discriminative quality of sparse-coded speech fea - tures. We empirically validate the...significantly improve the discriminative quality of sparse-coded speech fea - tures. In Section 4, we evaluate the proposed approaches against an i-vector
NASA Technical Reports Server (NTRS)
Sandor, Aniko; Moses, Haifa
2016-01-01
Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.
Müller, Joachim
2005-01-01
Over the past two decades, the fascinating possibilities of cochlear implants for congenitally deaf or deafened children and adults developed tremendously and created a rapidly developing interdisciplinary research field. The main advancements of cochlear implantation in the past decade are marked by significant improvement of hearing and speech understanding in CI users. These improvements are attributed to the enhancement of speech coding strategies. The Implantation of more (and increasingly younger) children as well as the possibilities of the restoration of binaural hearing abilities with cochlear implants reflect the high standards reached by this development. Despite this progress, modern cochlear implants do not yet enable normal speech understanding, not even for the best patients. In particular speech understanding in noise remains problematic [1]. Until the mid 1990ies research concentrated on unilateral implantation. Remarkable and effective improvements have been made with bilateral implantation since 1996. Nowadays an increasing numbers of patients enjoy these benefits. PMID:22073052
Müller, Joachim
2005-01-01
Over the past two decades, the fascinating possibilities of cochlear implants for congenitally deaf or deafened children and adults developed tremendously and created a rapidly developing interdisciplinary research field.The main advancements of cochlear implantation in the past decade are marked by significant improvement of hearing and speech understanding in CI users. These improvements are attributed to the enhancement of speech coding strategies.The Implantation of more (and increasingly younger) children as well as the possibilities of the restoration of binaural hearing abilities with cochlear implants reflect the high standards reached by this development. Despite this progress, modern cochlear implants do not yet enable normal speech understanding, not even for the best patients. In particular speech understanding in noise remains problematic [1]. Until the mid 1990ies research concentrated on unilateral implantation. Remarkable and effective improvements have been made with bilateral implantation since 1996. Nowadays an increasing numbers of patients enjoy these benefits.
Improved Speech Coding Based on Open-Loop Parameter Estimation
NASA Technical Reports Server (NTRS)
Juang, Jer-Nan; Chen, Ya-Chin; Longman, Richard W.
2000-01-01
A nonlinear optimization algorithm for linear predictive speech coding was developed early that not only optimizes the linear model coefficients for the open loop predictor, but does the optimization including the effects of quantization of the transmitted residual. It also simultaneously optimizes the quantization levels used for each speech segment. In this paper, we present an improved method for initialization of this nonlinear algorithm, and demonstrate substantial improvements in performance. In addition, the new procedure produces monotonically improving speech quality with increasing numbers of bits used in the transmitted error residual. Examples of speech encoding and decoding are given for 8 speech segments and signal to noise levels as high as 47 dB are produced. As in typical linear predictive coding, the optimization is done on the open loop speech analysis model. Here we demonstrate that minimizing the error of the closed loop speech reconstruction, instead of the simpler open loop optimization, is likely to produce negligible improvement in speech quality. The examples suggest that the algorithm here is close to giving the best performance obtainable from a linear model, for the chosen order with the chosen number of bits for the codebook.
Civility on Campus: Harassment Codes vs. Free Speech. ASHE Annual Meeting Paper.
ERIC Educational Resources Information Center
Nordin, Virginia Davis
In response to the resurgence of racial incidents and increased "gay-bashing" on higher education campuses in recent years, campus authorities have instituted harassment codes thereby giving rise to a conflicts with free speech. Similar conflicts and challenges to free speech have arisen recently in a municipal context such as a St. Paul…
ERP evidence for the recognition of emotional prosody through simulated cochlear implant strategies.
Agrawal, Deepashri; Timm, Lydia; Viola, Filipa Campos; Debener, Stefan; Büchner, Andreas; Dengler, Reinhard; Wittfoth, Matthias
2012-09-20
Emotionally salient information in spoken language can be provided by variations in speech melody (prosody) or by emotional semantics. Emotional prosody is essential to convey feelings through speech. In sensori-neural hearing loss, impaired speech perception can be improved by cochlear implants (CIs). Aim of this study was to investigate the performance of normal-hearing (NH) participants on the perception of emotional prosody with vocoded stimuli. Semantically neutral sentences with emotional (happy, angry and neutral) prosody were used. Sentences were manipulated to simulate two CI speech-coding strategies: the Advance Combination Encoder (ACE) and the newly developed Psychoacoustic Advanced Combination Encoder (PACE). Twenty NH adults were asked to recognize emotional prosody from ACE and PACE simulations. Performance was assessed using behavioral tests and event-related potentials (ERPs). Behavioral data revealed superior performance with original stimuli compared to the simulations. For simulations, better recognition for happy and angry prosody was observed compared to the neutral. Irrespective of simulated or unsimulated stimulus type, a significantly larger P200 event-related potential was observed for happy prosody after sentence onset than the other two emotions. Further, the amplitude of P200 was significantly more positive for PACE strategy use compared to the ACE strategy. Results suggested P200 peak as an indicator of active differentiation and recognition of emotional prosody. Larger P200 peak amplitude for happy prosody indicated importance of fundamental frequency (F0) cues in prosody processing. Advantage of PACE over ACE highlighted a privileged role of the psychoacoustic masking model in improving prosody perception. Taken together, the study emphasizes on the importance of vocoded simulation to better understand the prosodic cues which CI users may be utilizing.
Hansen, J H; Nandkumar, S
1995-01-01
The formulation of reliable signal processing algorithms for speech coding and synthesis require the selection of a prior criterion of performance. Though coding efficiency (bits/second) or computational requirements can be used, a final performance measure must always include speech quality. In this paper, three objective speech quality measures are considered with respect to quality assessment for American English, noisy American English, and noise-free versions of seven languages. The purpose is to determine whether objective quality measures can be used to quantify changes in quality for a given voice coding method, with a known subjective performance level, as background noise or language conditions are changed. The speech coding algorithm chosen is regular-pulse excitation with long-term prediction (RPE-LTP), which has been chosen as the standard voice compression algorithm for the European Digital Mobile Radio system. Three areas are considered for objective quality assessment which include: (i) vocoder performance for American English in a noise-free environment, (ii) speech quality variation for three additive background noise sources, and (iii) noise-free performance for seven languages which include English, Japanese, Finnish, German, Hindi, Spanish, and French. It is suggested that although existing objective quality measures will never replace subjective testing, they can be a useful means of assessing changes in performance, identifying areas for improvement in algorithm design, and augmenting subjective quality tests for voice coding/compression algorithms in noise-free, noisy, and/or non-English applications.
Different Timescales for the Neural Coding of Consonant and Vowel Sounds
Perez, Claudia A.; Engineer, Crystal T.; Jakkamsetti, Vikram; Carraway, Ryan S.; Perry, Matthew S.
2013-01-01
Psychophysical, clinical, and imaging evidence suggests that consonant and vowel sounds have distinct neural representations. This study tests the hypothesis that consonant and vowel sounds are represented on different timescales within the same population of neurons by comparing behavioral discrimination with neural discrimination based on activity recorded in rat inferior colliculus and primary auditory cortex. Performance on 9 vowel discrimination tasks was highly correlated with neural discrimination based on spike count and was not correlated when spike timing was preserved. In contrast, performance on 11 consonant discrimination tasks was highly correlated with neural discrimination when spike timing was preserved and not when spike timing was eliminated. These results suggest that in the early stages of auditory processing, spike count encodes vowel sounds and spike timing encodes consonant sounds. These distinct coding strategies likely contribute to the robust nature of speech sound representations and may help explain some aspects of developmental and acquired speech processing disorders. PMID:22426334
Coelho, Ana Cristina; Brasolotto, Alcione Ghedini; Bevilacqua, Maria Cecília
2015-06-01
To compare some perceptual and acoustic characteristics of the voices of children who use the advanced combination encoder (ACE) or fine structure processing (FSP) speech coding strategies, and to investigate whether these characteristics differ from children with normal hearing. Acoustic analysis of the sustained vowel /a/ was performed using the multi-dimensional voice program (MDVP). Analyses of sequential and spontaneous speech were performed using the real time pitch. Perceptual analyses of these samples were performed using visual-analogic scales of pre-selected parameters. Seventy-six children from three years to five years and 11 months of age participated. Twenty-eight were users of ACE, 23 were users of FSP, and 25 were children with normal hearing. Although both groups with CI presented with some deviated vocal features, the users of ACE presented with voice quality more like children with normal hearing than the users of FSP. Sound processing of ACE appeared to provide better conditions for auditory monitoring of the voice, and consequently, for better control of the voice production. However, these findings need to be further investigated due to the lack of comparative studies published to understand exactly which attributes of sound processing are responsible for differences in performance.
Vaerenberg, Bart; Péan, Vincent; Lesbros, Guillaume; De Ceulaer, Geert; Schauwers, Karen; Daemers, Kristin; Gnansia, Dan; Govaerts, Paul J
2013-06-01
To assess the auditory performance of Digisonic(®) cochlear implant users with electric stimulation (ES) and electro-acoustic stimulation (EAS) with special attention to the processing of low-frequency temporal fine structure. Six patients implanted with a Digisonic(®) SP implant and showing low-frequency residual hearing were fitted with the Zebra(®) speech processor providing both electric and acoustic stimulation. Assessment consisted of monosyllabic speech identification tests in quiet and in noise at different presentation levels, and a pitch discrimination task using harmonic and disharmonic intonating complex sounds ( Vaerenberg et al., 2011 ). These tests investigate place and time coding through pitch discrimination. All tasks were performed with ES only and with EAS. Speech results in noise showed significant improvement with EAS when compared to ES. Whereas EAS did not yield better results in the harmonic intonation test, the improvements in the disharmonic intonation test were remarkable, suggesting better coding of pitch cues requiring phase locking. These results suggest that patients with residual hearing in the low-frequency range still have good phase-locking capacities, allowing them to process fine temporal information. ES relies mainly on place coding but provides poor low-frequency temporal coding, whereas EAS also provides temporal coding in the low-frequency range. Patients with residual phase-locking capacities can make use of these cues.
Neural evidence for predictive coding in auditory cortex during speech production.
Okada, Kayoko; Matchin, William; Hickok, Gregory
2018-02-01
Recent models of speech production suggest that motor commands generate forward predictions of the auditory consequences of those commands, that these forward predications can be used to monitor and correct speech output, and that this system is hierarchically organized (Hickok, Houde, & Rong, Neuron, 69(3), 407--422, 2011; Pickering & Garrod, Behavior and Brain Sciences, 36(4), 329--347, 2013). Recent psycholinguistic research has shown that internally generated speech (i.e., imagined speech) produces different types of errors than does overt speech (Oppenheim & Dell, Cognition, 106(1), 528--537, 2008; Oppenheim & Dell, Memory & Cognition, 38(8), 1147-1160, 2010). These studies suggest that articulated speech might involve predictive coding at additional levels than imagined speech. The current fMRI experiment investigates neural evidence of predictive coding in speech production. Twenty-four participants from UC Irvine were recruited for the study. Participants were scanned while they were visually presented with a sequence of words that they reproduced in sync with a visual metronome. On each trial, they were cued to either silently articulate the sequence or to imagine the sequence without overt articulation. As expected, silent articulation and imagined speech both engaged a left hemisphere network previously implicated in speech production. A contrast of silent articulation with imagined speech revealed greater activation for articulated speech in inferior frontal cortex, premotor cortex and the insula in the left hemisphere, consistent with greater articulatory load. Although both conditions were silent, this contrast also produced significantly greater activation in auditory cortex in dorsal superior temporal gyrus in both hemispheres. We suggest that these activations reflect forward predictions arising from additional levels of the perceptual/motor hierarchy that are involved in monitoring the intended speech output.
Bilingual Voicing: A Study of Code-Switching in the Reported Speech of Finnish Immigrants in Estonia
ERIC Educational Resources Information Center
Frick, Maria; Riionheimo, Helka
2013-01-01
Through a conversation analytic investigation of Finnish-Estonian bilingual (direct) reported speech (i.e., voicing) by Finns who live in Estonia, this study shows how code-switching is used as a double contextualization device. The code-switched voicings are shaped by the on-going interactional situation, serving its needs by opening up a context…
Look at the Gato! Code-Switching in Speech to Toddlers
ERIC Educational Resources Information Center
Bail, Amelie; Morini, Giovanna; Newman, Rochelle S.
2015-01-01
We examined code-switching (CS) in the speech of twenty-four bilingual caregivers when speaking with their 18- to 24-month-old children. All parents CS at least once in a short play session, and some code-switched quite often (over 1/3 of utterances). This CS included both inter-sentential and intra-sentential switches, suggesting that at least…
4800 B/S speech compression techniques for mobile satellite systems
NASA Technical Reports Server (NTRS)
Townes, S. A.; Barnwell, T. P., III; Rose, R. C.; Gersho, A.; Davidson, G.
1986-01-01
This paper will discuss three 4800 bps digital speech compression techniques currently being investigated for application in the mobile satellite service. These three techniques, vector adaptive predictive coding, vector excitation coding, and the self excited vocoder, are the most promising among a number of techniques being developed to possibly provide near-toll-quality speech compression while still keeping the bit-rate low enough for a power and bandwidth limited satellite service.
Abrams, Daniel A; Nicol, Trent; White-Schwoch, Travis; Zecker, Steven; Kraus, Nina
2017-05-01
Speech perception relies on a listener's ability to simultaneously resolve multiple temporal features in the speech signal. Little is known regarding neural mechanisms that enable the simultaneous coding of concurrent temporal features in speech. Here we show that two categories of temporal features in speech, the low-frequency speech envelope and periodicity cues, are processed by distinct neural mechanisms within the same population of cortical neurons. We measured population activity in primary auditory cortex of anesthetized guinea pig in response to three variants of a naturally produced sentence. Results show that the envelope of population responses closely tracks the speech envelope, and this cortical activity more closely reflects wider bandwidths of the speech envelope compared to narrow bands. Additionally, neuronal populations represent the fundamental frequency of speech robustly with phase-locked responses. Importantly, these two temporal features of speech are simultaneously observed within neuronal ensembles in auditory cortex in response to clear, conversation, and compressed speech exemplars. Results show that auditory cortical neurons are adept at simultaneously resolving multiple temporal features in extended speech sentences using discrete coding mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
Yoo, Sejin; Chung, Jun-Young; Jeon, Hyeon-Ae; Lee, Kyoung-Min; Kim, Young-Bo; Cho, Zang-Hee
2012-07-01
Speech production is inextricably linked to speech perception, yet they are usually investigated in isolation. In this study, we employed a verbal-repetition task to identify the neural substrates of speech processing with two ends active simultaneously using functional MRI. Subjects verbally repeated auditory stimuli containing an ambiguous vowel sound that could be perceived as either a word or a pseudoword depending on the interpretation of the vowel. We found verbal repetition commonly activated the audition-articulation interface bilaterally at Sylvian fissures and superior temporal sulci. Contrasting word-versus-pseudoword trials revealed neural activities unique to word repetition in the left posterior middle temporal areas and activities unique to pseudoword repetition in the left inferior frontal gyrus. These findings imply that the tasks are carried out using different speech codes: an articulation-based code of pseudowords and an acoustic-phonetic code of words. It also supports the dual-stream model and imitative learning of vocabulary. Copyright © 2012 Elsevier Inc. All rights reserved.
Goehring, Tobias; Bolner, Federico; Monaghan, Jessica J M; van Dijk, Bas; Zarowski, Andrzej; Bleeck, Stefan
2017-02-01
Speech understanding in noisy environments is still one of the major challenges for cochlear implant (CI) users in everyday life. We evaluated a speech enhancement algorithm based on neural networks (NNSE) for improving speech intelligibility in noise for CI users. The algorithm decomposes the noisy speech signal into time-frequency units, extracts a set of auditory-inspired features and feeds them to the neural network to produce an estimation of which frequency channels contain more perceptually important information (higher signal-to-noise ratio, SNR). This estimate is used to attenuate noise-dominated and retain speech-dominated CI channels for electrical stimulation, as in traditional n-of-m CI coding strategies. The proposed algorithm was evaluated by measuring the speech-in-noise performance of 14 CI users using three types of background noise. Two NNSE algorithms were compared: a speaker-dependent algorithm, that was trained on the target speaker used for testing, and a speaker-independent algorithm, that was trained on different speakers. Significant improvements in the intelligibility of speech in stationary and fluctuating noises were found relative to the unprocessed condition for the speaker-dependent algorithm in all noise types and for the speaker-independent algorithm in 2 out of 3 noise types. The NNSE algorithms used noise-specific neural networks that generalized to novel segments of the same noise type and worked over a range of SNRs. The proposed algorithm has the potential to improve the intelligibility of speech in noise for CI users while meeting the requirements of low computational complexity and processing delay for application in CI devices. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Signers and co-speech gesturers adopt similar strategies for portraying viewpoint in narratives.
Quinto-Pozos, David; Parrill, Fey
2015-01-01
Gestural viewpoint research suggests that several dimensions determine which perspective a narrator takes, including properties of the event described. Events can evoke gestures from the point of view of a character (CVPT), an observer (OVPT), or both perspectives. CVPT and OVPT gestures have been compared to constructed action (CA) and classifiers (CL) in signed languages. We ask how CA and CL, as represented in ASL productions, compare to previous results for CVPT and OVPT from English-speaking co-speech gesturers. Ten ASL signers described cartoon stimuli from Parrill (2010). Events shown by Parrill to elicit a particular gestural strategy (CVPT, OVPT, both) were coded for signers' instances of CA and CL. CA was divided into three categories: CA-torso, CA-affect, and CA-handling. Signers used CA-handling the most when gesturers used CVPT exclusively. Additionally, signers used CL the most when gesturers used OVPT exclusively and CL the least when gesturers used CVPT exclusively. Copyright © 2014 Cognitive Science Society, Inc.
Harris, Margaret; Moreno, Constanza
2006-01-01
Nine children with severe-profound prelingual hearing loss and single-word reading scores not more than 10 months behind chronological age (Good Readers) were matched with 9 children whose reading lag was at least 15 months (Poor Readers). Good Readers had significantly higher spelling and reading comprehension scores. They produced significantly more phonetic errors (indicating the use of phonological coding) and more often correctly represented the number of syllables in spelling than Poor Readers. They also scored more highly on orthographic awareness and were better at speech reading. Speech intelligibility was the same in the two groups. Cluster analysis revealed that only three Good Readers showed strong evidence of phonetic coding in spelling although seven had good representation of syllables; only four had high orthographic awareness scores. However, all 9 children were good speech readers, suggesting that a phonological code derived through speech reading may underpin reading success for deaf children.
More About Vector Adaptive/Predictive Coding Of Speech
NASA Technical Reports Server (NTRS)
Jedrey, Thomas C.; Gersho, Allen
1992-01-01
Report presents additional information about digital speech-encoding and -decoding system described in "Vector Adaptive/Predictive Encoding of Speech" (NPO-17230). Summarizes development of vector adaptive/predictive coding (VAPC) system and describes basic functions of algorithm. Describes refinements introduced enabling receiver to cope with errors. VAPC algorithm implemented in integrated-circuit coding/decoding processors (codecs). VAPC and other codecs tested under variety of operating conditions. Tests designed to reveal effects of various background quiet and noisy environments and of poor telephone equipment. VAPC found competitive with and, in some respects, superior to other 4.8-kb/s codecs and other codecs of similar complexity.
Speaking of Race, Speaking of Sex: Hate Speech, Civil Rights, and Civil Liberties.
ERIC Educational Resources Information Center
Gates, Henry Louis, Jr.; And Others
The essays of this collection explore the restriction of speech and the hate speech codes that attempt to restrict bigoted or offensive speech and punish those who engage in it. These essays generally argue that speech restrictions are dangerous and counterproductive, but they acknowledge that it is very difficult to distinguish between…
ERIC Educational Resources Information Center
Raine, Adrian; And Others
1991-01-01
Children with speech disorders had lower short-term memory capacity and smaller word length effect than control children. Children with speech disorders also had reduced speech-motor activity during rehearsal. Results suggest that speech rate may be a causal determinant of verbal short-term memory capacity. (BC)
Speech processing using maximum likelihood continuity mapping
Hogden, John E.
2000-01-01
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Speech processing using maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.E.
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
NASA Astrophysics Data System (ADS)
Jiang, Hongyan; Qiu, Hongbing; He, Ning; Liao, Xin
2018-06-01
For the optoacoustic communication from in-air platforms to submerged apparatus, a method based on speech recognition and variable laser-pulse repetition rates is proposed, which realizes character encoding and transmission for speech. Firstly, the theories and spectrum characteristics of the laser-generated underwater sound are analyzed; and moreover character conversion and encoding for speech as well as the pattern of codes for laser modulation is studied; lastly experiments to verify the system design are carried out. Results show that the optoacoustic system, where laser modulation is controlled by speech-to-character baseband codes, is beneficial to improve flexibility in receiving location for underwater targets as well as real-time performance in information transmission. In the overwater transmitter, a pulse laser is controlled to radiate by speech signals with several repetition rates randomly selected in the range of one to fifty Hz, and then in the underwater receiver laser pulse repetition rate and data can be acquired by the preamble and information codes of the corresponding laser-generated sound. When the energy of the laser pulse is appropriate, real-time transmission for speaker-independent speech can be realized in that way, which solves the problem of underwater bandwidth resource and provides a technical approach for the air-sea communication.
Hate Speech and the First Amendment.
ERIC Educational Resources Information Center
Rainey, Susan J.; Kinsler, Waren S.; Kannarr, Tina L.; Reaves, Asa E.
This document is comprised of California state statutes, federal legislation, and court litigation pertaining to hate speech and the First Amendment. The document provides an overview of California education code sections relating to the regulation of speech; basic principles of the First Amendment; government efforts to regulate hate speech,…
The Cheerleaders' Mock Execution
ERIC Educational Resources Information Center
Trujillo-Jenks, Laura
2011-01-01
The fervor of student speech is demonstrated through different mediums and venues in public schools. In this case, a new principal encounters the mores of a community that believes in free speech, specifically student free speech. When a pep rally becomes a venue for hate speech, terroristic threats, and profanity, the student code of conduct…
Automating annotation of information-giving for analysis of clinical conversation.
Mayfield, Elijah; Laws, M Barton; Wilson, Ira B; Penstein Rosé, Carolyn
2014-02-01
Coding of clinical communication for fine-grained features such as speech acts has produced a substantial literature. However, annotation by humans is laborious and expensive, limiting application of these methods. We aimed to show that through machine learning, computers could code certain categories of speech acts with sufficient reliability to make useful distinctions among clinical encounters. The data were transcripts of 415 routine outpatient visits of HIV patients which had previously been coded for speech acts using the Generalized Medical Interaction Analysis System (GMIAS); 50 had also been coded for larger scale features using the Comprehensive Analysis of the Structure of Encounters System (CASES). We aggregated selected speech acts into information-giving and requesting, then trained the machine to automatically annotate using logistic regression classification. We evaluated reliability by per-speech act accuracy. We used multiple regression to predict patient reports of communication quality from post-visit surveys using the patient and provider information-giving to information-requesting ratio (briefly, information-giving ratio) and patient gender. Automated coding produces moderate reliability with human coding (accuracy 71.2%, κ=0.57), with high correlation between machine and human prediction of the information-giving ratio (r=0.96). The regression significantly predicted four of five patient-reported measures of communication quality (r=0.263-0.344). The information-giving ratio is a useful and intuitive measure for predicting patient perception of provider-patient communication quality. These predictions can be made with automated annotation, which is a practical option for studying large collections of clinical encounters with objectivity, consistency, and low cost, providing greater opportunity for training and reflection for care providers.
Noise suppression methods for robust speech processing
NASA Astrophysics Data System (ADS)
Boll, S. F.; Ravindra, H.; Randall, G.; Armantrout, R.; Power, R.
1980-05-01
Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during this reporting period for the research program funded to develop real time, compressed speech analysis synthesis algorithms whose performance in invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the current research and results in the areas of noise suppression using the dual input adaptive noise cancellation using the short time Fourier transform algorithms, articulation rate change techniques, and a description of an experiment which demonstrated that the spectral subtraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC 10 coded, helicopter speech by 10.6 point.
Effects of synthetic speech output in the learning of graphic symbols of varied iconicity.
Koul, Rajinder; Schlosser, Ralf
To examine the effects of additional auditory feedback from synthetic speech on the learning of high translucent symbols versus low translucent symbols. Two adults with little or no functional speech and severe intellectual disabilities served as participants. A single-subject ABACA/ACABA design was used to study the relative effects of two treatments: symbol training in the presence and absence of synthetic speech output. The results clearly indicated that the two treatments, rather than extraneous variables were responsible for gains in the symbol learning. Both participants learned either more low translucent symbols or reached their maximum learning of low translucent symbols in the speech output condition. The results of this preliminary study replicate and extend the iconicity hypothesis to a new set of learning conditions involving speech output, and suggest that feedback from speech output may assist adults with profound intellectual disabilities in coding particularly those symbols whose association with their referent cannot be coded via their visual resemblance with the referent.
School Dress Codes v. The First Amendment: Ganging up on Student Attire.
ERIC Educational Resources Information Center
Jahn, Karon L.
Do school dress codes written with the specific purpose of limiting individual dress preferences, including dress associated with gangs, infringe on speech freedoms granted by the First Amendment of the U.S. Constitution? Although the Supreme Court has extended its protection of political speech to nonverbal acts of communication, it has…
Rate and rhythm control strategies for apraxia of speech in nonfluent primary progressive aphasia.
Beber, Bárbara Costa; Berbert, Monalise Costa Batista; Grawer, Ruth Siqueira; Cardoso, Maria Cristina de Almeida Freitas
2018-01-01
The nonfluent/agrammatic variant of primary progressive aphasia is characterized by apraxia of speech and agrammatism. Apraxia of speech limits patients' communication due to slow speaking rate, sound substitutions, articulatory groping, false starts and restarts, segmentation of syllables, and increased difficulty with increasing utterance length. Speech and language therapy is known to benefit individuals with apraxia of speech due to stroke, but little is known about its effects in primary progressive aphasia. This is a case report of a 72-year-old, illiterate housewife, who was diagnosed with nonfluent primary progressive aphasia and received speech and language therapy for apraxia of speech. Rate and rhythm control strategies for apraxia of speech were trained to improve initiation of speech. We discuss the importance of these strategies to alleviate apraxia of speech in this condition and the future perspectives in the area.
ERIC Educational Resources Information Center
Podgor, Ellen S.
1976-01-01
The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)
Improving Speech Perception in Noise with Current Focusing in Cochlear Implant Users
Srinivasan, Arthi G.; Padilla, Monica; Shannon, Robert V.; Landsberger, David M.
2013-01-01
Cochlear implant (CI) users typically have excellent speech recognition in quiet but struggle with understanding speech in noise. It is thought that broad current spread from stimulating electrodes causes adjacent electrodes to activate overlapping populations of neurons which results in interactions across adjacent channels. Current focusing has been studied as a way to reduce spread of excitation, and therefore, reduce channel interactions. In particular, partial tripolar stimulation has been shown to reduce spread of excitation relative to monopolar stimulation. However, the crucial question is whether this benefit translates to improvements in speech perception. In this study, we compared speech perception in noise with experimental monopolar and partial tripolar speech processing strategies. The two strategies were matched in terms of number of active electrodes, microphone, filterbanks, stimulation rate and loudness (although both strategies used a lower stimulation rate than typical clinical strategies). The results of this study showed a significant improvement in speech perception in noise with partial tripolar stimulation. All subjects benefited from the current focused speech processing strategy. There was a mean improvement in speech recognition threshold of 2.7 dB in a digits in noise task and a mean improvement of 3 dB in a sentences in noise task with partial tripolar stimulation relative to monopolar stimulation. Although the experimental monopolar strategy was worse than the clinical, presumably due to different microphones, frequency allocations and stimulation rates, the experimental partial-tripolar strategy, which had the same changes, showed no acute deficit relative to the clinical. PMID:23467170
Matsushima, J; Kumagai, M; Harada, C; Takahashi, K; Inuyama, Y; Ifukube, T
1992-09-01
Our previous reports showed that second formant information, using a speech coding method, could be transmitted through an electrode on the promontory. However, second formant information can also be transmitted by tactile stimulation. Therefore, to find out whether electrical stimulation of the auditory nerve would be superior to tactile stimulation for our speech coding method, the time resolutions of the two modes of stimulation were compared. The results showed that the time resolution of electrical promontory stimulation was three times better than the time resolution of tactile stimulation of the finger. This indicates that electrical stimulation of the auditory nerve is much better for our speech coding method than tactile stimulation of the finger.
Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses
ERIC Educational Resources Information Center
Foundation for Individual Rights in Education (NJ1), 2011
2011-01-01
Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…
Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses
ERIC Educational Resources Information Center
Foundation for Individual Rights in Education (NJ1), 2009
2009-01-01
Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…
Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses
ERIC Educational Resources Information Center
Foundation for Individual Rights in Education (NJ1), 2010
2010-01-01
Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…
Buechner, Andreas; Beynon, Andy; Szyfter, Witold; Niemczyk, Kazimierz; Hoppe, Ulrich; Hey, Matthias; Brokx, Jan; Eyles, Julie; Van de Heyning, Paul; Paludetti, Gaetano; Zarowski, Andrzej; Quaranta, Nicola; Wesarg, Thomas; Festen, Joost; Olze, Heidi; Dhooge, Ingeborg; Müller-Deile, Joachim; Ramos, Angel; Roman, Stephane; Piron, Jean-Pierre; Cuda, Domenico; Burdo, Sandro; Grolman, Wilko; Vaillard, Samantha Roux; Huarte, Alicia; Frachet, Bruno; Morera, Constantine; Garcia-Ibáñez, Luis; Abels, Daniel; Walger, Martin; Müller-Mazotta, Jochen; Leone, Carlo Antonio; Meyer, Bernard; Dillier, Norbert; Steffens, Thomas; Gentine, André; Mazzoli, Manuela; Rypkema, Gerben; Killian, Matthijs; Smoorenburg, Guido
2011-11-01
Efficacy of the SPEAK and ACE coding strategies was compared with that of a new strategy, MP3000™, by 37 European implant centers including 221 subjects. The SPEAK and ACE strategies are based on selection of 8-10 spectral components with the highest levels, while MP3000 is based on the selection of only 4-6 components, with the highest levels relative to an estimate of the spread of masking. The pulse rate per component was fixed. No significant difference was found for the speech scores and for coding preference between the SPEAK/ACE and MP3000 strategies. Battery life was 24% longer for the MP3000 strategy. With MP3000 the best results were found for a selection of six components. In addition, the best results were found for a masking function with a low-frequency slope of 50 dB/Bark and a high-frequency slope of 37 dB/Bark (50/37) as compared to the other combinations examined of 40/30 and 20/15 dB/Bark. The best results found for the steepest slopes do not seem to agree with current estimates of the spread of masking in electrical stimulation. Future research might reveal if performance with respect to SPEAK/ACE can be enhanced by increasing the number of channels in MP3000 beyond 4-6 and it should shed more light on the optimum steepness of the slopes of the masking functions applied in MP3000.
Design of a robust baseband LPC coder for speech transmission over 9.6 kbit/s noisy channels
NASA Astrophysics Data System (ADS)
Viswanathan, V. R.; Russell, W. H.; Higgins, A. L.
1982-04-01
This paper describes the design of a baseband Linear Predictive Coder (LPC) which transmits speech over 9.6 kbit/sec synchronous channels with random bit errors of up to 1%. Presented are the results of our investigation of a number of aspects of the baseband LPC coder with the goal of maximizing the quality of the transmitted speech. Important among these aspects are: bandwidth of the baseband, coding of the baseband residual, high-frequency regeneration, and error protection of important transmission parameters. The paper discusses these and other issues, presents the results of speech-quality tests conducted during the various stages of optimization, and describes the details of the optimized speech coder. This optimized speech coding algorithm has been implemented as a real-time full-duplex system on an array processor. Informal listening tests of the real-time coder have shown that the coder produces good speech quality in the absence of channel bit errors and introduces only a slight degradation in quality for channel bit error rates of up to 1%.
ERIC Educational Resources Information Center
Riley, Gresham
1993-01-01
It is argued that the arguments currently advanced for limiting speech on college campuses are also arguments that will compromise academic freedom and that a distinction needs to be made between the right of free speech and the wisdom of exercising the right on any given occasion. (MSE)
Transitioning from analog to digital audio recording in childhood speech sound disorders.
Shriberg, Lawrence D; McSweeny, Jane L; Anderson, Bruce E; Campbell, Thomas F; Chial, Michael R; Green, Jordan R; Hauner, Katherina K; Moore, Christopher A; Rusiewicz, Heather L; Wilson, David L
2005-06-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants' speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise.
Transitioning from analog to digital audio recording in childhood speech sound disorders
Shriberg, Lawrence D.; McSweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.
2014-01-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants’ speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise. PMID:16019779
[Prosody, speech input and language acquisition].
Jungheim, M; Miller, S; Kühn, D; Ptok, M
2014-04-01
In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.
Ruble, Lisa; Birdwhistell, Jessie; Toland, Michael D; McGrew, John H
2011-01-01
The significant increase in the numbers of students with autism combined with the need for better trained teachers (National Research Council, 2001) call for research on the effectiveness of alternative methods, such as consultation, that have the potential to improve service delivery. Data from 2 randomized controlled single-blind trials indicate that an autism-specific consultation planning framework known as the collaborative model for promoting competence and success (COMPASS) is effective in increasing child Individual Education Programs (IEP) outcomes (Ruble, Dal-rymple, & McGrew, 2010; Ruble, McGrew, & Toland, 2011). In this study, we describe the verbal interactions, defined as speech acts and speech act exchanges that take place during COMPASS consultation, and examine the associations between speech exchanges and child outcomes. We applied the Psychosocial Processes Coding Scheme (Leaper, 1991) to code speech acts. Speech act exchanges were overwhelmingly affiliative, failed to show statistically significant relationships with child IEP outcomes and teacher adherence, but did correlate positively with IEP quality.
RUBLE, LISA; BIRDWHISTELL, JESSIE; TOLAND, MICHAEL D.; MCGREW, JOHN H.
2011-01-01
The significant increase in the numbers of students with autism combined with the need for better trained teachers (National Research Council, 2001) call for research on the effectiveness of alternative methods, such as consultation, that have the potential to improve service delivery. Data from 2 randomized controlled single-blind trials indicate that an autism-specific consultation planning framework known as the collaborative model for promoting competence and success (COMPASS) is effective in increasing child Individual Education Programs (IEP) outcomes (Ruble, Dal-rymple, & McGrew, 2010; Ruble, McGrew, & Toland, 2011). In this study, we describe the verbal interactions, defined as speech acts and speech act exchanges that take place during COMPASS consultation, and examine the associations between speech exchanges and child outcomes. We applied the Psychosocial Processes Coding Scheme (Leaper, 1991) to code speech acts. Speech act exchanges were overwhelmingly affiliative, failed to show statistically significant relationships with child IEP outcomes and teacher adherence, but did correlate positively with IEP quality. PMID:22639523
Improving speech perception in noise with current focusing in cochlear implant users.
Srinivasan, Arthi G; Padilla, Monica; Shannon, Robert V; Landsberger, David M
2013-05-01
Cochlear implant (CI) users typically have excellent speech recognition in quiet but struggle with understanding speech in noise. It is thought that broad current spread from stimulating electrodes causes adjacent electrodes to activate overlapping populations of neurons which results in interactions across adjacent channels. Current focusing has been studied as a way to reduce spread of excitation, and therefore, reduce channel interactions. In particular, partial tripolar stimulation has been shown to reduce spread of excitation relative to monopolar stimulation. However, the crucial question is whether this benefit translates to improvements in speech perception. In this study, we compared speech perception in noise with experimental monopolar and partial tripolar speech processing strategies. The two strategies were matched in terms of number of active electrodes, microphone, filterbanks, stimulation rate and loudness (although both strategies used a lower stimulation rate than typical clinical strategies). The results of this study showed a significant improvement in speech perception in noise with partial tripolar stimulation. All subjects benefited from the current focused speech processing strategy. There was a mean improvement in speech recognition threshold of 2.7 dB in a digits in noise task and a mean improvement of 3 dB in a sentences in noise task with partial tripolar stimulation relative to monopolar stimulation. Although the experimental monopolar strategy was worse than the clinical, presumably due to different microphones, frequency allocations and stimulation rates, the experimental partial-tripolar strategy, which had the same changes, showed no acute deficit relative to the clinical. Copyright © 2013 Elsevier B.V. All rights reserved.
Speech coding at 4800 bps for mobile satellite communications
NASA Technical Reports Server (NTRS)
Gersho, Allen; Chan, Wai-Yip; Davidson, Grant; Chen, Juin-Hwey; Yong, Mei
1988-01-01
A speech compression project has recently been completed to develop a speech coding algorithm suitable for operation in a mobile satellite environment aimed at providing telephone quality natural speech at 4.8 kbps. The work has resulted in two alternative techniques which achieve reasonably good communications quality at 4.8 kbps while tolerating vehicle noise and rather severe channel impairments. The algorithms are embodied in a compact self-contained prototype consisting of two AT and T 32-bit floating-point DSP32 digital signal processors (DSP). A Motorola 68HC11 microcomputer chip serves as the board controller and interface handler. On a wirewrapped card, the prototype's circuit footprint amounts to only 200 sq cm, and consumes about 9 watts of power.
Davis, Matthew H.
2016-01-01
Successful perception depends on combining sensory input with prior knowledge. However, the underlying mechanism by which these two sources of information are combined is unknown. In speech perception, as in other domains, two functionally distinct coding schemes have been proposed for how expectations influence representation of sensory evidence. Traditional models suggest that expected features of the speech input are enhanced or sharpened via interactive activation (Sharpened Signals). Conversely, Predictive Coding suggests that expected features are suppressed so that unexpected features of the speech input (Prediction Errors) are processed further. The present work is aimed at distinguishing between these two accounts of how prior knowledge influences speech perception. By combining behavioural, univariate, and multivariate fMRI measures of how sensory detail and prior expectations influence speech perception with computational modelling, we provide evidence in favour of Prediction Error computations. Increased sensory detail and informative expectations have additive behavioural and univariate neural effects because they both improve the accuracy of word report and reduce the BOLD signal in lateral temporal lobe regions. However, sensory detail and informative expectations have interacting effects on speech representations shown by multivariate fMRI in the posterior superior temporal sulcus. When prior knowledge was absent, increased sensory detail enhanced the amount of speech information measured in superior temporal multivoxel patterns, but with informative expectations, increased sensory detail reduced the amount of measured information. Computational simulations of Sharpened Signals and Prediction Errors during speech perception could both explain these behavioural and univariate fMRI observations. However, the multivariate fMRI observations were uniquely simulated by a Prediction Error and not a Sharpened Signal model. The interaction between prior expectation and sensory detail provides evidence for a Predictive Coding account of speech perception. Our work establishes methods that can be used to distinguish representations of Prediction Error and Sharpened Signals in other perceptual domains. PMID:27846209
Status Report on Speech Research, July 1994-December 1995.
ERIC Educational Resources Information Center
Fowler, Carol A., Ed.
This publication (one of a series) contains 19 articles which report the status and progress of studies on the nature of speech, instruments for its investigation, and practical applications. Articles are: "Speech Perception Deficits in Poor Readers: Auditory Processing or Phonological Coding?" (Maria Mody and others); "Auditory…
ERIC Educational Resources Information Center
Pratt, Michael W.; And Others
1992-01-01
Investigated relations between certain family context variables and the conversational behavior of 36 parents who were playing with their 3 year olds. Transcripts were coded for types of conversational functions and structure of parent speech. Marital satisfaction was associated with aspects of parent speech. (LB)
Wirtzfeld, Michael R; Ibrahim, Rasha A; Bruce, Ian C
2017-10-01
Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.
Johari, Karim; Behroozmand, Roozbeh
2017-05-01
The predictive coding model suggests that neural processing of sensory information is facilitated for temporally-predictable stimuli. This study investigated how temporal processing of visually-presented sensory cues modulates movement reaction time and neural activities in speech and hand motor systems. Event-related potentials (ERPs) were recorded in 13 subjects while they were visually-cued to prepare to produce a steady vocalization of a vowel sound or press a button in a randomized order, and to initiate the cued movement following the onset of a go signal on the screen. Experiment was conducted in two counterbalanced blocks in which the time interval between visual cue and go signal was temporally-predictable (fixed delay at 1000 ms) or unpredictable (variable between 1000 and 2000 ms). Results of the behavioral response analysis indicated that movement reaction time was significantly decreased for temporally-predictable stimuli in both speech and hand modalities. We identified premotor ERP activities with a left-lateralized parietal distribution for hand and a frontocentral distribution for speech that were significantly suppressed in response to temporally-predictable compared with unpredictable stimuli. The premotor ERPs were elicited approximately -100 ms before movement and were significantly correlated with speech and hand motor reaction times only in response to temporally-predictable stimuli. These findings suggest that the motor system establishes a predictive code to facilitate movement in response to temporally-predictable sensory stimuli. Our data suggest that the premotor ERP activities are robust neurophysiological biomarkers of such predictive coding mechanisms. These findings provide novel insights into the temporal processing mechanisms of speech and hand motor systems.
Speech input system for meat inspection and pathological coding used thereby
NASA Astrophysics Data System (ADS)
Abe, Shozo
Meat inspection is one of exclusive and important jobs of veterinarians though it is not well known in general. As the inspection should be conducted skillfully during a series of continuous operations in a slaughter house, development of automatic inspecting systems has been required for a long time. We employed a hand-free speech input system to record the inspecting data because inspecters have to use their both hands to treat the internals of catles and check their health conditions by necked eyes. The data collected by the inspectors are transfered to a speech recognizer and then stored as controlable data of each catle inspected. Control of terms such as pathological conditions to be input and their coding are also important in this speech input system and practical examples are shown.
Schuller, Björn
2017-01-01
Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies—the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain. PMID:28658285
Dilley, Laura C; Wieland, Elizabeth A; Gamache, Jessica L; McAuley, J Devin; Redford, Melissa A
2013-02-01
As children mature, changes in voice spectral characteristics co-vary with changes in speech, language, and behavior. In this study, spectral characteristics were manipulated to alter the perceived ages of talkers' voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Speech was modified by lowering formants and fundamental frequency, for 5-year-old children's utterances, or raising them, for adult caregivers' utterances. Next, participants differing in awareness of the manipulation (Experiment 1A) or amount of speech-language training (Experiment 1B) made judgments of prosodic, segmental, and talker attributes. Experiment 2 investigated the effects of spectral modification on intelligibility. Finally, in Experiment 3, trained analysts used formal prosody coding to assess prosodic characteristics of spectrally modified and unmodified speech. Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work.
NASA Technical Reports Server (NTRS)
Kondoz, A. M.; Evans, B. G.
1993-01-01
In the last decade, low bit rate speech coding research has received much attention resulting in newly developed, good quality, speech coders operating at as low as 4.8 Kb/s. Although speech quality at around 8 Kb/s is acceptable for a wide variety of applications, at 4.8 Kb/s more improvements in quality are necessary to make it acceptable to the majority of applications and users. In addition to the required low bit rate with acceptable speech quality, other facilities such as integrated digital echo cancellation and voice activity detection are now becoming necessary to provide a cost effective and compact solution. In this paper we describe a CELP speech coder with integrated echo canceller and a voice activity detector all of which have been implemented on a single DSP32C with 32 KBytes of SRAM. The quality of CELP coded speech has been improved significantly by a new codebook implementation which also simplifies the encoder/decoder complexity making room for the integration of a 64-tap echo canceller together with a voice activity detector.
Research in speech communication.
Flanagan, J
1995-10-24
Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.
Fingerspelled and Printed Words Are Recoded into a Speech-based Code in Short-term Memory.
Sehyr, Zed Sevcikova; Petrich, Jennifer; Emmorey, Karen
2017-01-01
We conducted three immediate serial recall experiments that manipulated type of stimulus presentation (printed or fingerspelled words) and word similarity (speech-based or manual). Matched deaf American Sign Language signers and hearing non-signers participated (mean reading age = 14-15 years). Speech-based similarity effects were found for both stimulus types indicating that deaf signers recoded both printed and fingerspelled words into a speech-based phonological code. A manual similarity effect was not observed for printed words indicating that print was not recoded into fingerspelling (FS). A manual similarity effect was observed for fingerspelled words when similarity was based on joint angles rather than on handshape compactness. However, a follow-up experiment suggested that the manual similarity effect was due to perceptual confusion at encoding. Overall, these findings suggest that FS is strongly linked to English phonology for deaf adult signers who are relatively skilled readers. This link between fingerspelled words and English phonology allows for the use of a more efficient speech-based code for retaining fingerspelled words in short-term memory and may strengthen the representation of English vocabulary. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Vector Sum Excited Linear Prediction (VSELP) speech coding at 4.8 kbps
NASA Technical Reports Server (NTRS)
Gerson, Ira A.; Jasiuk, Mark A.
1990-01-01
Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their larger computational requirements. The Vector Sum Excited Linear Prediction (VSELP) speech coder utilizes a codebook with a structure which allows for a very efficient search procedure. Other advantages of the VSELP codebook structure is discussed and a detailed description of a 4.8 kbps VSELP coder is given. This coder is an improved version of the VSELP algorithm, which finished first in the NSA's evaluation of the 4.8 kbps speech coders. The coder uses a subsample resolution single tap long term predictor, a single VSELP excitation codebook, a novel gain quantizer which is robust to channel errors, and a new adaptive pre/postfilter arrangement.
Modelling the Architecture of Phonetic Plans: Evidence from Apraxia of Speech
ERIC Educational Resources Information Center
Ziegler, Wolfram
2009-01-01
In theories of spoken language production, the gestural code prescribing the movements of the speech organs is usually viewed as a linear string of holistic, encapsulated, hard-wired, phonetic plans, e.g., of the size of phonemes or syllables. Interactions between phonetic units on the surface of overt speech are commonly attributed to either the…
Do North Carolina Students Have Freedom of Speech? A Review of Campus Speech Codes
ERIC Educational Resources Information Center
Robinson, Jenna Ashley
2010-01-01
America's colleges and universities are supposed to be strongholds of classically liberal ideals, including the protection of individual rights and openness to debate and inquiry. Too often, this is not the case. Across the country, universities deny students and faculty their fundamental rights to freedom of speech and expression. The report…
Deep electrode insertion and sound coding in cochlear implants.
Hochmair, Ingeborg; Hochmair, Erwin; Nopp, Peter; Waller, Melissa; Jolly, Claude
2015-04-01
Present-day cochlear implants demonstrate remarkable speech understanding performance despite the use of non-optimized coding strategies concerning the transmission of tonal information. Most systems rely on place pitch information despite possibly large deviations from correct tonotopic placement of stimulation sites. Low frequency information is limited as well because of the constant pulse rate stimulation generally used and, being even more restrictive, of the limited insertion depth of the electrodes. This results in a compromised perception of music and tonal languages. Newly available flexible long straight electrodes permit deep insertion reaching the apical region with little or no insertion trauma. This article discusses the potential benefits of deep insertion which are obtained using pitch-locked temporal stimulation patterns. Besides the access to low frequency information, further advantages of deeply inserted long electrodes are the possibility to better approximate the correct tonotopic location of contacts, the coverage of a wider range of cochlear locations, and the somewhat reduced channel interaction due to the wider contact separation for a given number of channels. A newly developed set of strategies has been shown to improve speech understanding in noise and to enhance sound quality by providing a more "natural" impression, which especially becomes obvious when listening to music. The benefits of deep insertion should not, however, be compromised by structural damage during insertion. The small cross section and the high flexibility of the new electrodes can help to ensure less traumatic insertions as demonstrated by patients' hearing preservation rate. This article is part of a Special Issue entitled
Techniques for the Enhancement of Linear Predictive Speech Coding in Adverse Conditions
NASA Astrophysics Data System (ADS)
Wrench, Alan A.
Available from UMI in association with The British Library. Requires signed TDF. The Linear Prediction model was first applied to speech two and a half decades ago. Since then it has been the subject of intense research and continues to be one of the principal tools in the analysis of speech. Its mathematical tractability makes it a suitable subject for study and its proven success in practical applications makes the study worthwhile. The model is known to be unsuited to speech corrupted by background noise. This has led many researchers to investigate ways of enhancing the speech signal prior to Linear Predictive analysis. In this thesis this body of work is extended. The chosen application is low bit-rate (2.4 kbits/sec) speech coding. For this task the performance of the Linear Prediction algorithm is crucial because there is insufficient bandwidth to encode the error between the modelled speech and the original input. A review of the fundamentals of Linear Prediction and an independent assessment of the relative performance of methods of Linear Prediction modelling are presented. A new method is proposed which is fast and facilitates stability checking, however, its stability is shown to be unacceptably poorer than existing methods. A novel supposition governing the positioning of the analysis frame relative to a voiced speech signal is proposed and supported by observation. The problem of coding noisy speech is examined. Four frequency domain speech processing techniques are developed and tested. These are: (i) Combined Order Linear Prediction Spectral Estimation; (ii) Frequency Scaling According to an Aural Model; (iii) Amplitude Weighting Based on Perceived Loudness; (iv) Power Spectrum Squaring. These methods are compared with the Recursive Linearised Maximum a Posteriori method. Following on from work done in the frequency domain, a time domain implementation of spectrum squaring is developed. In addition, a new method of power spectrum estimation is developed based on the Minimum Variance approach. This new algorithm is shown to be closely related to Linear Prediction but produces slightly broader spectral peaks. Spectrum squaring is applied to both the new algorithm and standard Linear Prediction and their relative performance is assessed. (Abstract shortened by UMI.).
A Survey of Practices and Strategies for Marketing Communication Majors.
ERIC Educational Resources Information Center
Gray, Philip A.; Wilson, Gerald L.
Fifty college speech departments responded to a survey intended to discover some of the common practices and strategies for marketing undergraduate speech communication majors. The results indicated that the most frequent name for the departments responding was "Communication" rather than "Speech Communication," completely the opposite of what was…
ERIC Educational Resources Information Center
Studdert-Kennedy, Michael, Ed.; O'Brien, Nancy, Ed.
Prepared as part of a regular series on the status and progress of studies on the nature of speech, instrumentation for its evaluation, and practical applications for speech research, this compilation contains 14 reports. Topics covered in the reports include the following: (1) phonetic coding and order memory in relation to reading proficiency,…
Dilley, Laura C.; Wieland, Elizabeth A.; Gamache, Jessica L.; McAuley, J. Devin; Redford, Melissa A.
2013-01-01
Purpose As children mature, changes in voice spectral characteristics covary with changes in speech, language, and behavior. Spectral characteristics were manipulated to alter the perceived ages of talkers’ voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Method Speech was modified by lowering formants and fundamental frequency, for 5-year-old children’s utterances, or raising them, for adult caregivers’ utterances. Next, participants differing in awareness of the manipulation (Exp. 1a) or amount of speech-language training (Exp. 1b) made judgments of prosodic, segmental, and talker attributes. Exp. 2 investigated the effects of spectral modification on intelligibility. Finally, in Exp. 3 trained analysts used formal prosody coding to assess prosodic characteristics of spectrally-modified and unmodified speech. Results Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Conclusions Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work. PMID:23275414
The Role of Corticostriatal Systems in Speech Category Learning
Yi, Han-Gyol; Maddox, W. Todd; Mumford, Jeanette A.; Chandrasekaran, Bharath
2016-01-01
One of the most difficult category learning problems for humans is learning nonnative speech categories. While feedback-based category training can enhance speech learning, the mechanisms underlying these benefits are unclear. In this functional magnetic resonance imaging study, we investigated neural and computational mechanisms underlying feedback-dependent speech category learning in adults. Positive feedback activated a large corticostriatal network including the dorsolateral prefrontal cortex, inferior parietal lobule, middle temporal gyrus, caudate, putamen, and the ventral striatum. Successful learning was contingent upon the activity of domain-general category learning systems: the fast-learning reflective system, involving the dorsolateral prefrontal cortex that develops and tests explicit rules based on the feedback content, and the slow-learning reflexive system, involving the putamen in which the stimuli are implicitly associated with category responses based on the reward value in feedback. Computational modeling of response strategies revealed significant use of reflective strategies early in training and greater use of reflexive strategies later in training. Reflexive strategy use was associated with increased activation in the putamen. Our results demonstrate a critical role for the reflexive corticostriatal learning system as a function of response strategy and proficiency during speech category learning. Keywords: category learning, fMRI, corticostriatal systems, speech, putamen PMID:25331600
Tona, Risa; Naito, Yasushi; Moroto, Saburo; Yamamoto, Rinko; Fujiwara, Keizo; Yamazaki, Hiroshi; Shinohara, Shogo; Kikuchi, Masahiro
2015-12-01
To investigate the McGurk effect in profoundly deafened Japanese children with cochlear implants (CI) and in normal-hearing children. This was done to identify how children with profound deafness using CI established audiovisual integration during the speech acquisition period. Twenty-four prelingually deafened children with CI and 12 age-matched normal-hearing children participated in this study. Responses to audiovisual stimuli were compared between deafened and normal-hearing controls. Additionally, responses of the children with CI younger than 6 years of age were compared with those of the children with CI at least 6 years of age at the time of the test. Responses to stimuli combining auditory labials and visual non-labials were significantly different between deafened children with CI and normal-hearing controls (p<0.05). Additionally, the McGurk effect tended to be more induced in deafened children older than 6 years of age than in their younger counterparts. The McGurk effect was more significantly induced in prelingually deafened Japanese children with CI than in normal-hearing, age-matched Japanese children. Despite having good speech-perception skills and auditory input through their CI, from early childhood, deafened children may use more visual information in speech perception than normal-hearing children. As children using CI need to communicate based on insufficient speech signals coded by CI, additional activities of higher-order brain function may be necessary to compensate for the incomplete auditory input. This study provided information on the influence of deafness on the development of audiovisual integration related to speech, which could contribute to our further understanding of the strategies used in spoken language communication by prelingually deafened children. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Research in speech communication.
Flanagan, J
1995-01-01
Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806
Signal Prediction With Input Identification
NASA Technical Reports Server (NTRS)
Juang, Jer-Nan; Chen, Ya-Chin
1999-01-01
A novel coding technique is presented for signal prediction with applications including speech coding, system identification, and estimation of input excitation. The approach is based on the blind equalization method for speech signal processing in conjunction with the geometric subspace projection theory to formulate the basic prediction equation. The speech-coding problem is often divided into two parts, a linear prediction model and excitation input. The parameter coefficients of the linear predictor and the input excitation are solved simultaneously and recursively by a conventional recursive least-squares algorithm. The excitation input is computed by coding all possible outcomes into a binary codebook. The coefficients of the linear predictor and excitation, and the index of the codebook can then be used to represent the signal. In addition, a variable-frame concept is proposed to block the same excitation signal in sequence in order to reduce the storage size and increase the transmission rate. The results of this work can be easily extended to the problem of disturbance identification. The basic principles are outlined in this report and differences from other existing methods are discussed. Simulations are included to demonstrate the proposed method.
Application of a VLSI vector quantization processor to real-time speech coding
NASA Technical Reports Server (NTRS)
Davidson, G.; Gersho, A.
1986-01-01
Attention is given to a working vector quantization processor for speech coding that is based on a first-generation VLSI chip which efficiently performs the pattern-matching operation needed for the codebook search process (CPS). Using this chip, the CPS architecture has been successfully incorporated into a compact, single-board Vector PCM implementation operating at 7-18 kbits/sec. A real time Adaptive Vector Predictive Coder system using the CPS has also been implemented.
A variable rate speech compressor for mobile applications
NASA Technical Reports Server (NTRS)
Yeldener, S.; Kondoz, A. M.; Evans, B. G.
1990-01-01
One of the most promising speech coder at the bit rate of 9.6 to 4.8 kbits/s is CELP. Code Excited Linear Prediction (CELP) has been dominating 9.6 to 4.8 kbits/s region during the past 3 to 4 years. Its set back however, is its expensive implementation. As an alternative to CELP, the Base-Band CELP (CELP-BB) was developed which produced good quality speech comparable to CELP and a single chip implementable complexity as reported previously. Its robustness was also improved to tolerate errors up to 1.0 pct. and maintain intelligibility up to 5.0 pct. and more. Although, CELP-BB produces good quality speech at around 4.8 kbits/s, it has a fundamental problem when updating the pitch filter memory. A sub-optimal solution is proposed for this problem. Below 4.8 kbits/s, however, CELP-BB suffers from noticeable quantization noise as a result of the large vector dimensions used. Efficient representation of speech below 4.8 kbits/s is reported by introducing Sinusoidal Transform Coding (STC) to represent the LPC excitation which is called Sine Wave Excited LPC (SWELP). In this case, natural sounding good quality synthetic speech is obtained at around 2.4 kbits/s.
Effects of irrelevant sounds on phonological coding in reading comprehension and short-term memory.
Boyle, R; Coltheart, V
1996-05-01
The effects of irrelevant sounds on reading comprehension and short-term memory were studied in two experiments. In Experiment 1, adults judged the acceptability of written sentences during irrelevant speech, accompanied and unaccompanied singing, instrumental music, and in silence. Sentences varied in syntactic complexity: Simple sentences contained a right-branching relative clause (The applause pleased the woman that gave the speech) and syntactically complex sentences included a centre-embedded relative clause (The hay that the farmer stored fed the hungry animals). Unacceptable sentences either sounded acceptable (The dog chased the cat that eight up all his food) or did not (The man praised the child that sight up his spinach). Decision accuracy was impaired by syntactic complexity but not by irrelevant sounds. Phonological coding was indicated by increased errors on unacceptable sentences that sounded correct. These errors rates were unaffected by irrelevant sounds. Experiment 2 examined effects of irrelevant sounds on ordered recall of phonologically similar and dissimilar word lists. Phonological similarity impaired recall. Irrelevant speech reduced recall but did not interact with phonological similarity. The results of these experiments question assumptions about the relationship between speech input and phonological coding in reading and the short-term store.
Rhetorical and Linguistic Analysis of Bush's Second Inaugural Speech
ERIC Educational Resources Information Center
Sameer, Imad Hayif
2017-01-01
This study attempts to analyze Bush's second inaugural speech. It aims at investigating the use of linguistic strategies in it. It resorts to two models which are Aristotle's model while the second is that of Atkinson's (1984) to draw the attention towards linguistic strategies. The analysis shows that Bush's second inaugural speech is successful…
One Speaker, Two Languages. Cross-Disciplinary Perspectives on Code-Switching.
ERIC Educational Resources Information Center
Milroy, Lesley, Ed.; Muysken, Pieter, Ed.
Fifteen articles review code-switching in the four major areas: policy implications in specific institutional and community settings; perspectives of social theory of code-switching as a form of speech behavior in particular social contexts; the grammatical analysis of code-switching, including factors that constrain switching even within a…
Real-time speech encoding based on Code-Excited Linear Prediction (CELP)
NASA Technical Reports Server (NTRS)
Leblanc, Wilfrid P.; Mahmoud, S. A.
1988-01-01
This paper reports on the work proceeding with regard to the development of a real-time voice codec for the terrestrial and satellite mobile radio environments. The codec is based on a complexity reduced version of code-excited linear prediction (CELP). The codebook search complexity was reduced to only 0.5 million floating point operations per second (MFLOPS) while maintaining excellent speech quality. Novel methods to quantize the residual and the long and short term model filters are presented.
ERIC Educational Resources Information Center
Sovilla, J. Buttet, Ed.; de Weck, G., Ed.
1998-01-01
These articles on scaffolding in language and speech pathology/therapy are included in this issue: "Strategies d'etayage avec des enfants disphasiques: sont-elles specifiques?" ("Scaffolding Strategies for Dysphasic Children: Are They Specific?") (Genevieve de Weck); "Comparaison des strategies discursives d'etayage dans un conte et un recit…
Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
Haro, Martín; Serrà, Joan; Herrera, Perfecto; Corral, Álvaro
2012-01-01
Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources. PMID:22479497
ERIC Educational Resources Information Center
Pattamadilok, Chotiga; Nelis, Aubéline; Kolinsky, Régine
2014-01-01
Studies on proficient readers showed that speech processing is affected by knowledge of the orthographic code. Yet, the automaticity of the orthographic influence depends on task demand. Here, we addressed this automaticity issue in normal and dyslexic adult readers by comparing the orthographic effects obtained in two speech processing tasks that…
ERIC Educational Resources Information Center
Dodd, Barbara; McIntosh, Beth; Erdener, Dogu; Burnham, Denis
2008-01-01
An example of the auditory-visual illusion in speech perception, first described by McGurk and MacDonald, is the perception of [ta] when listeners hear [pa] in synchrony with the lip movements for [ka]. One account of the illusion is that lip-read and heard speech are combined in an articulatory code since people who mispronounce words respond…
ERIC Educational Resources Information Center
Beatty, Michael J.
1988-01-01
Examines the choice-making processes of students engaged in the selection of speech introduction strategies. Finds that the frequency of students making decision-making errors was a positive function of public speaking apprehension. (MS)
Strategies for Treating Compensatory Articulation in Patients with Cleft Palate
Del Carmen Pamplona, Maria; Ysunza, Antonio; Morales, Santiago
2014-01-01
Patients with cleft palate frequently show compensatory articulation (CA). CA requires a prolonged period of speech intervention. Some scaffolding strategies can be useful for correcting placement and manner of articulation in these cases. The purpose of this paper was to study whether the use of specific strategies of speech pathology can be more effective if applied according to the level of severity of CA. Ninety patients with CA were studied in two groups. One group was treated using strategies specific for their level of severity of articulation, whereas in the other group all strategies were used indistinctively. The degree of severity of CA was compared at the end of the speech intervention. After the speech therapy intervention, the group of patients in which the strategies were used selectively, showed a significantly greater decrease in the severity of CA, as compared with the patients in whom all the strategies were used indistinctively. An assessment of the severity of CA can be useful for selecting the strategies, which can be more effective for correcting the compensatory errors. PMID:24711749
Alice's adventures in um-derland: Psycholinguistic sources of variation in disfluency production
Fraundorf, Scott H.; Watson, Duane G.
2013-01-01
This study tests the hypothesis that three common types of disfluency (fillers, silent pauses, and repeated words) reflect variance in what strategies are available to the production system for responding to difficulty in language production. Participants' speech in a storytelling paradigm was coded for the three disfluency types. Repeats occurred most often when difficult material was already being produced and could be repeated, but fillers and silent pauses occurred most when difficult material was still being planned. Fillers were associated only with conceptual difficulties, consistent with the proposal that they reflect a communicative signal whereas silent pauses and repeats were also related to lexical and phonological difficulties. These differences are discussed in terms of different strategies available to the language production system. PMID:25339788
Effect of technological advances on cochlear implant performance in adults.
Lenarz, Minoo; Joseph, Gert; Sönmez, Hasibe; Büchner, Andreas; Lenarz, Thomas
2011-12-01
To evaluate the effect of technological advances in the past 20 years on the hearing performance of a large cohort of adult cochlear implant (CI) patients. Individual, retrospective, cohort study. According to technological developments in electrode design and speech-processing strategies, we defined five virtual intervals on the time scale between 1984 and 2008. A cohort of 1,005 postlingually deafened adults was selected for this study, and their hearing performance with a CI was evaluated retrospectively according to these five technological intervals. The test battery was composed of four standard German speech tests: Freiburger monosyllabic test, speech tracking test, Hochmair-Schulz-Moser (HSM) sentence test in quiet, and HSM sentence test in 10 dB noise. The direct comparison of the speech perception in postlingually deafened adults, who were implanted during different technological periods, reveals an obvious improvement in the speech perception in patients who benefited from the recent electrode designs and speech-processing strategies. The major influence of technological advances on CI performance seems to be on speech perception in noise. Better speech perception in noisy surroundings is strong proof for demonstrating the success rate of new electrode designs and speech-processing strategies. Standard (internationally comparable) speech tests in noise should become an obligatory part of the postoperative test battery for adult CI patients. Copyright © 2011 The American Laryngological, Rhinological, and Otological Society, Inc.
Use of listening strategies for the speech of individuals with dysarthria and cerebral palsy.
Hustad, Katherine C; Dardis, Caitlin M; Kramper, Amy J
2011-03-01
This study examined listeners' endorsement of cognitive, linguistic, segmental, and suprasegmental strategies employed when listening to speakers with dysarthria. The study also examined whether strategy endorsement differed between listeners who earned the highest and lowest intelligibility scores. Speakers were eight individuals with dysarthria and cerebral palsy. Listeners were 80 individuals who transcribed speech stimuli and rated their use of each of 24 listening strategies on a 4-point scale. Results showed that cognitive and linguistic strategies were most highly endorsed. Use of listening strategies did not differ between listeners with the highest and lowest intelligibility scores. Results suggest that there may be a core of strategies common to listeners of speakers with dysarthria that may be supplemented by additional strategies, based on characteristics of the speaker and speech signal.
The design of an adaptive predictive coder using a single-chip digital signal processor
NASA Astrophysics Data System (ADS)
Randolph, M. A.
1985-01-01
A speech coding processor architecture design study has been performed in which Texas Instruments TMS32010 has been selected from among three commercially available digital signal processing integrated circuits and evaluated in an implementation study of real-time Adaptive Predictive Coding (APC). The TMS32010 has been compared with AR&T Bell Laboratories DSP I and Nippon Electric Co. PD7720 and was found to be most suitable for a single chip implementation of APC. A preliminary design system based on TMS32010 has been performed, and several of the hardware and software design issues are discussed. Particular attention was paid to the design of an external memory controller which permits rapid sequential access of external RAM. As a result, it has been determined that a compact hardware implementation of the APC algorithm is feasible based of the TSM32010. Originator-supplied keywords include: vocoders, speech compression, adaptive predictive coding, digital signal processing microcomputers, speech processor architectures, and special purpose processor.
A novel speech-processing strategy incorporating tonal information for cochlear implants.
Lan, N; Nie, K B; Gao, S K; Zeng, F G
2004-05-01
Good performance in cochlear implant users depends in large part on the ability of a speech processor to effectively decompose speech signals into multiple channels of narrow-band electrical pulses for stimulation of the auditory nerve. Speech processors that extract only envelopes of the narrow-band signals (e.g., the continuous interleaved sampling (CIS) processor) may not provide sufficient information to encode the tonal cues in languages such as Chinese. To improve the performance in cochlear implant users who speak tonal language, we proposed and developed a novel speech-processing strategy, which extracted both the envelopes of the narrow-band signals and the fundamental frequency (F0) of the speech signal, and used them to modulate both the amplitude and the frequency of the electrical pulses delivered to stimulation electrodes. We developed an algorithm to extract the fundatmental frequency and identified the general patterns of pitch variations of four typical tones in Chinese speech. The effectiveness of the extraction algorithm was verified with an artificial neural network that recognized the tonal patterns from the extracted F0 information. We then compared the novel strategy with the envelope-extraction CIS strategy in human subjects with normal hearing. The novel strategy produced significant improvement in perception of Chinese tones, phrases, and sentences. This novel processor with dynamic modulation of both frequency and amplitude is encouraging for the design of a cochlear implant device for sensorineurally deaf patients who speak tonal languages.
Advanced Persuasive Speaking, English, Speech: 5114.112.
ERIC Educational Resources Information Center
Dade County Public Schools, Miami, FL.
Developed as a high school quinmester unit on persuasive speaking, this guide provides the teacher with teaching strategies for a course which analyzes speeches from "Vital Speeches of the Day," political speeches, TV commercials, and other types of speeches. Practical use of persuasive methods for school, community, county, state, and…
Deriving Word Order in Code-Switching: Feature Inheritance and Light Verbs
ERIC Educational Resources Information Center
Shim, Ji Young
2013-01-01
This dissertation investigates code-switching (CS), the concurrent use of more than one language in conversation, commonly observed in bilingual speech. Assuming that code-switching is subject to universal principles, just like monolingual grammar, the dissertation provides a principled account of code-switching, with particular emphasis on OV~VO…
Woodruff Carr, Kali; Fitzroy, Ahren B; Tierney, Adam; White-Schwoch, Travis; Kraus, Nina
2017-01-01
Speech communication involves integration and coordination of sensory perception and motor production, requiring precise temporal coupling. Beat synchronization, the coordination of movement with a pacing sound, can be used as an index of this sensorimotor timing. We assessed adolescents' synchronization and capacity to correct asynchronies when given online visual feedback. Variability of synchronization while receiving feedback predicted phonological memory and reading sub-skills, as well as maturation of cortical auditory processing; less variable synchronization during the presence of feedback tracked with maturation of cortical processing of sound onsets and resting gamma activity. We suggest the ability to incorporate feedback during synchronization is an index of intentional, multimodal timing-based integration in the maturing adolescent brain. Precision of temporal coding across modalities is important for speech processing and literacy skills that rely on dynamic interactions with sound. Synchronization employing feedback may prove useful as a remedial strategy for individuals who struggle with timing-based language learning impairments. Copyright © 2016 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Holden, Laura K.; Vandali, Andrew E.; Skinner, Margaret W.; Fourakis, Marios S.; Holden, Timothy A.
2005-01-01
One of the difficulties faced by cochlear implant (CI) recipients is perception of low-intensity speech cues. A. E. Vandali (2001) has developed the transient emphasis spectral maxima (TESM) strategy to amplify short-duration, low-level sounds. The aim of the present study was to determine whether speech scores would be significantly higher with…
Group motivational interviewing for adolescents: Change talk and alcohol and marijuana outcomes
D’Amico, Elizabeth J.; Houck, Jon M.; Hunter, Sarah B.; Miles, Jeremy N.V.; Osilla, Karen Chan; Ewing, Brett A.
2014-01-01
Objective Little is known about what may distinguish effective and ineffective group interventions. Group motivational interviewing (MI) is a promising intervention for adolescent alcohol and other drug (AOD) use; however, the mechanisms of change for group MI are unknown. One potential mechanism is change talk, which is client speech arguing for change. The present study describes the group process in adolescent group MI and effects of group-level change talk on individual alcohol and marijuana outcomes. Method We analyzed 129 group session audio recordings from a randomized clinical trial of adolescent group MI. Sequential coding was performed using the Motivational Interviewing Skill Code (MISC) and the CASAA Application for Coding Treatment Interactions (CACTI) software application. Outcomes included past-month intentions, frequency, and consequences of alcohol and marijuana use, motivation to change, and positive expectancies. Results Sequential analysis indicated that facilitator open-ended questions and reflections of change talk (CT) increased group CT. Group CT was then followed by more CT. Multilevel models accounting for rolling group enrollment revealed group CT was associated with decreased alcohol intentions, alcohol use and heavy drinking three months later; group sustain talk was associated with decreased motivation to change, increased intentions to use marijuana, and increased positive alcohol and marijuana expectancies. Conclusions Facilitator speech and peer responses each had effects on change and sustain talk in the group setting, which was then associated with individual changes. Selective reflection of CT in adolescent group MI is suggested as a strategy to manage group dynamics and increase behavioral change. PMID:25365779
JND measurements of the speech formants parameters and its implication in the LPC pole quantization
NASA Astrophysics Data System (ADS)
Orgad, Yaakov
1988-08-01
The inherent sensitivity of auditory perception is explicitly used with the objective of designing an efficient speech encoder. Speech can be modelled by a filter representing the vocal tract shape that is driven by an excitation signal representing glottal air flow. This work concentrates on the filter encoding problem, assuming that excitation signal encoding is optimal. Linear predictive coding (LPC) techniques were used to model a short speech segment by an all-pole filter; each pole was directly related to the speech formants. Measurements were made of the auditory just noticeable difference (JND) corresponding to the natural speech formants, with the LPC filter poles as the best candidates to represent the speech spectral envelope. The JND is the maximum precision required in speech quantization; it was defined on the basis of the shift of one pole parameter of a single frame of a speech segment, necessary to induce subjective perception of the distortion, with .75 probability. The average JND in LPC filter poles in natural speech was found to increase with increasing pole bandwidth and, to a lesser extent, frequency. The JND measurements showed a large spread of the residuals around the average values, indicating that inter-formant coupling and, perhaps, other, not yet fully understood, factors were not taken into account at this stage of the research. A future treatment should consider these factors. The average JNDs obtained in this work were used to design pole quantization tables for speech coding and provided a better bit-rate than the standard quantizer of reflection coefficient; a 30-bits-per-frame pole quantizer yielded a speech quality similar to that obtained with a standard 41-bits-per-frame reflection coefficient quantizer. Owing to the complexity of the numerical root extraction system, the practical implementation of the pole quantization approach remains to be proved.
Tuning time-frequency methods for the detection of metered HF speech
NASA Astrophysics Data System (ADS)
Nelson, Douglas J.; Smith, Lawrence H.
2002-12-01
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.
Language choice in bimodal bilingual development.
Lillo-Martin, Diane; de Quadros, Ronice M; Chen Pichler, Deborah; Fieldsteel, Zoe
2014-01-01
Bilingual children develop sensitivity to the language used by their interlocutors at an early age, reflected in differential use of each language by the child depending on their interlocutor. Factors such as discourse context and relative language dominance in the community may mediate the degree of language differentiation in preschool age children. Bimodal bilingual children, acquiring both a sign language and a spoken language, have an even more complex situation. Their Deaf parents vary considerably in access to the spoken language. Furthermore, in addition to code-mixing and code-switching, they use code-blending-expressions in both speech and sign simultaneously-an option uniquely available to bimodal bilinguals. Code-blending is analogous to code-switching sociolinguistically, but is also a way to communicate without suppressing one language. For adult bimodal bilinguals, complete suppression of the non-selected language is cognitively demanding. We expect that bimodal bilingual children also find suppression difficult, and use blending rather than suppression in some contexts. We also expect relative community language dominance to be a factor in children's language choices. This study analyzes longitudinal spontaneous production data from four bimodal bilingual children and their Deaf and hearing interlocutors. Even at the earliest observations, the children produced more signed utterances with Deaf interlocutors and more speech with hearing interlocutors. However, while three of the four children produced >75% speech alone in speech target sessions, they produced <25% sign alone in sign target sessions. All four produced bimodal utterances in both, but more frequently in the sign sessions, potentially because they find suppression of the dominant language more difficult. Our results indicate that these children are sensitive to the language used by their interlocutors, while showing considerable influence from the dominant community language.
Language choice in bimodal bilingual development
Lillo-Martin, Diane; de Quadros, Ronice M.; Chen Pichler, Deborah; Fieldsteel, Zoe
2014-01-01
Bilingual children develop sensitivity to the language used by their interlocutors at an early age, reflected in differential use of each language by the child depending on their interlocutor. Factors such as discourse context and relative language dominance in the community may mediate the degree of language differentiation in preschool age children. Bimodal bilingual children, acquiring both a sign language and a spoken language, have an even more complex situation. Their Deaf parents vary considerably in access to the spoken language. Furthermore, in addition to code-mixing and code-switching, they use code-blending—expressions in both speech and sign simultaneously—an option uniquely available to bimodal bilinguals. Code-blending is analogous to code-switching sociolinguistically, but is also a way to communicate without suppressing one language. For adult bimodal bilinguals, complete suppression of the non-selected language is cognitively demanding. We expect that bimodal bilingual children also find suppression difficult, and use blending rather than suppression in some contexts. We also expect relative community language dominance to be a factor in children's language choices. This study analyzes longitudinal spontaneous production data from four bimodal bilingual children and their Deaf and hearing interlocutors. Even at the earliest observations, the children produced more signed utterances with Deaf interlocutors and more speech with hearing interlocutors. However, while three of the four children produced >75% speech alone in speech target sessions, they produced <25% sign alone in sign target sessions. All four produced bimodal utterances in both, but more frequently in the sign sessions, potentially because they find suppression of the dominant language more difficult. Our results indicate that these children are sensitive to the language used by their interlocutors, while showing considerable influence from the dominant community language. PMID:25368591
Ethnography of Communication: Cultural Codes and Norms.
ERIC Educational Resources Information Center
Carbaugh, Donal
The primary tasks of the ethnographic researcher are to discover, describe, and comparatively analyze different speech communities' ways of speaking. Two general abstractions occurring in ethnographic analyses are normative and cultural. Communicative norms are formulated in analyzing and explaining the "patterned use of speech."…
NASA Astrophysics Data System (ADS)
Viswanathan, V. R.; Makhoul, J.; Schwartz, R. M.; Huggins, A. W. F.
1982-04-01
The variable frame rate (VFR) transmission methodology developed, implemented, and tested in the years 1973-1978 for efficiently transmitting linear predictive coding (LPC) vocoder parameters extracted from the input speech at a fixed frame rate is reviewed. With the VFR method, parameters are transmitted only when their values have changed sufficiently over the interval since their preceding transmission. Two distinct approaches to automatic implementation of the VFR method are discussed. The first bases the transmission decisions on comparisons between the parameter values of the present frame and the last transmitted frame. The second, which is based on a functional perceptual model of speech, compares the parameter values of all the frames that lie in the interval between the present frame and the last transmitted frame against a linear model of parameter variation over that interval. Also considered is the application of VFR transmission to the design of narrow-band LPC speech coders with average bit rates of 2000-2400 bts/s.
Communication Supports for People with Motor Speech Disorders
ERIC Educational Resources Information Center
Hanson, Elizabeth K.; Fager, Susan K.
2017-01-01
Communication supports for people with motor speech disorders can include strategies and technologies to supplement natural speech efforts, resolve communication breakdowns, and replace natural speech when necessary to enhance participation in all communicative contexts. This article emphasizes communication supports that can enhance…
Ackermann, Hermann; Mathiak, Klaus; Riecker, Axel
2007-01-01
A classical tenet of clinical neurology proposes that cerebellar disorders may give rise to speech motor disorders (ataxic dysarthria), but spare perceptual and cognitive aspects of verbal communication. During the past two decades, however, a variety of higher-order deficits of speech production, e.g., more or less exclusive agrammatism, amnesic or transcortical motor aphasia, have been noted in patients with vascular cerebellar lesions, and transient mutism following resection of posterior fossa tumors in children may develop into similar constellations. Perfusion studies provided evidence for cerebello-cerebral diaschisis as a possible pathomechanism in these instances. Tight functional connectivity between the language-dominant frontal lobe and the contralateral cerebellar hemisphere represents a prerequisite of such long-distance effects. Recent functional imaging data point at a contribution of the right cerebellar hemisphere, concomitant with language-dominant dorsolateral and medial frontal areas, to the temporal organization of a prearticulatory verbal code ('inner speech'), in terms of the sequencing of syllable strings at a speaker's habitual speech rate. Besides motor control, this network also appears to be engaged in executive functions, e.g., subvocal rehearsal mechanisms of verbal working memory, and seems to be recruited during distinct speech perception tasks. Taken together, thus, a prearticulatory verbal code bound to reciprocal right cerebellar/left frontal interactions might represent a common platform for a variety of cerebellar engagements in cognitive functions. The distinct computational operation provided by cerebellar structures within this framework appears to be the concatenation of syllable strings into coarticulated sequences.
Co-occurrence statistics as a language-dependent cue for speech segmentation.
Saksida, Amanda; Langus, Alan; Nespor, Marina
2017-05-01
To what extent can language acquisition be explained in terms of different associative learning mechanisms? It has been hypothesized that distributional regularities in spoken languages are strong enough to elicit statistical learning about dependencies among speech units. Distributional regularities could be a useful cue for word learning even without rich language-specific knowledge. However, it is not clear how strong and reliable the distributional cues are that humans might use to segment speech. We investigate cross-linguistic viability of different statistical learning strategies by analyzing child-directed speech corpora from nine languages and by modeling possible statistics-based speech segmentations. We show that languages vary as to which statistical segmentation strategies are most successful. The variability of the results can be partially explained by systematic differences between languages, such as rhythmical differences. The results confirm previous findings that different statistical learning strategies are successful in different languages and suggest that infants may have to primarily rely on non-statistical cues when they begin their process of speech segmentation. © 2016 John Wiley & Sons Ltd.
Speech transport for packet telephony and voice over IP
NASA Astrophysics Data System (ADS)
Baker, Maurice R.
1999-11-01
Recent advances in packet switching, internetworking, and digital signal processing technologies have converged to allow realizable practical implementations of packet telephony systems. This paper provides a tutorial on transmission engineering for packet telephony covering the topics of speech coding/decoding, speech packetization, packet data network transport, and impairments which may negatively impact end-to-end system quality. Particular emphasis is placed upon Voice over Internet Protocol given the current popularity and ubiquity of IP transport.
The Role of Corticostriatal Systems in Speech Category Learning.
Yi, Han-Gyol; Maddox, W Todd; Mumford, Jeanette A; Chandrasekaran, Bharath
2016-04-01
One of the most difficult category learning problems for humans is learning nonnative speech categories. While feedback-based category training can enhance speech learning, the mechanisms underlying these benefits are unclear. In this functional magnetic resonance imaging study, we investigated neural and computational mechanisms underlying feedback-dependent speech category learning in adults. Positive feedback activated a large corticostriatal network including the dorsolateral prefrontal cortex, inferior parietal lobule, middle temporal gyrus, caudate, putamen, and the ventral striatum. Successful learning was contingent upon the activity of domain-general category learning systems: the fast-learning reflective system, involving the dorsolateral prefrontal cortex that develops and tests explicit rules based on the feedback content, and the slow-learning reflexive system, involving the putamen in which the stimuli are implicitly associated with category responses based on the reward value in feedback. Computational modeling of response strategies revealed significant use of reflective strategies early in training and greater use of reflexive strategies later in training. Reflexive strategy use was associated with increased activation in the putamen. Our results demonstrate a critical role for the reflexive corticostriatal learning system as a function of response strategy and proficiency during speech category learning. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Strahl, Stefan; Mertins, Alfred
2008-07-18
Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978-982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.
Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo
2014-01-01
Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society's Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children's speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation.
Teaching Speech Organization and Outlining Using a Color-Coded Approach.
ERIC Educational Resources Information Center
Hearn, Ralene
The organization/outlining unit in the basic Public Speaking course can be made more interesting by using a color-coded instructional method that captivates students, facilitates understanding, and provides the opportunity for interesting reinforcement activities. The two part lesson includes a mini-lecture with a color-coded outline and a two…
Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model
Rallapalli, Varsha H.
2016-01-01
Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.
A speech processing study using an acoustic model of a multiple-channel cochlear implant
NASA Astrophysics Data System (ADS)
Xu, Ying
1998-10-01
A cochlear implant is an electronic device designed to provide sound information for adults and children who have bilateral profound hearing loss. The task of representing speech signals as electrical stimuli is central to the design and performance of cochlear implants. Studies have shown that the current speech- processing strategies provide significant benefits to cochlear implant users. However, the evaluation and development of speech-processing strategies have been complicated by hardware limitations and large variability in user performance. To alleviate these problems, an acoustic model of a cochlear implant with the SPEAK strategy is implemented in this study, in which a set of acoustic stimuli whose psychophysical characteristics are as close as possible to those produced by a cochlear implant are presented on normal-hearing subjects. To test the effectiveness and feasibility of this acoustic model, a psychophysical experiment was conducted to match the performance of a normal-hearing listener using model- processed signals to that of a cochlear implant user. Good agreement was found between an implanted patient and an age-matched normal-hearing subject in a dynamic signal discrimination experiment, indicating that this acoustic model is a reasonably good approximation of a cochlear implant with the SPEAK strategy. The acoustic model was then used to examine the potential of the SPEAK strategy in terms of its temporal and frequency encoding of speech. It was hypothesized that better temporal and frequency encoding of speech can be accomplished by higher stimulation rates and a larger number of activated channels. Vowel and consonant recognition tests were conducted on normal-hearing subjects using speech tokens processed by the acoustic model, with different combinations of stimulation rate and number of activated channels. The results showed that vowel recognition was best at 600 pps and 8 activated channels, but further increases in stimulation rate and channel numbers were not beneficial. Manipulations of stimulation rate and number of activated channels did not appreciably affect consonant recognition. These results suggest that overall speech performance may improve by appropriately increasing stimulation rate and number of activated channels. Future revision of this acoustic model is necessary to provide more accurate amplitude representation of speech.
Will Microfilm and Computers Replace Clippings?
ERIC Educational Resources Information Center
Oppendahl, Alison; And Others
Four speeches are presented, each of which deals with the use of conputers to organize and retrieve news stories. The first speech relates in detail the step-by-step process devised by the "Free Press" in Detroit to analyze, categorize, code, film, process, and retrieve news stories through the use of the electronic film retrieval…
Comparisons of Young Children's Private Speech Profiles: Analogical Versus Nonanalogical Reasoners.
ERIC Educational Resources Information Center
Manning, Brenda H.; White, C. Stephen
The primary intention of this study was to compare private speech profiles of young children classified as analogical reasoners (AR) with young children classified as nonanalogical reasoners (NAR). The secondary purpose was to investigate Berk's (1986) research methodology and categorical scheme for the collection and coding of private speech…
Transitioning from Analog to Digital Audio Recording in Childhood Speech Sound Disorders
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Mcsweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.
2005-01-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing…
Cultivating American- and Japanese-Style Relatedness through Mother-Child Conversation
ERIC Educational Resources Information Center
Crane, Lauren Shapiro; Fernald, Anne
2017-01-01
This study investigated whether European American and Japanese mothers' speech to preschoolers contained exchange- and alignment-oriented structures that reflect and possibly support culture-specific models of self-other relatedness. In each country 12 mothers were observed in free play with their 3-year-olds. Maternal speech was coded for…
Freedom of Speech Wins in Wisconsin
ERIC Educational Resources Information Center
Downs, Donald Alexander
2006-01-01
One might derive, from the eradication of a particularly heinous speech code, some encouragement that all is not lost in the culture wars. A core of dedicated scholars, working from within, made it obvious, to all but the most radical left, that imposing social justice by restricting thought and expression was a recipe for tyranny. Donald…
Preliminary Analysis of Automatic Speech Recognition and Synthesis Technology.
1983-05-01
16.311 % a. Seale In/Se"l tAL4 lrs e y i s 2 I ROM men "Ig eddiei, m releerla ons leveltc. Ŗ dots ghoeea INDtISTRtAIJ%6LITARY SPEECH SYNTHESIS PRODUCTS...saquence The SC-01 Suech Syntheszer conftains 64 cf, arent poneme~hs which are accessed try A 6-tht code. 1 - the proper sequ.enti omthnatiors of thoe...connected speech input with widely differing emotional states, diverse accents, and substantial nonperiodic background noise input. As noted previously
The Segmentation Problem in the Study of Impromptu Speech.
ERIC Educational Resources Information Center
Loman, Bengt
A fundamental problem in the study of spontaneous speech is how to segment it for analysis. The segments should be relevant for the study of linguistic structures, speech planning, speech production, or communication strategies. Operational rules for segmentation should consider a wide variety of criteria and be hierarchically ordered. This is…
NASA Astrophysics Data System (ADS)
The present conference on the development status of communications systems in the context of electronic warfare gives attention to topics in spread spectrum code acquisition, digital speech technology, fiber-optics communications, free space optical communications, the networking of HF systems, and applications and evaluation methods for digital speech. Also treated are issues in local area network system design, coding techniques and applications, technology applications for HF systems, receiver technologies, software development status, channel simultion/prediction methods, C3 networking spread spectrum networks, the improvement of communication efficiency and reliability through technical control methods, mobile radio systems, and adaptive antenna arrays. Finally, communications system cost analyses, spread spectrum performance, voice and image coding, switched networks, and microwave GaAs ICs, are considered.
Freedom of Speech Newsletter, Volume 4, Number 1, October 1977.
ERIC Educational Resources Information Center
Kelley, Michael P., Ed.
This newsletter features an essay, "Anticipatory Democracy and Citizen Involvement: Strategies for Communication Education in the Future," which discusses strategies for improving citizen involvement and examines ways in which educators can prepare students for constructive citizen involvement. Notes on Speech Communication Association meetings…
IEP goals for school-age children with speech sound disorders.
Farquharson, Kelly; Tambyraja, Sherine R; Justice, Laura M; Redle, Erin E
2014-01-01
The purpose of the current study was to describe the current state of practice for writing Individualized Education Program (IEP) goals for children with speech sound disorders (SSDs). IEP goals for 146 children receiving services for SSDs within public school systems across two states were coded for their dominant theoretical framework and overall quality. A dichotomous scheme was used for theoretical framework coding: cognitive-linguistic or sensory-motor. Goal quality was determined by examining 7 specific indicators outlined by an empirically tested rating tool. In total, 147 long-term and 490 short-term goals were coded. The results revealed no dominant theoretical framework for long-term goals, whereas short-term goals largely reflected a sensory-motor framework. In terms of quality, the majority of speech production goals were functional and generalizable in nature, but were not able to be easily targeted during common daily tasks or by other members of the IEP team. Short-term goals were consistently rated higher in quality domains when compared to long-term goals. The current state of practice for writing IEP goals for children with SSDs indicates that theoretical framework may be eclectic in nature and likely written to support the individual needs of children with speech sound disorders. Further investigation is warranted to determine the relations between goal quality and child outcomes. (1) Identify two predominant theoretical frameworks and discuss how they apply to IEP goal writing. (2) Discuss quality indicators as they relate to IEP goals for children with speech sound disorders. (3) Discuss the relationship between long-term goals level of quality and related theoretical frameworks. (4) Identify the areas in which business-as-usual IEP goals exhibit strong quality.
McCormack, Jane; McLeod, Sharynne; McAllister, Lindy; Harrison, Linda J
2010-10-01
The purpose of this article was to understand the experience of speech impairment (speech sound disorders) in everyday life as described by children with speech impairment and their communication partners. Interviews were undertaken with 13 preschool children with speech impairment (mild to severe) and 21 significant others (family members and teachers). A phenomenological analysis of the interview transcripts revealed 2 global themes regarding the experience of living with speech impairment for these children and their families. The first theme encompassed the problems experienced by participants, namely (a) the child's inability to "speak properly," (b) the communication partner's failure to "listen properly," and (c) frustration caused by the speaking and listening problems. The second theme described the solutions participants used to overcome the problems. Solutions included (a) strategies to improve the child's speech accuracy (e.g., home practice, speech-language pathology) and (b) strategies to improve the listener's understanding (e.g., using gestures, repetition). Both short- and long-term solutions were identified. Successful communication is dependent on the skills of speakers and listeners. Intervention with children who experience speech impairment needs to reflect this reciprocity by supporting both the speaker and the listener and by addressing the frustration they experience.
Speech effort measurement and stuttering: investigating the chorus reading effect.
Ingham, Roger J; Warner, Allison; Byrd, Anne; Cotton, John
2006-06-01
The purpose of this study was to investigate chorus reading's (CR's) effect on speech effort during oral reading by adult stuttering speakers and control participants. The effect of a speech effort measurement highlighting strategy was also investigated. Twelve persistent stuttering (PS) adults and 12 normally fluent control participants completed 1-min base rate readings (BR-nonchorus) and CRs within a BR/CR/BR/CR/BR experimental design. Participants self-rated speech effort using a 9-point scale after each reading trial. Stuttering frequency, speech rate, and speech naturalness measures were also obtained. Instructions highlighting speech effort ratings during BR and CR phases were introduced after the first CR. CR improved speech effort ratings for the PS group, but the control group showed a reverse trend. Both groups' effort ratings were not significantly different during CR phases but were significantly poorer than the control group's effort ratings during BR phases. The highlighting strategy did not significantly change effort ratings. The findings show that CR will produce not only stutter-free and natural sounding speech but also reliable reductions in speech effort. However, these reductions do not reach effort levels equivalent to those achieved by normally fluent speakers, thereby conditioning its use as a gold standard of achievable normal fluency by PS speakers.
Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo
2014-01-01
Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. The tool aimed to help physicians achieve three main goals: early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society’s Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children’s speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation. PMID:24627648
Evaluation of inner-outer space distinction and verbal hallucinations in schizophrenia.
Stephane, Massoud; Kuskowski, Michael; McClannahan, Kate; Surerus, Christa; Nelson, Katie
2010-09-01
Verbal hallucinations could result from attributing one's own inner speech to another. Inner speech is usually experienced in inner space, whereas hallucinations are often experienced in outer space. To clarify this paradox, we investigated schizophrenia patients' ability to distinguish between speech experienced in inner space, and speech experienced in outer space. 32 schizophrenia patients and 26 matched healthy controls underwent a two-stage experiment. First, they read sentences aloud or silently. Afterwards, they were required to distinguish between the sentences read aloud (experienced in outer space), the sentences read silently (experienced in inner space), and new sentences not previously read (no space coding). The sentences were in the first, second, or third person in equal proportions. Linear mixed models were used to investigate the effects of group, sentence location, pronoun, and hallucinations status. Schizophrenia patients were similar to controls in recognition capacity of sentences without space coding. They exhibited both inner-outer and outer-inner space confusion (they confused silently read sentences for sentences read aloud, and vice versa). Patients who experienced hallucinations inside their head were more likely to have outer-inner space bias. For speech generated by one's own brain, schizophrenia patients have bidirectional failure of inner-outer space distinction (inner-outer and outer-inner space biases); this might explain why hallucinations (abnormal inner speech) could be experienced in outer space. Furthermore, the direction of inner-outer space indistinction could determine the spatial location of the experienced hallucinations (inside or outside the head).
From In-Session Behaviors to Drinking Outcomes: A Causal Chain for Motivational Interviewing
ERIC Educational Resources Information Center
Moyers, Theresa B.; Martin, Tim; Houck, Jon M.; Christopher, Paulette J.; Tonigan, J. Scott
2009-01-01
Client speech in favor of change within motivational interviewing sessions has been linked to treatment outcomes, but a causal chain has not yet been demonstrated. Using a sequential behavioral coding system for client speech, the authors found that, at both the session and utterance levels, specific therapist behaviors predict client change talk.…
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Paul, Rhea; McSweeny, Jane L.; Klin, Ami; Cohen, Donald J.; Volkmar, Fred R.
2001-01-01
This study compared the speech and prosody-voice profiles for 30 male speakers with either high-functioning autism (HFA) or Asperger syndrome (AS), and 53 typically developing male speakers. Both HFA and AS groups had more residual articulation distortion errors and utterances coded as inappropriate for phrasing, stress, and resonance. AS speakers…
Dole, Marjorie; Hoen, Michel; Meunier, Fanny
2012-06-01
Developmental dyslexia is associated with impaired speech-in-noise perception. The goal of the present research was to further characterize this deficit in dyslexic adults. In order to specify the mechanisms and processing strategies used by adults with dyslexia during speech-in-noise perception, we explored the influence of background type, presenting single target-words against backgrounds made of cocktail party sounds, modulated speech-derived noise or stationary noise. We also evaluated the effect of three listening configurations differing in terms of the amount of spatial processing required. In a monaural condition, signal and noise were presented to the same ear while in a dichotic situation, target and concurrent sound were presented to two different ears, finally in a spatialised configuration, target and competing signals were presented as if they originated from slightly differing positions in the auditory scene. Our results confirm the presence of a speech-in-noise perception deficit in dyslexic adults, in particular when the competing signal is also speech, and when both signals are presented to the same ear, an observation potentially relating to phonological accounts of dyslexia. However, adult dyslexics demonstrated better levels of spatial release of masking than normal reading controls when the background was speech, suggesting that they are well able to rely on denoising strategies based on spatial auditory scene analysis strategies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Neural mechanisms underlying auditory feedback control of speech
Reilly, Kevin J.; Guenther, Frank H.
2013-01-01
The neural substrates underlying auditory feedback control of speech were investigated using a combination of functional magnetic resonance imaging (fMRI) and computational modeling. Neural responses were measured while subjects spoke monosyllabic words under two conditions: (i) normal auditory feedback of their speech, and (ii) auditory feedback in which the first formant frequency of their speech was unexpectedly shifted in real time. Acoustic measurements showed compensation to the shift within approximately 135 ms of onset. Neuroimaging revealed increased activity in bilateral superior temporal cortex during shifted feedback, indicative of neurons coding mismatches between expected and actual auditory signals, as well as right prefrontal and Rolandic cortical activity. Structural equation modeling revealed increased influence of bilateral auditory cortical areas on right frontal areas during shifted speech, indicating that projections from auditory error cells in posterior superior temporal cortex to motor correction cells in right frontal cortex mediate auditory feedback control of speech. PMID:18035557
Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach
NASA Astrophysics Data System (ADS)
Feldbauer, Christian; Kubin, Gernot; Kleijn, W. Bastiaan
2005-12-01
Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.
Congdon, Eliza L; Novack, Miriam A; Brooks, Neon; Hemani-Lopez, Naureen; O'Keefe, Lucy; Goldin-Meadow, Susan
2017-08-01
When teachers gesture during instruction, children retain and generalize what they are taught (Goldin-Meadow, 2014). But why does gesture have such a powerful effect on learning? Previous research shows that children learn most from a math lesson when teachers present one problem-solving strategy in speech while simultaneously presenting a different, but complementary, strategy in gesture (Singer & Goldin-Meadow, 2005). One possibility is that gesture is powerful in this context because it presents information simultaneously with speech. Alternatively, gesture may be effective simply because it involves the body, in which case the timing of information presented in speech and gesture may be less important for learning. Here we find evidence for the importance of simultaneity: 3 rd grade children retain and generalize what they learn from a math lesson better when given instruction containing simultaneous speech and gesture than when given instruction containing sequential speech and gesture. Interpreting these results in the context of theories of multimodal learning, we find that gesture capitalizes on its synchrony with speech to promote learning that lasts and can be generalized.
Kumar, U A; Jayaram, M
2013-07-01
The purpose of this study was to evaluate the effect of lengthening of voice onset time and burst duration of selected speech stimuli on perception by individuals with auditory dys-synchrony. This is the second of a series of articles reporting the effect of signal enhancing strategies on speech perception by such individuals. Two experiments were conducted: (1) assessment of the 'just-noticeable difference' for voice onset time and burst duration of speech sounds; and (2) assessment of speech identification scores when speech sounds were modified by lengthening the voice onset time and the burst duration in units of one just-noticeable difference, both in isolation and in combination with each other plus transition duration modification. Lengthening of voice onset time as well as burst duration improved perception of voicing. However, the effect of voice onset time modification was greater than that of burst duration modification. Although combined lengthening of voice onset time, burst duration and transition duration resulted in improved speech perception, the improvement was less than that due to lengthening of transition duration alone. These results suggest that innovative speech processing strategies that enhance temporal cues may benefit individuals with auditory dys-synchrony.
NASA Astrophysics Data System (ADS)
Liberman, A. M.
1982-03-01
This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation and practical applications. Manuscripts cover the following topics: Speech perception and memory coding in relation to reading ability; The use of orthographic structure by deaf adults: Recognition of finger-spelled letters; Exploring the information support for speech; The stream of speech; Using the acoustic signal to make inferences about place and duration of tongue-palate contact. Patterns of human interlimb coordination emerge from the the properties of nonlinear limit cycle oscillatory processes: Theory and data; Motor control: Which themes do we orchestrate? Exploring the nature of motor control in Down's syndrome; Periodicity and auditory memory: A pilot study; Reading skill and language skill: On the role of sign order and morphological structure in memory for American Sign Language sentences; Perception of nasal consonants with special reference to Catalan; and Speech production Characteristics of the hearing impaired.
Smart command recognizer (SCR) - For development, test, and implementation of speech commands
NASA Technical Reports Server (NTRS)
Simpson, Carol A.; Bunnell, John W.; Krones, Robert R.
1988-01-01
The SCR, a rapid prototyping system for the development, testing, and implementation of speech commands in a flight simulator or test aircraft, is described. A single unit performs all functions needed during these three phases of system development, while the use of common software and speech command data structure files greatly reduces the preparation time for successive development phases. As a smart peripheral to a simulation or flight host computer, the SCR interprets the pilot's spoken input and passes command codes to the simulation or flight computer.
Levels of Code Switching on EFL Student's Daily Language; Study of Language Production
ERIC Educational Resources Information Center
Zainuddin
2016-01-01
This study is aimed at describing the levels of code switching on EFL students' daily conversation. The topic is chosen due to the facts that code switching phenomenon are commonly found in daily speech of Indonesian community such as in teenager talks, television serial dialogues and mass media. Therefore, qualitative data were collected by using…
Restoring speech perception with cochlear implants by spanning defective electrode contacts.
Frijns, Johan H M; Snel-Bongers, Jorien; Vellinga, Dirk; Schrage, Erik; Vanpoucke, Filiep J; Briaire, Jeroen J
2013-04-01
Even with six defective contacts, spanning can largely restore speech perception with the HiRes 120 speech processing strategy to the level supported by an intact electrode array. Moreover, the sound quality is not degraded. Previous studies have demonstrated reduced speech perception scores (SPS) with defective contacts in HiRes 120. This study investigated whether replacing defective contacts by spanning, i.e. current steering on non-adjacent contacts, is able to restore speech recognition to the level supported by an intact electrode array. Ten adult cochlear implant recipients (HiRes90K, HiFocus1J) with experience with HiRes 120 participated in this study. Three different defective electrode arrays were simulated (six separate defective contacts, three pairs or two triplets). The participants received three take-home strategies and were asked to evaluate the sound quality in five predefined listening conditions. After 3 weeks, SPS were evaluated with monosyllabic words in quiet and in speech-shaped background noise. The participants rated the sound quality equal for all take-home strategies. SPS with background noise were equal for all conditions tested. However, SPS in quiet (85% phonemes correct on average with the full array) decreased significantly with increasing spanning distance, with a 3% decrease for each spanned contact.
Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle
2016-12-01
This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Recordings were made for 96 children (50 female participants, 46 male participants) engaged in a problem-solving task with a same-sex friend; recordings for 20 adults were used as reference. The task was carried out in good listening conditions (normal transmission) and in degraded transmission conditions. Articulation rate, median fundamental frequency (f0), f0 range, and relative energy in the 1- to 3-kHz range were analyzed. With increasing age, children significantly reduced their median f0 and f0 range, became faster talkers, and reduced their mid-frequency energy in spontaneous speech. Children produced similar clear speech adaptations (in degraded transmission conditions) as adults, but only children aged 11-14 years increased their f0 range, an unhelpful strategy not transmitted via the vocoder. Changes made by children were consistent with a general increase in vocal effort. Further developments in speech production take place during later childhood. Children use clear speech strategies to benefit an interlocutor facing intelligibility problems but may not be able to attune these strategies to the same degree as adults.
How musical expertise shapes speech perception: evidence from auditory classification images.
Varnet, Léo; Wang, Tianyun; Peter, Chloe; Meunier, Fanny; Hoen, Michel
2015-09-24
It is now well established that extensive musical training percolates to higher levels of cognition, such as speech processing. However, the lack of a precise technique to investigate the specific listening strategy involved in speech comprehension has made it difficult to determine how musicians' higher performance in non-speech tasks contributes to their enhanced speech comprehension. The recently developed Auditory Classification Image approach reveals the precise time-frequency regions used by participants when performing phonemic categorizations in noise. Here we used this technique on 19 non-musicians and 19 professional musicians. We found that both groups used very similar listening strategies, but the musicians relied more heavily on the two main acoustic cues, at the first formant onset and at the onsets of the second and third formants onsets. Additionally, they responded more consistently to stimuli. These observations provide a direct visualization of auditory plasticity resulting from extensive musical training and shed light on the level of functional transfer between auditory processing and speech perception.
ERIC Educational Resources Information Center
Blood, Gordon W.; Boyle, Michael P.; Blood, Ingrid M.; Nalesnik, Gina R.
2010-01-01
Bullying in school-age children is a global epidemic. School personnel play a critical role in eliminating this problem. The goals of this study were to examine speech-language pathologists' (SLPs) perceptions of bullying, endorsement of potential strategies for dealing with bullying, and associations among SLPs' responses and specific demographic…
ERIC Educational Resources Information Center
Yang, Tae-Kyoung
2002-01-01
Examines how apology speech act strategies frequently used in daily life are transferred in the framework of interlanguage pragmatics and sociolinguistics and how they are influenced by sociolinguistic variations such as social status, social distance, severity of offense, and formal or private relationships. (Author/VWL)
Apology Strategies Employed by Saudi EFL Teachers
ERIC Educational Resources Information Center
Alsulayyi, Marzouq Nasser
2016-01-01
This study examines the apology strategies used by 30 Saudi EFL teachers in Najran, the Kingdom of Saudi Arabia (KSA), paying special attention to variables such as social distance and power and offence severity. The study also delineates gender differences in the respondents' speech as opposed to studies that only examined speech act output by…
ERIC Educational Resources Information Center
Galvin, Kathleen M.
This paper focuses on certain approaches which an urban speech department can use as it contributes to the preparation of urban school teachers to communicate effectively with their students. The contents include: "Verbal and Nonverbal Codes," which discusses the teacher as an encoder of verbal messages and emphasizes that teachers must learn to…
Hate Speech, the First Amendment, and Professional Codes of Conduct: Where to Draw the Line?
ERIC Educational Resources Information Center
Mello, Jeffrey A.
2008-01-01
This article presents a teaching case that involves the presentation of an actual incident in which a state commission on judicial performance had to balance a judge's First Amendment rights to protected free speech against his public statements about a societal class/group that were deemed to be derogatory and inflammatory and, hence, cast…
Fifty years of progress in speech coding standards
NASA Astrophysics Data System (ADS)
Cox, Richard
2004-10-01
Over the past 50 years, speech coding has taken root worldwide. Early applications were for the military and transmission for telephone networks. The military gave equal priority to intelligibility and low bit rate. The telephone network gave priority to high quality and low delay. These illustrate three of the four areas in which requirements must be set for any speech coder application: bit rate, quality, delay, and complexity. While the military could afford relatively expensive terminal equipment for secure communications, the telephone network needed low cost for massive deployment in switches and transmission equipment worldwide. Today speech coders are at the heart of the wireless phones and telephone answering systems we use every day. In addition to the technology and technical invention that has occurred, standards make it possible for all these different systems to interoperate. The primary areas of standardization are the public switched telephone network, wireless telephony, and secure telephony for government and military applications. With the advent of IP telephony there are additional standardization efforts and challenges. In this talk the progress in all areas is reviewed as well as a reflection on Jim Flanagan's impact on this field during the past half century.
Influence of musical training on understanding voiced and whispered speech in noise.
Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J
2014-01-01
This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.
Lei, Huimeng; Yan, Zhangming; Sun, Xiaohong; Zhang, Yue; Wang, Jianhong; Ma, Caihong; Xu, Qunyuan; Wang, Rui; Jarvis, Erich D; Sun, Zhirong
2017-11-01
Human and several nonhuman species share the rare ability of modifying acoustic and/or syntactic features of sounds produced, i.e. vocal learning, which is the important neurobiological and behavioral substrate of human speech/language. This convergent trait was suggested to be associated with significant genomic convergence and best manifested at the ROBO-SLIT axon guidance pathway. Here we verified the significance of such genomic convergence and assessed its functional relevance to human speech/language using human genetic variation data. In normal human populations, we found the affected amino acid sites were well fixed and accompanied with significantly more associated protein-coding SNPs in the same genes than the rest genes. Diseased individuals with speech/language disorders have significant more low frequency protein coding SNPs but they preferentially occurred outside the affected genes. Such patients' SNPs were enriched in several functional categories including two axon guidance pathways (mediated by netrin and semaphorin) that interact with ROBO-SLITs. Four of the six patients have homozygous missense SNPs on PRAME gene family, one youngest gene family in human lineage, which possibly acts upon retinoic acid receptor signaling, similarly as FOXP2, to modulate axon guidance. Taken together, we suggest the axon guidance pathways (e.g. ROBO-SLIT, PRAME gene family) served as common targets for human speech/language evolution and related disorders. Copyright © 2017 Elsevier Inc. All rights reserved.
Kong, Anthony Pak-Hin; Law, Sam-Po; Kwan, Connie Ching-Yin; Lai, Christy; Lam, Vivian
2014-01-01
Gestures are commonly used together with spoken language in human communication. One major limitation of gesture investigations in the existing literature lies in the fact that the coding of forms and functions of gestures has not been clearly differentiated. This paper first described a recently developed Database of Speech and GEsture (DoSaGE) based on independent annotation of gesture forms and functions among 119 neurologically unimpaired right-handed native speakers of Cantonese (divided into three age and two education levels), and presented findings of an investigation examining how gesture use was related to age and linguistic performance. Consideration of these two factors, for which normative data are currently very limited or lacking in the literature, is relevant and necessary when one evaluates gesture employment among individuals with and without language impairment. Three speech tasks, including monologue of a personally important event, sequential description, and story-telling, were used for elicitation. The EUDICO Linguistic ANnotator (ELAN) software was used to independently annotate each participant’s linguistic information of the transcript, forms of gestures used, and the function for each gesture. About one-third of the subjects did not use any co-verbal gestures. While the majority of gestures were non-content-carrying, which functioned mainly for reinforcing speech intonation or controlling speech flow, the content-carrying ones were used to enhance speech content. Furthermore, individuals who are younger or linguistically more proficient tended to use fewer gestures, suggesting that normal speakers gesture differently as a function of age and linguistic performance. PMID:25667563
A recursive linear predictive vocoder
NASA Astrophysics Data System (ADS)
Janssen, W. A.
1983-12-01
A non-real time 10 pole recursive autocorrelation linear predictive coding vocoder was created for use in studying effects of recursive autocorrelation on speech. The vocoder is composed of two interchangeable pitch detectors, a speech analyzer, and speech synthesizer. The time between updating filter coefficients is allowed to vary from .125 msec to 20 msec. The best quality was found using .125 msec between each update. The greatest change in quality was noted when changing from 20 msec/update to 10 msec/update. Pitch period plots for the center clipping autocorrelation pitch detector and simplified inverse filtering technique are provided. Plots of speech into and out of the vocoder are given. Formant versus time three dimensional plots are shown. Effects of noise on pitch detection and formants are shown. Noise effects the voiced/unvoiced decision process causing voiced speech to be re-constructed as unvoiced.
The Learning of Complex Speech Act Behaviour.
ERIC Educational Resources Information Center
Olshtain, Elite; Cohen, Andrew
1990-01-01
Pre- and posttraining measurement of adult English-as-a-Second-Language learners' (N=18) apology speech act behavior found no clear-cut quantitative improvement after training, although there was an obvious qualitative approximation of native-like speech act behavior in terms of types of intensification and downgrading, choice of strategy, and…
Speech Effort Measurement and Stuttering: Investigating the Chorus Reading Effect
ERIC Educational Resources Information Center
Ingham, Roger J.; Warner, Allison; Byrd, Anne; Cotton, John
2006-01-01
Purpose: The purpose of this study was to investigate chorus reading's (CR's) effect on speech effort during oral reading by adult stuttering speakers and control participants. The effect of a speech effort measurement highlighting strategy was also investigated. Method: Twelve persistent stuttering (PS) adults and 12 normally fluent control…
Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome
Engineer, Crystal T.; Rahebi, Kimiya C.; Borland, Michael S.; Buell, Elizabeth P.; Centanni, Tracy M.; Fink, Melyssa K.; Im, Kwok W.; Wilson, Linda G.; Kilgard, Michael P.
2015-01-01
Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial. PMID:26321676
An articulatorily constrained, maximum entropy approach to speech recognition and speech coding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.
Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values aremore » constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.« less
NASA Astrophysics Data System (ADS)
Riera-Palou, Felip; den Brinker, Albertus C.
2007-12-01
This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).
Can bilingual two-year-olds code-switch?
Lanza, E
1992-10-01
Sociolinguists have investigated language mixing as code-switching in the speech of bilingual children three years old and older. Language mixing by bilingual two-year-olds, however, has generally been interpreted in the child language literature as a sign of the child's lack of language differentiation. The present study applies perspectives from sociolinguistics to investigate the language mixing of a bilingual two-year-old acquiring Norwegian and English simultaneously in Norway. Monthly recordings of the child's spontaneous speech in interactions with her parents were made from the age of 2;0 to 2;7. An investigation into the formal aspects of the child's mixing and the context of the mixing reveals that she does differentiate her language use in contextually sensitive ways, hence that she can code-switch. This investigation stresses the need to examine more carefully the roles of dominance and context in the language mixing of young bilingual children.
Revisiting place and temporal theories of pitch
2014-01-01
The nature of pitch and its neural coding have been studied for over a century. A popular debate has revolved around the question of whether pitch is coded via “place” cues in the cochlea, or via timing cues in the auditory nerve. In the most recent incarnation of this debate, the role of temporal fine structure has been emphasized in conveying important pitch and speech information, particularly because the lack of temporal fine structure coding in cochlear implants might explain some of the difficulties faced by cochlear implant users in perceiving music and pitch contours in speech. In addition, some studies have postulated that hearing-impaired listeners may have a specific deficit related to processing temporal fine structure. This article reviews some of the recent literature surrounding the debate, and argues that much of the recent evidence suggesting the importance of temporal fine structure processing can also be accounted for using spectral (place) or temporal-envelope cues. PMID:25364292
An Investigation of Refusal Strategies as Used by Bahdini Kurdish and Syriac Aramaic Speakers
ERIC Educational Resources Information Center
Shareef, Dilgash M.; Qyrio, Marina Isteefan; Ali, Chiman Nadheer
2018-01-01
For the purpose of achieving a successful communication, issues such as the appropriateness of speech acts and face saving become essential. Therefore, it is very important to achieve a high level of pragmatic competence in speech acts. Bearing this in mind, this study was conducted to investigate the preferred refusal strategies Kurdish and…
ERIC Educational Resources Information Center
Donai, Jeremy J.; Schwartz, Jeremy C.
2016-01-01
Clinical Question: What high-frequency amplification strategy maximizes speechrecognition performance among adult hearing-impaired listeners with mild sloping to moderately severe sensorineural hearing loss? Method: Quick response review. Study Sources: EBSCO, PubMed, Google Scholar, as well as journals from the American Speech-Language-Hearing…
A 4.8 kbps code-excited linear predictive coder
NASA Technical Reports Server (NTRS)
Tremain, Thomas E.; Campbell, Joseph P., Jr.; Welch, Vanoy C.
1988-01-01
A secure voice system STU-3 capable of providing end-to-end secure voice communications (1984) was developed. The terminal for the new system will be built around the standard LPC-10 voice processor algorithm. The performance of the present STU-3 processor is considered to be good, its response to nonspeech sounds such as whistles, coughs and impulse-like noises may not be completely acceptable. Speech in noisy environments also causes problems with the LPC-10 voice algorithm. In addition, there is always a demand for something better. It is hoped that LPC-10's 2.4 kbps voice performance will be complemented with a very high quality speech coder operating at a higher data rate. This new coder is one of a number of candidate algorithms being considered for an upgraded version of the STU-3 in late 1989. The problems of designing a code-excited linear predictive (CELP) coder to provide very high quality speech at a 4.8 kbps data rate that can be implemented on today's hardware are considered.
Digital Coding and the Self-Proving Message
ERIC Educational Resources Information Center
Dettering, Richard
1971-01-01
Author suggests that digital Communication", which relies on arbitrary coding elements, like the phones of speech," overshadows the importance of the analogic symbolism people use more extensively than realized. Non-verbal messages can be more convincing than verbal and can be used to predict patterns of future behavior. (Author/PD)
Influence of signal processing strategy in auditory abilities.
Melo, Tatiana Mendes de; Bevilacqua, Maria Cecília; Costa, Orozimbo Alves; Moret, Adriane Lima Mortari
2013-01-01
The signal processing strategy is a parameter that may influence the auditory performance of cochlear implant and is important to optimize this parameter to provide better speech perception, especially in difficult listening situations. To evaluate the individual's auditory performance using two different signal processing strategy. Prospective study with 11 prelingually deafened children with open-set speech recognition. A within-subjects design was used to compare performance with standard HiRes and HiRes 120 in three different moments. During test sessions, subject's performance was evaluated by warble-tone sound-field thresholds, speech perception evaluation, in quiet and in noise. In the silence, children S1, S4, S5, S7 showed better performance with the HiRes 120 strategy and children S2, S9, S11 showed better performance with the HiRes strategy. In the noise was also observed that some children performed better using the HiRes 120 strategy and other with HiRes. Not all children presented the same pattern of response to the different strategies used in this study, which reinforces the need to look at optimizing cochlear implant clinical programming.
Blackie, Rebecca A; Kocovski, Nancy L
2016-01-01
According to cognitive models, post-event processing (PEP) is a key factor in the maintenance of social anxiety. Given that decreasing PEP can be challenging for socially anxious individuals, it is important to identify potentially useful strategies. Although distraction may help to decrease PEP, the findings have been equivocal. The primary purpose of this study was to examine whether a brief distraction period immediately following a speech would lead to less PEP the next day. The secondary aim was to examine the effect of distraction following an initial speech on anticipatory anxiety for a second speech, via reductions in PEP. Participants (N = 77 undergraduates with elevated social anxiety; 67.53% female) delivered a speech and were randomly assigned to a distraction, rumination, or control condition. The following day, participants reported levels of PEP in relation to the first speech, as well as anxiety regarding a second, upcoming speech. As expected, those in the distraction condition reported less PEP than those in the rumination and control conditions. Additionally, distraction following the first speech was indirectly related to anticipatory anxiety for the second speech, via PEP. Distraction may represent a potentially useful strategy for reducing PEP and other maladaptive processes that may maintain social anxiety.
ERIC Educational Resources Information Center
Lechler, Suzanne; Hare, Dougal Julian
2015-01-01
A naturalistic observational single case study was carried out to investigate the form and function of private speech (PS) in a young man with Dandy-Walker variant syndrome and trisomy 22. Video recordings were observed, transcribed and coded to identify all combinations of type and form of PS. Through comparison between theories of PS and the…
1988-05-01
Seeciv Limited- System for varying Senses term filter capacity output until some Figure 2. Original limited-capacity channel model (Frim Broadbent, 1958) S...2 Figure 2. Original limited-capacity channel model (From Broadbent, 1958) .... 10 Figure 3. Experimental...unlimited variety of human voices for digital recording sources. Synthesis by Analysis Analysis-synthesis methods electronically model the human voice
Sensory Information Processing
1975-12-31
system noise . To see how this is avoided, note that zeroes in the blur spectrum become sharp, spike-like negative «*»• Page impulses when the...Synthetic Speech Quality Using Binaural Reverberation-- Boll 12 13 Section 4. Noise Suppression with Linear Prediction Filtering—Peterson 24 Section...5. Speech Processing to Reduce Noise and Improve Intelligibility— Callahan 28 Section 6. Linear Predictive Coding with a Glottal 36 Section 7
Multiparticipant Chat Analysis: A Survey
2013-02-26
language variation (e.g., regional speech in Germany [6]; code-switching in German-speaking regions of Switzerland [84] and Indian IRC channels [77]), and...messages which may be missed in high- tempo situations [19], and automated analysis of chat messages [13]. Finally, the high number of chat messages can...Androutsopoulos, E. Ziegler, Exploring language variation on the internet: Regional speech in a chat community, in: Proceedings of the Second International
Cross-language Activation and the Phonetics of Code-switching
NASA Astrophysics Data System (ADS)
Piccinini, Page Elizabeth
It is now well established that bilinguals have both languages activated to some degree at all times. This cross-language activation has been documented in several research paradigms, including picture naming, reading, and electrophysiological studies. What is less well understood is how the degree a language is activated can vary in different language environments or contexts. Furthermore, when investigating effects of order of acquisition and language dominance, past research has been mixed, as the two variables are often conflated. In this dissertation, I test how degree of cross-language activation can vary according to context by examining phonetic productions in code-switching speech. Both spontaneous speech and scripted speech are analyzed. Follow-up perception experiments are conducted to see if listeners are able to anticipate language switches, potentially due to the phonetic cues in the signal. Additionally, by focusing on early bilinguals who are L1 Spanish but English dominant, I am able to see what plays a greater role in cross-language activation, order of acquisition or language dominance. I find that speakers do have intermediate phonetic productions in code-switching contexts relative to monolingual contexts. Effects are larger and more consistent in English than Spanish. Similar effects are found in speech perception. Listeners are able to anticipate language switches from English to Spanish but not Spanish to English. Together these results suggest that language dominance is a more important factor than order of acquisition in cross-language activation for early bilinguals. Future models on bilingual language organization and access should take into account both context and language dominance when modeling degrees of cross-language activation.
Kiernan, Barbara; Gray, Shelley
2013-05-01
Talk It Out was developed by speech-language pathologists to teach young children, especially those with speech and language impairments, to recognize problems, use words to solve them, and verbally negotiate solutions. One of the very successful by-products is that these same strategies help children avoid harming their voice. Across a school year, Talk It Out provides teaching and practice in predictable contexts so that children become competent problem solvers. It is especially powerful when implemented as part of the tier 1 preschool curriculum. The purpose of this article is to help school-based speech-language pathologists (1) articulate the need and rationale for early implementation of conflict resolution programs, (2) develop practical skills to implement Talk It Out strategies in their programs, and (3) transfer this knowledge to classroom teachers who can use and reinforce these strategies on a daily basis with the children they serve. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Psychoacoustic cues to emotion in speech prosody and music.
Coutinho, Eduardo; Dibben, Nicola
2013-01-01
There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners' second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.
Spectral analysis method and sample generation for real time visualization of speech
NASA Astrophysics Data System (ADS)
Hobohm, Klaus
A method for translating speech signals into optical models, characterized by high sound discrimination and learnability and designed to provide to deaf persons a feedback towards control of their way of speaking, is presented. Important properties of speech production and perception processes and organs involved in these mechanisms are recalled in order to define requirements for speech visualization. It is established that the spectral representation of time, frequency and amplitude resolution of hearing must be fair and continuous variations of acoustic parameters of speech signal must be depicted by a continuous variation of images. A color table was developed for dynamic illustration and sonograms were generated with five spectral analysis methods such as Fourier transformations and linear prediction coding. For evaluating sonogram quality, test persons had to recognize consonant/vocal/consonant words and an optimized analysis method was achieved with a fast Fourier transformation and a postprocessor. A hardware concept of a real time speech visualization system, based on multiprocessor technology in a personal computer, is presented.
Application of the Envelope Difference Index to Spectrally Sparse Speech
ERIC Educational Resources Information Center
Souza, Pamela; Hoover, Eric; Gallun, Frederick
2012-01-01
Purpose: Amplitude compression is a common hearing aid processing strategy that can improve speech audibility and loudness comfort but also has the potential to alter important cues carried by the speech envelope. In previous work, a measure of envelope change, the Envelope Difference Index (EDI; Fortune, Woodruff, & Preves, 1994), was moderately…
Sentence-Level Movements in Parkinson's Disease: Loud, Clear, and Slow Speech
ERIC Educational Resources Information Center
Kearney,Elaine; Giles, Renuka; Haworth, Brandon; Faloutsos, Petros; Baljko, Melanie; Yunusova, Yana
2017-01-01
Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying…
Speech Communication Anxiety: An Impediment to Academic Achievement in the University Classroom.
ERIC Educational Resources Information Center
Boohar, Richard K.; Seiler, William J.
1982-01-01
The achievement levels of college students taking a bioethics course who demonstrated high and low degrees of speech anxiety were studied. Students with high speech anxiety interacted less with instructors and did not achieve as well as other students. Strategies instructors can use to help students are suggested. (Authors/PP)
Private Speech Moderates the Effects of Effortful Control on Emotionality
ERIC Educational Resources Information Center
Day, Kimberly L.; Smith, Cynthia L.; Neal, Amy; Dunsmore, Julie C.
2018-01-01
Research Findings: In addition to being a regulatory strategy, children's private speech may enhance or interfere with their effortful control used to regulate emotion. The goal of the current study was to investigate whether children's private speech during a selective attention task moderated the relations of their effortful control to their…
Modulation, Adaptation, and Control of Orofacial Pathways in Healthy Adults
ERIC Educational Resources Information Center
Estep, Meredith E.
2009-01-01
Although the healthy adult possesses a large repertoire of coordinative strategies for oromotor behaviors, a range of nonverbal, speech-like movements can be observed during speech. The extent of overlap among sensorimotor speech and nonspeech neural correlates and the role of neuromodulatory inputs generated during oromotor behaviors are unknown.…
Business Speech, Language Arts, Business English: 5128.21.
ERIC Educational Resources Information Center
Dade County Public Schools, Miami, FL.
Developed as part of a high school quinmester unit on business speech, this guide provides the teacher with teaching strategies for a course designed to help people in the business world. The course covers the preparation and delivery of a speech and other business situations which require skill in speaking (sales techniques, committee and group…
Mock Trial: A Window to Free Speech Rights and Abilities
ERIC Educational Resources Information Center
Schwartz, Sherry
2010-01-01
This article provides some strategies to alleviate the current tensions between personal responsibility and freedom of speech rights in the public school classroom. The article advocates the necessity of making sure students understand the points and implications of the first amendment by providing a mock trial unit concerning free speech rights.…
Stathopoulos, Elaine T; Huber, Jessica E; Richardson, Kelly; Kamphaus, Jennifer; DeCicco, Devan; Darling, Meghan; Fulcher, Katrina; Sussman, Joan E
2014-01-01
The objective of the present study was to investigate whether speakers with hypophonia, secondary to Parkinson's disease (PD), would increases their vocal intensity when speaking in a noisy environment (Lombard effect). The other objective was to examine the underlying laryngeal and respiratory strategies used to increase vocal intensity. Thirty-three participants with PD were included for study. Each participant was fitted with the SpeechVive™ device that played multi-talker babble noise into one ear during speech. Using acoustic, aerodynamic and respiratory kinematic techniques, the simultaneous laryngeal and respiratory mechanisms used to regulate vocal intensity were examined. Significant group results showed that most speakers with PD (26/33) were successful at increasing their vocal intensity when speaking in the condition of multi-talker babble noise. They were able to support their increased vocal intensity and subglottal pressure with combined strategies from both the laryngeal and respiratory mechanisms. Individual speaker analysis indicated that the particular laryngeal and respiratory interactions differed among speakers. The SpeechVive™ device elicited higher vocal intensities from patients with PD. Speakers used different combinations of laryngeal and respiratory physiologic mechanisms to increase vocal intensity, thus suggesting that disease process does not uniformly affect the speech subsystems. Readers will be able to: (1) identify speech characteristics of people with Parkinson's disease (PD), (2) identify typical respiratory strategies for increasing sound pressure level (SPL), (3) identify typical laryngeal strategies for increasing SPL, (4) define the Lombard effect. Copyright © 2014 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Haugh, Erin Kathleen
2017-01-01
The purpose of this study was to examine the role orthographic coding might play in distinguishing between membership in groups of language-based disability types. The sample consisted of 36 second and third-grade subjects who were administered the PAL-II Receptive Coding and Word Choice Accuracy subtest as a measure of orthographic coding…
Lexical and sublexical units in speech perception.
Giroux, Ibrahima; Rey, Arnaud
2009-03-01
Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vinter, 1998 in an experiment where we compare the lexical and sublexical recognition performance of adults after hearing 2 or 10 min of an artificial spoken language. The results are consistent with Parser's predictions and the clustering approach, showing that performance on words is better than performance on part-words only after 10 min. This result suggests that word segmentation abilities are not merely due to stronger associations between sublexical units but to the emergence of stronger lexical representations during the development of speech perception processes. Copyright © 2009, Cognitive Science Society, Inc.
Human phoneme recognition depending on speech-intrinsic variability.
Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger
2010-11-01
The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).
Xiao, Bo; Imel, Zac E; Georgiou, Panayiotis G; Atkins, David C; Narayanan, Shrikanth S
2015-01-01
The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
ERIC Educational Resources Information Center
Kolehmainen, Leena; Skaffari, Janne
2016-01-01
This article serves as an introduction to a collection of four articles on multilingual practices in speech and writing, exploring both contemporary and historical sources. It not only introduces the articles but also discusses the scope and definitions of code-switching, attitudes towards multilingual interaction and, most pertinently, the…
Neural Coding of Relational Invariance in Speech: Human Language Analogs to the Barn Owl.
ERIC Educational Resources Information Center
Sussman, Harvey M.
1989-01-01
The neuronal model shown to code sound-source azimuth in the barn owl by H. Wagner et al. in 1987 is used as the basis for a speculative brain-based human model, which can establish contrastive phonetic categories to solve the problem of perception "non-invariance." (SLD)
ERIC Educational Resources Information Center
Bauminger-Zviely, Nirit; Golan-Itshaky, Adi; Tubul-Lavy, Gila
2017-01-01
In this study, we videotaped two 10-min. free-play interactions and coded speech acts (SAs) in peer talk of 51 preschoolers (21 ASD, 30 typical), interacting with friend versus non-friend partners. Groups were matched for maternal education, IQ (verbal/nonverbal), and CA. We compared SAs by group (ASD/typical), by partner's friendship status…
Perception and Neural Coding of Harmonic Fusion in Ferrets
2004-01-01
distinct percepts that come under the rubric of pitch, be- cause periodicity pitch underlies speakers’ voices and speech prosody, as well as musical ...spectral fusion is unclear for sounds having predominantly low-frequency spectra such as speech, music , and many animal vocalizations. In summary...84, 560–565. von Helmholtz, H. (1863). Die Lehre von den Tonempfindungen als physiologische Grundlage fr die Theorie der Musik . (Vieweg und Sohn
The Matrix Pencil and its Applications to Speech Processing
2007-03-01
Elementary Linear Algebra ” 8th edition, pp. 278, 2000 John Wiley & Sons, Inc., New York [37] Wai C. Chu, “Speech Coding Algorithms”, New Jeresy: John...Ben; Daniel, James W.; “Applied Linear Algebra ”, pp. 342-345, 1988 Prentice Hall, Englewood Cliffs, NJ [35] Haykin, Simon “Applied Linear Adaptive...ABSTRACT Matrix Pencils facilitate the study of differential equations resulting from oscillating systems. Certain problems in linear ordinary
Simplified APC for Space Shuttle applications. [Adaptive Predictive Coding for speech transmission
NASA Technical Reports Server (NTRS)
Hutchins, S. E.; Batson, B. H.
1975-01-01
This paper describes an 8 kbps adaptive predictive digital speech transmission system which was designed for potential use in the Space Shuttle Program. The system was designed to provide good voice quality in the presence of both cabin noise on board the Shuttle and the anticipated bursty channel. Minimal increase in size, weight, and power over the current high data rate system was also a design objective.
Iles, Jane; Spiby, Helen; Slade, Pauline
2014-10-01
Little is known about what constitutes key components of partner support during the childbirth experience. This study modified the five minute speech sample, a measure of expressed emotion (EE), for use with new parents in the immediate postpartum. A coding framework was developed to rate the speech samples on dimensions of couple support. Associations were explored between these codes and subsequent symptoms of postnatal depression and posttraumatic stress. 372 couples were recruited in the early postpartum and individually provided short speech samples. Posttraumatic stress and postnatal depression symptoms were assessed via questionnaire measures at six and thirteen weeks. Two hundred and twelve couples completed all time-points. Key elements of supportive interactions were identified and reliably categorised. Mothers' posttraumatic stress was associated with criticisms of the partner during childbirth, general relationship criticisms and men's perception of helplessness. Postnatal depression was associated with absence of partner empathy and any positive comments regarding the partner's support. The content of new parents' descriptions of labour and childbirth, their partner during labour and birth and their relationship within the immediate postpartum may have significant implications for later psychological functioning. Interventions to enhance specific supportive elements between couples during the antenatal period merit development and evaluation.
Doctors' voices in patients' narratives: coping with emotions in storytelling.
Lucius-Hoene, Gabriele; Thiele, Ulrike; Breuning, Martina; Haug, Stephanie
2012-09-01
To understand doctors' impacts on the emotional coping of patients, their stories about encounters with doctors are used. These accounts reflect meaning-making processes and biographically contextualized experiences. We investigate how patients characterize their doctors by voicing them in their stories, thus assigning them functions in their coping process. 394 narrated scenes with reported speech of doctors were extracted from interviews with 26 patients with type 2 diabetes and 30 with chronic pain. Constructed speech acts were investigated by means of positioning and narrative analysis, and assigned into thematic categories by a bottom-up coding procedure. Patients use narratives as coping strategies when confronted with illness and their encounters with doctors by constructing them in a supportive and face-saving way. In correspondence with the variance of illness conditions, differing moral problems in dealing with doctors arise. Different evaluative stances towards the same events within interviews show that positionings are not fixed, but vary according to contexts and purposes. Our narrative approach deepens the standardized and predominantly cognitive statements of questionnaires in research on doctor-patient relations by individualized emotional and biographical aspects of patients' perspective. Doctors should be trained to become aware of their impact in patients' coping processes.
Arsenault, Jessica S; Buchsbaum, Bradley R
2016-08-01
The motor theory of speech perception has experienced a recent revival due to a number of studies implicating the motor system during speech perception. In a key study, Pulvermüller et al. (2006) showed that premotor/motor cortex differentially responds to the passive auditory perception of lip and tongue speech sounds. However, no study has yet attempted to replicate this important finding from nearly a decade ago. The objective of the current study was to replicate the principal finding of Pulvermüller et al. (2006) and generalize it to a larger set of speech tokens while applying a more powerful statistical approach using multivariate pattern analysis (MVPA). Participants performed an articulatory localizer as well as a speech perception task where they passively listened to a set of eight syllables while undergoing fMRI. Both univariate and multivariate analyses failed to find evidence for somatotopic coding in motor or premotor cortex during speech perception. Positive evidence for the null hypothesis was further confirmed by Bayesian analyses. Results consistently show that while the lip and tongue areas of the motor cortex are sensitive to movements of the articulators, they do not appear to preferentially respond to labial and alveolar speech sounds during passive speech perception.
Carroll, Jeff; Zeng, Fan-Gang
2007-01-01
Increasing the number of channels at low frequencies improves discrimination of fundamental frequency (F0) in cochlear implants [Geurts and Wouters 2004]. We conducted three experiments to test whether improved F0 discrimination can be translated into increased speech intelligibility in noise in a cochlear implant simulation. The first experiment measured F0 discrimination and speech intelligibility in quiet as a function of channel density over different frequency regions. The results from this experiment showed a tradeoff in performance between F0 discrimination and speech intelligibility with a limited number of channels. The second experiment tested whether improved F0 discrimination and optimizing this tradeoff could improve speech performance with a competing talker. However, improved F0 discrimination did not improve speech intelligibility in noise. The third experiment identified the critical number of channels needed at low frequencies to improve speech intelligibility in noise. The result showed that, while 16 channels below 500 Hz were needed to observe any improvement in speech intelligibility in noise, even 32 channels did not achieve normal performance. Theoretically, these results suggest that without accurate spectral coding, F0 discrimination and speech perception in noise are two independent processes. Practically, the present results illustrate the need to increase the number of independent channels in cochlear implants. PMID:17604581
Hadely, Kathleen A; Power, Emma; O'Halloran, Robyn
2014-03-06
Communication and swallowing disorders are a common consequence of stroke. Clinical practice guidelines (CPGs) have been created to assist health professionals to put research evidence into clinical practice and can improve stroke care outcomes. However, CPGs are often not successfully implemented in clinical practice and research is needed to explore the factors that influence speech pathologists' implementation of stroke CPGs. This study aimed to describe speech pathologists' experiences and current use of guidelines, and to identify what factors influence speech pathologists' implementation of stroke CPGs. Speech pathologists working in stroke rehabilitation who had used a stroke CPG were invited to complete a 39-item online survey. Content analysis and descriptive and inferential statistics were used to analyse the data. 320 participants from all states and territories of Australia were surveyed. Almost all speech pathologists had used a stroke CPG and had found the guideline "somewhat useful" or "very useful". Factors that speech pathologists perceived influenced CPG implementation included the: (a) guideline itself, (b) work environment, (c) aspects related to the speech pathologist themselves, (d) patient characteristics, and (e) types of implementation strategies provided. There are many different factors that can influence speech pathologists' implementation of CPGs. The factors that influenced the implementation of CPGs can be understood in terms of knowledge creation and implementation frameworks. Speech pathologists should continue to adapt the stroke CPG to their local work environment and evaluate their use. To enhance guideline implementation, they may benefit from a combination of educational meetings and resources, outreach visits, support from senior colleagues, and audit and feedback strategies.
Patient Fatigue during Aphasia Treatment: A Survey of Speech-Language Pathologists
ERIC Educational Resources Information Center
Riley, Ellyn A.
2017-01-01
The purpose of this study was to measure speech-language pathologists' (SLPs) perceptions of fatigue in clients with aphasia and identify strategies used to manage client fatigue during speech and language therapy. SLPs completed a short online survey containing a series of questions related to their perceptions of patient fatigue. Of 312…
Walking to Medjugorje: Serving Children Who Are Deaf and Hard-of- Hearing in Bosnia-Herzegovina.
ERIC Educational Resources Information Center
Miller, Kevin J.
2002-01-01
This article discusses the experiences of an American team who worked with the Bosnia Speech and Hearing Project. The team collaborated with Bosnian teachers of children with deafness and speech-language pathologists in to share therapy ideas and model strategies that parents could utilize to promote speech and language development. (Author/CR)
Tjaden, Kris; Wilding, Greg
2011-01-01
The primary purpose of this study was to investigate how speakers with Parkinson's disease (PD) and Multiple Sclerosis (MS) accomplish voluntary reductions in speech rate. A group of talkers with no history of neurological disease was included for comparison. This study was motivated by the idea that knowledge of how speakers with dysarthria voluntarily accomplish a reduced speech rate would contribute toward a descriptive model of speaking rate change in dysarthria. Such a model has the potential to assist in identifying rate control strategies to receive focus in clinical treatment programs and also would advance understanding of global speech timing in dysarthria. All speakers read a passage in Habitual and Slow conditions. Speech rate, articulation rate, pause duration, and pause frequency were measured. All speaker groups adjusted articulation time as well as pause time to reduce overall speech rate. Group differences in how voluntary rate reduction was accomplished were primarily one of quantity or degree. Overall, a slower-than-normal rate was associated with a reduced articulation rate, shorter speech runs that included fewer syllables, and longer more frequent pauses. Taken together, these results suggest that existing skills or strategies used by patients should be emphasized in dysarthria training programs focusing on rate reduction. Results further suggest that a model of voluntary speech rate reduction based on neurologically normal speech shows promise as being applicable for mild to moderate dysarthria. The reader will be able to: (1) describe the importance of studying voluntary adjustments in speech rate in dysarthria, (2) discuss how speakers with Parkinson's disease and Multiple Sclerosis adjust articulation time and pause time to slow speech rate. Copyright © 2011 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
McDonald, David; Proctor, Penny; Gill, Wendy; Heaven, Sue; Marr, Jane; Young, Jane
2015-01-01
Intensive Speech and Language Therapy (SLT) training courses for Early Childhood Educators (ECEs) can have a positive effect on their use of interaction strategies that support children's communication skills. The impact of brief SLT training courses is not yet clearly understood. The aims of these two studies were to assess the impact of a brief…
The Effect of Speech-to-Text Technology on Learning a Writing Strategy
ERIC Educational Resources Information Center
Haug, Katrina N.; Klein, Perry D.
2018-01-01
Previous research has shown that speech-to-text (STT) software can support students in producing a given piece of writing. This is the 1st study to investigate the use of STT to teach a writing strategy. We pretested 45 Grade 5 students on argument writing and trained them to use STT. Students participated in 4 lessons on an argument writing…
Performance of concatenated Reed-Solomon trellis-coded modulation over Rician fading channels
NASA Technical Reports Server (NTRS)
Moher, Michael L.; Lodge, John H.
1990-01-01
A concatenated coding scheme for providing very reliable data over mobile-satellite channels at power levels similar to those used for vocoded speech is described. The outer code is a shorter Reed-Solomon code which provides error detection as well as error correction capabilities. The inner code is a 1-D 8-state trellis code applied independently to both the inphase and quadrature channels. To achieve the full error correction potential of this inner code, the code symbols are multiplexed with a pilot sequence which is used to provide dynamic channel estimation and coherent detection. The implementation structure of this scheme is discussed and its performance is estimated.
Bailey, Dallin J; Blomgren, Michael; DeLong, Catharine; Berggren, Kiera; Wambaugh, Julie L
2017-06-22
The purpose of this article is to quantify and describe stuttering-like disfluencies in speakers with acquired apraxia of speech (AOS), utilizing the Lidcombe Behavioural Data Language (LBDL). Additional purposes include measuring test-retest reliability and examining the effect of speech sample type on disfluency rates. Two types of speech samples were elicited from 20 persons with AOS and aphasia: repetition of mono- and multisyllabic words from a protocol for assessing AOS (Duffy, 2013), and connected speech tasks (Nicholas & Brookshire, 1993). Sampling was repeated at 1 and 4 weeks following initial sampling. Stuttering-like disfluencies were coded using the LBDL, which is a taxonomy that focuses on motoric aspects of stuttering. Disfluency rates ranged from 0% to 13.1% for the connected speech task and from 0% to 17% for the word repetition task. There was no significant effect of speech sampling time on disfluency rate in the connected speech task, but there was a significant effect of time for the word repetition task. There was no significant effect of speech sample type. Speakers demonstrated both major types of stuttering-like disfluencies as categorized by the LBDL (fixed postures and repeated movements). Connected speech samples yielded more reliable tallies over repeated measurements. Suggestions are made for modifying the LBDL for use in AOS in order to further add to systematic descriptions of motoric disfluencies in this disorder.
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.
Mitra, Vikramjit; Nam, Hosung; Espy-Wilson, Carol Y; Saltzman, Elliot; Goldstein, Louis
2010-09-13
Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is usually termed "speech-inversion." This study aims to propose and compare various machine learning strategies for speech inversion: Trajectory mixture density networks (TMDNs), feedforward artificial neural networks (FF-ANN), support vector regression (SVR), autoregressive artificial neural network (AR-ANN), and distal supervised learning (DSL). Further, using a database generated by the Haskins Laboratories speech production model, we test the claim that information regarding constrictions produced by the distinct organs of the vocal tract (vocal tract variables) is superior to flesh-point information (articulatory pellet trajectories) for the inversion process.
Enhanced procedural learning of speech sound categories in a genetic variant of FOXP2.
Chandrasekaran, Bharath; Yi, Han-Gyol; Blanco, Nathaniel J; McGeary, John E; Maddox, W Todd
2015-05-20
A mutation of the forkhead box protein P2 (FOXP2) gene is associated with severe deficits in human speech and language acquisition. In rodents, the humanized form of FOXP2 promotes faster switching from declarative to procedural learning strategies when the two learning systems compete. Here, we examined a polymorphism of FOXP2 (rs6980093) in humans (214 adults; 111 females) for associations with non-native speech category learning success. Neurocomputational modeling results showed that individuals with the GG genotype shifted faster to procedural learning strategies, which are optimal for the task. These findings support an adaptive role for the FOXP2 gene in modulating the function of neural learning systems that have a direct bearing on human speech category learning. Copyright © 2015 the authors 0270-6474/15/357808-05$15.00/0.
Hengst, Julie A; Frame, Simone R; Neuman-Stritzel, Tiffany; Gannaway, Rachel
2005-02-01
Reported speech, wherein one quotes or paraphrases the speech of another, has been studied extensively as a set of linguistic and discourse practices. Researchers agree that reported speech is pervasive, found across languages, and used in diverse contexts. However, to date, there have been no studies of the use of reported speech among individuals with aphasia. Grounded in an interactional sociolinguistic perspective, the study presented here documents and analyzes the use of reported speech by 7 adults with mild to moderately severe aphasia and their routine communication partners. Each of the 7 pairs was videotaped in 4 everyday activities at home or around the community, yielding over 27 hr of conversational interaction for analysis. A coding scheme was developed that identified 5 types of explicitly marked reported speech: direct, indirect, projected, indexed, and undecided. Analysis of the data documented reported speech as a common discourse practice used successfully by the individuals with aphasia and their communication partners. All participants produced reported speech at least once, and across all observations the target pairs produced 400 reported speech episodes (RSEs), 149 by individuals with aphasia and 251 by their communication partners. For all participants, direct and indirect forms were the most prevalent (70% of RSEs). Situated discourse analysis of specific episodes of reported speech used by 3 of the pairs provides detailed portraits of the diverse interactional, referential, social, and discourse functions of reported speech and explores ways that the pairs used reported speech to successfully frame talk despite their ongoing management of aphasia.
Increasing motivation changes subjective reports of listening effort and choice of coping strategy.
Picou, Erin M; Ricketts, Todd A
2014-06-01
The purpose of this project was to examine the effect of changing motivation on subjective ratings of listening effort and on the likelihood that a listener chooses either a controlling or an avoidance coping strategy. Two experiments were conducted, one with auditory-only (AO) and one with auditory-visual (AV) stimuli, both using the same speech recognition in noise materials. Four signal-to-noise ratios (SNRs) were used, two in each experiment. The two SNRs targeted 80% and 50% correct performance. Motivation was manipulated by either having participants listen carefully to the speech (low motivation), or listen carefully to the speech and then answer quiz questions about the speech (high motivation). Sixteen participants with normal hearing participated in each experiment. Eight randomly selected participants participated in both. Using AO and AV stimuli, motivation generally increased subjective ratings of listening effort and tiredness. In addition, using auditory-visual stimuli, motivation generally increased listeners' willingness to do something to improve the situation, and decreased their willingness to avoid the situation. These results suggest a listener's mental state may influence listening effort and choice of coping strategy.
An Adaptive Approach to a 2.4 kb/s LPC Speech Coding System.
1985-07-01
laryngeal cancer ). Spectral estimation is at the foundation of speech analysis for all these goals and accurate AR model estimation in noise is...S ,5 mWnL NrinKt ) o ,-G p (d va Rmea.imn flU: 5() WOM Lu M(G)INUNM 40 4KeemS! MU= 1 UD M5) SIGHSM A SO= WAGe . M. (d) I U NS maIm ( IW vis MAMA
Geo-Coding for the Mapping of Documents and Social Media Messages
2013-08-22
O.L. (2007). UBC-ALM: Combining KNN with SVD for WSD. Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague...and Yarowsky, D. (1992). One sense per discourse. In Proceedings of the 4th DARPA Speech and Natural Language Workshop. pp. 233-237, 1992. Retrieved...Part-of- Speech Tagging for Twitter: Annotation, Features, and Experiments. Proceedings of the Annual Meeting of the Association for Computational
Power Spectral Density Error Analysis of Spectral Subtraction Type of Speech Enhancement Methods
NASA Astrophysics Data System (ADS)
Händel, Peter
2006-12-01
A theoretical framework for analysis of speech enhancement algorithms is introduced for performance assessment of spectral subtraction type of methods. The quality of the enhanced speech is related to physical quantities of the speech and noise (such as stationarity time and spectral flatness), as well as to design variables of the noise suppressor. The derived theoretical results are compared with the outcome of subjective listening tests as well as successful design strategies, performed by independent research groups.
DEBLICOM: Deaf-Blind Communication & Control Systems: First Quarterly Progress Report.
ERIC Educational Resources Information Center
Kafafian, Haig
Reported on is the first phase of development of DEBLICOM, a code for a two-way communication system for deaf-blind individuals who may be speech-impaired. Brief sections cover the following topics: alternatives to and considerations for the development of cutaneous codes for deaf-blind people; the DEBLICOM system which provides a means of…
Cybersecurity: Current Legislation, Executive Branch Initiatives, and Options for Congress
2009-09-30
responsibilities of cybersecurity stakeholders. Privacy and civil liberties—maintaining privacy and freedom of speech protections on the Internet...securing networks before tackling the attendant issues such as freedom of speech , privacy, and civil liberty protections as they pertain to the Internet...legislation to mandate privacy and freedom of speech protections to be incorporated into a national strategy. • Assessing current congressional
Pausing Preceding and Following "Que" in the Production of Native Speakers of French
ERIC Educational Resources Information Center
Genc, Bilal; Mavasoglu, Mustafa; Bada, Erdogan
2011-01-01
Pausing strategies in read and spontaneous speech have been of significant interest for researchers since in literature it was observed that read speech and spontaneous speech pausing patterns do display some considerable differences. This, at least, is the case in the English language as it was produced by native speakers. As to what may be the…
ERIC Educational Resources Information Center
Dole, Marjorie; Hoen, Michel; Meunier, Fanny
2012-01-01
Developmental dyslexia is associated with impaired speech-in-noise perception. The goal of the present research was to further characterize this deficit in dyslexic adults. In order to specify the mechanisms and processing strategies used by adults with dyslexia during speech-in-noise perception, we explored the influence of background type,…
Churchill, Tyler H; Kan, Alan; Goupell, Matthew J; Litovsky, Ruth Y
2014-09-01
Most contemporary cochlear implant (CI) processing strategies discard acoustic temporal fine structure (TFS) information, and this may contribute to the observed deficits in bilateral CI listeners' ability to localize sounds when compared to normal hearing listeners. Additionally, for best speech envelope representation, most contemporary speech processing strategies use high-rate carriers (≥900 Hz) that exceed the limit for interaural pulse timing to provide useful binaural information. Many bilateral CI listeners are sensitive to interaural time differences (ITDs) in low-rate (<300 Hz) constant-amplitude pulse trains. This study explored the trade-off between superior speech temporal envelope representation with high-rate carriers and binaural pulse timing sensitivity with low-rate carriers. The effects of carrier pulse rate and pulse timing on ITD discrimination, ITD lateralization, and speech recognition in quiet were examined in eight bilateral CI listeners. Stimuli consisted of speech tokens processed at different electrical stimulation rates, and pulse timings that either preserved or did not preserve acoustic TFS cues. Results showed that CI listeners were able to use low-rate pulse timing cues derived from acoustic TFS when presented redundantly on multiple electrodes for ITD discrimination and lateralization of speech stimuli.
Teaming for Speech and Auditory Training.
ERIC Educational Resources Information Center
Nussbaum, Debra B.; Waddy-Smith, Bettie
1985-01-01
The article suggests three strategies for the audiologist and speech/communication specialist to use in assisting the preschool teacher to implement student's individualized education program: (1) demonstration teaming, (2) dual teaming; and (3) rotation teaming. (CL)
Self-Organization: Complex Dynamical Systems in the Evolution of Speech
NASA Astrophysics Data System (ADS)
Oudeyer, Pierre-Yves
Human vocalization systems are characterized by complex structural properties. They are combinatorial, based on the systematic reuse of phonemes, and the set of repertoires in human languages is characterized by both strong statistical regularities—universals—and a great diversity. Besides, they are conventional codes culturally shared in each community of speakers. What are the origins of the forms of speech? What are the mechanisms that permitted their evolution in the course of phylogenesis and cultural evolution? How can a shared speech code be formed in a community of individuals? This chapter focuses on the way the concept of self-organization, and its interaction with natural selection, can throw light on these three questions. In particular, a computational model is presented which shows that a basic neural equipment for adaptive holistic vocal imitation, coupling directly motor and perceptual representations in the brain, can generate spontaneously shared combinatorial systems of vocalizations in a society of babbling individuals. Furthermore, we show how morphological and physiological innate constraints can interact with these self-organized mechanisms to account for both the formation of statistical regularities and diversity in vocalization systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.
The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.« less
ERIC Educational Resources Information Center
Thistle, Jennifer J.; McNaughton, David
2015-01-01
Purpose: This study examined the effect of instruction in an active listening strategy on the communication skills of pre-service speech-language pathologists (SLPs). Method: Twenty-three pre-service SLPs in their 2nd year of graduate study received a brief strategy instruction in active listening skills. Participants were videotaped during a…
Nigam, Ravi; Schlosser, Ralf W; Lloyd, Lyle L
2006-09-01
Matrix strategies employing parts of speech arranged in systematic language matrices and milieu language teaching strategies have been successfully used to teach word combining skills to children who have cognitive disabilities and some functional speech. The present study investigated the acquisition and generalized production of two-term semantic relationships in a new population using new types of symbols. Three children with cognitive disabilities and little or no functional speech were taught to combine graphic symbols. The matrix strategy and the mand-model procedure were used concomitantly as intervention procedures. A multiple probe design across sets of action-object combinations with generalization probes of untrained combinations was used to teach the production of graphic symbol combinations. Results indicated that two of the three children learned the early syntactic-semantic rule of combining action-object symbols and demonstrated generalization to untrained action-object combinations and generalization across trainers. The results and future directions for research are discussed.
Investigation of potential cognitive tests for use with older adults in audiology clinics.
Vaughan, Nancy; Storzbach, Daniel; Furukawa, Izumi
2008-01-01
Cognitive declines in working memory and processing speed are hallmarks of aging. Deficits in speech understanding also are seen in aging individuals. A clinical test to determine whether the cognitive aging changes contribute to aging speech understanding difficulties would be helpful for determining rehabilitation strategies in audiology clinics. To identify a clinical neurocognitive test or battery of tests that could be used in audiology clinics to help explain deficits in speech recognition in some older listeners. A correlational study examining the association between certain cognitive test scores and speech recognition performance. Speeded (time-compressed) speech was used to increase the cognitive processing load. Two hundred twenty-five adults aged 50 through 75 years were participants in this study. Both batteries of tests were administered to all participants in two separate sessions. A selected battery of neurocognitive tests and a time-compressed speech recognition test battery using various rates of speech were administered. Principal component analysis was used to extract the important component factors from each set of tests, and regression models were constructed to examine the association between tests and to identify the neurocognitive test most strongly associated with speech recognition performance. A sequencing working memory test (Letter-Number Sequencing [LNS]) was most strongly associated with rapid speech understanding. The association between the LNS test results and the compressed sentence recognition scores (CSRS) was strong even when age and hearing loss were controlled. The LNS is a sequencing test that provides information about temporal processing at the cognitive level and may prove useful in diagnosis of speech understanding problems, and in the development of aural rehabilitation and training strategies.
2014-01-01
Background Communication and swallowing disorders are a common consequence of stroke. Clinical practice guidelines (CPGs) have been created to assist health professionals to put research evidence into clinical practice and can improve stroke care outcomes. However, CPGs are often not successfully implemented in clinical practice and research is needed to explore the factors that influence speech pathologists’ implementation of stroke CPGs. This study aimed to describe speech pathologists’ experiences and current use of guidelines, and to identify what factors influence speech pathologists’ implementation of stroke CPGs. Methods Speech pathologists working in stroke rehabilitation who had used a stroke CPG were invited to complete a 39-item online survey. Content analysis and descriptive and inferential statistics were used to analyse the data. Results 320 participants from all states and territories of Australia were surveyed. Almost all speech pathologists had used a stroke CPG and had found the guideline “somewhat useful” or “very useful”. Factors that speech pathologists perceived influenced CPG implementation included the: (a) guideline itself, (b) work environment, (c) aspects related to the speech pathologist themselves, (d) patient characteristics, and (e) types of implementation strategies provided. Conclusions There are many different factors that can influence speech pathologists’ implementation of CPGs. The factors that influenced the implementation of CPGs can be understood in terms of knowledge creation and implementation frameworks. Speech pathologists should continue to adapt the stroke CPG to their local work environment and evaluate their use. To enhance guideline implementation, they may benefit from a combination of educational meetings and resources, outreach visits, support from senior colleagues, and audit and feedback strategies. PMID:24602148
Azadpour, Mahan; McKay, Colette M
2014-01-01
Auditory brainstem implants (ABI) use the same processing strategy as was developed for cochlear implants (CI). However, the cochlear nucleus (CN), the stimulation site of ABIs, is anatomically and physiologically more complex than the auditory nerve and consists of neurons with differing roles in auditory processing. The aim of this study was to evaluate the hypotheses that ABI users are less able than CI users to access speech spectro-temporal information delivered by the existing strategies and that the sites stimulated by different locations of CI and ABI electrode arrays differ in encoding of temporal patterns in the stimulation. Six CI users and four ABI users of Nucleus implants with ACE processing strategy participated in this study. Closed-set perception of aCa syllables (16 consonants) and bVd words (11 vowels) was evaluated via experimental processing strategies that activated one, two, or four of the electrodes of the array in a CIS manner as well as subjects' clinical strategies. Three single-channel strategies presented the overall temporal envelope variations of the signal on a single-implant electrode located at the high-, medium-, and low-frequency regions of the array. Implantees' ability to discriminate within electrode temporal patterns of stimulation for phoneme perception and their ability to make use of spectral information presented by increased number of active electrodes were assessed in the single- and multiple-channel strategies, respectively. Overall percentages and information transmission of phonetic features were obtained for each experimental program. Phoneme perception performance of three ABI users was within the range of CI users in most of the experimental strategies and improved as the number of active electrodes increased. One ABI user performed close to chance with all the single and multiple electrode strategies. There was no significant difference between apical, basal, and middle CI electrodes in transmitting speech temporal information, except a trend that the voicing feature was the least transmitted by the basal electrode. A similar electrode-location pattern could be observed in most ABI subjects. Although the number of tested ABI subjects was small, their wide range of phoneme perception performance was consistent with previous reports of overall speech perception in ABI patients. The better-performing ABI user participants had access to speech temporal and spectral information that was comparable to that of average CI user. The poor-performing ABI user did not have access to within-channel speech temporal information and did not benefit from an increased number of spectral channels. The within-subject variability between different ABI electrodes was less than the variability across users in transmission of speech temporal information. The difference in the performance of ABI users could be related to the location of their electrode array on the CN, anatomy, and physiology of their CN or the damage to their auditory brainstem due to tumor or surgery.
Jiang, Chenghui; Whitehill, Tara L
2014-04-01
Speech errors associated with cleft palate are well established for English and several other Indo-European languages. Few articles describing the speech of Putonghua (standard Mandarin Chinese) speakers with cleft palate have been published in English language journals. Although methodological guidelines have been published for the perceptual speech evaluation of individuals with cleft palate, there has been no critical review of methodological issues in studies of Putonghua speakers with cleft palate. A literature search was conducted to identify relevant studies published over the past 30 years in Chinese language journals. Only studies incorporating perceptual analysis of speech were included. Thirty-seven articles which met inclusion criteria were analyzed and coded on a number of methodological variables. Reliability was established by having all variables recoded for all studies. This critical review identified many methodological issues. These design flaws make it difficult to draw reliable conclusions about characteristic speech errors in this group of speakers. Specific recommendations are made to improve the reliability and validity of future studies, as well to facilitate cross-center comparisons.
Phonology and Vocal Behavior in Toddlers with Autism Spectrum Disorders
Schoen, Elizabeth; Paul, Rhea; Chawarska, Katyrzyna
2011-01-01
Scientific Abstract The purpose of this study is to examine the phonological and other vocal productions of children, 18-36 months, with autism spectrum disorder (ASD) and to compare these productions to those of age-matched and language-matched controls. Speech samples were obtained from 30 toddlers with ASD, 11 age-matched toddlers and 23 language-matched toddlers during either parent-child or clinician-child play sessions. Samples were coded for a variety of speech-like and non-speech vocalization productions. Toddlers with ASD produced speech-like vocalizations similar to those of language-matched peers, but produced significantly more atypical non-speech vocalizations when compared to both control groups.Toddlers with ASD show speech-like sound production that is linked to their language level, in a manner similar to that seen in typical development. The main area of difference in vocal development in this population is in the production of atypical vocalizations. Findings suggest that toddlers with autism spectrum disorders might not tune into the language model of their environment. Failure to attend to the ambient language environment negatively impacts the ability to acquire spoken language. PMID:21308998
Alm, Magnus; Behne, Dawn
2015-01-01
Gender and age have been found to affect adults’ audio-visual (AV) speech perception. However, research on adult aging focuses on adults over 60 years, who have an increasing likelihood for cognitive and sensory decline, which may confound positive effects of age-related AV-experience and its interaction with gender. Observed age and gender differences in AV speech perception may also depend on measurement sensitivity and AV task difficulty. Consequently both AV benefit and visual influence were used to measure visual contribution for gender-balanced groups of young (20–30 years) and middle-aged adults (50–60 years) with task difficulty varied using AV syllables from different talkers in alternative auditory backgrounds. Females had better speech-reading performance than males. Whereas no gender differences in AV benefit or visual influence were observed for young adults, visually influenced responses were significantly greater for middle-aged females than middle-aged males. That speech-reading performance did not influence AV benefit may be explained by visual speech extraction and AV integration constituting independent abilities. Contrastingly, the gender difference in visually influenced responses in middle adulthood may reflect an experience-related shift in females’ general AV perceptual strategy. Although young females’ speech-reading proficiency may not readily contribute to greater visual influence, between young and middle-adulthood recurrent confirmation of the contribution of visual cues induced by speech-reading proficiency may gradually shift females AV perceptual strategy toward more visually dominated responses. PMID:26236274
Directive Speech Act of Imamu in Katoba Discourse of Muna Ethnic
NASA Astrophysics Data System (ADS)
Ardianto, Ardianto; Hadirman, Hardiman
2018-05-01
One of the traditions of Muna ethnic is katoba ritual. Katoba ritual is one tradition that values local knowledge maintained its existence for generations until today. Katoba ritual is a ritual to be Islamic person, repentance, and the formation of a child's character (male/female) who will enter adulthood (6-11 years) using directive speech. In katoba ritual, a child who is in-katoba introduced to the teaching of the Islamic religion, customs, manners to parents and his brother and behaviour towards others which is expected to be implemented in daily life. This study aims to describe and explain the directive speech acts of the imamu in the katoba discourse of Muna ethnic. This research uses a qualitative approach. Data are collected from a natural setting, namely katoba speech discourses. The data consist of two types, namely: (a) speech data, and (b) field note data. Data are analyzed using an interactive model with four stages: (1) data collection, (2) data reduction, (3) data display, and (4) conclusion and verification. The result shows, firstly, the form of directive speech acts includes declarative and imperative form; secondly, the function of directive speech acts includes functions of teaching, explaining, suggesting, and expecting; and thirdly, the strategy of directive speech acts includes both direct and indirect strategy. The results of this study could be implied in the development of character learning materials at schools. It also can be one of the contents of local content (mulok) at school.
ERIC Educational Resources Information Center
Katongo, Emily Mwamba; Ndhlovu, Daniel
2015-01-01
This study sought to establish the role of music in speech intelligibility of learners with Post Lingual Hearing Impairment (PLHI) and strategies teachers used to enhance speech intelligibility in learners with PLHI in selected special units for the deaf in Lusaka district. The study used a descriptive research design. Qualitative and quantitative…
ERIC Educational Resources Information Center
Masouleh, Fatemeh Abdollahizadeh; Arjmandi, Masoumeh; Vahdany, Fereydoon
2014-01-01
This study deals with the application of the pragmatics research to EFL teaching. The need for language learners to utilize a form of speech acts such as request which involves a series of strategies was significance of the study. Although defining different speech acts has been established since 1960s, recently there has been a shift towards…
Gesture and speech during shared book reading with preschoolers with specific language impairment.
Lavelli, Manuela; Barachetti, Chiara; Florit, Elena
2015-11-01
This study examined (a) the relationship between gesture and speech produced by children with specific language impairment (SLI) and typically developing (TD) children, and their mothers, during shared book-reading, and (b) the potential effectiveness of gestures accompanying maternal speech on the conversational responsiveness of children. Fifteen preschoolers with expressive SLI were compared with fifteen age-matched and fifteen language-matched TD children. Child and maternal utterances were coded for modality, gesture type, gesture-speech informational relationship, and communicative function. Relative to TD peers, children with SLI used more bimodal utterances and gestures adding unique information to co-occurring speech. Some differences were mirrored in maternal communication. Sequential analysis revealed that only in the SLI group maternal reading accompanied by gestures was significantly followed by child's initiatives, and when maternal non-informative repairs were accompanied by gestures, they were more likely to elicit adequate answers from children. These findings support the 'gesture advantage' hypothesis in children with SLI, and have implications for educational and clinical practice.
The minor third communicates sadness in speech, mirroring its use in music.
Curtis, Meagan E; Bharucha, Jamshed J
2010-06-01
There is a long history of attempts to explain why music is perceived as expressing emotion. The relationship between pitches serves as an important cue for conveying emotion in music. The musical interval referred to as the minor third is generally thought to convey sadness. We reveal that the minor third also occurs in the pitch contour of speech conveying sadness. Bisyllabic speech samples conveying four emotions were recorded by 9 actresses. Acoustic analyses revealed that the relationship between the 2 salient pitches of the sad speech samples tended to approximate a minor third. Participants rated the speech samples for perceived emotion, and the use of numerous acoustic parameters as cues for emotional identification was modeled using regression analysis. The minor third was the most reliable cue for identifying sadness. Additional participants rated musical intervals for emotion, and their ratings verified the historical association between the musical minor third and sadness. These findings support the theory that human vocal expressions and music share an acoustic code for communicating sadness.
Enhancing speech recognition using improved particle swarm optimization based hidden Markov model.
Selvaraj, Lokesh; Ganesan, Balakrishnan
2014-01-01
Enhancing speech recognition is the primary intention of this work. In this paper a novel speech recognition method based on vector quantization and improved particle swarm optimization (IPSO) is suggested. The suggested methodology contains four stages, namely, (i) denoising, (ii) feature mining (iii), vector quantization, and (iv) IPSO based hidden Markov model (HMM) technique (IP-HMM). At first, the speech signals are denoised using median filter. Next, characteristics such as peak, pitch spectrum, Mel frequency Cepstral coefficients (MFCC), mean, standard deviation, and minimum and maximum of the signal are extorted from the denoised signal. Following that, to accomplish the training process, the extracted characteristics are given to genetic algorithm based codebook generation in vector quantization. The initial populations are created by selecting random code vectors from the training set for the codebooks for the genetic algorithm process and IP-HMM helps in doing the recognition. At this point the creativeness will be done in terms of one of the genetic operation crossovers. The proposed speech recognition technique offers 97.14% accuracy.
Cochlear implant – state of the art
Lenarz, Thomas
2018-01-01
Cochlear implants are the treatment of choice for auditory rehabilitation of patients with sensory deafness. They restore the missing function of inner hair cells by transforming the acoustic signal into electrical stimuli for activation of auditory nerve fibers. Due to the very fast technology development, cochlear implants provide open-set speech understanding in the majority of patients including the use of the telephone. Children can achieve a near to normal speech and language development provided their deafness is detected early after onset and implantation is performed quickly thereafter. The diagnostic procedure as well as the surgical technique have been standardized and can be adapted to the individual anatomical and physiological needs both in children and adults. Special cases such as cochlear obliteration might require special measures and re-implantation, which can be done in most cases in a straight forward way. Technology upgrades count for better performance. Future developments will focus on better electrode-nerve interfaces by improving electrode technology. An increased number of electrical contacts as well as the biological treatment with regeneration of the dendrites growing onto the electrode will increase the number of electrical channels. This will give room for improved speech coding strategies in order to create the bionic ear, i.e. to restore the process of natural hearing by means of technology. The robot-assisted surgery will allow for high precision surgery and reliable hearing preservation. Biological therapies will support the bionic ear. Methods are bio-hybrid electrodes, which are coded by stem cells transplanted into the inner ear to enhance auto-production of neurotrophins. Local drug delivery will focus on suppression of trauma reaction and local regeneration. Gene therapy by nanoparticles will hopefully lead to the preservation of residual hearing in patients being affected by genetic hearing loss. Overall the cochlear implant is a very powerful tool to rehabilitate patients with sensory deafness. More than 1 million of candidates in Germany today could benefit from this high technology auditory implant. Only 50,000 are implanted so far. In the future, the procedure can be done under local anesthesia, will be minimally invasive and straight forward. Hearing preservation will be routine. PMID:29503669
Cooke, Martin; Lu, Youyi
2010-10-01
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.
NASA Technical Reports Server (NTRS)
Gray, Robert M.
1989-01-01
During the past ten years Vector Quantization (VQ) has developed from a theoretical possibility promised by Shannon's source coding theorems into a powerful and competitive technique for speech and image coding and compression at medium to low bit rates. In this survey, the basic ideas behind the design of vector quantizers are sketched and some comments made on the state-of-the-art and current research efforts.
Persistent Use of Mixed Code: An Exploration of Its Functions in Hong Kong Schools
ERIC Educational Resources Information Center
Low, Winnie W. M.; Lu, Dan
2006-01-01
Codemixing of Cantonese Chinese and English is a common speech behaviour used by bilingual people in Hong Kong. Though codemixing is repeatedly criticised as a cause of the decline of students' language competence, there is little hard evidence to indicate its detrimental effects. This study examines the use of mixed code in the context of the…
Speech coding at low to medium bit rates
NASA Astrophysics Data System (ADS)
Leblanc, Wilfred Paul
1992-09-01
Improved search techniques coupled with improved codebook design methodologies are proposed to improve the performance of conventional code-excited linear predictive coders for speech. Improved methods for quantizing the short term filter are developed by employing a tree search algorithm and joint codebook design to multistage vector quantization. Joint codebook design procedures are developed to design locally optimal multistage codebooks. Weighting during centroid computation is introduced to improve the outlier performance of the multistage vector quantizer. Multistage vector quantization is shown to be both robust against input characteristics and in the presence of channel errors. Spectral distortions of about 1 dB are obtained at rates of 22-28 bits/frame. Structured codebook design procedures for excitation in code-excited linear predictive coders are compared to general codebook design procedures. Little is lost using significant structure in the excitation codebooks while greatly reducing the search complexity. Sparse multistage configurations are proposed for reducing computational complexity and memory size. Improved search procedures are applied to code-excited linear prediction which attempt joint optimization of the short term filter, the adaptive codebook, and the excitation. Improvements in signal to noise ratio of 1-2 dB are realized in practice.
Near-toll quality digital speech transmission in the mobile satellite service
NASA Technical Reports Server (NTRS)
Townes, S. A.; Divsalar, D.
1986-01-01
This paper discusses system considerations for near-toll quality digital speech transmission in a 5 kHz mobile satellite system channel. Tradeoffs are shown for power performance versus delay for a 4800 bps speech compression system in conjunction with a 16 state rate 2/3 trellis coded 8PSK modulation system. The suggested system has an additional 150 ms of delay beyond the propagation delay and requires an E(b)/N(0) of about 7 dB for a Ricean channel assumption with line-of-sight to diffuse component ratio of 10 assuming ideal synchronization. An additional loss of 2 to 3 dB is expected for synchronization in fading environment.
Speech perception in individuals with auditory dys-synchrony.
Kumar, U A; Jayaram, M
2011-03-01
This study aimed to evaluate the effect of lengthening the transition duration of selected speech segments upon the perception of those segments in individuals with auditory dys-synchrony. Thirty individuals with auditory dys-synchrony participated in the study, along with 30 age-matched normal hearing listeners. Eight consonant-vowel syllables were used as auditory stimuli. Two experiments were conducted. Experiment one measured the 'just noticeable difference' time: the smallest prolongation of the speech sound transition duration which was noticeable by the subject. In experiment two, speech sounds were modified by lengthening the transition duration by multiples of the just noticeable difference time, and subjects' speech identification scores for the modified speech sounds were assessed. Subjects with auditory dys-synchrony demonstrated poor processing of temporal auditory information. Lengthening of speech sound transition duration improved these subjects' perception of both the placement and voicing features of the speech syllables used. These results suggest that innovative speech processing strategies which enhance temporal cues may benefit individuals with auditory dys-synchrony.
Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.
Borrie, Stephanie A
2015-03-01
This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.
Cameron, Ashley; McPhail, Steven; Hudson, Kyla; Fleming, Jennifer; Lethlean, Jennifer; Tan, Ngang Ju; Finch, Emma
2018-06-01
The aim of the study was to describe and compare the confidence and knowledge of health professionals (HPs) with and without specialized speech-language training for communicating with people with aphasia (PWA) in a metropolitan hospital setting. Ninety HPs from multidisciplinary teams completed a customized survey to identify their demographic information, knowledge of aphasia, current use of supported conversation strategies and overall communication confidence when interacting with PWA using a 100 mm visual analogue scale (VAS) to rate open-ended questions. Conventional descriptive statistics were used to examine the demographic information. Descriptive statistics and the Mann-Whitney U test were used to analyse VAS confidence rating data. The responses to the open-ended survey questions were grouped into four previously identified key categories. The HPs consisted of 22 (24.4%) participants who were speech-language pathologists and 68 (75.6%) participants from other disciplines (non-speech-language pathology HPs, non-SLP HPs). The non-SLP HPs reported significantly lower confidence levels (U = 159.0, p < 0.001, two-tailed) and identified fewer strategies for communicating effectively with PWA than the trained speech-language pathologists. The non-SLP HPs identified a median of two strategies identified [interquartile range (IQR) 1-3] in contrast to the speech-language pathologists who identified a median of eight strategies (IQR 7-12). These findings suggest that HPs, particularly those without specialized communication education, are likely to benefit from formal training to enhance their confidence, skills and ability to successfully communicate with PWA in their work environment. This may in turn increase the involvement of PWA in their health care decisions. Implications for Rehabilitation Interventions to remediate health professional's (particularly non-speech-language pathology health professionals) lower levels of confidence and ability to communicate with PWA may ultimately help ensure equal access for PWA. Promote informed collaborative decision-making, and foster patient-centred care within the health care setting.
Using Flanagan's phase vocoder to improve cochlear implant performance
NASA Astrophysics Data System (ADS)
Zeng, Fan-Gang
2004-10-01
The cochlear implant has restored partial hearing to more than 100
"Look What I Did!": Student Conferences with Text-to-Speech Software
ERIC Educational Resources Information Center
Young, Chase; Stover, Katie
2014-01-01
The authors describe a strategy that empowers students to edit and revise their own writing. Students input their writing in to text-to-speech software that rereads the text aloud. While listening, students make necessary revisions and edits.
ERIC Educational Resources Information Center
Richard, Gail J.; Hoge, Debra Reichert
Designed for practicing speech-language pathologists, this book discusses different syndrome disabilities, pertinent speech-language characteristics, and goals and strategies to begin intervention efforts at a preschool level. Chapters address: (1) Angelman syndrome; (2) Asperger syndrome; (3) Down syndrome; (4) fetal alcohol syndrome; (5) fetal…
Optimal speech motor control and token-to-token variability: a Bayesian modeling approach.
Patri, Jean-François; Diard, Julien; Perrier, Pascal
2015-12-01
The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the central nervous system selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way.
De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A
2016-01-01
As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and speech. The picture that emerges from our analyses suggests that the relation between gestures, speech and cognition is more complex than previously thought. We suggest that temporal differences and asymmetry in influence between gestures and speech arise from simultaneous coordination of synergies.
29 CFR 1401.21 - Information policy.
Code of Federal Regulations, 2011 CFR
2011-07-01
... excluded by subsection 552(b) of title 5, United States Code, matters covered by the Privacy Act, or other... routine public distribution, e.g., pamphlets, speeches, and educational or training materials, will be...
Cullington, Helen E; Zeng, Fan-Gang
2011-02-01
Despite excellent performance in speech recognition in quiet, most cochlear implant users have great difficulty with speech recognition in noise, music perception, identifying tone of voice, and discriminating different talkers. This may be partly due to the pitch coding in cochlear implant speech processing. Most current speech processing strategies use only the envelope information; the temporal fine structure is discarded. One way to improve electric pitch perception is to use residual acoustic hearing via a hearing aid on the nonimplanted ear (bimodal hearing). This study aimed to test the hypothesis that bimodal users would perform better than bilateral cochlear implant users on tasks requiring good pitch perception. Four pitch-related tasks were used. 1. Hearing in Noise Test (HINT) sentences spoken by a male talker with a competing female, male, or child talker. 2. Montreal Battery of Evaluation of Amusia. This is a music test with six subtests examining pitch, rhythm and timing perception, and musical memory. 3. Aprosodia Battery. This has five subtests evaluating aspects of affective prosody and recognition of sarcasm. 4. Talker identification using vowels spoken by 10 different talkers (three men, three women, two boys, and two girls). Bilateral cochlear implant users were chosen as the comparison group. Thirteen bimodal and 13 bilateral adult cochlear implant users were recruited; all had good speech perception in quiet. There were no significant differences between the mean scores of the bimodal and bilateral groups on any of the tests, although the bimodal group did perform better than the bilateral group on almost all tests. Performance on the different pitch-related tasks was not correlated, meaning that if a subject performed one task well they would not necessarily perform well on another. The correlation between the bimodal users' hearing threshold levels in the aided ear and their performance on these tasks was weak. Although the bimodal cochlear implant group performed better than the bilateral group on most parts of the four pitch-related tests, the differences were not statistically significant. The lack of correlation between test results shows that the tasks used are not simply providing a measure of pitch ability. Even if the bimodal users have better pitch perception, the real-world tasks used are reflecting more diverse skills than pitch. This research adds to the existing speech perception, language, and localization studies that show no significant difference between bimodal and bilateral cochlear implant users.
Key considerations in designing a speech brain-computer interface.
Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Chabardès, Stéphan; Yvert, Blaise
2016-11-01
Restoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cortical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Schwartz, Jean-Luc; Savariaux, Christophe
2014-01-01
An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call “preparatory gestures”. However, when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance, asynchrony should be defined in a different way. This is what we call “comodulatory gestures” providing auditory and visual events more or less in synchrony. We provide audiovisual data on sequences of plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na) showing that audiovisual synchrony is actually rather precise, varying between 20 ms audio lead and 70 ms audio lag. We show how more complex speech material should result in a range typically varying between 40 ms audio lead and 200 ms audio lag, and we discuss how this natural coordination is reflected in the so-called temporal integration window for audiovisual speech perception. Finally we present a toy model of auditory and audiovisual predictive coding, showing that visual lead is actually not necessary for visual prediction. PMID:25079216
Improving speech perception in noise for children with cochlear implants.
Gifford, René H; Olund, Amy P; Dejong, Melissa
2011-10-01
Current cochlear implant recipients are achieving increasingly higher levels of speech recognition; however, the presence of background noise continues to significantly degrade speech understanding for even the best performers. Newer generation Nucleus cochlear implant sound processors can be programmed with SmartSound strategies that have been shown to improve speech understanding in noise for adult cochlear implant recipients. The applicability of these strategies for use in children, however, is not fully understood nor widely accepted. To assess speech perception for pediatric cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether Nucleus sound processor SmartSound strategies yield improved sentence recognition in noise for children who learn language through the implant. Single subject, repeated measures design. Twenty-two experimental subjects with cochlear implants (mean age 11.1 yr) and 25 control subjects with normal hearing (mean age 9.6 yr) participated in this prospective study. Speech reception thresholds (SRT) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the experimental subjects' everyday program incorporating Adaptive Dynamic Range Optimization (ADRO) as well as with the addition of Autosensitivity control (ASC). Adaptive SRTs with the Hearing In Noise Test (HINT) sentences were obtained for all 22 experimental subjects, and performance-in percent correct-was assessed in a fixed +6 dB SNR (signal-to-noise ratio) for a six-subject subset. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the SmartSound setting on the SRT in noise. The primary findings mirrored those reported previously with adult cochlear implant recipients in that the addition of ASC to ADRO significantly improved speech recognition in noise for pediatric cochlear implant recipients. The mean degree of improvement in the SRT with the addition of ASC to ADRO was 3.5 dB for a mean SRT of 10.9 dB SNR. Thus, despite the fact that these children have acquired auditory/oral speech and language through the use of their cochlear implant(s) equipped with ADRO, the addition of ASC significantly improved their ability to recognize speech in high levels of diffuse background noise. The mean SRT for the control subjects with normal hearing was 0.0 dB SNR. Given that the mean SRT for the experimental group was 10.9 dB SNR, despite the improvements in performance observed with the addition of ASC, cochlear implants still do not completely overcome the speech perception deficit encountered in noisy environments accompanying the diagnosis of severe-to-profound hearing loss. SmartSound strategies currently available in latest generation Nucleus cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise for pediatric cochlear implant recipients. Despite the reluctance of pediatric audiologists to utilize SmartSound settings for regular use, the results of the current study support the addition of ASC to ADRO for everyday listening environments to improve speech perception in a child's typical everyday program. American Academy of Audiology.
Speech and Swallowing in Parkinson’s Disease
Tjaden, Kris
2009-01-01
Dysarthria and dysphagia occur frequently in Parkinson’s disease (PD). Reduced speech intelligibility is a significant functional limitation of dysarthria, and in the case of PD is likely related articulatory and phonatory impairment. Prosodically-based treatments show the most promise for addressing these deficits as well as for maximizing speech intelligibility. Communication-oriented strategies also may help to enhance mutual understanding between a speaker and listener. Dysphagia in PD can result in serious health issues, including aspiration pneumonia, malnutrition, and dehydration. Early identification of swallowing abnormalities is critical so as to minimize the impact of dysphagia on health status and quality of life. Feeding modifications, compensatory strategies, and therapeutic swallowing techniques all have a role in the management of dysphagia in PD. PMID:19946386
Johari, Karim; Behroozmand, Roozbeh
2017-08-01
Skilled movement is mediated by motor commands executed with extremely fine temporal precision. The question of how the brain incorporates temporal information to perform motor actions has remained unanswered. This study investigated the effect of stimulus temporal predictability on response timing of speech and hand movement. Subjects performed a randomized vowel vocalization or button press task in two counterbalanced blocks in response to temporally-predictable and unpredictable visual cues. Results indicated that speech and hand reaction time was decreased for predictable compared with unpredictable stimuli. This finding suggests that a temporal predictive code is established to capture temporal dynamics of sensory cues in order to produce faster movements in responses to predictable stimuli. In addition, results revealed a main effect of modality, indicating faster hand movement compared with speech. We suggest that this effect is accounted for by the inherent complexity of speech production compared with hand movement. Lastly, we found that movement inhibition was faster than initiation for both hand and speech, suggesting that movement initiation requires a longer processing time to coordinate activities across multiple regions in the brain. These findings provide new insights into the mechanisms of temporal information processing during initiation and inhibition of speech and hand movement. Copyright © 2017 Elsevier B.V. All rights reserved.
New Perspectives on Assessing Amplification Effects
Souza, Pamela E.; Tremblay, Kelly L.
2006-01-01
Clinicians have long been aware of the range of performance variability with hearing aids. Despite improvements in technology, there remain many instances of well-selected and appropriately fitted hearing aids whereby the user reports minimal improvement in speech understanding. This review presents a multistage framework for understanding how a hearing aid affects performance. Six stages are considered: (1) acoustic content of the signal, (2) modification of the signal by the hearing aid, (3) interaction between sound at the output of the hearing aid and the listener's ear, (4) integrity of the auditory system, (5) coding of available acoustic cues by the listener's auditory system, and (6) correct identification of the speech sound. Within this framework, this review describes methodology and research on 2 new assessment techniques: acoustic analysis of speech measured at the output of the hearing aid and auditory evoked potentials recorded while the listener wears hearing aids. Acoustic analysis topics include the relationship between conventional probe microphone tests and probe microphone measurements using speech, appropriate procedures for such tests, and assessment of signal-processing effects on speech acoustics and recognition. Auditory evoked potential topics include an overview of physiologic measures of speech processing and the effect of hearing loss and hearing aids on cortical auditory evoked potential measurements in response to speech. Finally, the clinical utility of these procedures is discussed. PMID:16959734
Musicians change their tune: how hearing loss alters the neural code.
Parbery-Clark, Alexandra; Anderson, Samira; Kraus, Nina
2013-08-01
Individuals with sensorineural hearing loss have difficulty understanding speech, especially in background noise. This deficit remains even when audibility is restored through amplification, suggesting that mechanisms beyond a reduction in peripheral sensitivity contribute to the perceptual difficulties associated with hearing loss. Given that normal-hearing musicians have enhanced auditory perceptual skills, including speech-in-noise perception, coupled with heightened subcortical responses to speech, we aimed to determine whether similar advantages could be observed in middle-aged adults with hearing loss. Results indicate that musicians with hearing loss, despite self-perceptions of average performance for understanding speech in noise, have a greater ability to hear in noise relative to nonmusicians. This is accompanied by more robust subcortical encoding of sound (e.g., stimulus-to-response correlations and response consistency) as well as more resilient neural responses to speech in the presence of background noise (e.g., neural timing). Musicians with hearing loss also demonstrate unique neural signatures of spectral encoding relative to nonmusicians: enhanced neural encoding of the speech-sound's fundamental frequency but not of its upper harmonics. This stands in contrast to previous outcomes in normal-hearing musicians, who have enhanced encoding of the harmonics but not the fundamental frequency. Taken together, our data suggest that although hearing loss modifies a musician's spectral encoding of speech, the musician advantage for perceiving speech in noise persists in a hearing-impaired population by adaptively strengthening underlying neural mechanisms for speech-in-noise perception. Copyright © 2013 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Richard, Gail J.; Hoge, Debra Reichert
Designed for practicing speech-language pathologists, this book discusses different lesser-known syndrome disabilities, pertinent speech-language characteristics, and goals and strategies to begin intervention efforts at a preschool level. Chapters address: (1) Apert syndrome; (2) Beckwith-Wiedemann syndrome; (3) CHARGE syndrome; (4) Cri-du-Chat…
Single and Multiple Microphone Noise Reduction Strategies in Cochlear Implants
Azimi, Behnam; Hu, Yi; Friedland, David R.
2012-01-01
To restore hearing sensation, cochlear implants deliver electrical pulses to the auditory nerve by relying on sophisticated signal processing algorithms that convert acoustic inputs to electrical stimuli. Although individuals fitted with cochlear implants perform well in quiet, in the presence of background noise, the speech intelligibility of cochlear implant listeners is more susceptible to background noise than that of normal hearing listeners. Traditionally, to increase performance in noise, single-microphone noise reduction strategies have been used. More recently, a number of approaches have suggested that speech intelligibility in noise can be improved further by making use of two or more microphones, instead. Processing strategies based on multiple microphones can better exploit the spatial diversity of speech and noise because such strategies rely mostly on spatial information about the relative position of competing sound sources. In this article, we identify and elucidate the most significant theoretical aspects that underpin single- and multi-microphone noise reduction strategies for cochlear implants. More analytically, we focus on strategies of both types that have been shown to be promising for use in current-generation implant devices. We present data from past and more recent studies, and furthermore we outline the direction that future research in the area of noise reduction for cochlear implants could follow. PMID:22923425
Intensive treatment of speech disorders in robin sequence: a case report.
Pinto, Maria Daniela Borro; Pegoraro-Krook, Maria Inês; Andrade, Laura Katarine Félix de; Correa, Ana Paula Carvalho; Rosa-Lugo, Linda Iris; Dutka, Jeniffer de Cássia Rillo
2017-10-23
To describe the speech of a patient with Pierre Robin Sequence (PRS) and severe speech disorders before and after participating in an Intensive Speech Therapy Program (ISTP). The ISTP consisted of two daily sessions of therapy over a 36-week period, resulting in a total of 360 therapy sessions. The sessions included the phases of establishment, generalization, and maintenance. A combination of strategies, such as modified contrast therapy and speech sound perception training, were used to elicit adequate place of articulation. The ISTP addressed correction of place of production of oral consonants and maximization of movement of the pharyngeal walls with a speech bulb reduction program. Therapy targets were addressed at the phonetic level with a gradual increase in the complexity of the productions hierarchically (e.g., syllables, words, phrases, conversation) while simultaneously addressing the velopharyngeal hypodynamism with speech bulb reductions. Re-evaluation after the ISTP revealed normal speech resonance and articulation with the speech bulb. Nasoendoscopic assessment indicated consistent velopharyngeal closure for all oral sounds with the speech bulb in place. Intensive speech therapy, combined with the use of the speech bulb, yielded positive outcomes in the rehabilitation of a clinical case with severe speech disorders associated with velopharyngeal dysfunction in Pierre Robin Sequence.
Neural coding of sound envelope in reverberant environments.
Slama, Michaël C C; Delgutte, Bertrand
2015-03-11
Speech reception depends critically on temporal modulations in the amplitude envelope of the speech signal. Reverberation encountered in everyday environments can substantially attenuate these modulations. To assess the effect of reverberation on the neural coding of amplitude envelope, we recorded from single units in the inferior colliculus (IC) of unanesthetized rabbit using sinusoidally amplitude modulated (AM) broadband noise stimuli presented in simulated anechoic and reverberant environments. Although reverberation degraded both rate and temporal coding of AM in IC neurons, in most neurons, the degradation in temporal coding was smaller than the AM attenuation in the stimulus. This compensation could largely be accounted for by the compressive shape of the modulation input-output function (MIOF), which describes the nonlinear transformation of modulation depth from acoustic stimuli into neural responses. Additionally, in a subset of neurons, the temporal coding of AM was better for reverberant stimuli than for anechoic stimuli having the same modulation depth at the ear. Using hybrid anechoic stimuli that selectively possess certain properties of reverberant sounds, we show that this reverberant advantage is not caused by envelope distortion, static interaural decorrelation, or spectral coloration. Overall, our results suggest that the auditory system may possess dual mechanisms that make the coding of amplitude envelope relatively robust in reverberation: one general mechanism operating for all stimuli with small modulation depths, and another mechanism dependent on very specific properties of reverberant stimuli, possibly the periodic fluctuations in interaural correlation at the modulation frequency. Copyright © 2015 the authors 0270-6474/15/354452-17$15.00/0.
Language development at 18 months is related to multimodal communicative strategies at 12 months.
Igualada, Alfonso; Bosch, Laura; Prieto, Pilar
2015-05-01
The present study investigated the degree to which an infants' use of simultaneous gesture-speech combinations during controlled social interactions predicts later language development. Nineteen infants participated in a declarative pointing task involving three different social conditions: two experimental conditions (a) available, when the adult was visually attending to the infant but did not attend to the object of reference jointly with the child, and (b) unavailable, when the adult was not visually attending to neither the infant nor the object; and (c) a baseline condition, when the adult jointly engaged with the infant's object of reference. At 12 months of age measures related to infants' speech-only productions, pointing-only gestures, and simultaneous pointing-speech combinations were obtained in each of the three social conditions. Each child's lexical and grammatical output was assessed at 18 months of age through parental report. Results revealed a significant interaction between social condition and type of communicative production. Specifically, only simultaneous pointing-speech combinations increased in frequency during the available condition compared to baseline, while no differences were found for speech-only and pointing-only productions. Moreover, simultaneous pointing-speech combinations in the available condition at 12 months positively correlated with lexical and grammatical development at 18 months of age. The ability to selectively use this multimodal communicative strategy to engage the adult in joint attention by drawing his attention toward an unseen event or object reveals 12-month-olds' clear understanding of referential cues that are relevant for language development. This strategy to successfully initiate and maintain joint attention is related to language development as it increases learning opportunities from social interactions. Copyright © 2015 Elsevier Inc. All rights reserved.
Development of a good-quality speech coder for transmission over noisy channels at 2.4 kb/s
NASA Astrophysics Data System (ADS)
Viswanathan, V. R.; Berouti, M.; Higgins, A.; Russell, W.
1982-03-01
This report describes the development, study, and experimental results of a 2.4 kb/s speech coder called harmonic deviations (HDV) vocoder, which transmits good-quality speech over noisy channels with bit-error rates of up to 1%. The HDV coder is based on the linear predictive coding (LPC) vocoder, and it transmits additional information over and above the data transmitted by the LPC vocoder, in the form of deviations between the speech spectrum and the LPC all-pole model spectrum at a selected set of frequencies. At the receiver, the spectral deviations are used to generate the excitation signal for the all-pole synthesis filter. The report describes and compares several methods for extracting the spectral deviations from the speech signal and for encoding them. To limit the bit-rate of the HDV coder to 2.4 kb/s the report discusses several methods including orthogonal transformation and minimum-mean-square-error scalar quantization of log area ratios, two-stage vector-scalar quantization, and variable frame rate transmission. The report also presents the results of speech-quality optimization of the HDV coder at 2.4 kb/s.
Zhong, Ziwei; Henry, Kenneth S.; Heinz, Michael G.
2014-01-01
People with sensorineural hearing loss often have substantial difficulty understanding speech under challenging listening conditions. Behavioral studies suggest that reduced sensitivity to the temporal structure of sound may be responsible, but underlying neurophysiological pathologies are incompletely understood. Here, we investigate the effects of noise-induced hearing loss on coding of envelope (ENV) structure in the central auditory system of anesthetized chinchillas. ENV coding was evaluated noninvasively using auditory evoked potentials recorded from the scalp surface in response to sinusoidally amplitude modulated tones with carrier frequencies of 1, 2, 4, and 8 kHz and a modulation frequency of 140 Hz. Stimuli were presented in quiet and in three levels of white background noise. The latency of scalp-recorded ENV responses was consistent with generation in the auditory midbrain. Hearing loss amplified neural coding of ENV at carrier frequencies of 2 kHz and above. This result may reflect enhanced ENV coding from the periphery and/or an increase in the gain of central auditory neurons. In contrast to expectations, hearing loss was not associated with a stronger adverse effect of increasing masker intensity on ENV coding. The exaggerated neural representation of ENV information shown here at the level of the auditory midbrain helps to explain previous findings of enhanced sensitivity to amplitude modulation in people with hearing loss under some conditions. Furthermore, amplified ENV coding may potentially contribute to speech perception problems in people with cochlear hearing loss by acting as a distraction from more salient acoustic cues, particularly in fluctuating backgrounds. PMID:24315815
Xiao, Bo; Huang, Chewei; Imel, Zac E; Atkins, David C; Georgiou, Panayiotis; Narayanan, Shrikanth S
2016-04-01
Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy-a key therapy quality index-from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training.
Xiao, Bo; Huang, Chewei; Imel, Zac E.; Atkins, David C.; Georgiou, Panayiotis; Narayanan, Shrikanth S.
2016-01-01
Scaling up psychotherapy services such as for addiction counseling is a critical societal need. One challenge is ensuring quality of therapy, due to the heavy cost of manual observational assessment. This work proposes a speech technology-based system to automate the assessment of therapist empathy—a key therapy quality index—from audio recordings of the psychotherapy interactions. We designed a speech processing system that includes voice activity detection and diarization modules, and an automatic speech recognizer plus a speaker role matching module to extract the therapist's language cues. We employed Maximum Entropy models, Maximum Likelihood language models, and a Lattice Rescoring method to characterize high vs. low empathic language. We estimated therapy-session level empathy codes using utterance level evidence obtained from these models. Our experiments showed that the fully automated system achieved a correlation of 0.643 between expert annotated empathy codes and machine-derived estimations, and an accuracy of 81% in classifying high vs. low empathy, in comparison to a 0.721 correlation and 86% accuracy in the oracle setting using manual transcripts. The results show that the system provides useful information that can contribute to automatic quality insurance and therapist training. PMID:28286867
The Struggle with Hate Speech. Teaching Strategy.
ERIC Educational Resources Information Center
Bloom, Jennifer
1995-01-01
Discusses the issue of hate-motivated violence and special laws aimed at deterrence. Presents a secondary school lesson to help students define hate speech and understand constitutional issues related to the topic. Includes three student handouts, student learning objectives, instructional procedures, and a discussion guide. (CFR)
Literacy as Commodity: Redistributing the Goods.
ERIC Educational Resources Information Center
Elsasser, Nan; Irvine, Patricia
1992-01-01
A rationale is presented for educational change and the strategies to achieve it. The model of speech communities of Dell Hymes is used to show how language differences are connected to social and economic disparities. Efforts to create new speech communities to overcome inequalities are discussed. (SLD)
The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients
Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine
2016-01-01
Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to compare each pre-processing strategy across group data. Critical differences were utilized to determine significant score differences between each pre-processing strategy for individual participants. Results For CNC words presented at 50 dB SPL, the group data revealed significantly better scores using ASC+ADRO® compared to all other pre-processing conditions while ASC resulted in poorer scores compared to ADRO® and ASC+ADRO®. Group data for HINT sentences presented in 70 dB SPL of R-Space noise revealed significantly improved scores using ASC and ASC+ADRO® compared to no pre-processing, with ASC+ADRO® scores being better than ADRO® alone scores. Group data for CNC words presented at 70 dB SPL and adaptive HINT sentences presented in 60 dB SPL of R-Space noise showed no significant difference among conditions. Individual data showed that the pre-processing strategy yielding the best scores varied across measures and participants. Conclusions Group data reveals an advantage with ASC+ADRO® for speech perception presented at lower levels and in higher levels of background noise. Individual data revealed that the optimal pre-processing strategy varied among participants; indicating that a variety of pre-processing strategies should be explored for each CI user considering his or her performance in challenging listening environments. PMID:26905529
Putter-Katz, Hanna; Adi-Bensaid, Limor; Feldman, Irit; Hildesheimer, Minka
2008-01-01
Twenty children with central auditory processing disorders [(C)APD] were subjected to a structured intervention program of listening skills in quiet and in noise. Their performance was compared to that of a control group of 10 children with (C)APD with no special treatment. Pretests were conducted in quiet and in degraded listening conditions (speech noise and competing speech). The (C)APD management approach was integrative and included top-down and bottom-up strategies. It focused on environmental modifications, remediation techniques, and compensatory strategies. Training was conducted with monosyllabic and polysyllabic words, sentences and phrases in quiet and in noise. Comparisons of pre- and post-management measures indicated increase in speech recognition performance in background noise and competing speech for the treatment group. This improvement was exhibited for both ears. A significant difference between ears was found with the left ear showing improvement in both the short and the long versions of competing sentence tests and the right ear performing better in the long competing sentences only following intervention. No changes were documented for the control group. These findings add to a growing body of literature suggesting that interactive auditory training can improve listening skills.
Electrocardiographic anxiety profiles improve speech anxiety.
Kim, Pyoung Won; Kim, Seung Ae; Jung, Keun-Hwa
2012-12-01
The present study was to set out in efforts to determine the effect of electrocardiographic (ECG) feedback on the performance in speech anxiety. Forty-six high school students participated in a speech performance educational program. They were randomly divided into two groups, an experimental group with ECG feedback (N = 21) and a control group (N = 25). Feedback was given with video recording in the control, whereas in the experimental group, an additional ECG feedback was provided. Speech performance was evaluated by the Korean Broadcasting System (KBS) speech ability test, which determines the 10 different speaking categories. ECG was recorded during rest and speech, together with a video recording of the speech performance. Changes in R-R intervals were used to reflect anxiety profiles. Three trials were performed for 3-week program. Results showed that the subjects with ECG feedback revealed a significant improvement in speech performance and anxiety states, which compared to those in the control group. These findings suggest that visualization of the anxiety profile feedback with ECG can be a better cognitive therapeutic strategy in speech anxiety.
Strategies for distant speech recognitionin reverberant environments
NASA Astrophysics Data System (ADS)
Delcroix, Marc; Yoshioka, Takuya; Ogawa, Atsunori; Kubo, Yotaro; Fujimoto, Masakiyo; Ito, Nobutaka; Kinoshita, Keisuke; Espi, Miquel; Araki, Shoko; Hori, Takaaki; Nakatani, Tomohiro
2015-12-01
Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.
Space station interior noise analysis program
NASA Technical Reports Server (NTRS)
Stusnick, E.; Burn, M.
1987-01-01
Documentation is provided for a microcomputer program which was developed to evaluate the effect of the vibroacoustic environment on speech communication inside a space station. The program, entitled Space Station Interior Noise Analysis Program (SSINAP), combines a Statistical Energy Analysis (SEA) prediction of sound and vibration levels within the space station with a speech intelligibility model based on the Modulation Transfer Function and the Speech Transmission Index (MTF/STI). The SEA model provides an effective analysis tool for predicting the acoustic environment based on proposed space station design. The MTF/STI model provides a method for evaluating speech communication in the relatively reverberant and potentially noisy environments that are likely to occur in space stations. The combinations of these two models provides a powerful analysis tool for optimizing the acoustic design of space stations from the point of view of speech communications. The mathematical algorithms used in SSINAP are presented to implement the SEA and MTF/STI models. An appendix provides an explanation of the operation of the program along with details of the program structure and code.
Using speech for mode selection in control of multifunctional myoelectric prostheses.
Fang, Peng; Wei, Zheng; Geng, Yanjuan; Yao, Fuan; Li, Guanglin
2013-01-01
Electromyogram (EMG) recorded from residual muscles of limbs is considered as suitable control information for motorized prostheses. However, in case of high-level amputations, the residual muscles are usually limited, which may not provide enough EMG for flexible control of myoelectric prostheses with multiple degrees of freedom of movements. Here, we proposed a control strategy, where the speech signals were used as additional information and combined with the EMG signals to realize more flexible control of multifunctional prostheses. By replacing the traditional "sequential mode-switching (joint-switching)", the speech signals were used to select a mode (joint) of the prosthetic arm, and then the EMG signals were applied to determine a motion class involved in the selected joint and to execute the motion. Preliminary results from three able-bodied subjects and one transhumeral amputee demonstrated the proposed strategy could achieve a high mode-selection rate and enhance the operation efficiency, suggesting the strategy may improve the control performance of commercial myoelectric prostheses.
Role of the Speech-Language Pathologist (SLP) in the Head and Neck Cancer Team.
Hansen, Kelly; Chenoweth, Marybeth; Thompson, Heather; Strouss, Alexandra
2018-01-01
While treatments for head and neck cancer are aimed at curing patients from disease, they can have significant short- and long-term negative impacts on speech and swallowing functions. Research demonstrates that early and frequent involvement of Speech-Language Pathologists (SLPs) is beneficial to these functions and overall quality of life for head and neck cancer patients. Strategies and tools to optimize communication and safe swallowing are presented in this chapter.
High-frame-rate full-vocal-tract 3D dynamic speech imaging.
Fu, Maojing; Barlaz, Marissa S; Holtrop, Joseph L; Perry, Jamie L; Kuehn, David P; Shosted, Ryan K; Liang, Zhi-Pei; Sutton, Bradley P
2017-04-01
To achieve high temporal frame rate, high spatial resolution and full-vocal-tract coverage for three-dimensional dynamic speech MRI by using low-rank modeling and sparse sampling. Three-dimensional dynamic speech MRI is enabled by integrating a novel data acquisition strategy and an image reconstruction method with the partial separability model: (a) a self-navigated sparse sampling strategy that accelerates data acquisition by collecting high-nominal-frame-rate cone navigator sand imaging data within a single repetition time, and (b) are construction method that recovers high-quality speech dynamics from sparse (k,t)-space data by enforcing joint low-rank and spatiotemporal total variation constraints. The proposed method has been evaluated through in vivo experiments. A nominal temporal frame rate of 166 frames per second (defined based on a repetition time of 5.99 ms) was achieved for an imaging volume covering the entire vocal tract with a spatial resolution of 2.2 × 2.2 × 5.0 mm 3 . Practical utility of the proposed method was demonstrated via both validation experiments and a phonetics investigation. Three-dimensional dynamic speech imaging is possible with full-vocal-tract coverage, high spatial resolution and high nominal frame rate to provide dynamic speech data useful for phonetic studies. Magn Reson Med 77:1619-1629, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Abdeltawwab, Mohamed M; Khater, Ahmed; El-Anwar, Mohammad W
2016-01-01
The combination of acoustic and electric stimulation as a way to enhance speech recognition performance in cochlear implant (CI) users has generated considerable interest in the recent years. The purpose of this study was to evaluate the bimodal advantage of the FS4 speech processing strategy in combination with hearing aids (HA) as a means to improve low-frequency resolution in CI patients. Nineteen postlingual CI adults were selected to participate in this study. All patients wore implants on one side and HA on the contralateral side with residual hearing. Monosyllabic word recognition, speech in noise, and emotion and talker identification were assessed using CI with fine structure processing/FS4 and high-definition continuous interleaved sampling strategies, HA alone, and a combination of CI and HA. The bimodal stimulation showed improvement in speech performance and emotion identification for the question/statement/order tasks, which was statistically significant compared to patients with CI alone, but there were no significant statistical differences in intragender talker discrimination and emotion identification for the happy/angry/neutral tasks. The poorest performance was obtained with HA only, and it was statistically significant compared to the other modalities. The bimodal stimulation showed enhanced speech performance in CI patients, and it improves the limitations provided by electric or acoustic stimulation alone. © 2016 S. Karger AG, Basel.
Filtering, Coding, and Compression with Malvar Wavelets
1993-12-01
speech coding techniques being investigated by the military (38). Imagery: Space imagery often requires adaptive restoration to deblur out-of-focus...and blurred image, find an estimate of the ideal image using a priori information about the blur, noise , and the ideal image" (12). The research for...recording can be described as the original signal convolved with impulses , which appear as echoes in the seismic event. The term deconvolution indicates
Current Controversies in Diagnosis and Management of Cleft Palate and Velopharyngeal Insufficiency
Ysunza, Pablo Antonio; Repetto, Gabriela M.; Pamplona, Maria Carmen; Calderon, Juan F.; Shaheen, Kenneth; Chaiyasate, Konkgrit; Rontal, Matthew
2015-01-01
Background. One of the most controversial topics concerning cleft palate is the diagnosis and treatment of velopharyngeal insufficiency (VPI). Objective. This paper reviews current genetic aspects of cleft palate, imaging diagnosis of VPI, the planning of operations for restoring velopharyngeal function during speech, and strategies for speech pathology treatment of articulation disorders in patients with cleft palate. Materials and Methods. An updated review of the scientific literature concerning genetic aspects of cleft palate was carried out. Current strategies for assessing and treating articulation disorders associated with cleft palate were analyzed. Imaging procedures for assessing velopharyngeal closure during speech were reviewed, including a recent method for performing intraoperative videonasopharyngoscopy. Results. Conclusions from the analysis of genetic aspects of syndromic and nonsyndromic cleft palate and their use in its diagnosis and management are presented. Strategies for classifying and treating articulation disorders in patients with cleft palate are presented. Preliminary results of the use of multiplanar videofluoroscopy as an outpatient procedure and intraoperative endoscopy for the planning of operations which aimed to correct VPI are presented. Conclusion. This paper presents current aspects of the diagnosis and management of patients with cleft palate and VPI including 3 main aspects: genetics and genomics, speech pathology and imaging diagnosis, and surgical management. PMID:26273595
Classroom acoustics and intervention strategies to enhance the learning environment
NASA Astrophysics Data System (ADS)
Savage, Christal
The classroom environment can be an acoustically difficult atmosphere for students to learn effectively, sometimes due in part to poor acoustical properties. Noise and reverberation have a substantial influence on room acoustics and subsequently intelligibility of speech. The American Speech-Language-Hearing Association (ASHA, 1995) developed minimal standards for noise and reverberation in a classroom for the purpose of providing an adequate listening environment. A lack of adherence to these standards may have undesirable consequences, which may lead to poor academic performance. The purpose of this capstone project is to develop a protocol to measure the acoustical properties of reverberation time and noise levels in elementary classrooms and present the educators with strategies to improve the learning environment. Noise level and reverberation will be measured and recorded in seven, unoccupied third grade classrooms in Lincoln Parish in North Louisiana. The recordings will occur at six specific distances in the classroom to simulate teacher and student positions. The recordings will be compared to the American Speech-Language-Hearing Association standards for noise and reverberation. If discrepancies are observed, the primary investigator will serve as an auditory consultant for the school and educators to recommend remediation and intervention strategies to improve these acoustical properties. The hypothesis of the study is that the classroom acoustical properties of noise and reverberation will exceed the American Speech-Language-Hearing Association standards; therefore, the auditory consultant will provide strategies to improve those acoustical properties.
NASA Technical Reports Server (NTRS)
Birch, J. N.; Getzin, N.
1971-01-01
Analog and digital voice coding techniques for application to an L-band satellite-basedair traffic control (ATC) system for over ocean deployment are examined. In addition to performance, the techniques are compared on the basis of cost, size, weight, power consumption, availability, reliability, and multiplexing features. Candidate systems are chosen on the bases of minimum required RF bandwidth and received carrier-to-noise density ratios. A detailed survey of automated and nonautomated intelligibility testing methods and devices is presented and comparisons given. Subjective evaluation of speech system by preference tests is considered. Conclusion and recommendations are developed regarding the selection of the voice system. Likewise, conclusions and recommendations are developed for the appropriate use of intelligibility tests, speech quality measurements, and preference tests with the framework of the proposed ATC system.
Reading your own lips: common-coding theory and visual speech perception.
Tye-Murray, Nancy; Spehar, Brent P; Myerson, Joel; Hale, Sandra; Sommers, Mitchell S
2013-02-01
Common-coding theory posits that (1) perceiving an action activates the same representations of motor plans that are activated by actually performing that action, and (2) because of individual differences in the ways that actions are performed, observing recordings of one's own previous behavior activates motor plans to an even greater degree than does observing someone else's behavior. We hypothesized that if observing oneself activates motor plans to a greater degree than does observing others, and if these activated plans contribute to perception, then people should be able to lipread silent video clips of their own previous utterances more accurately than they can lipread video clips of other talkers. As predicted, two groups of participants were able to lipread video clips of themselves, recorded more than two weeks earlier, significantly more accurately than video clips of others. These results suggest that visual input activates speech motor activity that links to word representations in the mental lexicon.
Do perceived context pictures automatically activate their phonological code?
Jescheniak, Jörg D; Oppermann, Frank; Hantsch, Ansgar; Wagner, Valentin; Mädebach, Andreas; Schriefers, Herbert
2009-01-01
Morsella and Miozzo (Morsella, E., & Miozzo, M. (2002). Evidence for a cascade model of lexical access in speech production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 555-563) have reported that the to-be-ignored context pictures become phonologically activated when participants name a target picture, and took this finding as support for cascaded models of lexical retrieval in speech production. In a replication and extension of their experiment in German, we failed to obtain priming effects from context pictures phonologically related to a to-be-named target picture. By contrast, corresponding context words (i.e., the names of the respective pictures) and the same context pictures, when used in an identity condition, did reliably facilitate the naming process. This pattern calls into question the generality of the claim advanced by Morsella and Miozzo that perceptual processing of pictures in the context of a naming task automatically leads to the activation of corresponding lexical-phonological codes.
Small intragenic deletion in FOXP2 associated with childhood apraxia of speech and dysarthria.
Turner, Samantha J; Hildebrand, Michael S; Block, Susan; Damiano, John; Fahey, Michael; Reilly, Sheena; Bahlo, Melanie; Scheffer, Ingrid E; Morgan, Angela T
2013-09-01
Relatively little is known about the neurobiological basis of speech disorders although genetic determinants are increasingly recognized. The first gene for primary speech disorder was FOXP2, identified in a large, informative family with verbal and oral dyspraxia. Subsequently, many de novo and familial cases with a severe speech disorder associated with FOXP2 mutations have been reported. These mutations include sequencing alterations, translocations, uniparental disomy, and genomic copy number variants. We studied eight probands with speech disorder and their families. Family members were phenotyped using a comprehensive assessment of speech, oral motor function, language, literacy skills, and cognition. Coding regions of FOXP2 were screened to identify novel variants. Segregation of the variant was determined in the probands' families. Variants were identified in two probands. One child with severe motor speech disorder had a small de novo intragenic FOXP2 deletion. His phenotype included features of childhood apraxia of speech and dysarthria, oral motor dyspraxia, receptive and expressive language disorder, and literacy difficulties. The other variant was found in a family in two of three family members with stuttering, and also in the mother with oral motor impairment. This variant was considered a benign polymorphism as it was predicted to be non-pathogenic with in silico tools and found in database controls. This is the first report of a small intragenic deletion of FOXP2 that is likely to be the cause of severe motor speech disorder associated with language and literacy problems. Copyright © 2013 Wiley Periodicals, Inc.
Action planning and predictive coding when speaking
Wang, Jun; Mathalon, Daniel H.; Roach, Brian J.; Reilly, James; Keedy, Sarah; Sweeney, John A.; Ford, Judith M.
2014-01-01
Across the animal kingdom, sensations resulting from an animal's own actions are processed differently from sensations resulting from external sources, with self-generated sensations being suppressed. A forward model has been proposed to explain this process across sensorimotor domains. During vocalization, reduced processing of one's own speech is believed to result from a comparison of speech sounds to corollary discharges of intended speech production generated from efference copies of commands to speak. Until now, anatomical and functional evidence validating this model in humans has been indirect. Using EEG with anatomical MRI to facilitate source localization, we demonstrate that inferior frontal gyrus activity during the 300ms before speaking was associated with suppressed processing of speech sounds in auditory cortex around 100ms after speech onset (N1). These findings indicate that an efference copy from speech areas in prefrontal cortex is transmitted to auditory cortex, where it is used to suppress processing of anticipated speech sounds. About 100ms after N1, a subsequent auditory cortical component (P2) was not suppressed during talking. The combined N1 and P2 effects suggest that although sensory processing is suppressed as reflected in N1, perceptual gaps are filled as reflected in the lack of P2 suppression, explaining the discrepancy between sensory suppression and preserved sensory experiences. These findings, coupled with the coherence between relevant brain regions before and during speech, provide new mechanistic understanding of the complex interactions between action planning and sensory processing that provide for differentiated tagging and monitoring of one's own speech, processes disrupted in neuropsychiatric disorders. PMID:24423729
Cracking the Language Code: Neural Mechanisms Underlying Speech Parsing
McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella
2013-01-01
Word segmentation, detecting word boundaries in continuous speech, is a critical aspect of language learning. Previous research in infants and adults demonstrated that a stream of speech can be readily segmented based solely on the statistical and speech cues afforded by the input. Using functional magnetic resonance imaging (fMRI), the neural substrate of word segmentation was examined on-line as participants listened to three streams of concatenated syllables, containing either statistical regularities alone, statistical regularities and speech cues, or no cues. Despite the participants’ inability to explicitly detect differences between the speech streams, neural activity differed significantly across conditions, with left-lateralized signal increases in temporal cortices observed only when participants listened to streams containing statistical regularities, particularly the stream containing speech cues. In a second fMRI study, designed to verify that word segmentation had implicitly taken place, participants listened to trisyllabic combinations that occurred with different frequencies in the streams of speech they just heard (“words,” 45 times; “partwords,” 15 times; “nonwords,” once). Reliably greater activity in left inferior and middle frontal gyri was observed when comparing words with partwords and, to a lesser extent, when comparing partwords with nonwords. Activity in these regions, taken to index the implicit detection of word boundaries, was positively correlated with participants’ rapid auditory processing skills. These findings provide a neural signature of on-line word segmentation in the mature brain and an initial model with which to study developmental changes in the neural architecture involved in processing speech cues during language learning. PMID:16855090
Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age
Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik
2015-01-01
Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker’s age. Here, we report two experiments on age estimation by “naïve” listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers’ natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60–65 years) speakers in comparison with younger (20–25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40–45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed. PMID:26236259
Testing and Beyond: Strategies and Tools for Evaluating and Assessing Infants and Toddlers
ERIC Educational Resources Information Center
Crais, Elizabeth R.
2011-01-01
Purpose: This article is a condensation of the recent American Speech-Language-Hearing Association (ASHA) document entitled "Roles and Responsibilities of Speech-Language Pathologists in Early Intervention: Guidelines" (ASHA, 2008). The article presents information on recommended and evidence-based practices related to the screening, evaluation,…
The Pardoning of Richard Nixon: A Failure in Motivational Strategy.
ERIC Educational Resources Information Center
Klumpp, James F.; Lukehart, Jeffrey K.
1978-01-01
Discusses the failure of Gerald Ford's speech pardoning Richard Nixon and the relationship between moral and legal themes. The speech failed because Ford did not effectively combine these themes into a single compatible perspective from which his pardon became the appropriate response to the situation. (JMF)
Business Speaks: A Study of the Themes in Speeches by America's Corporate Leaders.
ERIC Educational Resources Information Center
Myers, Robert J.; Kessler, Martha Stout
1980-01-01
Identifies the issues concerning the American business community as reflected in speeches by corporate leaders on government regulation, energy, capital investment, inflation, the public image of business, and international business. Strategies for dealing with these issues include increased social responsibility, influencing government policy,…
Cerebral Palsy and Communication--What Parents Can Do.
ERIC Educational Resources Information Center
Golbin, Arlene, Ed.
Intended for parents of cerebral palsied children, the manual discusses special communication problems that often accompany the condition, and describes various strategies for helping such children communicate. A chapter on positioning for speech diagrams 14 different positions to help facilitate better functioning in many areas, including speech.…
1986-03-01
attributed to insufficient power in the experimental design: Two of the studies that failed to find evidence of sign-based coding when printed words...perception of [p]; so may a lesser amount of silence, insufficient to cue a [p] percept in itself, followed bytransitions characteristic of [p] release...posterior pharyngeal wall has become visible through the nasal passage; the Velotrace is inserted using a procedure similar to that used for nasal
Vowel Space Characteristics of Speech Directed to Children With and Without Hearing Loss
Wieland, Elizabeth A.; Burnham, Evamarie B.; Kondaurova, Maria; Bergeson, Tonya R.
2015-01-01
Purpose This study examined vowel characteristics in adult-directed (AD) and infant-directed (ID) speech to children with hearing impairment who received cochlear implants or hearing aids compared with speech to children with normal hearing. Method Mothers' AD and ID speech to children with cochlear implants (Study 1, n = 20) or hearing aids (Study 2, n = 11) was compared with mothers' speech to controls matched on age and hearing experience. The first and second formants of vowels /i/, /ɑ/, and /u/ were measured, and vowel space area and dispersion were calculated. Results In both studies, vowel space was modified in ID compared with AD speech to children with and without hearing loss. Study 1 showed larger vowel space area and dispersion in ID compared with AD speech regardless of infant hearing status. The pattern of effects of ID and AD speech on vowel space characteristics in Study 2 was similar to that in Study 1, but depended partly on children's hearing status. Conclusion Given previously demonstrated associations between expanded vowel space in ID compared with AD speech and enhanced speech perception skills, this research supports a focus on vowel pronunciation in developing intervention strategies for improving speech-language skills in children with hearing impairment. PMID:25658071
Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S
2012-03-14
Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.
Spectro-temporal cues enhance modulation sensitivity in cochlear implant users
Zheng, Yi; Escabí, Monty; Litovsky, Ruth Y.
2018-01-01
Although speech understanding is highly variable amongst cochlear implants (CIs) subjects, the remarkably high speech recognition performance of many CI users is unexpected and not well understood. Numerous factors, including neural health and degradation of the spectral information in the speech signal of CIs, likely contribute to speech understanding. We studied the ability to use spectro-temporal modulations, which may be critical for speech understanding and discrimination, and hypothesize that CI users adopt a different perceptual strategy than normal-hearing (NH) individuals, whereby they rely more heavily on joint spectro-temporal cues to enhance detection of auditory cues. Modulation detection sensitivity was studied in CI users and NH subjects using broadband “ripple” stimuli that were modulated spectrally, temporally, or jointly, i.e., spectro-temporally. The spectro-temporal modulation transfer functions of CI users and NH subjects was decomposed into spectral and temporal dimensions and compared to those subjects’ spectral-only and temporal-only modulation transfer functions. In CI users, the joint spectro-temporal sensitivity was better than that predicted by spectral-only and temporal-only sensitivity, indicating a heightened spectro-temporal sensitivity. Such an enhancement through the combined integration of spectral and temporal cues was not observed in NH subjects. The unique use of spectro-temporal cues by CI patients can yield benefits for use of cues that are important for speech understanding. This finding has implications for developing sound processing strategies that may rely on joint spectro-temporal modulations to improve speech comprehension of CI users, and the findings of this study may be valuable for developing clinical assessment tools to optimize CI processor performance. PMID:28601530
Donaldson, Gail S; Dawson, Patricia K; Borden, Lamar Z
2011-01-01
Previous studies have confirmed that current steering can increase the number of discriminable pitches available to many cochlear implant (CI) users; however, the ability to perceive additional pitches has not been linked to improved speech perception. The primary goals of this study were to determine (1) whether adult CI users can achieve higher levels of spectral cue transmission with a speech processing strategy that implements current steering (Fidelity120) than with a predecessor strategy (HiRes) and, if so, (2) whether the magnitude of improvement can be predicted from individual differences in place-pitch sensitivity. A secondary goal was to determine whether Fidelity120 supports higher levels of speech recognition in noise than HiRes. A within-subjects repeated measures design evaluated speech perception performance with Fidelity120 relative to HiRes in 10 adult CI users. Subjects used the novel strategy (either HiRes or Fidelity120) for 8 wks during the main study; a subset of five subjects used Fidelity120 for three additional months after the main study. Speech perception was assessed for the spectral cues related to vowel F1 frequency, vowel F2 frequency, and consonant place of articulation; overall transmitted information for vowels and consonants; and sentence recognition in noise. Place-pitch sensitivity was measured for electrode pairs in the apical, middle, and basal regions of the implanted array using a psychophysical pitch-ranking task. With one exception, there was no effect of strategy (HiRes versus Fidelity120) on the speech measures tested, either during the main study (N = 10) or after extended use of Fidelity120 (N = 5). The exception was a small but significant advantage for HiRes over Fidelity120 for consonant perception during the main study. Examination of individual subjects' data revealed that 3 of 10 subjects demonstrated improved perception of one or more spectral cues with Fidelity120 relative to HiRes after 8 wks or longer experience with Fidelity120. Another three subjects exhibited initial decrements in spectral cue perception with Fidelity120 at the 8-wk time point; however, evidence from one subject suggested that such decrements may resolve with additional experience. Place-pitch thresholds were inversely related to improvements in vowel F2 frequency perception with Fidelity120 relative to HiRes. However, no relationship was observed between place-pitch thresholds and the other spectral measures (vowel F1 frequency or consonant place of articulation). Findings suggest that Fidelity120 supports small improvements in the perception of spectral speech cues in some Advanced Bionics CI users; however, many users show no clear benefit. Benefits are more likely to occur for vowel spectral cues (related to F1 and F2 frequency) than for consonant spectral cues (related to place of articulation). There was an inconsistent relationship between place-pitch sensitivity and improvements in spectral cue perception with Fidelity120 relative to HiRes. This may partly reflect the small number of sites at which place-pitch thresholds were measured. Contrary to some previous reports, there was no clear evidence that Fidelity120 supports improved sentence recognition in noise.
Gifford, René H; Revit, Lawrence J
2010-01-01
Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects' preferred listening programs as well as with the addition of either Beam preprocessing (Cochlear Corporation) or the T-Mic accessory option (Advanced Bionics). In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested "Everyday," "Noise," and "Focus" preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. American Academy of Audiology.
1990-07-01
changes either in the MFA or in Soviet foreign and defense policy. This situation began to change in May 1986, when Gorbachev gave an unusual speech to the...MFA in which he demanded better performance from Soviet diplomats. Although it was later reported that Gorbachev’s speech contained strong criticism...July 1988 with a sweeping critique of Soviet strategy and ,military policy since World War II. Subsequent speeches and articles in MFA-controlled
Application of advanced speech technology in manned penetration bombers
NASA Astrophysics Data System (ADS)
North, R.; Lea, W.
1982-03-01
This report documents research on the potential use of speech technology in a manned penetration bomber aircraft (B-52/G and H). The objectives of the project were to analyze the pilot/copilot crewstation tasks over a three-hour-and forty-minute mission and determine the tasks that would benefit the most from conversion to speech recognition/generation, determine the technological feasibility of each of the identified tasks, and prioritize these tasks based on these criteria. Secondary objectives of the program were to enunciate research strategies in the application of speech technologies in airborne environments, and develop guidelines for briefing user commands on the potential of using speech technologies in the cockpit. The results of this study indicated that for the B-52 crewmember, speech recognition would be most beneficial for retrieving chart and procedural data that is contained in the flight manuals. Technological feasibility of these tasks indicated that the checklist and procedural retrieval tasks would be highly feasible for a speech recognition system.
Role of N-Methyl-D-Aspartate Receptors in Action-Based Predictive Coding Deficits in Schizophrenia.
Kort, Naomi S; Ford, Judith M; Roach, Brian J; Gunduz-Bruce, Handan; Krystal, John H; Jaeger, Judith; Reinhart, Robert M G; Mathalon, Daniel H
2017-03-15
Recent theoretical models of schizophrenia posit that dysfunction of the neural mechanisms subserving predictive coding contributes to symptoms and cognitive deficits, and this dysfunction is further posited to result from N-methyl-D-aspartate glutamate receptor (NMDAR) hypofunction. Previously, by examining auditory cortical responses to self-generated speech sounds, we demonstrated that predictive coding during vocalization is disrupted in schizophrenia. To test the hypothesized contribution of NMDAR hypofunction to this disruption, we examined the effects of the NMDAR antagonist, ketamine, on predictive coding during vocalization in healthy volunteers and compared them with the effects of schizophrenia. In two separate studies, the N1 component of the event-related potential elicited by speech sounds during vocalization (talk) and passive playback (listen) were compared to assess the degree of N1 suppression during vocalization, a putative measure of auditory predictive coding. In the crossover study, 31 healthy volunteers completed two randomly ordered test days, a saline day and a ketamine day. Event-related potentials during the talk/listen task were obtained before infusion and during infusion on both days, and N1 amplitudes were compared across days. In the case-control study, N1 amplitudes from 34 schizophrenia patients and 33 healthy control volunteers were compared. N1 suppression to self-produced vocalizations was significantly and similarly diminished by ketamine (Cohen's d = 1.14) and schizophrenia (Cohen's d = .85). Disruption of NMDARs causes dysfunction in predictive coding during vocalization in a manner similar to the dysfunction observed in schizophrenia patients, consistent with the theorized contribution of NMDAR hypofunction to predictive coding deficits in schizophrenia. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Exarchakis, Georgios; Lücke, Jörg
2017-11-01
Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
King, S
The highlights of the many public programs are described and summaries of plenary session speeches are included. Names, addresses, and solar interest codes of conference registrants are included. Eleven technical papers or summaries are included. A separate citation was prepared for each one. (MHR)
Development of a coding form for approach control/pilot voice communications.
DOT National Transportation Integrated Search
1995-05-01
The Aviation Topics Speech Acts Taxonomy (ATSAT) is a tool for categorizing pilot/controller communications according to their purpose and for classifying communication errors. Air traffic controller communications that deviate from FAA Air Traffic C...
Mapping the Speech Code: Cortical Responses Linking the Perception and Production of Vowels
Schuerman, William L.; Meyer, Antje S.; McQueen, James M.
2017-01-01
The acoustic realization of speech is constrained by the physical mechanisms by which it is produced. Yet for speech perception, the degree to which listeners utilize experience derived from speech production has long been debated. In the present study, we examined how sensorimotor adaptation during production may affect perception, and how this relationship may be reflected in early vs. late electrophysiological responses. Participants first performed a baseline speech production task, followed by a vowel categorization task during which EEG responses were recorded. In a subsequent speech production task, half the participants received shifted auditory feedback, leading most to alter their articulations. This was followed by a second, post-training vowel categorization task. We compared changes in vowel production to both behavioral and electrophysiological changes in vowel perception. No differences in phonetic categorization were observed between groups receiving altered or unaltered feedback. However, exploratory analyses revealed correlations between vocal motor behavior and phonetic categorization. EEG analyses revealed correlations between vocal motor behavior and cortical responses in both early and late time windows. These results suggest that participants' recent production behavior influenced subsequent vowel perception. We suggest that the change in perception can be best characterized as a mapping of acoustics onto articulation. PMID:28439232
Altieri, Nicholas; Pisoni, David B.; Townsend, James T.
2012-01-01
Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield’s feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration. PMID:21968081
Altieri, Nicholas; Pisoni, David B; Townsend, James T
2011-01-01
Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.
Buchan, Julie N; Paré, Martin; Munhall, Kevin G
2008-11-25
During face-to-face conversation the face provides auditory and visual linguistic information, and also conveys information about the identity of the speaker. This study investigated behavioral strategies involved in gathering visual information while watching talking faces. The effects of varying talker identity and varying the intelligibility of speech (by adding acoustic noise) on gaze behavior were measured with an eyetracker. Varying the intelligibility of the speech by adding noise had a noticeable effect on the location and duration of fixations. When noise was present subjects adopted a vantage point that was more centralized on the face by reducing the frequency of the fixations on the eyes and mouth and lengthening the duration of their gaze fixations on the nose and mouth. Varying talker identity resulted in a more modest change in gaze behavior that was modulated by the intelligibility of the speech. Although subjects generally used similar strategies to extract visual information in both talker variability conditions, when noise was absent there were more fixations on the mouth when viewing a different talker every trial as opposed to the same talker every trial. These findings provide a useful baseline for studies examining gaze behavior during audiovisual speech perception and perception of dynamic faces.
Fine-coarse semantic processing in schizophrenia: a reversed pattern of hemispheric dominance.
Zeev-Wolf, Maor; Goldstein, Abraham; Levkovitz, Yechiel; Faust, Miriam
2014-04-01
Left lateralization for language processing is a feature of neurotypical brains. In individuals with schizophrenia, lack of left lateralization is associated with the language impairments manifested in this population. Beeman׳s fine-coarse semantic coding model asserts left hemisphere specialization in fine (i.e., conventionalized) semantic coding and right hemisphere specialization in coarse (i.e., non-conventionalized) semantic coding. Applying this model to schizophrenia would suggest that language impairments in this population are a result of greater reliance on coarse semantic coding. We investigated this hypothesis and examined whether a reversed pattern of hemispheric involvement in fine-coarse semantic coding along the time course of activation could be detected in individuals with schizophrenia. Seventeen individuals with schizophrenia and 30 neurotypical participants were presented with two word expressions of four types: literal, conventional metaphoric, unrelated (exemplars of fine semantic coding) and novel metaphoric (an exemplar of coarse semantic coding). Expressions were separated by either a short (250 ms) or long (750 ms) delay. Findings indicate that whereas during novel metaphor processing, controls displayed a left hemisphere advantage at 250 ms delay and right hemisphere advantage at 750 ms, individuals with schizophrenia displayed the opposite. For conventional metaphoric and unrelated expressions, controls showed left hemisphere advantage across times, while individuals with schizophrenia showed a right hemisphere advantage. Furthermore, whereas individuals with schizophrenia were less accurate than control at judging literal, conventional metaphoric and unrelated expressions they were more accurate when judging novel metaphors. Results suggest that individuals with schizophrenia display a reversed pattern of lateralization for semantic coding which causes them to rely more heavily on coarse semantic coding. Thus, for individuals with schizophrenia, speech situation are always non-conventional, compelling them to constantly seek for meanings and prejudicing them toward novel or atypical speech acts. This, in turn, may disadvantage them in conventionalized communication and result in language impairment. Copyright © 2014 Elsevier Ltd. All rights reserved.
Spatial Frequency Requirements and Gaze Strategy in Visual-Only and Audiovisual Speech Perception
ERIC Educational Resources Information Center
Wilson, Amanda H.; Alsius, Agnès; Parè, Martin; Munhall, Kevin G.
2016-01-01
Purpose: The aim of this article is to examine the effects of visual image degradation on performance and gaze behavior in audiovisual and visual-only speech perception tasks. Method: We presented vowel-consonant-vowel utterances visually filtered at a range of frequencies in visual-only, audiovisual congruent, and audiovisual incongruent…
ERIC Educational Resources Information Center
Pillai, Patrick
2000-01-01
Children with chronic ear infections experience a lag time in understanding speech, which inhibits classroom participation and the ability to make friends, and ultimately reduces self-esteem. Difficulty in hearing affects speech and vocabulary development, reading and writing proficiency, and academic performance, and could lead to placement in…
Supporting Children with Speech and Language Difficulties. Supporting Children Series
ERIC Educational Resources Information Center
David Fulton Publishers, 2004
2004-01-01
Off-the-shelf support containing all the vital information practitioners need to know about Speech and Language Difficulties, this book includes: (1) Strategies for developing attention control; (2) Guidance on how to improve language and listening skills; and (3) Ideas for teaching phonological awareness. Following a foreword and an introduction,…
Once More, With Feeling: Reagan and "The Speech" in 1980.
ERIC Educational Resources Information Center
Henry, David
Ronald Reagan's rise from political neophyte to Republican candidate for governor of California in 1966 was characterized by a public relations strategy, which was bolstered by "The Speech," a 30-minute anti-big government, defense-of-freedom message. He presented this message appropriately to each audience to identify himself with…
ERIC Educational Resources Information Center
Perfitt, Ruth
2013-01-01
This article investigates the impact of transitions upon pupils aged 11-14 with speech, language and communication needs, including specific language impairment and autism. The aim is to identify stress factors, examine whether these affect any subgroups in particular and suggest practical strategies to support pupils through transitions. Stress…
The analysis of verbal interaction sequences in dyadic clinical communication: a review of methods.
Connor, Martin; Fletcher, Ian; Salmon, Peter
2009-05-01
To identify methods available for sequential analysis of dyadic verbal clinical communication and to review their methodological and conceptual differences. Critical review, based on literature describing sequential analyses of clinical and other relevant social interaction. Dominant approaches are based on analysis of communication according to its precise position in the series of utterances that constitute event-coded dialogue. For practical reasons, methods focus on very short-term processes, typically the influence of one party's speech on what the other says next. Studies of longer-term influences are rare. Some analyses have statistical limitations, particularly in disregarding heterogeneity between consultations, patients or practitioners. Additional techniques, including ones that can use information about timing and duration of speech from interval-coding are becoming available. There is a danger that constraints of commonly used methods shape research questions and divert researchers from potentially important communication processes including ones that operate over a longer-term than one or two speech turns. Given that no one method can model the complexity of clinical communication, multiple methods, both quantitative and qualitative, are necessary. Broadening the range of methods will allow the current emphasis on exploratory studies to be balanced by tests of hypotheses about clinically important communication processes.
Incorporating Speech Recognition into a Natural User Interface
NASA Technical Reports Server (NTRS)
Chapa, Nicholas
2017-01-01
The Augmented/ Virtual Reality (AVR) Lab has been working to study the applicability of recent virtual and augmented reality hardware and software to KSC operations. This includes the Oculus Rift, HTC Vive, Microsoft HoloLens, and Unity game engine. My project in this lab is to integrate voice recognition and voice commands into an easy to modify system that can be added to an existing portion of a Natural User Interface (NUI). A NUI is an intuitive and simple to use interface incorporating visual, touch, and speech recognition. The inclusion of speech recognition capability will allow users to perform actions or make inquiries using only their voice. The simplicity of needing only to speak to control an on-screen object or enact some digital action means that any user can quickly become accustomed to using this system. Multiple programs were tested for use in a speech command and recognition system. Sphinx4 translates speech to text using a Hidden Markov Model (HMM) based Language Model, an Acoustic Model, and a word Dictionary running on Java. PocketSphinx had similar functionality to Sphinx4 but instead ran on C. However, neither of these programs were ideal as building a Java or C wrapper slowed performance. The most ideal speech recognition system tested was the Unity Engine Grammar Recognizer. A Context Free Grammar (CFG) structure is written in an XML file to specify the structure of phrases and words that will be recognized by Unity Grammar Recognizer. Using Speech Recognition Grammar Specification (SRGS) 1.0 makes modifying the recognized combinations of words and phrases very simple and quick to do. With SRGS 1.0, semantic information can also be added to the XML file, which allows for even more control over how spoken words and phrases are interpreted by Unity. Additionally, using a CFG with SRGS 1.0 produces a Finite State Machine (FSM) functionality limiting the potential for incorrectly heard words or phrases. The purpose of my project was to investigate options for a Speech Recognition System. To that end I attempted to integrate Sphinx4 into a user interface. Sphinx4 had great accuracy and is the only free program able to perform offline speech dictation. However it had a limited dictionary of words that could be recognized, single syllable words were almost impossible for it to hear, and since it ran on Java it could not be integrated into the Unity based NUI. PocketSphinx ran much faster than Sphinx4 which would've made it ideal as a plugin to the Unity NUI, unfortunately creating a C# wrapper for the C code made the program unusable with Unity due to the wrapper slowing code execution and class files becoming unreachable. Unity Grammar Recognizer is the ideal speech recognition interface, it is flexible in recognizing multiple variations of the same command. It is also the most accurate program in recognizing speech due to using an XML grammar to specify speech structure instead of relying solely on a Dictionary and Language model. The Unity Grammar Recognizer will be used with the NUI for these reasons as well as being written in C# which further simplifies the incorporation.
Whitmore, Ani S; Romski, Mary Ann; Sevcik, Rose A
2014-09-01
This exploratory study examined the potential secondary outcome of an early augmented language intervention that incorporates speech-generating devices (SGD) on motor skill use for children with developmental delays. The data presented are from a longitudinal study by Romski and colleagues. Toddlers in the augmented language interventions were either required (Augmented Communication-Output; AC-O) or not required (Augmented Communication-Input; AC-I) to use the SGD to produce an augmented word. Three standardized assessments and five event-based coding schemes measured the participants' language abilities and motor skills. Toddlers in the AC-O intervention used more developmentally appropriate motor movements and became more accurate when using the SGD to communicate than toddlers in the AC-I intervention. AAC strategies, interventionist/parent support, motor learning opportunities, and physical feedback may all contribute to this secondary benefit of AAC interventions that use devices.
Perceived gender in clear and conversational speech
NASA Astrophysics Data System (ADS)
Booz, Jaime A.
Although many studies have examined acoustic and sociolinguistic differences between male and female speech, the relationship between talker speaking style and perceived gender has not yet been explored. The present study attempts to determine whether clear speech, a style adopted by talkers who perceive some barrier to effective communication, shifts perceptions of femininity for male and female talkers. Much of our understanding of gender perception in voice and speech is based on sustained vowels or single words, eliminating temporal, prosodic, and articulatory cues available in more naturalistic, connected speech. Thus, clear and conversational sentence stimuli, selected from the 41 talkers of the Ferguson Clear Speech Database (Ferguson, 2004) were presented to 17 normal-hearing listeners, aged 18 to 30. They rated the talkers' gender using a visual analog scale with "masculine" and "feminine" endpoints. This response method was chosen to account for within-category shifts of gender perception by allowing nonbinary responses. Mixed-effects regression analysis of listener responses revealed a small but significant effect of speaking style, and this effect was larger for male talkers than female talkers. Because of the high degree of talker variability observed for talker gender, acoustic analyses of these sentences were undertaken to determine the relationship between acoustic changes in clear and conversational speech and perceived femininity. Results of these analyses showed that mean fundamental frequency (fo) and f o standard deviation were significantly correlated to perceived gender for both male and female talkers, and vowel space was significantly correlated only for male talkers. Speaking rate and breathiness measures (CPPS) were not significantly related for either group. Outcomes of this study indicate that adopting a clear speaking style is correlated with increases in perceived femininity. Although the increase was small, some changes associated with making adjustments to improve speech clarity have a larger impact on perceived femininity than others. Using a clear speech strategy alone may not be sufficient for a male speaker to be perceived as female, but could be used as one of many tools to help speakers achieve more "feminine" speech, in conjunction with more specific strategies targeting the acoustic parameters outlined in this study.
Paper-Based Textbooks with Audio Support for Print-Disabled Students.
Fujiyoshi, Akio; Ohsawa, Akiko; Takaira, Takuya; Tani, Yoshiaki; Fujiyoshi, Mamoru; Ota, Yuko
2015-01-01
Utilizing invisible 2-dimensional codes and digital audio players with a 2-dimensional code scanner, we developed paper-based textbooks with audio support for students with print disabilities, called "multimodal textbooks." Multimodal textbooks can be read with the combination of the two modes: "reading printed text" and "listening to the speech of the text from a digital audio player with a 2-dimensional code scanner." Since multimodal textbooks look the same as regular textbooks and the price of a digital audio player is reasonable (about 30 euro), we think multimodal textbooks are suitable for students with print disabilities in ordinary classrooms.
Nittrouer, Susan; Lowenstein, Joanna H
2007-02-01
It has been reported that children and adults weight differently the various acoustic properties of the speech signal that support phonetic decisions. This finding is generally attributed to the fact that the amount of weight assigned to various acoustic properties by adults varies across languages, and that children have not yet discovered the mature weighting strategies of their own native languages. But an alternative explanation exists: Perhaps children's auditory sensitivities for some acoustic properties of speech are poorer than those of adults, and children cannot categorize stimuli based on properties to which they are not keenly sensitive. The purpose of the current study was to test that hypothesis. Edited-natural, synthetic-formant, and sine wave stimuli were all used, and all were modeled after words with voiced and voiceless final stops. Adults and children (5 and 7 years of age) listened to pairs of stimuli in 5 conditions: 2 involving a temporal property (1 with speech and 1 with nonspeech stimuli) and 3 involving a spectral property (1 with speech and 2 with nonspeech stimuli). An AX discrimination task was used in which a standard stimulus (A) was compared with all other stimuli (X) equal numbers of times (method of constant stimuli). Adults and children had similar difference thresholds (i.e., 50% point on the discrimination function) for 2 of the 3 sets of nonspeech stimuli (1 temporal and 1 spectral), but children's thresholds were greater for both sets of speech stimuli. Results are interpreted as evidence that children's auditory sensitivities are adequate to support weighting strategies similar to those of adults, and so observed differences between children and adults in speech perception cannot be explained by differences in auditory perception. Furthermore, it is concluded that listeners bring expectations to the listening task about the nature of the signals they are hearing based on their experiences with those signals.
Tsai, Ching-Shu; Chen, Vincent Chin-Hung; Yang, Yao-Hsu; Hung, Tai-Hsin; Lu, Mong-Liang; Huang, Kuo-You; Gossop, Michael
2017-01-01
Manifestations of Mycoplasma pneumoniae infection can range from self-limiting upper respiratory symptoms to various neurological complications, including speech and language impairment. But an association between Mycoplasma pneumoniae infection and speech and language impairment has not been sufficiently explored. In this study, we aim to investigate the association between Mycoplasma pneumoniae infection and subsequent speech and language impairment in a nationwide population-based sample using Taiwan's National Health Insurance Research Database. We identified 5,406 children with Mycoplasma pneumoniae infection (International Classification of Disease, Revision 9, Clinical Modification code 4830) and compared to 21,624 age-, sex-, urban- and income-matched controls on subsequent speech and language impairment. The mean follow-up interval for all subjects was 6.44 years (standard deviation = 2.42 years); the mean latency period between the initial Mycoplasma pneumoniae infection and presence of speech and language impairment was 1.96 years (standard deviation = 1.64 years). The results showed that Mycoplasma pneumoniae infection was significantly associated with greater incidence of speech and language impairment [hazard ratio (HR) = 1.49, 95% CI: 1.23-1.80]. In addition, significantly increased hazard ratio of subsequent speech and language impairment in the groups younger than 6 years old and no significant difference in the groups over the age of 6 years were found (HR = 1.43, 95% CI:1.09-1.88 for age 0-3 years group; HR = 1.67, 95% CI: 1.25-2.23 for age 4-5 years group; HR = 1.14, 95% CI: 0.54-2.39 for age 6-7 years group; and HR = 0.83, 95% CI:0.23-2.92 for age 8-18 years group). In conclusion, Mycoplasma pneumoniae infection is temporally associated with incident speech and language impairment.
Law, J; Campbell, C; Roulstone, S; Adams, C; Boyle, J
2008-01-01
Receptive language impairment (RLI) is one of the most significant indicators of negative sequelae for children with speech and language disorders. Despite this, relatively little is known about the most effective treatments for these children in the primary school period. To explore the relationship between the reported practice of speech and language practitioners and the underlying rationales for the therapy that they provide. A phenomenological approach was adopted, drawing on the experiences of speech and language practitioners. Practitioners completed a questionnaire relating to their practice for a single child with receptive language impairment within the 5-11 age range, providing details and rationales for three recent therapy activities. The responses of 56 participants were coded. All the children described experienced marked receptive language impairments, in the main associated with expressive language difficulties and/or social communication problems. The relative homogeneity of the presenting symptoms in terms of test performance was not reflected in the highly differentiated descriptions of intervention. One of the key determinants of how therapists described their practice was the child's age. As the child develops the therapists appeared to shift from a 'skills acquisition' orientation to a 'meta-cognitive' orientation, that is they move away from teaching specific linguistic behaviours towards teaching children strategies for thinking and using their language. A third of rationales refer to explicit theories but only half of these refer to the work of specific authors. Many of these were theories of practice rather than theories of deficit, and of those that do cite specific theories, no less than 29 different authors were cited many of whom might best be described as translators of existing theories rather than generators of novel theories. While theories of the deficit dominate the literature they appear to play a relatively small part in the eclectic practice of speech and language therapists. Theories of therapy may develop relatively independent of theories of deficit. While this may not present a problem for the practitioner, whose principal focus is remediation, it may present a problem for the researcher developing intervention efficacy studies, where the theory of the deficit will need to be well-defined in order to describe both the subgroup of children under investigation and the parameters of the deficit to be targeted in intervention.
NASA Astrophysics Data System (ADS)
Lynch, John T.
1987-02-01
The present technique for coping with fading and burst noise on HF channels used in digital voice communications transmits digital voice only during high S/N time intervals, and speeds up the speech when necessary to avoid conversation-hindering delays. On the basis of informal listening tests, four test conditions were selected in order to characterize those conditions of speech interruption which would render it comprehensible or incomprehensible. One of the test conditions, 2 secs on and 1/2-sec off, yielded test scores comparable to the reference continuous speech case and is a reasonable match to the temporal variations of a disturbed ionosphere.
Functional Characterization of the Human Speech Articulation Network.
Basilakos, Alexandra; Smith, Kimberly G; Fillmore, Paul; Fridriksson, Julius; Fedorenko, Evelina
2018-05-01
A number of brain regions have been implicated in articulation, but their precise computations remain debated. Using functional magnetic resonance imaging, we examine the degree of functional specificity of articulation-responsive brain regions to constrain hypotheses about their contributions to speech production. We find that articulation-responsive regions (1) are sensitive to articulatory complexity, but (2) are largely nonoverlapping with nearby domain-general regions that support diverse goal-directed behaviors. Furthermore, premotor articulation regions show selectivity for speech production over some related tasks (respiration control), but not others (nonspeech oral-motor [NSO] movements). This overlap between speech and nonspeech movements concords with electrocorticographic evidence that these regions encode articulators and their states, and with patient evidence whereby articulatory deficits are often accompanied by oral-motor deficits. In contrast, the superior temporal regions show strong selectivity for articulation relative to nonspeech movements, suggesting that these regions play a specific role in speech planning/production. Finally, articulation-responsive portions of posterior inferior frontal gyrus show some selectivity for articulation, in line with the hypothesis that this region prepares an articulatory code that is passed to the premotor cortex. Taken together, these results inform the architecture of the human articulation system.
Yorkston, Kathryn; Baylor, Carolyn; Britton, Deanna
2017-06-22
In this project, we explore the experiences of people who report speech changes associated with Parkinson's disease as they describe taking part in everyday communication situations and report impressions related to speech treatment. Twenty-four community-dwelling adults with Parkinson's disease took part in face-to-face, semistructured interviews. Qualitative research methods were used to code and develop themes related to the interviews. Two major themes emerged. The first, called "speaking," included several subthemes: thinking about speaking, weighing value versus effort, feelings associated with speaking, the environmental context of speaking, and the impact of Parkinson's disease on speaking. The second theme involved "treatment experiences" and included subthemes: choosing not to have treatment, the clinician, drills and exercise, and suggestions for change. From the perspective of participants with Parkinson's disease, speaking is an activity requiring both physical and cognitive effort that takes place in a social context. Although many report positive experiences with speech treatment, some reported dissatisfaction with speech drills and exercises and a lack of focus on the social aspects of communication. Suggestions for improvement include increased focus on the cognitive demands of speaking and on the psychosocial aspects of communication.
Baylor, Carolyn; Britton, Deanna
2017-01-01
Purpose In this project, we explore the experiences of people who report speech changes associated with Parkinson's disease as they describe taking part in everyday communication situations and report impressions related to speech treatment. Method Twenty-four community-dwelling adults with Parkinson's disease took part in face-to-face, semistructured interviews. Qualitative research methods were used to code and develop themes related to the interviews. Results Two major themes emerged. The first, called “speaking,” included several subthemes: thinking about speaking, weighing value versus effort, feelings associated with speaking, the environmental context of speaking, and the impact of Parkinson's disease on speaking. The second theme involved “treatment experiences” and included subthemes: choosing not to have treatment, the clinician, drills and exercise, and suggestions for change. Conclusions From the perspective of participants with Parkinson's disease, speaking is an activity requiring both physical and cognitive effort that takes place in a social context. Although many report positive experiences with speech treatment, some reported dissatisfaction with speech drills and exercises and a lack of focus on the social aspects of communication. Suggestions for improvement include increased focus on the cognitive demands of speaking and on the psychosocial aspects of communication. PMID:28654939
Miao, Melissa; Power, Emma; O'Halloran, Robyn
2015-01-01
Although clinical practice guidelines can facilitate evidence-based practice and improve the health outcomes of stroke patients, they continue to be underutilised. There is limited research into the reasons for this, especially in speech pathology. This study provides the first in-depth, qualitative examination of the barriers and facilitators that speech pathologists perceive and experience when implementing guidelines. A maximum variation sample of eight speech pathologists participated in a semi-structured interview concerning the implementation of the National Stroke Foundation's Clinical Guidelines for Stroke Management 2010. Interviews were transcribed, thematically analysed and member checked before overall themes were identified. Three main themes and ten subthemes were identified. The first main theme, making implementation explicit, reflected the necessity of accessing and understanding guideline recommendations, and focussing specifically on implementation in context. In the second theme, demand versus ability to change, the size of changes required was compared with available resources and collaboration. The final theme, Speech pathologist motivation to implement guidelines, demonstrated the influence of individual perception of the guidelines and personal commitment to improved practice. Factors affecting implementation are complex, and are not exclusively barriers or facilitators. Some potential implementation strategies are suggested. Further research is recommended. In most Western nations, stroke remains the single greatest cause of disability, including communication and swallowing disabilities. Although adherence to stroke clinical practice guidelines improves stroke patient outcomes, guidelines continue to be underutilised, and the reasons for this are not well understood. This is the first in-depth qualitative study identifying the complex barriers and facilitators to guideline implementation as experienced by speech pathologists in stroke care. Suggested implementation strategies include local monitoring of guideline implementation (e.g. team meetings, audits), increasing collaboration on implementation projects (e.g. managerial involvement, networking), and seeking speech pathologist input into guideline development.
Spectro-temporal cues enhance modulation sensitivity in cochlear implant users.
Zheng, Yi; Escabí, Monty; Litovsky, Ruth Y
2017-08-01
Although speech understanding is highly variable amongst cochlear implants (CIs) subjects, the remarkably high speech recognition performance of many CI users is unexpected and not well understood. Numerous factors, including neural health and degradation of the spectral information in the speech signal of CIs, likely contribute to speech understanding. We studied the ability to use spectro-temporal modulations, which may be critical for speech understanding and discrimination, and hypothesize that CI users adopt a different perceptual strategy than normal-hearing (NH) individuals, whereby they rely more heavily on joint spectro-temporal cues to enhance detection of auditory cues. Modulation detection sensitivity was studied in CI users and NH subjects using broadband "ripple" stimuli that were modulated spectrally, temporally, or jointly, i.e., spectro-temporally. The spectro-temporal modulation transfer functions of CI users and NH subjects was decomposed into spectral and temporal dimensions and compared to those subjects' spectral-only and temporal-only modulation transfer functions. In CI users, the joint spectro-temporal sensitivity was better than that predicted by spectral-only and temporal-only sensitivity, indicating a heightened spectro-temporal sensitivity. Such an enhancement through the combined integration of spectral and temporal cues was not observed in NH subjects. The unique use of spectro-temporal cues by CI patients can yield benefits for use of cues that are important for speech understanding. This finding has implications for developing sound processing strategies that may rely on joint spectro-temporal modulations to improve speech comprehension of CI users, and the findings of this study may be valuable for developing clinical assessment tools to optimize CI processor performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Lopez-Poveda, Enrique A; Eustaquio-Martín, Almudena
2018-04-01
It has been recently shown that cochlear implant users could enjoy better speech reception in noise and enhanced spatial unmasking with binaural audio processing inspired by the inhibitory effects of the contralateral medial olivocochlear (MOC) reflex on compression [Lopez-Poveda, Eustaquio-Martin, Stohl, Wolford, Schatzer, and Wilson (2016). Ear Hear. 37, e138-e148]. The perceptual evidence supporting those benefits, however, is limited to a few target-interferer spatial configurations and to a particular implementation of contralateral MOC inhibition. Here, the short-term objective intelligibility index is used to (1) objectively demonstrate potential benefits over many more spatial configurations, and (2) investigate if the predicted benefits may be enhanced by using more realistic MOC implementations. Results corroborate the advantages and drawbacks of MOC processing indicated by the previously published perceptual tests. The results also suggest that the benefits may be enhanced and the drawbacks overcome by using longer time constants for the activation and deactivation of inhibition and, to a lesser extent, by using a comparatively greater inhibition in the lower than in the higher frequency channels. Compared to using two functionally independent processors, the better MOC processor improved the signal-to-noise ratio in the two ears between 1 and 6 decibels by enhancing head-shadow effects, and was advantageous for all tested target-interferer spatial configurations.
A Dynamically Focusing Cochlear Implant Strategy Can Improve Vowel Identification in Noise.
Arenberg, Julie G; Parkinson, Wendy S; Litvak, Leonid; Chen, Chen; Kreft, Heather A; Oxenham, Andrew J
2018-03-09
The standard, monopolar (MP) electrode configuration used in commercially available cochlear implants (CI) creates a broad electrical field, which can lead to unwanted channel interactions. Use of more focused configurations, such as tripolar and phased array, has led to mixed results for improving speech understanding. The purpose of the present study was to assess the efficacy of a physiologically inspired configuration called dynamic focusing, using focused tripolar stimulation at low levels and less focused stimulation at high levels. Dynamic focusing may better mimic cochlear excitation patterns in normal acoustic hearing, while reducing the current levels necessary to achieve sufficient loudness at high levels. Twenty postlingually deafened adult CI users participated in the study. Speech perception was assessed in quiet and in a four-talker babble background noise. Speech stimuli were closed-set spondees in noise, and medial vowels at 50 and 60 dB SPL in quiet and in noise. The signal to noise ratio was adjusted individually such that performance was between 40 and 60% correct with the MP strategy. Subjects were fitted with three experimental strategies matched for pulse duration, pulse rate, filter settings, and loudness on a channel-by-channel basis. The strategies included 14 channels programmed in MP, fixed partial tripolar (σ = 0.8), and dynamic partial tripolar (σ at 0.8 at threshold and 0.5 at the most comfortable level). Fifteen minutes of listening experience was provided with each strategy before testing. Sound quality ratings were also obtained. Speech perception performance for vowel identification in quiet at 50 and 60 dB SPL and for spondees in noise was similar for the three tested strategies. However, performance on vowel identification in noise was significantly better for listeners using the dynamic focusing strategy. Sound quality ratings were similar for the three strategies. Some subjects obtained more benefit than others, with some individual differences explained by the relation between loudness growth and the rate of change from focused to broader stimulation. These initial results suggest that further exploration of dynamic focusing is warranted. Specifically, optimizing such strategies on an individual basis may lead to improvements in speech perception for more adult listeners and improve how CIs are tailored. Some listeners may also need a longer period of time to acclimate to a new program.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
New developments in the management of speech and language disorders.
Harding, Celia; Gourlay, Sara
2008-05-01
Speech and language disorders, which include swallowing difficulties, are usually managed by speech and language therapists. Such a diverse, complex and challenging clinical group of symptoms requires practitioners with detailed knowledge and understanding of research within those areas, as well as the ability to implement appropriate therapy strategies within many environments. These environments range from neonatal units, acute paediatric wards and health centres through to nurseries, schools and children's homes. This paper summarises the key issues that are fundamental to our understanding of this client group.
An Intrinsically Digital Amplification Scheme for Hearing Aids
NASA Astrophysics Data System (ADS)
Blamey, Peter J.; Macfarlane, David S.; Steele, Brenton R.
2005-12-01
Results for linear and wide-dynamic range compression were compared with a new 64-channel digital amplification strategy in three separate studies. The new strategy addresses the requirements of the hearing aid user with efficient computations on an open-platform digital signal processor (DSP). The new amplification strategy is not modeled on prior analog strategies like compression and linear amplification, but uses statistical analysis of the signal to optimize the output dynamic range in each frequency band independently. Using the open-platform DSP processor also provided the opportunity for blind trial comparisons of the different processing schemes in BTE and ITE devices of a high commercial standard. The speech perception scores and questionnaire results show that it is possible to provide improved audibility for sound in many narrow frequency bands while simultaneously improving comfort, speech intelligibility in noise, and sound quality.
Donaldson, Gail S.; Dawson, Patricia K.; Borden, Lamar Z.
2010-01-01
Objectives Previous studies have confirmed that current steering can increase the number of discriminable pitches available to many CI users; however, the ability to perceive additional pitches has not been linked to improved speech perception. The primary goals of this study were to determine (1) whether adult CI users can achieve higher levels of spectral-cue transmission with a speech processing strategy that implements current steering (Fidelity120) than with a predecessor strategy (HiRes) and, if so, (2) whether the magnitude of improvement can be predicted from individual differences in place-pitch sensitivity. A secondary goal was to determine whether Fidelity120 supports higher levels of speech recognition in noise than HiRes. Design A within-subjects repeated measures design evaluated speech perception performance with Fidelity120 relative to HiRes in 10 adult CI users. Subjects used the novel strategy (either HiRes or Fidelity120) for 8 weeks during the main study; a subset of five subjects used Fidelity120 for 3 additional months following the main study. Speech perception was assessed for the spectral cues related to vowel F1 frequency (Vow F1), vowel F2 frequency (Vow F2) and consonant place of articulation (Con PLC); overall transmitted information for vowels (Vow STIM) and consonants (Con STIM); and sentence recognition in noise. Place-pitch sensitivity was measured for electrode pairs in the apical, middle and basal regions of the implanted array using a psychophysical pitch-ranking task. Results With one exception, there was no effect of strategy (HiRes vs. Fidelity120) on the speech measures tested, either during the main study (n=10) or after extended use of Fidelity120 (n=5). The exception was a small but significant advantage for HiRes over Fidelity120 for the Con STIM measure during the main study. Examination of individual subjects' data revealed that 3 of 10 subjects demonstrated improved perception of one or more spectral cues with Fidelity120 relative to HiRes after 8 weeks or longer experience with Fidelity120. Another 3 subjects exhibited initial decrements in spectral cue perception with Fidelity120 at the 8 week time point; however, evidence from one subject suggested that such decrements may resolve with additional experience. Place-pitch thresholds were inversely related to improvements in Vow F2 perception with Fidelity120 relative to HiRes. However, no relationship was observed between place-pitch thresholds and the other spectral measures (Vow F1 or Con PLC). Conclusions Findings suggest that Fidelity120 supports small improvements in the perception of spectral speech cues in some Advanced Bionics CI users; however, many users show no clear benefit. Benefits are more likely to occur for vowel spectral cues (related to F1 and F2 frequency) than for consonant spectral cues (related to place of articulation). There was an inconsistent relationship between place-pitch sensitivity and improvements in spectral cue perception with Fidelity120 relative to HiRes. This may partly reflect the small number of sites at which place-pitch thresholds were measured. Contrary to some previous reports, there was no clear evidence that Fidelity120 supports improved sentence recognition in noise. PMID:21084987
Gifford, René H.; Revit, Lawrence J.
2014-01-01
Background Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. Purpose To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Research Design Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Study Sample Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Intervention Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects’ preferred listening programs as well as with the addition of either Beam™ preprocessing (Cochlear Corporation) or the T-Mic® accessory option (Advanced Bionics). Data Collection and Analysis In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested “Everyday,” “Noise,” and “Focus” preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. Results The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Conclusion Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. PMID:20807480
Finch, Emma; Cameron, Ashley; Fleming, Jennifer; Lethlean, Jennifer; Hudson, Kyla; McPhail, Steven
2017-07-01
Aphasia is a common consequence of stroke. Despite receiving specialised training in communication, speech-language pathology students may lack confidence when communicating with People with Aphasia (PWA). This paper reports data from secondary outcome measures from a randomised controlled trial. The aim of the current study was to examine the effects of communication partner training on the communication skills of speech-language pathology students during conversations with PWA. Thirty-eight speech-language pathology students were randomly allocated to trained and untrained groups. The first group received a lecture about communication strategies for communicating with PWA then participated in a conversation with PWA (Trained group), while the second group of students participated in a conversation with the PWA without receiving the lecture (Untrained group). The conversations between the groups were analysed according to the Measure of skill in Supported Conversation (MSC) scales, Measure of Participation in Conversation (MPC) scales, types of strategies used in conversation, and the occurrence and repair of conversation breakdowns. The trained group received significantly higher MSC Revealing Competence scores, used significantly more props, and introduced significantly more new ideas into the conversation than the untrained group. The trained group also used more gesture and writing to facilitate the conversation, however, the difference was not significant. There was no significant difference between the groups according to MSC Acknowledging Competence scores, MPC Interaction or Transaction scores, or in the number of interruptions, minor or major conversation breakdowns, or in the success of strategies initiated to repair the conversation breakdowns. Speech-language pathology students may benefit from participation in communication partner training programs. Copyright © 2017 Elsevier Inc. All rights reserved.
Children's Acoustic and Linguistic Adaptations to Peers with Hearing Impairment
ERIC Educational Resources Information Center
Granlund, Sonia; Hazan, Valerie; Mahon, Merle
2018-01-01
Purpose: This study aims to examine the clear speaking strategies used by older children when interacting with a peer with hearing loss, focusing on both acoustic and linguistic adaptations in speech. Method: The Grid task, a problem-solving task developed to elicit spontaneous interactive speech, was used to obtain a range of global acoustic and…
Enhancing Listener Strategies Using a Payoff Matrix in Speech-on-speech Masking Experiments
2015-09-03
five of the eighteen listeners would have achieved the 1200 point goal, had it been in place for experiment 1. Seven out of the ten listeners in...listeners would revert back to reporting more masker words at negative TMRs in the ab- sence of specific instructions and incentives. The listeners could have
A Controlled Clinical Trial for Stuttering in Persons Aged 9 to 14 Years.
ERIC Educational Resources Information Center
Craig, Ashley; And Others
1996-01-01
This paper presents results of a controlled trial of 3 child stuttering treatment strategies in 97 subjects. All 3 treatments (electromyography feedback, intensive smooth speech, and home-based smooth speech) were very successful in the long term for 70% of the group, with electromyography and home-based treatment appearing to be especially…
Planning of Hiatus-Breaking Inserted /?/ in the Speech of Australian English-Speaking Children
ERIC Educational Resources Information Center
Yuen, Ivan; Cox, Felicity; Demuth, Katherine
2017-01-01
Purpose: Non-rhotic varieties of English often use /?/ insertion as a connected speech process to separate heterosyllabic V1.V2 hiatus contexts. However, there has been little research on children's development of this strategy. This study investigated whether children use /?/ insertion and, if so, whether hiatus-breaking /?/ can be considered…
ERIC Educational Resources Information Center
Cox, Amy Swartz; Clark, Denise M.; Skoning, Stacey N.; Wegner, Theresa M.; Muwana, Florence C.
2015-01-01
This study examined the effects of using sensory, augmentative, and alternative communication (AAC), and supportive communication strategies on the rate and type of communication used by three students with severe speech and motor impairments (SSMI). Using a multiple baseline across behaviour design with sensory and AAC intervention phases,…
ERIC Educational Resources Information Center
Higgins, Maureen B.; And Others
1996-01-01
A study of four children with deafness who had cochlear implants investigated the use of negative intraoral air pressure in articulation, from both the physiological and phonological perspectives. The study showed that the children used speech-production strategies that were different from hearing children and that deviant speech behaviors should…
Relationship Among Signal Fidelity, Hearing Loss, and Working Memory for Digital Noise Suppression.
Arehart, Kathryn; Souza, Pamela; Kates, James; Lunner, Thomas; Pedersen, Michael Syskind
2015-01-01
This study considered speech modified by additive babble combined with noise-suppression processing. The purpose was to determine the relative importance of the signal modifications, individual peripheral hearing loss, and individual cognitive capacity on speech intelligibility and speech quality. The participant group consisted of 31 individuals with moderate high-frequency hearing loss ranging in age from 51 to 89 years (mean = 69.6 years). Speech intelligibility and speech quality were measured using low-context sentences presented in babble at several signal-to-noise ratios. Speech stimuli were processed with a binary mask noise-suppression strategy with systematic manipulations of two parameters (error rate and attenuation values). The cumulative effects of signal modification produced by babble and signal processing were quantified using an envelope-distortion metric. Working memory capacity was assessed with a reading span test. Analysis of variance was used to determine the effects of signal processing parameters on perceptual scores. Hierarchical linear modeling was used to determine the role of degree of hearing loss and working memory capacity in individual listener response to the processed noisy speech. The model also considered improvements in envelope fidelity caused by the binary mask and the degradations to envelope caused by error and noise. The participants showed significant benefits in terms of intelligibility scores and quality ratings for noisy speech processed by the ideal binary mask noise-suppression strategy. This benefit was observed across a range of signal-to-noise ratios and persisted when up to a 30% error rate was introduced into the processing. Average intelligibility scores and average quality ratings were well predicted by an objective metric of envelope fidelity. Degree of hearing loss and working memory capacity were significant factors in explaining individual listener's intelligibility scores for binary mask processing applied to speech in babble. Degree of hearing loss and working memory capacity did not predict listeners' quality ratings. The results indicate that envelope fidelity is a primary factor in determining the combined effects of noise and binary mask processing for intelligibility and quality of speech presented in babble noise. Degree of hearing loss and working memory capacity are significant factors in explaining variability in listeners' speech intelligibility scores but not in quality ratings.
Monstrey, Jolijn; Deeks, John M.; Macherey, Olivier
2014-01-01
Objective To evaluate a speech-processing strategy in which the lowest frequency channel is conveyed using an asymmetric pulse shape and “phantom stimulation”, where current is injected into one intra-cochlear electrode and where the return current is shared between an intra-cochlear and an extra-cochlear electrode. This strategy is expected to provide more selective excitation of the cochlear apex, compared to a standard strategy where the lowest-frequency channel is conveyed by symmetric pulses in monopolar mode. In both strategies all other channels were conveyed by monopolar stimulation. Design Within-subjects comparison between the two strategies. Four experiments: (1) discrimination between the strategies, controlling for loudness differences, (2) consonant identification, (3) recognition of lowpass-filtered sentences in quiet, (4) sentence recognition in the presence of a competing speaker. Study sample Eight users of the Advanced Bionics CII/Hi-Res 90k cochlear implant. Results Listeners could easily discriminate between the two strategies but no consistent differences in performance were observed. Conclusions The proposed method does not improve speech perception, at least in the short term. PMID:25358027
Carlyon, Robert P; Monstrey, Jolijn; Deeks, John M; Macherey, Olivier
2014-12-01
To evaluate a speech-processing strategy in which the lowest frequency channel is conveyed using an asymmetric pulse shape and "phantom stimulation", where current is injected into one intra-cochlear electrode and where the return current is shared between an intra-cochlear and an extra-cochlear electrode. This strategy is expected to provide more selective excitation of the cochlear apex, compared to a standard strategy where the lowest-frequency channel is conveyed by symmetric pulses in monopolar mode. In both strategies all other channels were conveyed by monopolar stimulation. Within-subjects comparison between the two strategies. Four experiments: (1) discrimination between the strategies, controlling for loudness differences, (2) consonant identification, (3) recognition of lowpass-filtered sentences in quiet, (4) sentence recognition in the presence of a competing speaker. Eight users of the Advanced Bionics CII/Hi-Res 90k cochlear implant. Listeners could easily discriminate between the two strategies but no consistent differences in performance were observed. The proposed method does not improve speech perception, at least in the short term.
Civier, Oren; Tasko, Stephen M.; Guenther, Frank H.
2010-01-01
This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which, if they grow large enough, can cause the motor system to “reset” and repeat the current syllable. This hypothesis is investigated using computer simulations of a “neurally impaired” version of the DIVA model, a neural network model of speech acquisition and production. The model’s outputs are compared to published acoustic data from PWS’ fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject’s speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. PMID:20831971
NASA Astrophysics Data System (ADS)
Hargus Ferguson, Sarah; Kewley-Port, Diane
2002-05-01
Several studies have shown that when a talker is instructed to speak as though talking to a hearing-impaired person, the resulting ``clear'' speech is significantly more intelligible than typical conversational speech. Recent work in this lab suggests that talkers vary in how much their intelligibility improves when they are instructed to speak clearly. The few studies examining acoustic characteristics of clear and conversational speech suggest that these differing clear speech effects result from different acoustic strategies on the part of individual talkers. However, only two studies to date have directly examined differences among talkers producing clear versus conversational speech, and neither included acoustic analysis. In this project, clear and conversational speech was recorded from 41 male and female talkers aged 18-45 years. A listening experiment demonstrated that for normal-hearing listeners in noise, vowel intelligibility varied widely among the 41 talkers for both speaking styles, as did the magnitude of the speaking style effect. Acoustic analyses using stimuli from a subgroup of talkers shown to have a range of speaking style effects will be used to assess specific acoustic correlates of vowel intelligibility in clear and conversational speech. [Work supported by NIHDCD-02229.
Perception of temporally modified speech in auditory neuropathy.
Hassan, Dalia Mohamed
2011-01-01
Disrupted auditory nerve activity in auditory neuropathy (AN) significantly impairs the sequential processing of auditory information, resulting in poor speech perception. This study investigated the ability of AN subjects to perceive temporally modified consonant-vowel (CV) pairs and shed light on their phonological awareness skills. Four Arabic CV pairs were selected: /ki/-/gi/, /to/-/do/, /si/-/sti/ and /so/-/zo/. The formant transitions in consonants and the pauses between CV pairs were prolonged. Rhyming, segmentation and blending skills were tested using words at a natural rate of speech and with prolongation of the speech stream. Fourteen adult AN subjects were compared to a matched group of cochlear-impaired patients in their perception of acoustically processed speech. The AN group distinguished the CV pairs at a low speech rate, in particular with modification of the consonant duration. Phonological awareness skills deteriorated in adult AN subjects but improved with prolongation of the speech inter-syllabic time interval. A rehabilitation program for AN should consider temporal modification of speech, training for auditory temporal processing and the use of devices with innovative signal processing schemes. Verbal modifications as well as visual imaging appear to be promising compensatory strategies for remediating the affected phonological processing skills.
Language familiarity modulates relative attention to the eyes and mouth of a talker.
Barenholtz, Elan; Mavica, Lauren; Lewkowicz, David J
2016-02-01
We investigated whether the audiovisual speech cues available in a talker's mouth elicit greater attention when adults have to process speech in an unfamiliar language vs. a familiar language. Participants performed a speech-encoding task while watching and listening to videos of a talker in a familiar language (English) or an unfamiliar language (Spanish or Icelandic). Attention to the mouth increased in monolingual subjects in response to an unfamiliar language condition but did not in bilingual subjects when the task required speech processing. In the absence of an explicit speech-processing task, subjects attended equally to the eyes and mouth in response to both familiar and unfamiliar languages. Overall, these results demonstrate that language familiarity modulates selective attention to the redundant audiovisual speech cues in a talker's mouth in adults. When our findings are considered together with similar findings from infants, they suggest that this attentional strategy emerges very early in life. Copyright © 2015 Elsevier B.V. All rights reserved.
Current Policies and New Directions for Speech-Language Pathology Assistants.
Paul-Brown, Diane; Goldberg, Lynette R
2001-01-01
This article provides an overview of current American Speech-Language-Hearing Association (ASHA) policies for the appropriate use and supervision of speech-language pathology assistants with an emphasis on the need to preserve the role of fully qualified speech-language pathologists in the service delivery system. Seven challenging issues surrounding the appropriate use of speech-language pathology assistants are considered. These include registering assistants and approving training programs; membership in ASHA; discrepancies between state requirements and ASHA policies; preparation for serving diverse multicultural, bilingual, and international populations; supervision considerations; funding and reimbursement for assistants; and perspectives on career-ladder/bachelor-level personnel. The formation of a National Leadership Council is proposed to develop a coordinated strategic plan for addressing these controversial and potentially divisive issues related to speech-language pathology assistants. This council would implement strategies for future development in the areas of professional education pertaining to assistant-level supervision, instruction of assistants, communication networks, policy development, research, and the dissemination/promotion of information regarding assistants.
[Verbal and gestural communication in interpersonal interaction with Alzheimer's disease patients].
Schiaratura, Loris Tamara; Di Pastena, Angela; Askevis-Leherpeux, Françoise; Clément, Sylvain
2015-03-01
Communication can be defined as a verbal and non verbal exchange of thoughts and emotions. While verbal communication deficit in Alzheimer's disease is well documented, very little is known about gestural communication, especially in interpersonal situations. This study examines the production of gestures and its relations with verbal aspects of communication. Three patients suffering from moderately severe Alzheimer's disease were compared to three healthy adults. Each one were given a series of pictures and asked to explain which one she preferred and why. The interpersonal interaction was video recorded. Analyses concerned verbal production (quantity and quality) and gestures. Gestures were either non representational (i.e., gestures of small amplitude punctuating speech or accentuating some parts of utterance) or representational (i.e., referring to the object of the speech). Representational gestures were coded as iconic (depicting of concrete aspects), metaphoric (depicting of abstract meaning) or deictic (pointing toward an object). In comparison with healthy participants, patients revealed a decrease in quantity and quality of speech. Nevertheless, their production of gestures was always present. This pattern is in line with the conception that gestures and speech depend on different communicational systems and look inconsistent with the assumption of a parallel dissolution of gesture and speech. Moreover, analyzing the articulation between verbal and gestural dimensions suggests that representational gestures may compensate for speech deficits. It underlines the importance for the role of gestures in maintaining interpersonal communication.
Marshall, Julie; Goldbart, Juliet; Phillips, Julie
2007-01-01
Parental and speech and language therapist (SLT) explanatory models may affect engagement with speech and language therapy, but there has been dearth of research in this area. This study investigated parents' and SLTs' views about language development, delay and intervention in pre-school children with language delay. The aims were to describe, explore and explain the thoughts, understandings, perceptions, beliefs, knowledge and feelings held by: a group of parents from East Manchester, UK, whose pre-school children had been referred with suspected language delay; and SLTs working in the same area, in relation to language development, language delay and language intervention. A total of 24 unstructured interviews were carried out: 15 with parents whose children had been referred for speech and language therapy and nine with SLTs who worked with pre-school children. The interviews were transcribed verbatim and coded using Atlas/ti. The data were analysed, subjected to respondent validation, and grounded theories and principled descriptions developed to explain and describe parents' and SLTs' beliefs and views. Parent and SLT data are presented separately. There are commonalities and differences between the parents and the SLTs. Both groups believe that language development and delay are influenced by both external and internal factors. Parents give more weight to the role of gender, imitation and personality and value television and videos, whereas the SLTs value the 'right environment' and listening skills and consider that health/disability and socio-economic factors are important. Parents see themselves as experts on their child and have varied ideas about the role of SLTs, which do not always accord with SLTs' views. The parents and SLTs differ in their views of the roles of imitation and play in intervention. Parents typically try strategies before seeing an SLT. These data suggest that parents' ideas vary and that, although parents and SLTs may share some views, there are some important differences. These views have implications for the provision of appropriate services. Although this is a small sample from one group in the UK, the results indicate the need to investigate the views of other groups of parents.
Benkendorf, J L; Prince, M B; Rose, M A; De Fina, A; Hamilton, H E
2001-01-01
To date, research examining adherence to genetic counseling principles has focused on specific counseling activities such as the giving or withholding of information and responding to client requests for advice. We audiotaped 43 prenatal genetic counseling sessions and used data-driven, qualitative, sociolinguistic methodologies to investigate how language choices facilitate or hinder the counseling process. Transcripts of each session were prepared for sociolinguistic analysis of the emergent discourse that included studying conversational style, speaker-listener symmetry, directness, and other interactional patterns. Analysis of our data demonstrates that: 1) indirect speech, marked by the use of hints, hedges, and other politeness strategies, facilitates rapport and mitigates the tension between a client-centered relationship and a counselor-driven agenda; 2) direct speech, or speaking literally, is an effective strategy for providing information and education; and 3) confusion exists between the use of indirect speech and the intent to provide nondirective counseling, especially when facilitating client decision-making. Indirect responses to client questions, such as those that include the phrases "some people" or "most people," helped to maintain counselor neutrality; however, this well-intended indirectness, used to preserve client autonomy, may have obstructed direct explorations of client needs. We argue that the genetic counseling process requires increased flexibility in the use of direct and indirect speech and provide new insights into how "talk" affects the work of genetic counselors.
A novel speech processing algorithm based on harmonicity cues in cochlear implant
NASA Astrophysics Data System (ADS)
Wang, Jian; Chen, Yousheng; Zhang, Zongping; Chen, Yan; Zhang, Weifeng
2017-08-01
This paper proposed a novel speech processing algorithm in cochlear implant, which used harmonicity cues to enhance tonal information in Mandarin Chinese speech recognition. The input speech was filtered by a 4-channel band-pass filter bank. The frequency ranges for the four bands were: 300-621, 621-1285, 1285-2657, and 2657-5499 Hz. In each pass band, temporal envelope and periodicity cues (TEPCs) below 400 Hz were extracted by full wave rectification and low-pass filtering. The TEPCs were modulated by a sinusoidal carrier, the frequency of which was fundamental frequency (F0) and its harmonics most close to the center frequency of each band. Signals from each band were combined together to obtain an output speech. Mandarin tone, word, and sentence recognition in quiet listening conditions were tested for the extensively used continuous interleaved sampling (CIS) strategy and the novel F0-harmonic algorithm. Results found that the F0-harmonic algorithm performed consistently better than CIS strategy in Mandarin tone, word, and sentence recognition. In addition, sentence recognition rate was higher than word recognition rate, as a result of contextual information in the sentence. Moreover, tone 3 and 4 performed better than tone 1 and tone 2, due to the easily identified features of the former. In conclusion, the F0-harmonic algorithm could enhance tonal information in cochlear implant speech processing due to the use of harmonicity cues, thereby improving Mandarin tone, word, and sentence recognition. Further study will focus on the test of the F0-harmonic algorithm in noisy listening conditions.
Tao, Duoduo; Deng, Rui; Jiang, Ye; Galvin, John J; Fu, Qian-Jie; Chen, Bing
2014-01-01
To investigate how auditory working memory relates to speech perception performance by Mandarin-speaking cochlear implant (CI) users. Auditory working memory and speech perception was measured in Mandarin-speaking CI and normal-hearing (NH) participants. Working memory capacity was measured using forward digit span and backward digit span; working memory efficiency was measured using articulation rate. Speech perception was assessed with: (a) word-in-sentence recognition in quiet, (b) word-in-sentence recognition in speech-shaped steady noise at +5 dB signal-to-noise ratio, (c) Chinese disyllable recognition in quiet, (d) Chinese lexical tone recognition in quiet. Self-reported school rank was also collected regarding performance in schoolwork. There was large inter-subject variability in auditory working memory and speech performance for CI participants. Working memory and speech performance were significantly poorer for CI than for NH participants. All three working memory measures were strongly correlated with each other for both CI and NH participants. Partial correlation analyses were performed on the CI data while controlling for demographic variables. Working memory efficiency was significantly correlated only with sentence recognition in quiet when working memory capacity was partialled out. Working memory capacity was correlated with disyllable recognition and school rank when efficiency was partialled out. There was no correlation between working memory and lexical tone recognition in the present CI participants. Mandarin-speaking CI users experience significant deficits in auditory working memory and speech performance compared with NH listeners. The present data suggest that auditory working memory may contribute to CI users' difficulties in speech understanding. The present pattern of results with Mandarin-speaking CI users is consistent with previous auditory working memory studies with English-speaking CI users, suggesting that the lexical importance of voice pitch cues (albeit poorly coded by the CI) did not influence the relationship between working memory and speech perception.
The Effects of Word Length on Memory for Pictures: Evidence for Speech Coding in Young Children.
ERIC Educational Resources Information Center
Hulme, Charles; And Others
1986-01-01
Three experiments demonstrate that children four to ten years old, when presented with a series recall task with pictures of common objects having short or long names, showed consistently better recall of pictures with short names. (HOD)
Decoding Articulatory Features from fMRI Responses in Dorsal Speech Regions.
Correia, Joao M; Jansma, Bernadette M B; Bonte, Milene
2015-11-11
The brain's circuitry for perceiving and producing speech may show a notable level of overlap that is crucial for normal development and behavior. The extent to which sensorimotor integration plays a role in speech perception remains highly controversial, however. Methodological constraints related to experimental designs and analysis methods have so far prevented the disentanglement of neural responses to acoustic versus articulatory speech features. Using a passive listening paradigm and multivariate decoding of single-trial fMRI responses to spoken syllables, we investigated brain-based generalization of articulatory features (place and manner of articulation, and voicing) beyond their acoustic (surface) form in adult human listeners. For example, we trained a classifier to discriminate place of articulation within stop syllables (e.g., /pa/ vs /ta/) and tested whether this training generalizes to fricatives (e.g., /fa/ vs /sa/). This novel approach revealed generalization of place and manner of articulation at multiple cortical levels within the dorsal auditory pathway, including auditory, sensorimotor, motor, and somatosensory regions, suggesting the representation of sensorimotor information. Additionally, generalization of voicing included the right anterior superior temporal sulcus associated with the perception of human voices as well as somatosensory regions bilaterally. Our findings highlight the close connection between brain systems for speech perception and production, and in particular, indicate the availability of articulatory codes during passive speech perception. Sensorimotor integration is central to verbal communication and provides a link between auditory signals of speech perception and motor programs of speech production. It remains highly controversial, however, to what extent the brain's speech perception system actively uses articulatory (motor), in addition to acoustic/phonetic, representations. In this study, we examine the role of articulatory representations during passive listening using carefully controlled stimuli (spoken syllables) in combination with multivariate fMRI decoding. Our approach enabled us to disentangle brain responses to acoustic and articulatory speech properties. In particular, it revealed articulatory-specific brain responses of speech at multiple cortical levels, including auditory, sensorimotor, and motor regions, suggesting the representation of sensorimotor information during passive speech perception. Copyright © 2015 the authors 0270-6474/15/3515015-11$15.00/0.
ERIC Educational Resources Information Center
Frank, Jane
A study examined the use of three linguistic features imitating speech found in two groups of direct-mail marketing texts, in order to show differences in the ways U.S.-based and transnational efforts exploit readers' expectations regarding "literate" versus "oral" modes of expression. Two groups of sales letters, 25 U.S.-based…
Common Schools and Uncommon Conversations: Education, Religious Speech and Public Spaces
ERIC Educational Resources Information Center
Strike, Kenneth A.
2007-01-01
This paper discusses the role of religious speech in the public square and the common school. It argues for more openness to political theology than many liberals are willing to grant and for an educational strategy of engagement over one of avoidance. The paper argues that the exclusion of religious debate from the public square has dysfunctional…
Exploring School Life from the Lens of a Child Who Does Not Use Speech to Communicate
ERIC Educational Resources Information Center
Ajodhia-Andrews, Amanda; Berman, Rachel
2009-01-01
The "new sociology of childhood" emphasizes listening to the voices of children when conducting research about their lives. In keeping with this framework, the following case study highlights the use of inclusive strategies and the importance of the researcher's orientation in exploring the perspectives of a child who does not use speech to…
What! I Have To Give a Speech?
ERIC Educational Resources Information Center
Murphy, Thomas J.; Snyder, Kenneth
Noting that fear of public speaking is the most common fear shared by people of all types, this book offers practical, easy-to-follow strategies for confident and effective public speaking. The book discusses the following aspects of public speaking: (1) what to talk about; (2) how to research a topic; (3) how to organize a speech; (4) how to keep…
What! I Have To Give a Speech? 2nd Edition.
ERIC Educational Resources Information Center
Snyder, Kenneth; Murphy, Thomas J.
Noting that fear of public speaking is shared by people of all types, the second edition of this book offers practical, easy-to-follow strategies for confident and effective public speaking. The book discusses the following aspects of public speaking: what to talk about; how to research a topic; how to organize a speech; how to keep an audience…
ERIC Educational Resources Information Center
Charlesworth, Dacia
2010-01-01
Invention deals with the content of a speech, arrangement involves placing the content in an order that is most strategic, style focuses on selecting linguistic devices, such as metaphor, to make the message more appealing, memory assists the speaker in delivering the message correctly, and delivery ideally enables great reception of the message.…
ERIC Educational Resources Information Center
Vihman, Marilyn May
A discussion of word acquisition rates and strategies is based upon a 6-month case study of an Estonian-speaking child who gradually and systematically relaxed phonotactic constraints to allow greater complexity in word production. In addition to the cognitive tools of assimilation and accomodation as described by Piaget, the child used a further…
ERIC Educational Resources Information Center
Ganz, Jennifer B.; Simpson, Richard L.; Corbin-Newsome, Jawanda
2008-01-01
By definition children with autism spectrum disorders (ASD) experience difficulty understanding and using language. Accordingly, visual and picture-based strategies such as the Picture Exchange Communication System (PECS) show promise in ameliorating speech and language deficits. This study reports the results of a multiple baseline across…
Children's Comprehension and Use of Indirect Speech Acts: The Case of Soliciting Praise.
ERIC Educational Resources Information Center
Kovac, Ceil
Children in school cooperate in the evaluation of their products and activities by teachers and other students by calling attention to these products and activities with various language strategies. The requests that someone notice something and/or praise it are the data base for this study. The unmarked speech act for this request type is in the…
ERIC Educational Resources Information Center
Pitman, Tim
2012-01-01
This article analyses the educational visions put forward by Australian federal politicians in their maiden (first) speeches to Parliament. The theoretical approach was a Habermasian-based analysis of the communication strategies adopted by the politicians, meaning that it was not only the content of the speeches but also the delivery that was the…
NASA Astrophysics Data System (ADS)
He, Di; Lim, Boon Pang; Yang, Xuesong; Hasegawa-Johnson, Mark; Chen, Deming
2018-06-01
Most mainstream Automatic Speech Recognition (ASR) systems consider all feature frames equally important. However, acoustic landmark theory is based on a contradictory idea, that some frames are more important than others. Acoustic landmark theory exploits quantal non-linearities in the articulatory-acoustic and acoustic-perceptual relations to define landmark times at which the speech spectrum abruptly changes or reaches an extremum; frames overlapping landmarks have been demonstrated to be sufficient for speech perception. In this work, we conduct experiments on the TIMIT corpus, with both GMM and DNN based ASR systems and find that frames containing landmarks are more informative for ASR than others. We find that altering the level of emphasis on landmarks by re-weighting acoustic likelihood tends to reduce the phone error rate (PER). Furthermore, by leveraging the landmark as a heuristic, one of our hybrid DNN frame dropping strategies maintained a PER within 0.44% of optimal when scoring less than half (45.8% to be precise) of the frames. This hybrid strategy out-performs other non-heuristic-based methods and demonstrate the potential of landmarks for reducing computation.
Higgins, Paul; Searchfield, Grant; Coad, Gavin
2012-06-01
The aim of this study was to determine which level-dependent hearing aid digital signal-processing strategy (DSP) participants preferred when listening to music and/or performing a speech-in-noise task. Two receiver-in-the-ear hearing aids were compared: one using 32-channel adaptive dynamic range optimization (ADRO) and the other wide dynamic range compression (WDRC) incorporating dual fast (4 channel) and slow (15 channel) processing. The manufacturers' first-fit settings based on participants' audiograms were used in both cases. Results were obtained from 18 participants on a quick speech-in-noise (QuickSIN; Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) task and for 3 music listening conditions (classical, jazz, and rock). Participants preferred the quality of music and performed better at the QuickSIN task using the hearing aids with ADRO processing. A potential reason for the better performance of the ADRO hearing aids was less fluctuation in output with change in sound dynamics. ADRO processing has advantages for both music quality and speech recognition in noise over the multichannel WDRC processing that was used in the study. Further evaluations of which DSP aspects contribute to listener preference are required.
McCormack, Jane; Easton, Catherine; Morkel-Kingsbury, Lenni
2014-01-01
The landscape of tertiary education is changing. Developments in information and communications technology have created new ways of engaging with subject material and supporting students on their learning journeys. Therefore, it is timely to reconsider and re-imagine the education of speech-language pathology (SLP) students within this new learning space. In this paper, we outline the design of a new Master of Speech Pathology course being offered by distance education at Charles Sturt University (CSU) in Australia. We discuss the catalyst for the course and the commitments of the SLP team at CSU, then describe the curriculum design process, focusing on the pedagogical approach and the learning and teaching strategies utilised in the course delivery. We explain how the learning and teaching strategies have been selected to support students' online learning experience and enable greater interaction between students and the subject material, with students and subject experts, and among student groups. Finally, we highlight some of the challenges in designing and delivering a distance education SLP program and identify future directions for educating students in an online world. © 2015 S. Karger AG, Basel.
Varnet, Léo; Knoblauch, Kenneth; Serniclaes, Willy; Meunier, Fanny; Hoen, Michel
2015-01-01
Although there is a large consensus regarding the involvement of specific acoustic cues in speech perception, the precise mechanisms underlying the transformation from continuous acoustical properties into discrete perceptual units remains undetermined. This gap in knowledge is partially due to the lack of a turnkey solution for isolating critical speech cues from natural stimuli. In this paper, we describe a psychoacoustic imaging method known as the Auditory Classification Image technique that allows experimenters to estimate the relative importance of time-frequency regions in categorizing natural speech utterances in noise. Importantly, this technique enables the testing of hypotheses on the listening strategies of participants at the group level. We exemplify this approach by identifying the acoustic cues involved in da/ga categorization with two phonetic contexts, Al- or Ar-. The application of Auditory Classification Images to our group of 16 participants revealed significant critical regions on the second and third formant onsets, as predicted by the literature, as well as an unexpected temporal cue on the first formant. Finally, through a cluster-based nonparametric test, we demonstrate that this method is sufficiently sensitive to detect fine modifications of the classification strategies between different utterances of the same phoneme.
Lee, Shao-Hsuan; Hsiao, Tzu-Yu; Lee, Guo-She
2015-06-01
Sustained vocalizations of vowels [a], [i], and syllable [mə] were collected in twenty normal-hearing individuals. On vocalizations, five conditions of different audio-vocal feedback were introduced separately to the speakers including no masking, wearing supra-aural headphones only, speech-noise masking, high-pass noise masking, and broad-band-noise masking. Power spectral analysis of vocal fundamental frequency (F0) was used to evaluate the modulations of F0 and linear-predictive-coding was used to acquire first two formants. The results showed that while the formant frequencies were not significantly shifted, low-frequency modulations (<3 Hz) of F0 significantly increased with reduced audio-vocal feedback across speech sounds and were significantly correlated with auditory awareness of speakers' own voices. For sustained speech production, the motor speech controls on F0 may depend on a feedback mechanism while articulation should rely more on a feedforward mechanism. Power spectral analysis of F0 might be applied to evaluate audio-vocal control for various hearing and neurological disorders in the future. Copyright © 2015 Elsevier B.V. All rights reserved.
Tavano, Alessandro; Pesarin, Anna; Murino, Vittorio; Cristani, Marco
2014-01-01
Individuals with Asperger syndrome/High Functioning Autism fail to spontaneously attribute mental states to the self and others, a life-long phenotypic characteristic known as mindblindness. We hypothesized that mindblindness would affect the dynamics of conversational interaction. Using generative models, in particular Gaussian mixture models and observed influence models, conversations were coded as interacting Markov processes, operating on novel speech/silence patterns, termed Steady Conversational Periods (SCPs). SCPs assume that whenever an agent's process changes state (e.g., from silence to speech), it causes a general transition of the entire conversational process, forcing inter-actant synchronization. SCPs fed into observed influence models, which captured the conversational dynamics of children and adolescents with Asperger syndrome/High Functioning Autism, and age-matched typically developing participants. Analyzing the parameters of the models by means of discriminative classifiers, the dialogs of patients were successfully distinguished from those of control participants. We conclude that meaning-free speech/silence sequences, reflecting inter-actant synchronization, at least partially encode typical and atypical conversational dynamics. This suggests a direct influence of theory of mind abilities onto basic speech initiative behavior.
Rowa, Karen; Paulitzki, Jeffrey R; Ierullo, Maria D; Chiang, Brenda; Antony, Martin M; McCabe, Randi E; Moscovitch, David A
2015-05-01
In the current study, 55 participants with a diagnosis of generalized social anxiety disorder (SAD), 23 participants with a diagnosis of an anxiety disorder other than SAD with no comorbid SAD, and 50 healthy controls completed a speech task as well as self-reported measures of safety behavior use. Speeches were videotaped and coded for global and specific indicators of performance by two raters who were blind to participants' diagnostic status. Results suggested that the objective performance of people with SAD was poorer than that of both control groups, who did not differ from each other. Moreover, self-reported use of safety behaviors during the speech strongly mediated the relationship between diagnostic group and observers' performance ratings. These results are consistent with contemporary cognitive-behavioral and interpersonal models of SAD and suggest that socially anxious individuals' performance skills may be undermined by the use of safety behaviors. These data provide further support for recommendations from previous studies that the elimination of safety behaviors ought to be a priority in cognitive behavioral therapy for SAD. Copyright © 2014. Published by Elsevier Ltd.
Spatiotemporal dynamics of auditory attention synchronize with speech
Wöstmann, Malte; Herrmann, Björn; Maess, Burkhard
2016-01-01
Attention plays a fundamental role in selectively processing stimuli in our environment despite distraction. Spatial attention induces increasing and decreasing power of neural alpha oscillations (8–12 Hz) in brain regions ipsilateral and contralateral to the locus of attention, respectively. This study tested whether the hemispheric lateralization of alpha power codes not just the spatial location but also the temporal structure of the stimulus. Participants attended to spoken digits presented to one ear and ignored tightly synchronized distracting digits presented to the other ear. In the magnetoencephalogram, spatial attention induced lateralization of alpha power in parietal, but notably also in auditory cortical regions. This alpha power lateralization was not maintained steadily but fluctuated in synchrony with the speech rate and lagged the time course of low-frequency (1–5 Hz) sensory synchronization. Higher amplitude of alpha power modulation at the speech rate was predictive of a listener’s enhanced performance of stream-specific speech comprehension. Our findings demonstrate that alpha power lateralization is modulated in tune with the sensory input and acts as a spatiotemporal filter controlling the read-out of sensory content. PMID:27001861
Impact of Noise Reduction Algorithm in Cochlear Implant Processing on Music Enjoyment.
Kohlberg, Gavriel D; Mancuso, Dean M; Griffin, Brianna M; Spitzer, Jaclyn B; Lalwani, Anil K
2016-06-01
Noise reduction algorithm (NRA) in speech processing strategy has positive impact on speech perception among cochlear implant (CI) listeners. We sought to evaluate the effect of NRA on music enjoyment. Prospective analysis of music enjoyment. Academic medical center. Normal-hearing (NH) adults (N = 16) and CI listeners (N = 9). Subjective rating of music excerpts. NH and CI listeners evaluated country music piece on three enjoyment modalities: pleasantness, musicality, and naturalness. Participants listened to the original version and 20 modified, less complex versions created by including subsets of musical instruments from the original song. NH participants listened to the segments through CI simulation and CI listeners listened to the segments with their usual speech processing strategy, with and without NRA. Decreasing the number of instruments was significantly associated with increase in the pleasantness and naturalness in both NH and CI subjects (p < 0.05). However, there was no difference in music enjoyment with or without NRA for either NH listeners with CI simulation or CI listeners across all three modalities of pleasantness, musicality, and naturalness (p > 0.05): this was true for the original and the modified music segments with one to three instruments (p > 0.05). NRA does not affect music enjoyment in CI listener or NH individual with CI simulation. This suggests that strategies to enhance speech processing will not necessarily have a positive impact on music enjoyment. However, reducing the complexity of music shows promise in enhancing music enjoyment and should be further explored.
Measuring Syntactic Complexity in Spontaneous Spoken Swedish
ERIC Educational Resources Information Center
Roll, Mikael; Frid, Johan; Horne, Merle
2007-01-01
Hesitation disfluencies after phonetically prominent stranded function words are thought to reflect the cognitive coding of complex structures. Speech fragments following the Swedish function word "att" "that" were analyzed syntactically, and divided into two groups: one with "att" in disfluent contexts, and the other with "att" in fluent…
NASA Astrophysics Data System (ADS)
Lightstone, P. C.; Davidson, W. M.
1982-04-01
The military detection assessment laboratory houses an experimental field system which assesses different alarm indicators such as fence disturbance sensors, MILES cables, and microwave Racons. A speech synthesis board which could be interfaced, by means of a computer, to an alarm logger making verbal acknowledgement of alarms possible was purchased. Different products and different types of voice synthesis were analyzed before a linear predictive code device produced by Telesensory Speech Systems of Palo Alto, California was chosen. This device is called the Speech 1000 Board and has a dedicated 8085 processor. A multiplexer card was designed and the Sp 1000 interfaced through the card into a TMS 990/100M Texas Instrument microcomputer. It was also necessary to design the software with the capability of recognizing and flagging an alarm on any 1 of 32 possible lines. The experimental field system was then packaged with a dc power supply, LED indicators, speakers, and switches, and deployed in the field performing reliably.
Effects of prior information on decoding degraded speech: an fMRI study.
Clos, Mareike; Langner, Robert; Meyer, Martin; Oechslin, Mathias S; Zilles, Karl; Eickhoff, Simon B
2014-01-01
Expectations and prior knowledge are thought to support the perceptual analysis of incoming sensory stimuli, as proposed by the predictive-coding framework. The current fMRI study investigated the effect of prior information on brain activity during the decoding of degraded speech stimuli. When prior information enabled the comprehension of the degraded sentences, the left middle temporal gyrus and the left angular gyrus were activated, highlighting a role of these areas in meaning extraction. In contrast, the activation of the left inferior frontal gyrus (area 44/45) appeared to reflect the search for meaningful information in degraded speech material that could not be decoded because of mismatches with the prior information. Our results show that degraded sentences evoke instantaneously different percepts and activation patterns depending on the type of prior information, in line with prediction-based accounts of perception. Copyright © 2012 Wiley Periodicals, Inc.
Digitised evaluation of speech intelligibility using vowels in maxillectomy patients.
Sumita, Y I; Hattori, M; Murase, M; Elbashti, M E; Taniguchi, H
2018-03-01
Among the functional disabilities that patients face following maxillectomy, speech impairment is a major factor influencing quality of life. Proper rehabilitation of speech, which may include prosthodontic and surgical treatments and speech therapy, requires accurate evaluation of speech intelligibility (SI). A simple, less time-consuming yet accurate evaluation is desirable both for maxillectomy patients and the various clinicians providing maxillofacial treatment. This study sought to determine the utility of digital acoustic analysis of vowels for the prediction of SI in maxillectomy patients, based on a comprehensive understanding of speech production in the vocal tract of maxillectomy patients and its perception. Speech samples were collected from 33 male maxillectomy patients (mean age 57.4 years) in two conditions, without and with a maxillofacial prosthesis, and formant data for the vowels /a/,/e/,/i/,/o/, and /u/ were calculated based on linear predictive coding. The frequency range of formant 2 (F2) was determined by differences between the minimum and maximum frequency. An SI test was also conducted to reveal the relationship between SI score and F2 range. Statistical analyses were applied. F2 range and SI score were significantly different between the two conditions without and with a prosthesis (both P < .0001). F2 range was significantly correlated with SI score in both the conditions (Spearman's r = .843, P < .0001; r = .832, P < .0001, respectively). These findings indicate that calculating the F2 range from 5 vowels has clinical utility for the prediction of SI after maxillectomy. © 2017 John Wiley & Sons Ltd.
Perceptual learning of degraded speech by minimizing prediction error.
Sohoglu, Ediz; Davis, Matthew H
2016-03-22
Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.
Perceptual learning of degraded speech by minimizing prediction error
Sohoglu, Ediz
2016-01-01
Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech. PMID:26957596
How reading differs from object naming at the neuronal level.
Price, C J; McCrory, E; Noppeney, U; Mechelli, A; Moore, C J; Biggio, N; Devlin, J T
2006-01-15
This paper uses whole brain functional neuroimaging in neurologically normal participants to explore how reading aloud differs from object naming in terms of neuronal implementation. In the first experiment, we directly compared brain activation during reading aloud and object naming. This revealed greater activation for reading in bilateral premotor, left posterior superior temporal and precuneus regions. In a second experiment, we segregated the object-naming system into object recognition and speech production areas by factorially manipulating the presence or absence of objects (pictures of objects or their meaningless scrambled counterparts) with the presence or absence of speech production (vocal vs. finger press responses). This demonstrated that the areas associated with speech production (object naming and repetitively saying "OK" to meaningless scrambled pictures) corresponded exactly to the areas where responses were higher for reading aloud than object naming in Experiment 1. Collectively the results suggest that, relative to object naming, reading increases the demands on shared speech production processes. At a cognitive level, enhanced activation for reading in speech production areas may reflect the multiple and competing phonological codes that are generated from the sublexical parts of written words. At a neuronal level, it may reflect differences in the speed with which different areas are activated and integrate with one another.
NASA Astrophysics Data System (ADS)
Liberman, A. M.
1983-09-01
This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. Manuscripts cover the following topics: The association between comprehension of spoken sentences and early reading ability: The role of phonetic representation; Phonetic coding and order memory in relation to reading proficiency: A comparison of short-term memory for temporal and spatial order information; Exploring the oral and written language errors made by language disabled children; Perceiving phonetic events; Converging evidence in support of common dynamical principles for speech and movement coordination; Phase transitions and critical behavior in human bimanual coordination; Timing and coarticulation for alveolo-palatals and sequences of alveolar +J in Catalan; V-to-C coarticulation in Catalan VCV sequences: An articulatory and acoustical study; Prosody and the /S/-/c/ distinction; Intersections of tone and intonation in Thai; Simultaneous measurements of vowels produced by a hearing-impaired speaker; Extending format transitions may not improve aphasics' perception of stop consonant place of articulation; Against a role of chirp identification in duplex perception; Further evidence for the role of relative timing in speech: A reply to Barry; Review (Phonological intervention: Concepts and procedures); and Review (Temporal variables in speech).
Temporal order processing of syllables in the left parietal lobe.
Moser, Dana; Baker, Julie M; Sanchez, Carmen E; Rorden, Chris; Fridriksson, Julius
2009-10-07
Speech processing requires the temporal parsing of syllable order. Individuals suffering from posterior left hemisphere brain injury often exhibit temporal processing deficits as well as language deficits. Although the right posterior inferior parietal lobe has been implicated in temporal order judgments (TOJs) of visual information, there is limited evidence to support the role of the left inferior parietal lobe (IPL) in processing syllable order. The purpose of this study was to examine whether the left inferior parietal lobe is recruited during temporal order judgments of speech stimuli. Functional magnetic resonance imaging data were collected on 14 normal participants while they completed the following forced-choice tasks: (1) syllable order of multisyllabic pseudowords, (2) syllable identification of single syllables, and (3) gender identification of both multisyllabic and monosyllabic speech stimuli. Results revealed increased neural recruitment in the left inferior parietal lobe when participants made judgments about syllable order compared with both syllable identification and gender identification. These findings suggest that the left inferior parietal lobe plays an important role in processing syllable order and support the hypothesized role of this region as an interface between auditory speech and the articulatory code. Furthermore, a breakdown in this interface may explain some components of the speech deficits observed after posterior damage to the left hemisphere.
Temporal Order Processing of Syllables in the Left Parietal Lobe
Baker, Julie M.; Sanchez, Carmen E.; Rorden, Chris; Fridriksson, Julius
2009-01-01
Speech processing requires the temporal parsing of syllable order. Individuals suffering from posterior left hemisphere brain injury often exhibit temporal processing deficits as well as language deficits. Although the right posterior inferior parietal lobe has been implicated in temporal order judgments (TOJs) of visual information, there is limited evidence to support the role of the left inferior parietal lobe (IPL) in processing syllable order. The purpose of this study was to examine whether the left inferior parietal lobe is recruited during temporal order judgments of speech stimuli. Functional magnetic resonance imaging data were collected on 14 normal participants while they completed the following forced-choice tasks: (1) syllable order of multisyllabic pseudowords, (2) syllable identification of single syllables, and (3) gender identification of both multisyllabic and monosyllabic speech stimuli. Results revealed increased neural recruitment in the left inferior parietal lobe when participants made judgments about syllable order compared with both syllable identification and gender identification. These findings suggest that the left inferior parietal lobe plays an important role in processing syllable order and support the hypothesized role of this region as an interface between auditory speech and the articulatory code. Furthermore, a breakdown in this interface may explain some components of the speech deficits observed after posterior damage to the left hemisphere. PMID:19812331
Governing sexual behaviour through humanitarian codes of conduct.
Matti, Stephanie
2015-10-01
Since 2001, there has been a growing consensus that sexual exploitation and abuse of intended beneficiaries by humanitarian workers is a real and widespread problem that requires governance. Codes of conduct have been promoted as a key mechanism for governing the sexual behaviour of humanitarian workers and, ultimately, preventing sexual exploitation and abuse (PSEA). This article presents a systematic study of PSEA codes of conduct adopted by humanitarian non-governmental organisations (NGOs) and how they govern the sexual behaviour of humanitarian workers. It draws on Foucault's analytics of governance and speech act theory to examine the findings of a survey of references to codes of conduct made on the websites of 100 humanitarian NGOs, and to analyse some features of the organisation-specific PSEA codes identified. © 2015 The Author(s). Disasters © Overseas Development Institute, 2015.
Berenstein, Carlo K; Mens, Lucas H M; Mulder, Jef J S; Vanpoucke, Filiep J
2008-04-01
To compare the effects of Monopole (Mono), Tripole (Tri), and "Virtual channel" (Vchan) electrode configurations on spectral resolution and speech perception in a crossover design. Nine experienced adults who received an Advanced Bionics CII/90K cochlear implant participated in a crossover design using three experimental strategies for 2 wk each. Three strategies were compared: (1) Mono; (2) Tri with current partly returning to adjacent electrodes and partly (25 or 75%) to the extracochlear reference; and (3) a monopolar "Vchan" strategy creating seven intermediate channels between two contacts. Each strategy was a variant of the standard "HiRes" processing strategy using 14 channels and 1105 pulses/sec/ channel, and a pulse duration of 32 microsec/phase. Spectral resolution was measured using broadband noise with a sinusoidally rippled spectral envelope with peaks evenly spaced on a logarithmic frequency scale. Speech perception was measured for monosyllables in quiet and in steady-state and fluctuating noises. Subjective comments on music experience and preferences in everyday use were assessed through questionnaires. Thresholds and most comfortable levels with Mono and Vchan were both significantly lower than levels with Tri. Spectral resolution was significantly higher with Tri than with Mono; spectral resolution with Vchan did not differ significantly from the other configurations. Moderate but significant correlations between word recognition and spectral resolution were found in speech in quiet and fluctuating noise. For speech in quiet, word recognition was best with Mono and worst with Vchan; Tri did not significantly differ from the other configurations. Pooled across the noise conditions, word recognition was best with Tri and worst with Vchan (Mono did not significantly differ from the other configurations). These differences were small and insufficient to result in a clear increase in performance across subjects if the result from the best configuration per subject was compared with the result from Mono. Across all subjects, music appreciation and satisfaction in everyday use did not clearly differ between configurations. (1) Although spectral resolution was improved with the tripolar configuration, differences in speech performance were too small in this limited group of subjects to justify clinical introduction. (2) Overall spectral resolution remained extremely poor compared with normal hearing; it remains to be seen whether further manipulations of the electrical field will be more effective.
Civility vs. Incivility in Online Social Interactions: An Evolutionary Approach
2016-01-01
Evidence is growing that forms of incivility–e.g. aggressive and disrespectful behaviors, harassment, hate speech and outrageous claims–are spreading in the population of social networking sites’ (SNS) users. Online social networks such as Facebook allow users to regularly interact with known and unknown others, who can behave either politely or rudely. This leads individuals not only to learn and adopt successful strategies for using the site, but also to condition their own behavior on that of others. Using a mean field approach, we define anevolutionary game framework to analyse the dynamics of civil and uncivil ways of interaction in online social networks and their consequences for collective welfare. Agents can choose to interact with others–politely or rudely–in SNS, or to opt out from online social networks to protect themselves from incivility. We find that, when the initial share of the population of polite users reaches a critical level, civility becomes generalized if its payoff increases more than that of incivility with the spreading of politeness in online interactions. Otherwise, the spreading of self-protective behaviors to cope with online incivility can lead the economyto non-socially optimal stationary states. JEL Codes: C61, C73, D85, O33, Z13. PsycINFO Codes: 2240, 2750. PMID:27802271
Civility vs. Incivility in Online Social Interactions: An Evolutionary Approach.
Antoci, Angelo; Delfino, Alexia; Paglieri, Fabio; Panebianco, Fabrizio; Sabatini, Fabio
2016-01-01
Evidence is growing that forms of incivility-e.g. aggressive and disrespectful behaviors, harassment, hate speech and outrageous claims-are spreading in the population of social networking sites' (SNS) users. Online social networks such as Facebook allow users to regularly interact with known and unknown others, who can behave either politely or rudely. This leads individuals not only to learn and adopt successful strategies for using the site, but also to condition their own behavior on that of others. Using a mean field approach, we define anevolutionary game framework to analyse the dynamics of civil and uncivil ways of interaction in online social networks and their consequences for collective welfare. Agents can choose to interact with others-politely or rudely-in SNS, or to opt out from online social networks to protect themselves from incivility. We find that, when the initial share of the population of polite users reaches a critical level, civility becomes generalized if its payoff increases more than that of incivility with the spreading of politeness in online interactions. Otherwise, the spreading of self-protective behaviors to cope with online incivility can lead the economyto non-socially optimal stationary states. JEL Codes: C61, C73, D85, O33, Z13. PsycINFO Codes: 2240, 2750.
Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification
NASA Astrophysics Data System (ADS)
Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato
We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.
Orthography and Modality Influence Speech Production in Adults and Children.
Saletta, Meredith; Goffman, Lisa; Hogan, Tiffany P
2016-12-01
The acquisition of literacy skills influences the perception and production of spoken language. We examined if orthography influences implicit processing in speech production in child readers and in adult readers with low and high reading proficiency. Children (n = 17), adults with typical reading skills (n = 17), and adults demonstrating low reading proficiency (n = 18) repeated or read aloud nonwords varying in orthographic transparency. Analyses of implicit linguistic processing (segmental accuracy and speech movement stability) were conducted. The accuracy and articulatory stability of productions of the nonwords were assessed before and after repetition or reading. Segmental accuracy results indicate that all 3 groups demonstrated greater learning when they were able to read, rather than just hear, the nonwords. Speech movement results indicate that, for adults with poor reading skills, exposure to the nonwords in a transparent spelling reduces the articulatory variability of speech production. Reading skill was correlated with speech movement stability in the groups of adults. In children and adults, orthography interacts with speech production; all participants integrate orthography into their lexical representations. Adults with poor reading skills do not use the same reading or speaking strategies as children with typical reading skills.
Orthography and Modality Influence Speech Production in Adults and Children
Goffman, Lisa; Hogan, Tiffany P.
2016-01-01
Purpose The acquisition of literacy skills influences the perception and production of spoken language. We examined if orthography influences implicit processing in speech production in child readers and in adult readers with low and high reading proficiency. Method Children (n = 17), adults with typical reading skills (n = 17), and adults demonstrating low reading proficiency (n = 18) repeated or read aloud nonwords varying in orthographic transparency. Analyses of implicit linguistic processing (segmental accuracy and speech movement stability) were conducted. The accuracy and articulatory stability of productions of the nonwords were assessed before and after repetition or reading. Results Segmental accuracy results indicate that all 3 groups demonstrated greater learning when they were able to read, rather than just hear, the nonwords. Speech movement results indicate that, for adults with poor reading skills, exposure to the nonwords in a transparent spelling reduces the articulatory variability of speech production. Reading skill was correlated with speech movement stability in the groups of adults. Conclusions In children and adults, orthography interacts with speech production; all participants integrate orthography into their lexical representations. Adults with poor reading skills do not use the same reading or speaking strategies as children with typical reading skills. PMID:27942710
Civier, Oren; Tasko, Stephen M; Guenther, Frank H
2010-09-01
This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which if they grow large enough, can cause the motor system to "reset" and repeat the current syllable. This hypothesis is investigated using computer simulations of a "neurally impaired" version of the DIVA model, a neural network model of speech acquisition and production. The model's outputs are compared to published acoustic data from PWS' fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject's speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. The reader will be able to (a) describe the contribution of auditory feedback control and feedforward control to normal and stuttered speech production, (b) summarize the neural modeling approach to speech production and its application to stuttering, and (c) explain how the DIVA model accounts for enhancements of fluency gained by slowed/prolonged speech and masking noise.
Barkmeier-Kraemer, Julie M.; Clark, Heather M.
2017-01-01
Background Hyperkinetic dysarthria is characterized by abnormal involuntary movements affecting respiratory, phonatory, and articulatory structures impacting speech and deglutition. Speech–language pathologists (SLPs) play an important role in the evaluation and management of dysarthria and dysphagia. This review describes the standard clinical evaluation and treatment approaches by SLPs for addressing impaired speech and deglutition in specific hyperkinetic dysarthria populations. Methods A literature review was conducted using the data sources of PubMed, Cochrane Library, and Google Scholar. Search terms included 1) hyperkinetic dysarthria, essential voice tremor, voice tremor, vocal tremor, spasmodic dysphonia, spastic dysphonia, oromandibular dystonia, Meige syndrome, orofacial, cervical dystonia, dystonia, dyskinesia, chorea, Huntington’s Disease, myoclonus; and evaluation/treatment terms: 2) Speech–Language Pathology, Speech Pathology, Evaluation, Assessment, Dysphagia, Swallowing, Treatment, Management, and diagnosis. Results The standard SLP clinical speech and swallowing evaluation of chorea/Huntington’s disease, myoclonus, focal and segmental dystonia, and essential vocal tremor typically includes 1) case history; 2) examination of the tone, symmetry, and sensorimotor function of the speech structures during non-speech, speech and swallowing relevant activities (i.e., cranial nerve assessment); 3) evaluation of speech characteristics; and 4) patient self-report of the impact of their disorder on activities of daily living. SLP management of individuals with hyperkinetic dysarthria includes behavioral and compensatory strategies for addressing compromised speech and intelligibility. Swallowing disorders are managed based on individual symptoms and the underlying pathophysiology determined during evaluation. Discussion SLPs play an important role in contributing to the differential diagnosis and management of impaired speech and deglutition associated with hyperkinetic disorders. PMID:28983422
Tonn, Christopher R; Grundfast, Kenneth M
2014-03-01
Otolaryngologists are asked to evaluate children who a parent, physician, or someone else believes is slow in developing speech. Therefore, an otolaryngologist should be familiar with milestones for normal speech development, the causes of delay in speech development, and the best ways to help assure that children develop the ability to speak in a normal way. To provide information for otolaryngologists that is helpful in the evaluation and management of children perceived to be delayed in developing speech. Data were obtained via literature searches, online databases, textbooks, and the most recent national guidelines on topics including speech delay and language delay and the underlying disorders that can cause delay in developing speech. Emphasis was placed on epidemiology, pathophysiology, most common presentation, and treatment strategies. Most of the sources referenced were published within the past 5 years. Our article is a summary of major causes of speech delay based on reliable sources as listed herein. Speech delay can be the manifestation of a spectrum of disorders affecting the language comprehension and/or speech production pathways, ranging from disorders involving global developmental limitations to motor dysfunction to hearing loss. Determining the cause of a child's delay in speech production is a time-sensitive issue because a child loses valuable opportunities in intellectual development if his or her communication defect is not addressed and ameliorated with treatment. Knowing several key items about each disorder can help otolaryngologists direct families to the correct health care provider to maximize the child's learning potential and intellectual growth curve.
Na, Wondo; Kim, Gibbeum; Kim, Gungu; Han, Woojae; Kim, Jinsook
2017-01-01
The current study aimed to evaluate hearing-related changes in terms of speech-in-noise processing, fast-rate speech processing, and working memory; and to identify which of these three factors is significantly affected by age-related hearing loss. One hundred subjects aged 65-84 years participated in the study. They were classified into four groups ranging from normal hearing to moderate-to-severe hearing loss. All the participants were tested for speech perception in quiet and noisy conditions and for speech perception with time alteration in quiet conditions. Forward- and backward-digit span tests were also conducted to measure the participants' working memory. 1) As the level of background noise increased, speech perception scores systematically decreased in all the groups. This pattern was more noticeable in the three hearing-impaired groups than in the normal hearing group. 2) As the speech rate increased faster, speech perception scores decreased. A significant interaction was found between speed of speech and hearing loss. In particular, 30% of compressed sentences revealed a clear differentiation between moderate hearing loss and moderate-to-severe hearing loss. 3) Although all the groups showed a longer span on the forward-digit span test than the backward-digit span test, there was no significant difference as a function of hearing loss. The degree of hearing loss strongly affects the speech recognition of babble-masked and time-compressed speech in the elderly but does not affect the working memory. We expect these results to be applied to appropriate rehabilitation strategies for hearing-impaired elderly who experience difficulty in communication.
ERIC Educational Resources Information Center
Gramberg, Anne-Kathrin; Heinze, Karin U.
1993-01-01
This article talks about the subjunctive of indirect speech, in which its important functions and meanings are depicted. An analysis of the instructional materials used in the first and second years of language study, followed by practical curriculum recommendations, demonstrates how this grammatical phenomenon can be established in an advanced…
ERIC Educational Resources Information Center
Kuriscak, Lisa
2015-01-01
This study focuses on variation within a group of learners of Spanish (N = 253) who produced requests and complaints via a written discourse completion task. It examines the effects of learner and situational variables on production--the effect of proficiency and addressee-gender on speech-act choice and the effect of perception of imposition on…
Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness
Kates, James M.; Arehart, Kathryn H.; Souza, Pamela E.
2013-01-01
Individual factors beyond the audiogram, such as age and cognitive abilities, can influence speech intelligibility and speech quality judgments. This paper develops a neural network framework for combining multiple subject factors into a single model that predicts speech intelligibility and quality for a nonlinear hearing-aid processing strategy. The nonlinear processing approach used in the paper is frequency compression, which is intended to improve the audibility of high-frequency speech sounds by shifting them to lower frequency regions where listeners with high-frequency loss have better hearing thresholds. An ensemble averaging approach is used for the neural network to avoid the problems associated with overfitting. Models are developed for two subject groups, one having nearly normal hearing and the other mild-to-moderate sloping losses. PMID:25669257
A Study of Apology Strategies Used by Iraqi EFL University Students
ERIC Educational Resources Information Center
Ugla, Raed Latif; Abidin, Mohamad Jafre Zainol
2016-01-01
This study was aimed at exploring apology strategies of English used by Iraqi EFL students, apology strategies in Iraqi Arabic and the pragmatic strategies of Iraqi EFL students in relation to the use of apology as a speech act. The data analyzed in this study were collected in Al-Yarmouk University College and University of Diyala. The study was…
Wireless communication and their mathematics
NASA Astrophysics Data System (ADS)
Komaki, Shozo
2015-05-01
Mobile phone and smart phone are penetrating into social use. To develop these system, various type of theoretical works based on mathematics are done, such as radio propagation theory, traffic theory, security coding and wireless device etc. In this speech, I will mention about the related mathematics and problems in it.
ERIC Educational Resources Information Center
Lewy, Guenter
2018-01-01
Freedom of expression is imperiled on today's college campuses. Citizens and educators alike are concerned about the number of shout-downs and disinvitations and their silencing effect on intellectual diversity. The use of speech codes, "safe spaces," new rules demanding "trigger warnings," and condemning…
ERIC Educational Resources Information Center
Gould, Jon B.
2007-01-01
Last December saw another predictable report from the Foundation for Individual Rights in Education (FIRE), a self-described watchdog group, highlighting how higher education is supposedly under siege from a politically correct plague of so-called hate-speech codes. In that report, FIRE declared that as many as 96 percent of top-ranked colleges…
ERIC Educational Resources Information Center
Stiles, William B.; And Others
1983-01-01
Coded campaign speeches recorded during the 1980 American presidential primaries and college lectures using a taxonomy of verbal response modes. Both candidates and lecturers used mostly informative modes, but candidates used relatively more disclosures (subjective information) and fewer edifications (objective information). Candidates…
ERIC Educational Resources Information Center
Gray, Mary W.
1994-01-01
Sexual harassment is abuse of power. It should be prohibited in colleges and universities, not through constraints on academic freedom such as speech codes, but through enforcement of standards of ethical professional conduct. Faculty have an ethical obligation not to engage in harassment and to hold colleagues accountable if they do so. (MSE)
English in Political Discourse of Post-Suharto Indonesia.
ERIC Educational Resources Information Center
Bernsten, Suzanne
This paper illustrates increases in the use of English in political speeches in post-Suharto Indonesia by analyzing the phonological, morphological, and syntactic assimilation of loanwords (linguistic borrowing), as well as hybridization and code switching, and phenomena such as doubling and loan translations. The paper also examines the mixed…
The Courts as Educational Policy Makers.
ERIC Educational Resources Information Center
Maready, William F.
This report discusses the expanding role of Federal judges as educational policymakers. The report discusses court decisions related to interpretations by the Federal Courts of the U.S. Constitution. The report notes that court decisions have covered the following topics: dress codes, flying of the flag, freedom of speech, unwed mothers,…
Spanish-English Speech Perception in Children and Adults: Developmental Trends
ERIC Educational Resources Information Center
Brice, Alejandro E.; Gorman, Brenda K.; Leung, Cynthia B.
2013-01-01
This study explored the developmental trends and phonetic category formation in bilingual children and adults. Participants included 30 fluent Spanish-English bilingual children, aged 8-11, and bilingual adults, aged 18-40. All completed gating tasks that incorporated code-mixed Spanish-English stimuli. There were significant differences in…
Student Disciplinary Codes -- What Makes Them Tick.
ERIC Educational Resources Information Center
Johnson, Donald V.
In this speech, the author describes how one school developed discipline guidelines with the cooperation of staff, parents, and students. Due process procedures, types of discipline, and an alternative out-of-school program for adjustment students (those who have experienced chronic or serious disciplinary problems in the school) are described.…
Acoustic properties of naturally produced clear speech at normal speaking rates
NASA Astrophysics Data System (ADS)
Krause, Jean C.; Braida, Louis D.
2004-01-01
Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
Speech and gesture in spatial language and cognition among the Yucatec Mayas.
Le Guen, Olivier
2011-07-01
In previous analyses of the influence of language on cognition, speech has been the main channel examined. In studies conducted among Yucatec Mayas, efforts to determine the preferred frame of reference in use in this community have failed to reach an agreement (Bohnemeyer & Stolz, 2006; Levinson, 2003 vs. Le Guen, 2006, 2009). This paper argues for a multimodal analysis of language that encompasses gesture as well as speech, and shows that the preferred frame of reference in Yucatec Maya is only detectable through the analysis of co-speech gesture and not through speech alone. A series of experiments compares knowledge of the semantics of spatial terms, performance on nonlinguistic tasks and gestures produced by men and women. The results show a striking gender difference in the knowledge of the semantics of spatial terms, but an equal preference for a geocentric frame of reference in nonverbal tasks. In a localization task, participants used a variety of strategies in their speech, but they all exhibited a systematic preference for a geocentric frame of reference in their gestures. Copyright © 2011 Cognitive Science Society, Inc.
Audiovisual integration in children listening to spectrally degraded speech.
Maidment, David W; Kang, Hi Jee; Stewart, Hannah J; Amitay, Sygal
2015-02-01
The study explored whether visual information improves speech identification in typically developing children with normal hearing when the auditory signal is spectrally degraded. Children (n=69) and adults (n=15) were presented with noise-vocoded sentences from the Children's Co-ordinate Response Measure (Rosen, 2011) in auditory-only or audiovisual conditions. The number of bands was adaptively varied to modulate the degradation of the auditory signal, with the number of bands required for approximately 79% correct identification calculated as the threshold. The youngest children (4- to 5-year-olds) did not benefit from accompanying visual information, in comparison to 6- to 11-year-old children and adults. Audiovisual gain also increased with age in the child sample. The current data suggest that children younger than 6 years of age do not fully utilize visual speech cues to enhance speech perception when the auditory signal is degraded. This evidence not only has implications for understanding the development of speech perception skills in children with normal hearing but may also inform the development of new treatment and intervention strategies that aim to remediate speech perception difficulties in pediatric cochlear implant users.
Predictive top-down integration of prior knowledge during speech perception.
Sohoglu, Ediz; Peelle, Jonathan E; Carlyon, Robert P; Davis, Matthew H
2012-06-20
A striking feature of human perception is that our subjective experience depends not only on sensory information from the environment but also on our prior knowledge or expectations. The precise mechanisms by which sensory information and prior knowledge are integrated remain unclear, with longstanding disagreement concerning whether integration is strictly feedforward or whether higher-level knowledge influences sensory processing through feedback connections. Here we used concurrent EEG and MEG recordings to determine how sensory information and prior knowledge are integrated in the brain during speech perception. We manipulated listeners' prior knowledge of speech content by presenting matching, mismatching, or neutral written text before a degraded (noise-vocoded) spoken word. When speech conformed to prior knowledge, subjective perceptual clarity was enhanced. This enhancement in clarity was associated with a spatiotemporal profile of brain activity uniquely consistent with a feedback process: activity in the inferior frontal gyrus was modulated by prior knowledge before activity in lower-level sensory regions of the superior temporal gyrus. In parallel, we parametrically varied the level of speech degradation, and therefore the amount of sensory detail, so that changes in neural responses attributable to sensory information and prior knowledge could be directly compared. Although sensory detail and prior knowledge both enhanced speech clarity, they had an opposite influence on the evoked response in the superior temporal gyrus. We argue that these data are best explained within the framework of predictive coding in which sensory activity is compared with top-down predictions and only unexplained activity propagated through the cortical hierarchy.
Hogrefe, Katharina; Rein, Robert; Skomroch, Harald; Lausberg, Hedda
2016-12-01
Persons with brain damage show deviant patterns of co-speech hand movement behaviour in comparison to healthy speakers. It has been claimed by several authors that gesture and speech rely on a single production mechanism that depends on the same neurological substrate while others claim that both modalities are closely related but separate production channels. Thus, findings so far are contradictory and there is a lack of studies that systematically analyse the full range of hand movements that accompany speech in the condition of brain damage. In the present study, we aimed to fill this gap by comparing hand movement behaviour in persons with unilateral brain damage to the left and the right hemisphere and a matched control group of healthy persons. For hand movement coding, we applied Module I of NEUROGES, an objective and reliable analysis system that enables to analyse the full repertoire of hand movements independent of speech, which makes it specifically suited for the examination of persons with aphasia. The main results of our study show a decreased use of communicative conceptual gestures in persons with damage to the right hemisphere and an increased use of these gestures in persons with left brain damage and aphasia. These results not only suggest that the production of gesture and speech do not rely on the same neurological substrate but also underline the important role of right hemisphere functioning for gesture production. Copyright © 2016 Elsevier Ltd. All rights reserved.
Noise-robust speech recognition through auditory feature detection and spike sequence decoding.
Schafer, Phillip B; Jin, Dezhe Z
2014-03-01
Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences--one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.
Community-based early intervention for language delay: a preliminary investigation.
Ciccone, Natalie; Hennessey, Neville; Stokes, Stephanie F
2012-01-01
A trial parent-focused early intervention (PFEI) programme for children with delayed language development is reported in which current research evidence was translated and applied within the constraints of available of clinical resources. The programme, based at a primary school, was run by a speech-language pathologist with speech-language pathology students. To investigate the changes in child language development and parent and child interactions following attendance at the PFEI. Eighteen parents and their children attended six, weekly group sessions in which parents were provided with strategies to maximize language learning in everyday contexts. Pre- and post-programme assessments of vocabulary size and measures of parent-child interaction were collected. Parents and children significantly increased their communicative interactions from pre- to post-treatment. Children's expressive vocabulary size and language skills increased significantly. Large-effect sizes were observed. The positive outcomes of the intervention programme contribute to the evidence base of intervention strategies and forms of service delivery for children at risk of language delay. © 2012 Royal College of Speech and Language Therapists.
Emmorey, Karen; Petrich, Jennifer; Gollan, Tamar H.
2012-01-01
Bilinguals who are fluent in American Sign Language (ASL) and English often produce code-blends - simultaneously articulating a sign and a word while conversing with other ASL-English bilinguals. To investigate the cognitive mechanisms underlying code-blend processing, we compared picture-naming times (Experiment 1) and semantic categorization times (Experiment 2) for code-blends versus ASL signs and English words produced alone. In production, code-blending did not slow lexical retrieval for ASL and actually facilitated access to low-frequency signs. However, code-blending delayed speech production because bimodal bilinguals synchronized English and ASL lexical onsets. In comprehension, code-blending speeded access to both languages. Bimodal bilinguals’ ability to produce code-blends without any cost to ASL implies that the language system either has (or can develop) a mechanism for switching off competition to allow simultaneous production of close competitors. Code-blend facilitation effects during comprehension likely reflect cross-linguistic (and cross-modal) integration at the phonological and/or semantic levels. The absence of any consistent processing costs for code-blending illustrates a surprising limitation on dual-task costs and may explain why bimodal bilinguals code-blend more often than they code-switch. PMID:22773886
Tamplin, Jeanette; Brazzale, Danny J; Pretto, Jeffrey J; Ruehland, Warren R; Buttifant, Mary; Brown, Douglas J; Berlowitz, David J
2011-02-01
To explore how respiratory impairment after cervical spinal cord injury affects vocal function, and to explore muscle recruitment strategies used during vocal tasks after quadriplegia. It was hypothesized that to achieve the increased respiratory support required for singing and loud speech, people with quadriplegia use different patterns of muscle recruitment and control strategies compared with control subjects without spinal cord injury. Matched, parallel-group design. Large university-affiliated public hospital. Consenting participants with motor-complete C5-7 quadriplegia (n=6) and able-bodied age-matched controls (n=6) were assessed on physiologic and voice measures during vocal tasks. Not applicable. Standard respiratory function testing, surface electromyographic activity from accessory respiratory muscles, sound pressure levels during vocal tasks, the Voice Handicap Index, and the Perceptual Voice Profile. The group with quadriplegia had a reduced lung capacity (vital capacity, 71% vs 102% of predicted; P=.028), more perceived voice problems (Voice Handicap Index score, 22.5 vs 6.5; P=.046), and greater recruitment of accessory respiratory muscles during both loud and soft volumes (P=.028) than the able-bodied controls. The group with quadriplegia also demonstrated higher accessory muscle activation in changing from soft to loud speech (P=.028). People with quadriplegia have impaired vocal ability and use different muscle recruitment strategies during speech than the able-bodied. These findings will enable us to target specific measurements of respiratory physiology for assessing functional improvements in response to formal therapeutic singing training. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Group climate in the voice therapy of patients with Parkinson's Disease.
Diaféria, Giovana; Madazio, Glaucya; Pacheco, Claudia; Takaki, Patricia Barbarini; Behlau, Mara
2017-09-04
To verify the impact that group dynamics and coaching strategies have on the PD patients voice, speech and communication, as well as the group climate. 16 individuals with mild to moderate dysarthria due to the PD were divided into two groups: the CG (8 patients), submitted to traditional therapy with 12 regular therapy sessions plus 4 additional support sessions; and the EG (8 patients), submitted to traditional therapy with 12 regular therapy sessions plus 4 sessions with group dynamics and coaching strategies. The Living with Dysarthria questionnaire (LwD), the self-evaluation of voice, speech and communication, and the perceptual-auditory analysis of the vocal quality were assess in 3 moments: pre-traditional therapy (pre); post-traditional therapy (post 1); and post support sessions/coaching strategies (post 2); in post 1 and post 2 moments, the Group Climate Questionnaire (GCQ) was also applied. CG and EG showed an improvement in the LwD from pre to post 1 and post 2 moments. Voice self-evaluation was better for the EG - when pre was compared with post 2 and when post 1 was compared with post 2 - ranging from regular to very good; both groups presented improvement in the communication self-evaluation. The perceptual-auditory evaluation of the vocal quality was better for the EG in the post 1 moment. No difference was found for the GCQ; however, the EG presented lower avoidance scores in post 2. All patients showed improvement in the voice, speech and communication self-evaluation; EG showed lower avoidance scores, creating a more collaborative and propitious environment for speech therapy.
Ortmann, Magdalene; Zwitserlood, Pienie; Knief, Arne; Baare, Johanna; Brinkheetker, Stephanie; am Zehnhoff-Dinnesen, Antoinette; Dobel, Christian
2017-01-01
Cochlear implants provide individuals who are deaf with access to speech. Although substantial advancements have been made by novel technologies, there still is high variability in language development during childhood, depending on adaptation and neural plasticity. These factors have often been investigated in the auditory domain, with the mismatch negativity as an index for sensory and phonological processing. Several studies have demonstrated that the MMN is an electrophysiological correlate for hearing improvement with cochlear implants. In this study, two groups of cochlear implant users, both with very good basic hearing abilities but with non-overlapping speech performance (very good or very poor speech performance), were matched according to device experience and age at implantation. We tested the perception of phonemes in the context of specific other phonemes from which they were very hard to discriminate (e.g., the vowels in /bu/ vs. /bo/). The most difficult pair was individually determined for each participant. Using behavioral measures, both cochlear implants groups performed worse than matched controls, and the good performers performed better than the poor performers. Cochlear implant groups and controls did not differ during time intervals typically used for the mismatch negativity, but earlier: source analyses revealed increased activity in the region of the right supramarginal gyrus (220–260 ms) in good performers. Poor performers showed increased activity in the left occipital cortex (220–290 ms), which may be an index for cross-modal perception. The time course and the neural generators differ from data from our earlier studies, in which the same phonemes were assessed in an easy-to-discriminate context. The results demonstrate that the groups used different language processing strategies, depending on the success of language development and the particular language context. Overall, our data emphasize the role of neural plasticity and use of adaptive strategies for successful language development with cochlear implants. PMID:28056017
When will a stuttering moment occur? The determining role of speech motor preparation.
Vanhoutte, Sarah; Cosyns, Marjan; van Mierlo, Pieter; Batens, Katja; Corthals, Paul; De Letter, Miet; Van Borsel, John; Santens, Patrick
2016-06-01
The present study aimed to evaluate whether increased activity related to speech motor preparation preceding fluently produced words reflects a successful compensation strategy in stuttering. For this purpose, a contingent negative variation (CNV) was evoked during a picture naming task and measured by use of electro-encephalography. A CNV is a slow, negative event-related potential known to reflect motor preparation generated by the basal ganglia-thalamo-cortical (BGTC) - loop. In a previous analysis, the CNV of 25 adults with developmental stuttering (AWS) was significantly increased, especially over the right hemisphere, compared to the CNV of 35 fluent speakers (FS) when both groups were speaking fluently (Vanhoutte et al., (2015) doi: 10.1016/j.neuropsychologia.2015.05.013). To elucidate whether this increase is a compensation strategy enabling fluent speech in AWS, the present analysis evaluated the CNV of 7 AWS who stuttered during this picture naming task. The CNV preceding AWS stuttered words was statistically compared to the CNV preceding AWS fluent words and FS fluent words. Though no difference emerged between the CNV of the AWS stuttered words and the FS fluent words, a significant reduction was observed when comparing the CNV preceding AWS stuttered words to the CNV preceding AWS fluent words. The latter seems to confirm the compensation hypothesis: the increased CNV prior to AWS fluent words is a successful compensation strategy, especially when it occurs over the right hemisphere. The words are produced fluently because of an enlarged activity during speech motor preparation. The left CNV preceding AWS stuttered words correlated negatively with stuttering frequency and severity suggestive for a link between the left BGTC - network and the stuttering pathology. Overall, speech motor preparatory activity generated by the BGTC - loop seems to have a determining role in stuttering. An important divergence between left and right hemisphere is hypothesized. Copyright © 2016 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Abedi, Elham
2016-01-01
The development of speech-act theory has provided the hearers with a better understanding of what speakers intend to perform in the act of communication. One type of speech act is apologizing. When an action or utterance has resulted in an offense, the offender needs to apologize. In the present study, an attempt was made to compare the apology…
On Shifting Sands: Iranian Strategy in a Changing Middle East
2013-10-01
including the “axis of evil” speech ), made war with Iran seem like a growing possibility to both Iranian... speech delivered after most of the unrest had been put down, Iran’s Supreme Leader claimed: “The leaders of certain Western countries, presidents...central missions,100 developing a new training protocol that included texts such as ‘Obur az Fetneh [Overcoming Sedition ], which viewed the protests
Utianski, Rene L; Caviness, John N; Liss, Julie M
2015-01-01
High-density electroencephalography was used to evaluate cortical activity during speech comprehension via a sentence verification task. Twenty-four participants assigned true or false to sentences produced with 3 noise-vocoded channel levels (1--unintelligible, 6--decipherable, 16--intelligible), during simultaneous EEG recording. Participant data were sorted into higher- (HP) and lower-performing (LP) groups. The identification of a late-event related potential for LP listeners in the intelligible condition and in all listeners when challenged with a 6-Ch signal supports the notion that this induced potential may be related to either processing degraded speech, or degraded processing of intelligible speech. Different cortical locations are identified as neural generators responsible for this activity; HP listeners are engaging motor aspects of their language system, utilizing an acoustic-phonetic based strategy to help resolve the sentence, while LP listeners do not. This study presents evidence for neurophysiological indices associated with more or less successful speech comprehension performance across listening conditions. Copyright © 2014 Elsevier Inc. All rights reserved.
[Swallowing and Voice Disorders in Cancer Patients].
Tanuma, Akira
2015-07-01
Dysphagia sometimes occurs in patients with head and neck cancer, particularly in those undergoing surgery and radiotherapy for lingual, pharyngeal, and laryngeal cancer. It also occurs in patients with esophageal cancer and brain tumor. Patients who undergo glossectomy usually show impairment of the oral phase of swallowing, whereas those with pharyngeal, laryngeal, and esophageal cancer show impairment of the pharyngeal phase of swallowing. Videofluoroscopic examination of swallowing provides important information necessary for rehabilitation of swallowing in these patients. Appropriate swallowing exercises and compensatory strategies can be decided based on the findings of the evaluation. Palatal augmentation prostheses are sometimes used for rehabilitation in patients undergoing glossectomy. Patients who undergo total laryngectomy or total pharyngolaryngoesophagectomy should receive speech therapy to enable them to use alaryngeal speech methods, including electrolarynx, esophageal speech, or speech via tracheoesophageal puncture. Regaining swallowing function and speech can improve a patient's emotional health and quality of life. Therefore, it is important to manage swallowing and voice disorders appropriately.
Attrill, Stacie; Lincoln, Michelle; McAllister, Sue
2017-06-01
Increasing the proportion of culturally and linguistically diverse (CALD) students and providing intercultural learning opportunities for all students are two strategies identified to facilitate greater access to culturally responsive speech-language pathology services. To enact these strategies, more information is needed about student diversity. This study collected descriptive information about CALD speech-language pathology students in Australia. Cultural and linguistic background information was collected through surveying 854 domestic and international speech-language pathology students from three Australian universities. Students were categorised according to defined or perceived CALD status, international student status, speaking English as an Additional Language (EAL), or speaking a Language Other than English at Home (LOTEH). Overall, 32.1% of students were either defined or perceived CALD. A total of 14.9% spoke EAL and 25.7% identified speaking a LOTEH. CALD students were more likely to speak EAL or a LOTEH than non-CALD students, were prominently from Southern and South-Eastern Asian backgrounds and spoke related languages. Many students reported direct or indirect connections with their cultural heritage and/or contributed linguistic diversity. These students may represent broader acculturative experiences in communities. The sociocultural knowledge and experience of these students may provide intercultural learning opportunities for all students and promote culturally responsive practices.
Language/culture/mind/brain. Progress at the margins between disciplines.
Kuhl, P K; Tsao, F M; Liu, H M; Zhang, Y; De Boer, B
2001-05-01
At the forefront of research on language are new data demonstrating infants' strategies in the early acquisition of language. The data show that infants perceptually "map" critical aspects of ambient language in the first year of life before they can speak. Statistical and abstract properties of speech are picked up through exposure to ambient language. Moreover, linguistic experience alters infants' perception of speech, warping perception in a way that enhances native-language speech processing. Infants' strategies are unexpected and unpredicted by historical views. At the same time, research in three additional disciplines is contributing to our understanding of language and its acquisition by children. Cultural anthropologists are demonstrating the universality of adult speech behavior when addressing infants and children across cultures, and this is creating a new view of the role adult speakers play in bringing about language in the child. Neuroscientists, using the techniques of modern brain imaging, are revealing the temporal and structural aspects of language processing by the brain and suggesting new views of the critical period for language. Computer scientists, modeling the computational aspects of childrens' language acquisition, are meeting success using biologically inspired neural networks. Although a consilient view cannot yet be offered, the cross-disciplinary interaction now seen among scientists pursuing one of humans' greatest achievements, language, is quite promising.
Pamplona, María Del Carmen; Ysunza, Pablo Antonio; Morales, Santiago
2017-02-01
Children with cleft palate frequently show speech disorders known as compensatory articulation. Compensatory articulation requires a prolonged period of speech intervention that should include reinforcement at home. However, frequently relatives do not know how to work with their children at home. To study whether the use of audiovisual materials especially designed for complementing speech pathology treatment in children with compensatory articulation can be effective for stimulating articulation practice at home and consequently enhancing speech normalization in children with cleft palate. Eighty-two patients with compensatory articulation were studied. Patients were randomly divided into two groups. Both groups received speech pathology treatment aimed to correct articulation placement. In addition, patients from the active group received a set of audiovisual materials to be used at home. Parents were instructed about strategies and ideas about how to use the materials with their children. Severity of compensatory articulation was compared at the onset and at the end of the speech intervention. After the speech therapy period, the group of patients using audiovisual materials at home demonstrated significantly greater improvement in articulation, as compared with the patients receiving speech pathology treatment on - site without audiovisual supporting materials. The results of this study suggest that audiovisual materials especially designed for practicing adequate articulation placement at home can be effective for reinforcing and enhancing speech pathology treatment of patients with cleft palate and compensatory articulation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields.
Carlin, Michael A; Elhilali, Mounya
2015-12-01
One of the hallmarks of sound processing in the brain is the ability of the nervous system to adapt to changing behavioral demands and surrounding soundscapes. It can dynamically shift sensory and cognitive resources to focus on relevant sounds. Neurophysiological studies indicate that this ability is supported by adaptively retuning the shapes of cortical spectro-temporal receptive fields (STRFs) to enhance features of target sounds while suppressing those of task-irrelevant distractors. Because an important component of human communication is the ability of a listener to dynamically track speech in noisy environments, the solution obtained by auditory neurophysiology implies a useful adaptation strategy for speech activity detection (SAD). SAD is an important first step in a number of automated speech processing systems, and performance is often reduced in highly noisy environments. In this paper, we describe how task-driven adaptation is induced in an ensemble of neurophysiological STRFs, and show how speech-adapted STRFs reorient themselves to enhance spectro-temporal modulations of speech while suppressing those associated with a variety of nonspeech sounds. We then show how an adapted ensemble of STRFs can better detect speech in unseen noisy environments compared to an unadapted ensemble and a noise-robust baseline. Finally, we use a stimulus reconstruction task to demonstrate how the adapted STRF ensemble better captures the spectrotemporal modulations of attended speech in clean and noisy conditions. Our results suggest that a biologically plausible adaptation framework can be applied to speech processing systems to dynamically adapt feature representations for improving noise robustness.
Discrimination of brief speech sounds is impaired in rats with auditory cortex lesions
Porter, Benjamin A.; Rosenthal, Tara R.; Ranasinghe, Kamalini G.; Kilgard, Michael P.
2011-01-01
Auditory cortex (AC) lesions impair complex sound discrimination. However, a recent study demonstrated spared performance on an acoustic startle response test of speech discrimination following AC lesions (Floody et al., 2010). The current study reports the effects of AC lesions on two operant speech discrimination tasks. AC lesions caused a modest and quickly recovered impairment in the ability of rats to discriminate consonant-vowel-consonant speech sounds. This result seems to suggest that AC does not play a role in speech discrimination. However, the speech sounds used in both studies differed in many acoustic dimensions and an adaptive change in discrimination strategy could allow the rats to use an acoustic difference that does not require an intact AC to discriminate. Based on our earlier observation that the first 40 ms of the spatiotemporal activity patterns elicited by speech sounds best correlate with behavioral discriminations of these sounds (Engineer et al., 2008), we predicted that eliminating additional cues by truncating speech sounds to the first 40 ms would render the stimuli indistinguishable to a rat with AC lesions. Although the initial discrimination of truncated sounds took longer to learn, the final performance paralleled rats using full-length consonant-vowel-consonant sounds. After 20 days of testing, half of the rats using speech onsets received bilateral AC lesions. Lesions severely impaired speech onset discrimination for at least one-month post lesion. These results support the hypothesis that auditory cortex is required to accurately discriminate the subtle differences between similar consonant and vowel sounds. PMID:21167211
Resourcing speech-language pathologists to work with multilingual children.
McLeod, Sharynne
2014-06-01
Speech-language pathologists play important roles in supporting people to be competent communicators in the languages of their communities. However, with over 7000 languages spoken throughout the world and the majority of the global population being multilingual, there is often a mismatch between the languages spoken by children and families and their speech-language pathologists. This paper provides insights into service provision for multilingual children within an English-dominant country by viewing Australia's multilingual population as a microcosm of ethnolinguistic minorities. Recent population studies of Australian pre-school children show that their most common languages other than English are: Arabic, Cantonese, Vietnamese, Italian, Mandarin, Spanish, and Greek. Although 20.2% of services by Speech Pathology Australia members are offered in languages other than English, there is a mismatch between the language of the services and the languages of children within similar geographical communities. Australian speech-language pathologists typically use informal or English-based assessments and intervention tools with multilingual children. Thus, there is a need for accessible culturally and linguistically appropriate resources for working with multilingual children. Recent international collaborations have resulted in practical strategies to support speech-language pathologists during assessment, intervention, and collaboration with families, communities, and other professionals. The International Expert Panel on Multilingual Children's Speech was assembled to prepare a position paper to address issues faced by speech-language pathologists when working with multilingual populations. The Multilingual Children's Speech website ( http://www.csu.edu.au/research/multilingual-speech ) addresses one of the aims of the position paper by providing free resources and information for speech-language pathologists about more than 45 languages. These international collaborations have been framed around the World Health Organization's International Classification of Functioning, Disability and Health (ICF-CY) and have been established with the goal of supporting multilingual children to participate in society.
[The ideal body: media pedagogy].
Ribeiro, Rubia Guimarães; da Silva, Karen Schein; Kruse, Maria Henriqueta Luce
2009-03-01
We present enunciations that circulate in the media regarding the body, discussing the ways in which the speeches related with the maintenance of health and aesthetics invest in its improvement. Therefore, we used the Caderno Vida, a weekly insert of Zero Hora, for we understand it as owner of a proper speech that has the power of subjectivate people The analysis is part of Cultural Studies and it is based on the ideas of Michel Foucault. The methodological strategy used was the speech analysis of subjects about body care. The periodical questions its readers using speeches that point to beauty health and success The constructed categories were: how is the ideal body, what to do to have such body and why we must have this body Balanced feeding, practice of regular physical activities and the accomplishment of plastic surgeries are recommendations recurrently found in weekly inserts.
Lebib, Riadh; Papo, David; Douiri, Abdel; de Bode, Stella; Gillon Dowens, Margaret; Baudonnière, Pierre-Marie
2004-11-30
Lipreading reliably improve speech perception during face-to-face conversation. Within the range of good dubbing, however, adults tolerate some audiovisual (AV) discrepancies and lipreading, then, can give rise to confusion. We used event-related brain potentials (ERPs) to study the perceptual strategies governing the intermodal processing of dynamic and bimodal speech stimuli, either congruently dubbed or not. Electrophysiological analyses revealed that non-coherent audiovisual dubbings modulated in amplitude an endogenous ERP component, the N300, we compared to a 'N400-like effect' reflecting the difficulty to integrate these conflicting pieces of information. This result adds further support for the existence of a cerebral system underlying 'integrative processes' lato sensu. Further studies should take advantage of this 'N400-like effect' with AV speech stimuli to open new perspectives in the domain of psycholinguistics.
Dual diathesis-stressor model of emotional and linguistic contributions to developmental stuttering.
Walden, Tedra A; Frankel, Carl B; Buhr, Anthony P; Johnson, Kia N; Conture, Edward G; Karrass, Jan M
2012-05-01
This study assessed emotional and speech-language contributions to childhood stuttering. A dual diathesis-stressor framework guided this study, in which both linguistic requirements and skills, and emotion and its regulation, are hypothesized to contribute to stuttering. The language diathesis consists of expressive and receptive language skills. The emotion diathesis consists of proclivities to emotional reactivity and regulation of emotion, and the emotion stressor consists of experimentally manipulated emotional inductions prior to narrative speaking tasks. Preschool-age children who do and do not stutter were exposed to three emotion-producing overheard conversations-neutral, positive, and angry. Emotion and emotion-regulatory behaviors were coded while participants listened to each conversation and while telling a story after each overheard conversation. Instances of stuttering during each story were counted. Although there was no main effect of conversation type, results indicated that stuttering in preschool-age children is influenced by emotion and language diatheses, as well as coping strategies and situational emotional stressors. Findings support the dual diathesis-stressor model of stuttering.
Dual Diathesis-Stressor Model of Emotional and Linguistic Contributions to Developmental Stuttering
Frankel, Carl B.; Buhr, Anthony P.; Johnson, Kia N.; Conture, Edward G.; Karrass, Jan M.
2013-01-01
This study assessed emotional and speech-language contributions to childhood stuttering. A dual diathesis-stressor framework guided this study, in which both linguistic requirements and skills, and emotion and its regulation, are hypothesized to contribute to stuttering. The language diathesis consists of expressive and receptive language skills. The emotion diathesis consists of proclivities to emotional reactivity and regulation of emotion, and the emotion stressor consists of experimentally manipulated emotional inductions prior to narrative speaking tasks. Preschool-age children who do and do not stutter were exposed to three emotion-producing overheard conversations—neutral, positive, and angry. Emotion and emotion-regulatory behaviors were coded while participants listened to each conversation and while telling a story after each overheard conversation. Instances of stuttering during each story were counted. Although there was no main effect of conversation type, results indicated that stuttering in preschool-age children is influenced by emotion and language diatheses, as well as coping strategies and situational emotional stressors. Findings support the dual diathesis-stressor model of stuttering. PMID:22016200
Children's Acoustic and Linguistic Adaptations to Peers With Hearing Impairment.
Granlund, Sonia; Hazan, Valerie; Mahon, Merle
2018-05-17
This study aims to examine the clear speaking strategies used by older children when interacting with a peer with hearing loss, focusing on both acoustic and linguistic adaptations in speech. The Grid task, a problem-solving task developed to elicit spontaneous interactive speech, was used to obtain a range of global acoustic and linguistic measures. Eighteen 9- to 14-year-old children with normal hearing (NH) performed the task in pairs, once with a friend with NH and once with a friend with a hearing impairment (HI). In HI-directed speech, children increased their fundamental frequency range and midfrequency intensity, decreased the number of words per phrase, and expanded their vowel space area by increasing F1 and F2 range, relative to NH-directed speech. However, participants did not appear to make changes to their articulation rate, the lexical frequency of content words, or lexical diversity when talking to their friend with HI compared with their friend with NH. Older children show evidence of listener-oriented adaptations to their speech production; although their speech production systems are still developing, they are able to make speech adaptations to benefit the needs of a peer with HI, even without being given a specific instruction to do so. https://doi.org/10.23641/asha.6118817.
Gaudrain, Etienne; Carlyon, Robert P
2013-01-01
Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.
Gaudrain, Etienne; Carlyon, Robert P.
2013-01-01
Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish target and masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed. PMID:23297922
The influence of target-masker similarity on across-ear interference in dichotic listening
NASA Astrophysics Data System (ADS)
Brungart, Douglas; Simpson, Brian
2004-05-01
In most dichotic listening tasks, the comprehension of a target speech signal presented in one ear is unaffected by the presence of irrelevant speech in the opposite ear. However, recent results have shown that contralaterally presented interfering speech signals do influence performance when a second interfering speech signal is present in the same ear as the target speech. In this experiment, we examined the influence of target-masker similarity on this effect by presenting ipsilateral and contralateral masking phrases spoken by the same talker, a different same-sex talker, or a different-sex talker than the one used to generate the target speech. The results show that contralateral target-masker similarity has the greatest influence on performance when an easily segregated different-sex masker is presented in the target ear, and the least influence when a difficult-to-segregate same-talker masker is presented in the target ear. These results indicate that across-ear interference in dichotic listening is not directly related to the difficulty of the segregation task in the target ear, and suggest that contralateral maskers are least likely to interfere with dichotic speech perception when the same general strategy could be used to segregate the target from the masking voices in the ipsilateral and contralateral ears.
Orchestrating Semiotic Resources in Explicit Strategy Instruction
ERIC Educational Resources Information Center
Shanahan, Lynn E.; Flury-Kashmanian, Caroline
2014-01-01
Research and pedagogical information provided to teachers on implementing explicit strategy instruction has primarily focused on teachers' speech, with limited attention to other modes of communication, such as gesture and artefacts. This interpretive case study investigates two teachers' use of different semiotic resources when introducing…
Speech parts as Poisson processes.
Badalamenti, A F
2001-09-01
This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.
Role of maternal gesture use in speech use by children with fragile X syndrome.
Hahn, Laura J; Zimmer, B Jean; Brady, Nancy C; Swinburne Romine, Rebecca E; Fleming, Kandace K
2014-05-01
The purpose of this study was to investigate how maternal gesture relates to speech production by children with fragile X syndrome (FXS). Participants were 27 young children with FXS (23 boys, 4 girls) and their mothers. Videotaped home observations were conducted between the ages of 25 and 37 months (toddler period) and again between the ages of 60 and 71 months (child period). The videos were later coded for types of maternal utterances and maternal gestures that preceded child speech productions. Children were also assessed with the Mullen Scales of Early Learning at both ages. Maternal gesture use in the toddler period was positively related to expressive language scores at both age periods and was related to receptive language scores in the child period. Maternal proximal pointing, in comparison to other gestures, evoked more speech responses from children during the mother-child interactions, particularly when combined with wh-questions. This study adds to the growing body of research on the importance of contextual variables, such as maternal gestures, in child language development. Parental gesture use may be an easily added ingredient to parent-focused early language intervention programs.
NASA Astrophysics Data System (ADS)
Mapp, Peter
2002-11-01
Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Tavano, Alessandro; Pesarin, Anna; Murino, Vittorio; Cristani, Marco
2014-01-01
Individuals with Asperger syndrome/High Functioning Autism fail to spontaneously attribute mental states to the self and others, a life-long phenotypic characteristic known as mindblindness. We hypothesized that mindblindness would affect the dynamics of conversational interaction. Using generative models, in particular Gaussian mixture models and observed influence models, conversations were coded as interacting Markov processes, operating on novel speech/silence patterns, termed Steady Conversational Periods (SCPs). SCPs assume that whenever an agent's process changes state (e.g., from silence to speech), it causes a general transition of the entire conversational process, forcing inter-actant synchronization. SCPs fed into observed influence models, which captured the conversational dynamics of children and adolescents with Asperger syndrome/High Functioning Autism, and age-matched typically developing participants. Analyzing the parameters of the models by means of discriminative classifiers, the dialogs of patients were successfully distinguished from those of control participants. We conclude that meaning-free speech/silence sequences, reflecting inter-actant synchronization, at least partially encode typical and atypical conversational dynamics. This suggests a direct influence of theory of mind abilities onto basic speech initiative behavior. PMID:24489674
Argument Structure, Speech Acts, and Roles in Child-Adult Dispute Episodes.
ERIC Educational Resources Information Center
Prescott, Barbara L.
A study identified discourse patterns in potential disputes, deflected disputes, incomplete, and completed disputes from a one-hour conversation involving two 3-year-old female children and one female adult. These varied dispute episodes were identified, coded, and analyzed using a pragmatic model of adult argumentation focusing on the structures,…
ERIC Educational Resources Information Center
Cox, David J.
2012-01-01
To address the developmental deficits of children with autism, several disciplines have come to the forefront within intervention programs. These are speech-pathologists, psychologists/counselors, occupational-therapists/physical-therapists, special-education consultants, behavior analysts, and physicians/medical personnel. As the field of autism…
Speech Perception Deficits in Poor Readers: Auditory Processing or Phonological Coding?
ERIC Educational Resources Information Center
Mody, Maria; And Others
1997-01-01
Forty second-graders, 20 good and 20 poor readers, completed a /ba/-/da/ temporal order judgment (TOJ) task. The groups did not differ in TOJ when /ba/ and /da/ were paired with more easily discriminated syllables. Poor readers' difficulties with /ba/-/da/ reflected perceptual confusion between phonetically similar syllables rather than difficulty…
Predicting Phonetic Transcription Agreement: Insights from Research in Infant Vocalizations
ERIC Educational Resources Information Center
Ramsdell, Heather L.; Oller, D. Kimbrough; Ethington, Corinna A.
2007-01-01
The purpose of this study is to provide new perspectives on correlates of phonetic transcription agreement. Our research focuses on phonetic transcription and coding of infant vocalizations. The findings are presumed to be broadly applicable to other difficult cases of transcription, such as found in severe disorders of speech, which similarly…
Searching for Syllabic Coding Units in Speech Perception
ERIC Educational Resources Information Center
Dumay, Nicolas; Content, Alain
2012-01-01
Two auditory priming experiments tested whether the effect of final phonological overlap relies on syllabic representations. Amount of shared phonemic information and syllabic status of the overlap between nonword primes and targets were varied orthogonally. In the related conditions, CV.CCVC items shared the last syllable (e.g., vi.klyd-p[image…
The Effects of Prohibiting Gestures on Children's Lexical Retrieval Ability
ERIC Educational Resources Information Center
Pine, Karen J.; Bird, Hannah; Kirk, Elizabeth
2007-01-01
Two alternative accounts have been proposed to explain the role of gestures in thinking and speaking. The Information Packaging Hypothesis (Kita, 2000) claims that gestures are important for the conceptual packaging of information before it is coded into a linguistic form for speech. The Lexical Retrieval Hypothesis (Rauscher, Krauss & Chen, 1996)…
HCPCS Coding: An Integral Part of Your Reimbursement Strategy.
Nusgart, Marcia
2013-12-01
The first step to a successful reimbursement strategy is to ensure that your wound care product has the most appropriate Healthcare Common Procedure Coding System (HCPCS) code (or billing) for your product. The correct HCPCS code plays an essential role in patient access to new and existing technologies. When devising a strategy to obtain a HCPCS code for its product, companies must consider a number of factors as follows: (1) Has the product gone through the Food and Drug Administration (FDA) regulatory process or does it need to do so? Will the FDA code designation impact which HCPCS code will be assigned to your product? (2) In what "site of service" do you intend to market your product? Where will your customers use the product? Which coding system (CPT ® or HCPCS) applies to your product? (3) Does a HCPCS code for a similar product already exist? Does your product fit under the existing HCPCS code? (4) Does your product need a new HCPCS code? What is the linkage, if any, between coding, payment, and coverage for the product? Researchers and companies need to start early and place the same emphasis on a reimbursement strategy as it does on a regulatory strategy. Your reimbursement strategy staff should be involved early in the process, preferably during product research and development and clinical trial discussions.
Van Hoesel, Richard; Ramsden, Richard; Odriscoll, Martin
2002-04-01
To characterize some of the benefits available from using two cochlear implants compared with just one, sound-direction identification (ID) abilities, sensitivity to interaural time delays (ITDs) and speech intelligibility in noise were measured for a bilateral multi-channel cochlear implant user. Sound-direction ID in the horizontal plane was tested with a bilateral cochlear implant user. The subject was tested both unilaterally and bilaterally using two independent behind-the-ear ESPRIT (Cochlear Ltd.) processors, as well as bilaterally using custom research processors. Pink noise bursts were presented using an 11-loudspeaker array spanning the subject's frontal 180 degrees arc in an anechoic room. After each burst, the subject was asked to identify which loudspeaker had produced the sound. No explicit training, and no feedback were given. Presentation levels were nominally at 70 dB SPL, except for a repeat experiment using the clinical devices where the presentation levels were reduced to 60 dB SPL to avoid activation of the devices' automatic gain control (AGC) circuits. Overall presentation levels were randomly varied by +/- 3 dB. For the research processor, a "low-update-rate" and a "high-update-rate" strategy were tested. Direct measurements of ITD just noticeable differences (JNDs) were made using a 3 AFC paradigm targeting 70% correct performance on the psychometric function. Stimuli included simple, low-rate electrical pulse trains as well as high-rate pulse trains modulated at 100 Hz. Speech data comparing monaural and binaural performance in noise were also collected with both low, and high update-rate strategies on the research processors. Open-set sentences were presented from directly in front of the subject and competing multi-talker babble noise was presented from the same loudspeaker, or from a loudspeaker placed 90 degrees to the left or right of the subject. For the sound-direction ID task, monaural performance using the clinical devices showed large mean absolute errors of 81 degrees and 73 degrees, with standard deviations (averaged across all 11 loud-speakers) of 10 degrees and 17 degrees, for left and right ears, respectively. Fore bilateral device use at a presentation level of 70 dB SPL, the mean error improved to about 16 degrees with an average standard deviation of 18 degrees. When the presentation level was decreased to 60 dB SPL to avoid activation of the automatic gain control (AGC) circuits in the clinical processors, the mean response error improved further to 8 degrees with a standard deviation of 13 degrees. Further tests with the custom research processors, which had a higher stimulation rate and did not include AGCs, showed comparable response errors: around 8 or 9 degrees and a standard deviation of about 11 degrees for both update rates. The best ITD JNDs measured for this subject were between 350 to 400 microsec for simple low-rate pulse trains. Speech results showed a substantial headshadow advantage for bilateral device use when speech and noise were spatially separated, but little evidence of binaural unmasking. For spatially coincident speech and noise, listening with both ears showed similar results to listening with either side alone when loudness summation was compensated for. No significant differences were observed between binaural results for high and low update-rates in any test configuration. Only for monaural listening in one test configuration did the high rate show a small significant improvement over the low rate. Results show that even if interaural time delay cues are not well coded or perceived, bilateral implants can offer important advantages, both for speech in noise as well as for sound-direction identification.
Are written and spoken recall of text equivalent?
Kellogg, Ronald T
2007-01-01
Writing is less practiced than speaking, graphemic codes are activated only in writing, and the retrieved representations of the text must be maintained in working memory longer because handwritten output is slower than speech. These extra demands on working memory could result in less effort being given to retrieval during written compared with spoken text recall. To test this hypothesis, college students read or heard Bartlett's "War of the Ghosts" and then recalled the text in writing or speech. Spoken recall produced more accurately recalled propositions and more major distortions (e.g., inferences) than written recall. The results suggest that writing reduces the retrieval effort given to reconstructing the propositions of a text.
Leblanc, Linda A; Geiger, Kaneen B; Sautter, Rachael A; Sidener, Tina M
2007-01-01
The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated in a multiple baseline design across participants. Data were collected on appropriate and inappropriate vocalizations with appropriate vocalizations coded as prompted or unprompted during baseline and treatment sessions. All participants experienced increases in appropriate speech during NLP with variable response patterns. Additionally, the two participants with substantial inappropriate vocalizations showed decreases in inappropriate speech. Implications for intervention in day programs are discussed.
Age Estimation Based on Children's Voice: A Fuzzy-Based Decision Fusion Strategy
Ting, Hua-Nong
2014-01-01
Automatic estimation of a speaker's age is a challenging research topic in the area of speech analysis. In this paper, a novel approach to estimate a speaker's age is presented. The method features a “divide and conquer” strategy wherein the speech data are divided into six groups based on the vowel classes. There are two reasons behind this strategy. First, reduction in the complicated distribution of the processing data improves the classifier's learning performance. Second, different vowel classes contain complementary information for age estimation. Mel-frequency cepstral coefficients are computed for each group and single layer feed-forward neural networks based on self-adaptive extreme learning machine are applied to the features to make a primary decision. Subsequently, fuzzy data fusion is employed to provide an overall decision by aggregating the classifier's outputs. The results are then compared with a number of state-of-the-art age estimation methods. Experiments conducted based on six age groups including children aged between 7 and 12 years revealed that fuzzy fusion of the classifier's outputs resulted in considerable improvement of up to 53.33% in age estimation accuracy. Moreover, the fuzzy fusion of decisions aggregated the complementary information of a speaker's age from various speech sources. PMID:25006595
CACTI: free, open-source software for the sequential coding of behavioral interactions.
Glynn, Lisa H; Hallgren, Kevin A; Houck, Jon M; Moyers, Theresa B
2012-01-01
The sequential analysis of client and clinician speech in psychotherapy sessions can help to identify and characterize potential mechanisms of treatment and behavior change. Previous studies required coding systems that were time-consuming, expensive, and error-prone. Existing software can be expensive and inflexible, and furthermore, no single package allows for pre-parsing, sequential coding, and assignment of global ratings. We developed a free, open-source, and adaptable program to meet these needs: The CASAA Application for Coding Treatment Interactions (CACTI). Without transcripts, CACTI facilitates the real-time sequential coding of behavioral interactions using WAV-format audio files. Most elements of the interface are user-modifiable through a simple XML file, and can be further adapted using Java through the terms of the GNU Public License. Coding with this software yields interrater reliabilities comparable to previous methods, but at greatly reduced time and expense. CACTI is a flexible research tool that can simplify psychotherapy process research, and has the potential to contribute to the improvement of treatment content and delivery.
Examining the relationship between comprehension and production processes in code-switched language
Guzzardo Tamargo, Rosa E.; Valdés Kroff, Jorge R.; Dussias, Paola E.
2016-01-01
We employ code-switching (the alternation of two languages in bilingual communication) to test the hypothesis, derived from experience-based models of processing (e.g., Boland, Tanenhaus, Carlson, & Garnsey, 1989; Gennari & MacDonald, 2009), that bilinguals are sensitive to the combinatorial distributional patterns derived from production and that they use this information to guide processing during the comprehension of code-switched sentences. An analysis of spontaneous bilingual speech confirmed the existence of production asymmetries involving two auxiliary + participle phrases in Spanish–English code-switches. A subsequent eye-tracking study with two groups of bilingual code-switchers examined the consequences of the differences in distributional patterns found in the corpus study for comprehension. Participants’ comprehension costs mirrored the production patterns found in the corpus study. Findings are discussed in terms of the constraints that may be responsible for the distributional patterns in code-switching production and are situated within recent proposals of the links between production and comprehension. PMID:28670049
Examining the relationship between comprehension and production processes in code-switched language.
Guzzardo Tamargo, Rosa E; Valdés Kroff, Jorge R; Dussias, Paola E
2016-08-01
We employ code-switching (the alternation of two languages in bilingual communication) to test the hypothesis, derived from experience-based models of processing (e.g., Boland, Tanenhaus, Carlson, & Garnsey, 1989; Gennari & MacDonald, 2009), that bilinguals are sensitive to the combinatorial distributional patterns derived from production and that they use this information to guide processing during the comprehension of code-switched sentences. An analysis of spontaneous bilingual speech confirmed the existence of production asymmetries involving two auxiliary + participle phrases in Spanish-English code-switches. A subsequent eye-tracking study with two groups of bilingual code-switchers examined the consequences of the differences in distributional patterns found in the corpus study for comprehension. Participants' comprehension costs mirrored the production patterns found in the corpus study. Findings are discussed in terms of the constraints that may be responsible for the distributional patterns in code-switching production and are situated within recent proposals of the links between production and comprehension.
Experience with code-switching modulates the use of grammatical gender during sentence processing
Valdés Kroff, Jorge R.; Dussias, Paola E.; Gerfen, Chip; Perrotti, Lauren; Bajo, M. Teresa
2016-01-01
Using code-switching as a tool to illustrate how language experience modulates comprehension, the visual world paradigm was employed to examine the extent to which gender-marked Spanish determiners facilitate upcoming target nouns in a group of Spanish-English bilingual code-switchers. The first experiment tested target Spanish nouns embedded in a carrier phrase (Experiment 1b) and included a control Spanish monolingual group (Experiment 1a). The second set of experiments included critical trials in which participants heard code-switches from Spanish determiners into English nouns (e.g., la house) either in a fixed carrier phrase (Experiment 2a) or in variable and complex sentences (Experiment 2b). Across the experiments, bilinguals revealed an asymmetric gender effect in processing, showing facilitation only for feminine target items. These results reflect the asymmetric use of gender in the production of code-switched speech. The extension of the asymmetric effect into Spanish (Experiment 1b) underscores the permeability between language modes in bilingual code-switchers. PMID:28663771
How do auditory cortex neurons represent communication sounds?
Gaucher, Quentin; Huetz, Chloé; Gourévitch, Boris; Laudanski, Jonathan; Occelli, Florian; Edeline, Jean-Marc
2013-11-01
A major goal in auditory neuroscience is to characterize how communication sounds are represented at the cortical level. The present review aims at investigating the role of auditory cortex in the processing of speech, bird songs and other vocalizations, which all are spectrally and temporally highly structured sounds. Whereas earlier studies have simply looked for neurons exhibiting higher firing rates to particular conspecific vocalizations over their modified, artificially synthesized versions, more recent studies determined the coding capacity of temporal spike patterns, which are prominent in primary and non-primary areas (and also in non-auditory cortical areas). In several cases, this information seems to be correlated with the behavioral performance of human or animal subjects, suggesting that spike-timing based coding strategies might set the foundations of our perceptive abilities. Also, it is now clear that the responses of auditory cortex neurons are highly nonlinear and that their responses to natural stimuli cannot be predicted from their responses to artificial stimuli such as moving ripples and broadband noises. Since auditory cortex neurons cannot follow rapid fluctuations of the vocalizations envelope, they only respond at specific time points during communication sounds, which can serve as temporal markers for integrating the temporal and spectral processing taking place at subcortical relays. Thus, the temporal sparse code of auditory cortex neurons can be considered as a first step for generating high level representations of communication sounds independent of the acoustic characteristic of these sounds. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives". Copyright © 2013 Elsevier B.V. All rights reserved.
ANN modeling of DNA sequences: new strategies using DNA shape code.
Parbhane, R V; Tambe, S S; Kulkarni, B D
2000-09-01
Two new encoding strategies, namely, wedge and twist codes, which are based on the DNA helical parameters, are introduced to represent DNA sequences in artificial neural network (ANN)-based modeling of biological systems. The performance of the new coding strategies has been evaluated by conducting three case studies involving mapping (modeling) and classification applications of ANNs. The proposed coding schemes have been compared rigorously and shown to outperform the existing coding strategies especially in situations wherein limited data are available for building the ANN models.
Wilhelms, Susanne B; Huss, Fredrik R; Granath, Göran; Sjöberg, Folke
2010-06-01
To compare three International Classification of Diseases code abstraction strategies that have previously been reported to mirror severe sepsis by examining retrospective Swedish national data from 1987 to 2005 inclusive. Retrospective cohort study. Swedish hospital discharge database. All hospital admissions during the period 1987 to 2005 were extracted and these patients were screened for severe sepsis using the three International Classification of Diseases code abstraction strategies, which were adapted for the Swedish version of the International Classification of Diseases. Two code abstraction strategies included both International Classification of Diseases, Ninth Revision and International Classification of Diseases, Tenth Revision codes, whereas one included International Classification of Diseases, Tenth Revision codes alone. None. The three International Classification of Diseases code abstraction strategies identified 37,990, 27,655, and 12,512 patients, respectively, with severe sepsis. The incidence increased over the years, reaching 0.35 per 1000, 0.43 per 1000, and 0.13 per 1000 inhabitants, respectively. During the International Classification of Diseases, Ninth Revision period, we found 17,096 unique patients and of these, only 2789 patients (16%) met two of the code abstraction strategy lists and 14,307 (84%) met one list. The International Classification of Diseases, Tenth Revision period included 46,979 unique patients, of whom 8% met the criteria of all three International Classification of Diseases code abstraction strategies, 7% met two, and 84% met one only. The three different International Classification of Diseases code abstraction strategies generated three almost separate cohorts of patients with severe sepsis. Thus, the International Classification of Diseases code abstraction strategies for recording severe sepsis in use today provides an unsatisfactory way of estimating the true incidence of severe sepsis. Further studies relating International Classification of Diseases code abstraction strategies to the American College of Chest Physicians/Society of Critical Care Medicine scores are needed.
Language-learning disabilities: Paradigms for the nineties.
Wiig, E H
1991-01-01
We are beginning a decade, during which many traditional paradigms in education, special education, and speech-language pathology will undergo change. Among paradigms considered promising for speech-language pathology in the schools are collaborative language intervention and strategy training for language and communication. This presentation introduces management models for developing a collaborative language intervention process, among them the Deming Management Method for Total Quality (TQ) (Deming 1986). Implementation models for language assessment and IEP planning and multicultural issues are also introduced (Damico and Nye 1990; Secord and Wiig in press). While attention to processes involved in developing and implementing collaborative language intervention is paramount, content should not be neglected. To this end, strategy training for language and communication is introduced as a viable paradigm. Macro- and micro-level process models for strategy training are featured and general issues are discussed (Ellis, Deshler, and Schumaker 1989; Swanson 1989; Wiig 1989).
Strategies to Improve Regeneration of the Soft Palate Muscles After Cleft Palate Repair
Carvajal Monroy, Paola L.; Grefte, Sander; Kuijpers-Jagtman, Anne Marie; Wagener, Frank A.D.T.G.
2012-01-01
Children with a cleft in the soft palate have difficulties with speech, swallowing, and sucking. These patients are unable to separate the nasal from the oral cavity leading to air loss during speech. Although surgical repair ameliorates soft palate function by joining the clefted muscles of the soft palate, optimal function is often not achieved. The regeneration of muscles in the soft palate after surgery is hampered because of (1) their low intrinsic regenerative capacity, (2) the muscle properties related to clefting, and (3) the development of fibrosis. Adjuvant strategies based on tissue engineering may improve the outcome after surgery by approaching these specific issues. Therefore, this review will discuss myogenesis in the noncleft and cleft palate, the characteristics of soft palate muscles, and the process of muscle regeneration. Finally, novel therapeutic strategies based on tissue engineering to improve soft palate function after surgical repair are presented. PMID:22697475
Strategies to improve regeneration of the soft palate muscles after cleft palate repair.
Carvajal Monroy, Paola L; Grefte, Sander; Kuijpers-Jagtman, Anne Marie; Wagener, Frank A D T G; Von den Hoff, Johannes W
2012-12-01
Children with a cleft in the soft palate have difficulties with speech, swallowing, and sucking. These patients are unable to separate the nasal from the oral cavity leading to air loss during speech. Although surgical repair ameliorates soft palate function by joining the clefted muscles of the soft palate, optimal function is often not achieved. The regeneration of muscles in the soft palate after surgery is hampered because of (1) their low intrinsic regenerative capacity, (2) the muscle properties related to clefting, and (3) the development of fibrosis. Adjuvant strategies based on tissue engineering may improve the outcome after surgery by approaching these specific issues. Therefore, this review will discuss myogenesis in the noncleft and cleft palate, the characteristics of soft palate muscles, and the process of muscle regeneration. Finally, novel therapeutic strategies based on tissue engineering to improve soft palate function after surgical repair are presented.
Strategies for Teaching Advertising: A Summary.
ERIC Educational Resources Information Center
Flory, Joyce
This paper offers techniques and strategies which high school and college teachers of speech communication can use for teaching units and/or courses in advertising. One such technique is role playing, which can involve the corporate chairperson, the executive coordinator, and chairpersons for magazine advertising, outdoor advertising, broadcast…