Autistic traits and attention to speech: Evidence from typically developing individuals.
Korhonen, Vesa; Werner, Stefan
2017-04-01
Individuals with autism spectrum disorder have a preference for attending to non-speech stimuli over speech stimuli. We are interested in whether non-speech preference is only a feature of diagnosed individuals, and whether we can we test implicit preference experimentally. In typically developed individuals, serial recall is disrupted more by speech stimuli than by non-speech stimuli. Since behaviour of individuals with autistic traits resembles that of individuals with autism, we have used serial recall to test whether autistic traits influence task performance during irrelevant speech sounds. The errors made on the serial recall task during speech or non-speech sounds were counted as a measure of speech or non-speech preference in relation to no sound condition. We replicated the serial order effect and found the speech to be more disruptive than the non-speech sounds, but were unable to find any associations between the autism quotient scores and the non-speech sounds. Our results may indicate a learnt behavioural response to speech sounds.
Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah
2018-01-01
Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate "Correct"/"Incorrect" feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a "Wizard of Oz" experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human "Wizard" will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children.
Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah
2018-01-01
Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate “Correct”/”Incorrect” feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a “Wizard of Oz” experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human “Wizard” will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children. PMID:29674986
DETECTION AND IDENTIFICATION OF SPEECH SOUNDS USING CORTICAL ACTIVITY PATTERNS
Centanni, T.M.; Sloan, A.M.; Reed, A.C.; Engineer, C.T.; Rennaker, R.; Kilgard, M.P.
2014-01-01
We have developed a classifier capable of locating and identifying speech sounds using activity from rat auditory cortex with an accuracy equivalent to behavioral performance without the need to specify the onset time of the speech sounds. This classifier can identify speech sounds from a large speech set within 40 ms of stimulus presentation. To compare the temporal limits of the classifier to behavior, we developed a novel task that requires rats to identify individual consonant sounds from a stream of distracter consonants. The classifier successfully predicted the ability of rats to accurately identify speech sounds for syllable presentation rates up to 10 syllables per second (up to 17.9 ± 1.5 bits/sec), which is comparable to human performance. Our results demonstrate that the spatiotemporal patterns generated in primary auditory cortex can be used to quickly and accurately identify consonant sounds from a continuous speech stream without prior knowledge of the stimulus onset times. Improved understanding of the neural mechanisms that support robust speech processing in difficult listening conditions could improve the identification and treatment of a variety of speech processing disorders. PMID:24286757
Developing a Weighted Measure of Speech Sound Accuracy
Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.
2010-01-01
Purpose The purpose is to develop a system for numerically quantifying a speaker’s phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, we describe a system for differentially weighting speech sound errors based on various levels of phonetic accuracy with a Weighted Speech Sound Accuracy (WSSA) score. We then evaluate the reliability and validity of this measure. Method Phonetic transcriptions are analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy is compared to existing measures, is used to discriminate typical and disordered speech production, and is evaluated to determine whether it is sensitive to changes in phonetic accuracy over time. Results Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners’ judgments of severity of a child’s speech disorder. The measure separates children with and without speech sound disorders. WSSA scores also capture growth in phonetic accuracy in toddler’s speech over time. Conclusion Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children’s speech. PMID:20699344
Tomblin, J. Bruce; Peng, Shu-Chen; Spencer, Linda J.; Lu, Nelson
2011-01-01
Purpose This study characterized the development of speech sound production in prelingually deaf children with a minimum of 8 years of cochlear implant (CI) experience. Method Twenty-seven pediatric CI recipients' spontaneous speech samples from annual evaluation sessions were phonemically transcribed. Accuracy for these speech samples was evaluated in piecewise regression models. Results As a group, pediatric CI recipients showed steady improvement in speech sound production following implantation, but the improvement rate declined after 6 years of device experience. Piecewise regression models indicated that the slope estimating the participants' improvement rate was statistically greater than 0 during the first 6 years postimplantation, but not after 6 years. The group of pediatric CI recipients' accuracy of speech sound production after 4 years of device experience reasonably predicts their speech sound production after 5–10 years of device experience. Conclusions The development of speech sound production in prelingually deaf children stabilizes after 6 years of device experience, and typically approaches a plateau by 8 years of device use. Early growth in speech before 4 years of device experience did not predict later rates of growth or levels of achievement. However, good predictions could be made after 4 years of device use. PMID:18695018
Developing a weighted measure of speech sound accuracy.
Preston, Jonathan L; Ramsdell, Heather L; Oller, D Kimbrough; Edwards, Mary Louise; Tobin, Stephen J
2011-02-01
To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound Accuracy (WSSA) score. The authors then evaluate the reliability and validity of this measure. Phonetic transcriptions were analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy was validated against existing measures, was used to discriminate typical and disordered speech production, and was evaluated to examine sensitivity to changes in phonetic accuracy over time. Reliability between transcribers and consistency of scores among different word sets and testing points are compared. Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners' judgments of the severity of a child's speech disorder. The measure separates children with and without speech sound disorders and captures growth in phonetic accuracy in toddlers' speech over time. The measure correlates highly across transcribers, word lists, and testing points. Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children's speech.
Developing a Weighted Measure of Speech Sound Accuracy
ERIC Educational Resources Information Center
Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.
2011-01-01
Purpose: To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound…
NASA Astrophysics Data System (ADS)
Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki
2012-07-01
Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.
Cortical activity patterns predict robust speech discrimination ability in noise
Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.
2012-01-01
The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
The Long Road to Automation: Neurocognitive Development of Letter-Speech Sound Processing
ERIC Educational Resources Information Center
Froyen, Dries J. W.; Bonte, Milene L.; van Atteveldt, Nienke; Blomert, Leo
2009-01-01
In transparent alphabetic languages, the expected standard for complete acquisition of letter-speech sound associations is within one year of reading instruction. The neural mechanisms underlying the acquisition of letter-speech sound associations have, however, hardly been investigated. The present article describes an ERP study with beginner and…
Jansson-Verkasalo, Eira; Eggers, Kurt; Järvenpää, Anu; Suominen, Kalervo; Van den Bergh, Bea; De Nil, Luc; Kujala, Teija
2014-09-01
Recent theoretical conceptualizations suggest that disfluencies in stuttering may arise from several factors, one of them being atypical auditory processing. The main purpose of the present study was to investigate whether speech sound encoding and central auditory discrimination, are affected in children who stutter (CWS). Participants were 10 CWS, and 12 typically developing children with fluent speech (TDC). Event-related potentials (ERPs) for syllables and syllable changes [consonant, vowel, vowel-duration, frequency (F0), and intensity changes], critical in speech perception and language development of CWS were compared to those of TDC. There were no significant group differences in the amplitudes or latencies of the P1 or N2 responses elicited by the standard stimuli. However, the Mismatch Negativity (MMN) amplitude was significantly smaller in CWS than in TDC. For TDC all deviants of the linguistic multifeature paradigm elicited significant MMN amplitudes, comparable with the results found earlier with the same paradigm in 6-year-old children. In contrast, only the duration change elicited a significant MMN in CWS. The results showed that central auditory speech-sound processing was typical at the level of sound encoding in CWS. In contrast, central speech-sound discrimination, as indexed by the MMN for multiple sound features (both phonetic and prosodic), was atypical in the group of CWS. Findings were linked to existing conceptualizations on stuttering etiology. The reader will be able (a) to describe recent findings on central auditory speech-sound processing in individuals who stutter, (b) to describe the measurement of auditory reception and central auditory speech-sound discrimination, (c) to describe the findings of central auditory speech-sound discrimination, as indexed by the mismatch negativity (MMN), in children who stutter. Copyright © 2014 Elsevier Inc. All rights reserved.
Multi-sensory learning and learning to read.
Blomert, Leo; Froyen, Dries
2010-09-01
The basis of literacy acquisition in alphabetic orthographies is the learning of the associations between the letters and the corresponding speech sounds. In spite of this primacy in learning to read, there is only scarce knowledge on how this audiovisual integration process works and which mechanisms are involved. Recent electrophysiological studies of letter-speech sound processing have revealed that normally developing readers take years to automate these associations and dyslexic readers hardly exhibit automation of these associations. It is argued that the reason for this effortful learning may reside in the nature of the audiovisual process that is recruited for the integration of in principle arbitrarily linked elements. It is shown that letter-speech sound integration does not resemble the processes involved in the integration of natural audiovisual objects such as audiovisual speech. The automatic symmetrical recruitment of the assumedly uni-sensory visual and auditory cortices in audiovisual speech integration does not occur for letter and speech sound integration. It is also argued that letter-speech sound integration only partly resembles the integration of arbitrarily linked unfamiliar audiovisual objects. Letter-sound integration and artificial audiovisual objects share the necessity of a narrow time window for integration to occur. However, they differ from these artificial objects, because they constitute an integration of partly familiar elements which acquire meaning through the learning of an orthography. Although letter-speech sound pairs share similarities with audiovisual speech processing as well as with unfamiliar, arbitrary objects, it seems that letter-speech sound pairs develop into unique audiovisual objects that furthermore have to be processed in a unique way in order to enable fluent reading and thus very likely recruit other neurobiological learning mechanisms than the ones involved in learning natural or arbitrary unfamiliar audiovisual associations. Copyright 2010 Elsevier B.V. All rights reserved.
Barton-Hulsey, Andrea; Sevcik, Rose A; Romski, MaryAnn
2018-05-03
A number of intrinsic factors, including expressive speech skills, have been suggested to place children with developmental disabilities at risk for limited development of reading skills. This study examines the relationship between these factors, speech ability, and children's phonological awareness skills. A nonexperimental study design was used to examine the relationship between intrinsic skills of speech, language, print, and letter-sound knowledge to phonological awareness in 42 children with developmental disabilities between the ages of 48 and 69 months. Hierarchical multiple regression was done to determine if speech ability accounted for a unique amount of variance in phonological awareness skill beyond what would be expected by developmental skills inclusive of receptive language and print and letter-sound knowledge. A range of skill in all areas of direct assessment was found. Children with limited speech were found to have emerging skills in print knowledge, letter-sound knowledge, and phonological awareness. Speech ability did not predict a significant amount of variance in phonological awareness beyond what would be expected by developmental skills of receptive language and print and letter-sound knowledge. Children with limited speech ability were found to have receptive language and letter-sound knowledge that supported the development of phonological awareness skills. This study provides implications for practitioners and researchers concerning the factors related to early reading development in children with limited speech ability and developmental disabilities.
The sound symbolism bootstrapping hypothesis for language acquisition and language evolution
Imai, Mutsumi; Kita, Sotaro
2014-01-01
Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture. PMID:25092666
ERIC Educational Resources Information Center
Hodge, Megan M.; Gotzke, Carrie L.
2011-01-01
Listeners' identification of young children's productions of minimally contrastive words and predictive relationships between accurately identified words and intelligibility scores obtained from a 100-word spontaneous speech sample were determined for 36 children with typically developing speech (TDS) and 36 children with speech sound disorders…
... sound different from the way it normally sounds. Causes Some of these disorders develop gradually, but anyone can develop a speech and language impairment suddenly, usually in a trauma. APHASIA Alzheimer disease Brain tumor (more common in aphasia than ...
Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P
2017-01-01
Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Stefanidou, Chrysi; McCleery, Joseph P.
2017-01-01
Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6—year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age. PMID:28738063
The sound symbolism bootstrapping hypothesis for language acquisition and language evolution.
Imai, Mutsumi; Kita, Sotaro
2014-09-19
Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
ERIC Educational Resources Information Center
Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen
2017-01-01
Purpose: The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables--words of three or more syllables--are important to consider because unlike…
NASA Astrophysics Data System (ADS)
Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki
2013-07-01
Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed. However, there is room for improvement particularly in terms of sound quality. BCU speech is accompanied by a strong high-pitched tone and contain some distortion. In this study, the sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulations] and air-conducted (AC) speech was quantitatively evaluated using semantic differential and factor analysis. The results showed that all the types of BCU speech had higher metallic and lower esthetic factor scores than AC speech. On the other hand, transposed speech was closer than the other types of BCU speech to AC speech generally; the transposed speech showed a higher powerfulness factor score than the other types of BCU speech and a higher esthetic factor score than DSB-SC speech. These results provide useful information for further development of the BCUHA.
Nonspeech oral motor treatment issues related to children with developmental speech sound disorders.
Ruscello, Dennis M
2008-07-01
This article examines nonspeech oral motor treatments (NSOMTs) in the population of clients with developmental speech sound disorders. NSOMTs are a collection of nonspeech methods and procedures that claim to influence tongue, lip, and jaw resting postures; increase strength; improve muscle tone; facilitate range of motion; and develop muscle control. In the case of developmental speech sound disorders, NSOMTs are employed before or simultaneous with actual speech production treatment. First, NSOMTs are defined for the reader, and there is a discussion of NSOMTs under the categories of active muscle exercise, passive muscle exercise, and sensory stimulation. Second, different theories underlying NSOMTs along with the implications of the theories are discussed. Finally, a review of pertinent investigations is presented. The application of NSOMTs is questionable due to a number of reservations that include (a) the implied cause of developmental speech sound disorders, (b) neurophysiologic differences between the limbs and oral musculature, (c) the development of new theories of movement and movement control, and (d) the paucity of research literature concerning NSOMTs. There is no substantive evidence to support NSOMTs as interventions for children with developmental speech sound disorders.
Speech perception skills of deaf infants following cochlear implantation: a first report
Houston, Derek M.; Pisoni, David B.; Kirk, Karen Iler; Ying, Elizabeth A.; Miyamoto, Richard T.
2012-01-01
Summary Objective We adapted a behavioral procedure that has been used extensively with normal-hearing (NH) infants, the visual habituation (VH) procedure, to assess deaf infants’ discrimination and attention to speech. Methods Twenty-four NH 6-month-olds, 24 NH 9-month-olds, and 16 deaf infants at various ages before and following cochlear implantation (CI) were tested in a sound booth on their caregiver’s lap in front of a TV monitor. During the habituation phase, each infant was presented with a repeating speech sound (e.g. ‘hop hop hop’) paired with a visual display of a checkerboard pattern on half of the trials (‘sound trials’) and only the visual display on the other half (‘silent trials’). When the infant’s looking time decreased and reached a habituation criterion, a test phase began. This consisted of two trials: an ‘old trial’ that was identical to the ‘sound trials’ and a ‘novel trial’ that consisted of a different repeating speech sound (e.g. ‘ahhh’) paired with the same checkerboard pattern. Results During the habituation phase, NH infants looked significantly longer during the sound trials than during the silent trials. However, deaf infants who had received cochlear implants (CIs) displayed a much weaker preference for the sound trials. On the other hand, both NH infants and deaf infants with CIs attended significantly longer to the visual display during the novel trial than during the old trial, suggesting that they were able to discriminate the speech patterns. Before receiving CIs, deaf infants did not show any preferences. Conclusions Taken together, the findings suggest that deaf infants who receive CIs are able to detect and discriminate some speech patterns. However, their overall attention to speech sounds may be less than NH infants’. Attention to speech may impact other aspects of speech perception and spoken language development, such as segmenting words from fluent speech and learning novel words. Implications of the effects of early auditory deprivation and age at CI on speech perception and language development are discussed. PMID:12697350
Ultrasound analysis of tongue contour for the sound [j] in adults and children.
Barberena, Luciana da Silva; Simoni, Simone Nicolini de; Souza, Rosalina Correa Sobrinho de; Moraes, Denis Altieri de Oliveira; Berti, Larissa Cristina; Keske-Soares, Márcia
2017-12-11
Analyze and compare the mean tongue contours and articulatory gestures in the production of the sound [j] in adults and children with typical and atypical speech development. The children with atypical development presented speech sound disorders. The diagnosis was determined by speech assessments. The study sample was composed of 90 individuals divided into three groups: 30 adults with typical speech development aged 19-44 years (AT), 30 children with typical speech development (CT), and 30 children with speech sound disorders, named as atypical in this study, aged four years to eight years and eleven months (CA). Ultrasonography assessment of tongue movements was performed for all groups. Mean tongue contours were compared between three groups in different vocalic contexts following the sound [j]. The maximum elevation of the tongue tip was considered for delimitation of gestures using the Articulate Assistant Advanced (AAA) software and images in sagittal plane/Mode B. The points that intercepted the language curves were analyzed by the statistical tool R. The graphs of tongue contours were obtained adopting a 95% confidence interval. After that, the regions with significant statistical differences (p<0.05) between the CT and CA groups were obtained. The mean tongue contours demonstrated the gesture for the sound [j] in the comparison between typical and atypical children. For the semivowel [j], there is an articulatory gesture of tongue and dorsum towards the center of the hard palate, with significant differences observed between the children. The results showed differences between the groups of children regarding the ability to refine articulatory gestures.
Masapollo, Matthew; Polka, Linda; Ménard, Lucie
2016-03-01
To learn to produce speech, infants must effectively monitor and assess their own speech output. Yet very little is known about how infants perceive speech produced by an infant, which has higher voice pitch and formant frequencies compared to adult or child speech. Here, we tested whether pre-babbling infants (at 4-6 months) prefer listening to vowel sounds with infant vocal properties over vowel sounds with adult vocal properties. A listening preference favoring infant vowels may derive from their higher voice pitch, which has been shown to attract infant attention in infant-directed speech (IDS). In addition, infants' nascent articulatory abilities may induce a bias favoring infant speech given that 4- to 6-month-olds are beginning to produce vowel sounds. We created infant and adult /i/ ('ee') vowels using a production-based synthesizer that simulates the act of speaking in talkers at different ages and then tested infants across four experiments using a sequential preferential listening task. The findings provide the first evidence that infants preferentially attend to vowel sounds with infant voice pitch and/or formants over vowel sounds with no infant-like vocal properties, supporting the view that infants' production abilities influence how they process infant speech. The findings with respect to voice pitch also reveal parallels between IDS and infant speech, raising new questions about the role of this speech register in infant development. Research exploring the underpinnings and impact of this perceptual bias can expand our understanding of infant language development. © 2015 John Wiley & Sons Ltd.
Phonological Encoding in Speech-Sound Disorder: Evidence from a Cross-Modal Priming Experiment
ERIC Educational Resources Information Center
Munson, Benjamin; Krause, Miriam O. P.
2017-01-01
Background: Psycholinguistic models of language production provide a framework for determining the locus of language breakdown that leads to speech-sound disorder (SSD) in children. Aims: To examine whether children with SSD differ from their age-matched peers with typical speech and language development (TD) in the ability phonologically to…
Korean speech sound development in children from bilingual Japanese-Korean environments
Kim, Jeoung Suk; Lee, Jun Ho; Choi, Yoon Mi; Kim, Hyun Gi; Kim, Sung Hwan; Lee, Min Kyung
2010-01-01
Purpose This study investigates Korean speech sound development, including articulatory error patterns, among the Japanese-Korean children whose mothers are Japanese immigrants to Korea. Methods The subjects were 28 Japanese-Korean children with normal development born to Japanese women immigrants who lived in Jeonbuk province, Korea. They were assessed through Computerized Speech Lab 4500. The control group consisted of 15 Korean children who lived in the same area. Results The values of the voice onset time of consonants /ph/, /t/, /th/, and /k*/ among the children were prolonged. The children replaced the lenis sounds with aspirated or fortis sounds rather than replacing the fortis sounds with lenis or aspirated sounds, which are typical among Japanese immigrants. The children showed numerous articulatory errors for /c/ and /l/ sounds (similar to Koreans) rather than errors on /p/ sounds, which are more frequent among Japanese immigrants. The vowel formants of the children showed a significantly prolonged vowel /o/ as compared to that of Korean children (P<0.05). The Japanese immigrants and their children showed a similar substitution /n/ for /ɧ/ [Japanese immigrants (62.5%) vs Japanese-Korean children (14.3%)], which is rarely seen among Koreans. Conclusion The findings suggest that Korean speech sound development among Japanese-Korean children is influenced not only by the Korean language environment but also by their maternal language. Therefore, appropriate language education programs may be warranted not only or immigrant women but also for their children. PMID:21189968
Speech processing using maximum likelihood continuity mapping
Hogden, John E.
2000-01-01
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Speech processing using maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.E.
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Francisco, Danira Tavares; Wertzner, Haydée Fiszbein
2017-01-01
This study describes the criteria that are used in ultrasound to measure the differences between the tongue contours that produce [s] and [ʃ] sounds in the speech of adults, typically developing children (TDC), and children with speech sound disorder (SSD) with the phonological process of palatal fronting. Overlapping images of the tongue contours that resulted from 35 subjects producing the [s] and [ʃ] sounds were analysed to select 11 spokes on the radial grid that were spread over the tongue contour. The difference was calculated between the mean contour of the [s] and [ʃ] sounds for each spoke. A cluster analysis produced groups with some consistency in the pattern of articulation across subjects and differentiated adults and TDC to some extent and children with SSD with a high level of success. Children with SSD were less likely to show differentiation of the tongue contours between the articulation of [s] and [ʃ].
ERIC Educational Resources Information Center
Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.
2017-01-01
Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…
Sensorimotor influences on speech perception in infancy.
Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F
2015-11-03
The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline
2013-01-01
A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional practice, and (d) SLPs' collaboration with other professionals. Readers will 1. recognize that multilingual children with speech sound disorders have both similar and different needs to monolingual children when working with speech-language pathologists. 2. Describe the challenges for speech-language pathologists who work with multilingual children. 3. Recall the importance of cultural competence for speech-language pathologists. 4. Identify methods for international collaboration and consultation. 5. Recognize the importance of engaging with families and people within their local communities for supporting multilingual children in context. Copyright © 2013 Elsevier Inc. All rights reserved.
Visual stimuli in intervention approaches for pre-schoolers diagnosed with phonological delay.
Pedro, Cassandra Ferreira; Lousada, Marisa; Hall, Andreia; Jesus, Luis M T
2018-04-01
The aim of this study was to develop and content validate specific speech and language intervention picture cards: The Letter-Sound (L&S) cards. The present study was also focused on assessing the influence of these cards on letter-sound correspondences and speech sound production. An expert panel of six speech and language therapists analysed and discussed the L&S cards based on several criteria previously established. A Speech and Language Therapist carried out a 6-week therapeutic intervention with a group of seven Portuguese phonologically delayed pre-schoolers aged 5;3 to 6;5. The modified Bland-Altman method revealed good agreement among evaluators, that is the majority of the values was between the agreement limits. Additional outcome measures were collected before and after the therapeutic intervention process. Results indicate that the L&S cards facilitate the acquisition of letter-sound correspondences. Regarding speech sound production, some improvements were also observed at word level. The L&S cards are therefore likely to give phonetic cues, which are crucial for the correct production of therapeutic targets. These visual cues seemed to have helped children with phonological delay develop the above-mentioned skills.
Motor-Based Treatment with and without Ultrasound Feedback for Residual Speech-Sound Errors
ERIC Educational Resources Information Center
Preston, Jonathan L.; Leece, Megan C.; Maas, Edwin
2017-01-01
Background: There is a need to develop effective interventions and to compare the efficacy of different interventions for children with residual speech-sound errors (RSSEs). Rhotics (the r-family of sounds) are frequently in error American English-speaking children with RSSEs and are commonly targeted in treatment. One treatment approach involves…
Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex
Rogers, Jack C.; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E.
2014-01-01
Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener’s ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips (“ba” vs. “da”); (2) two speech sounds not articulated by the lips (“ga” vs. “da”); and (3) non-speech sounds produced by the hands (“claps” vs. “clicks”). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba–da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way. PMID:25076928
Rogers, Jack C; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E
2014-01-01
Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener's ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips ("ba" vs. "da"); (2) two speech sounds not articulated by the lips ("ga" vs. "da"); and (3) non-speech sounds produced by the hands ("claps" vs. "clicks"). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba-da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way.
Reading Skills of Students with Speech Sound Disorders at Three Stages of Literacy Development
ERIC Educational Resources Information Center
Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.
2013-01-01
Purpose: The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with…
ERIC Educational Resources Information Center
Apel, Kenn; Lawrence, Jessika
2011-01-01
Purpose: In this study, the authors compared the morphological awareness abilities of children with speech sound disorder (SSD) and children with typical speech skills and examined how morphological awareness ability predicted word-level reading and spelling performance above other known contributors to literacy development. Method: Eighty-eight…
Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds
Agnew, Z.K.; McGettigan, C.; Scott, S.K.
2012-01-01
Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557
Auditory-Motor Processing of Speech Sounds
Möttönen, Riikka; Dutton, Rebekah; Watkins, Kate E.
2013-01-01
The motor regions that control movements of the articulators activate during listening to speech and contribute to performance in demanding speech recognition and discrimination tasks. Whether the articulatory motor cortex modulates auditory processing of speech sounds is unknown. Here, we aimed to determine whether the articulatory motor cortex affects the auditory mechanisms underlying discrimination of speech sounds in the absence of demanding speech tasks. Using electroencephalography, we recorded responses to changes in sound sequences, while participants watched a silent video. We also disrupted the lip or the hand representation in left motor cortex using transcranial magnetic stimulation. Disruption of the lip representation suppressed responses to changes in speech sounds, but not piano tones. In contrast, disruption of the hand representation had no effect on responses to changes in speech sounds. These findings show that disruptions within, but not outside, the articulatory motor cortex impair automatic auditory discrimination of speech sounds. The findings provide evidence for the importance of auditory-motor processes in efficient neural analysis of speech sounds. PMID:22581846
Goldrick, Matthew; Keshet, Joseph; Gustafson, Erin; Heller, Jordana; Needle, Jeremy
2016-04-01
Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Nishiura, Takanobu; Nakamura, Satoshi
2003-10-01
Humans communicate with each other through speech by focusing on the target speech among environmental sounds in real acoustic environments. We can easily identify the target sound from other environmental sounds. For hands-free speech recognition, the identification of the target speech from environmental sounds is imperative. This mechanism may also be important for a self-moving robot to sense the acoustic environments and communicate with humans. Therefore, this paper first proposes hidden Markov model (HMM)-based environmental sound source identification. Environmental sounds are modeled by three states of HMMs and evaluated using 92 kinds of environmental sounds. The identification accuracy was 95.4%. This paper also proposes a new HMM composition method that composes speech HMMs and an HMM of categorized environmental sounds for robust environmental sound-added speech recognition. As a result of the evaluation experiments, we confirmed that the proposed HMM composition outperforms the conventional HMM composition with speech HMMs and a noise (environmental sound) HMM trained using noise periods prior to the target speech in a captured signal. [Work supported by Ministry of Public Management, Home Affairs, Posts and Telecommunications of Japan.
Schaadt, Gesa; van der Meer, Elke; Pannekamp, Ann; Oberecker, Regine; Männel, Claudia
2018-01-17
During information processing, individuals benefit from bimodally presented input, as has been demonstrated for speech perception (i.e., printed letters and speech sounds) or the perception of emotional expressions (i.e., facial expression and voice tuning). While typically developing individuals show this bimodal benefit, school children with dyslexia do not. Currently, it is unknown whether the bimodal processing deficit in dyslexia also occurs for visual-auditory speech processing that is independent of reading and spelling acquisition (i.e., no letter-sound knowledge is required). Here, we tested school children with and without spelling problems on their bimodal perception of video-recorded mouth movements pronouncing syllables. We analyzed the event-related potential Mismatch Response (MMR) to visual-auditory speech information and compared this response to the MMR to monomodal speech information (i.e., auditory-only, visual-only). We found a reduced MMR with later onset to visual-auditory speech information in children with spelling problems compared to children without spelling problems. Moreover, when comparing bimodal and monomodal speech perception, we found that children without spelling problems showed significantly larger responses in the visual-auditory experiment compared to the visual-only response, whereas children with spelling problems did not. Our results suggest that children with dyslexia exhibit general difficulties in bimodal speech perception independently of letter-speech sound knowledge, as apparent in altered bimodal speech perception and lacking benefit from bimodal information. This general deficit in children with dyslexia may underlie the previously reported reduced bimodal benefit for letter-speech sound combinations and similar findings in emotion perception. Copyright © 2018 Elsevier Ltd. All rights reserved.
EEG oscillations entrain their phase to high-level features of speech sound.
Zoefel, Benedikt; VanRullen, Rufin
2016-01-01
Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation-and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed "high-level" speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech. Copyright © 2015 Elsevier Inc. All rights reserved.
Stop consonant voicing in young children's speech: Evidence from a cross-sectional study
NASA Astrophysics Data System (ADS)
Ganser, Emily
There are intuitive reasons to believe that speech-sound acquisition and language acquisition should be related in development. Surprisingly, only recently has research begun to parse just how the two might be related. This study investigated possible correlations between speech-sound acquisition and language acquisition, as part of a large-scale, longitudinal study of the relationship between different types of phonological development and vocabulary growth in the preschool years. Productions of voiced and voiceless stop-initial words were recorded from 96 children aged 28-39 months. Voice Onset Time (VOT, in ms) for each token context was calculated. A mixed-model logistic regression was calculated which predicted whether the sound was intended to be voiced or voiceless based on its VOT. This model estimated the slopes of the logistic function for each child. This slope was referred to as Robustness of Contrast (based on Holliday, Reidy, Beckman, and Edwards, 2015), defined as being the degree of categorical differentiation between the production of two speech sounds or classes of sounds, in this case, voiced and voiceless stops. Results showed a wide range of slopes for individual children, suggesting that slope-derived Robustness of Contrast could be a viable means of measuring a child's acquisition of the voicing contrast. Robustness of Contrast was then compared to traditional measures of speech and language skills to investigate whether there was any correlation between the production of stop voicing and broader measures of speech and language development. The Robustness of Contrast measure was found to correlate with all individual measures of speech and language, suggesting that it might indeed be predictive of later language skills.
Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome
Engineer, Crystal T.; Rahebi, Kimiya C.; Borland, Michael S.; Buell, Elizabeth P.; Centanni, Tracy M.; Fink, Melyssa K.; Im, Kwok W.; Wilson, Linda G.; Kilgard, Michael P.
2015-01-01
Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial. PMID:26321676
Yoder, Paul J.; Molfese, Dennis; Murray, Micah M.; Key, Alexandra P. F.
2013-01-01
Typically developing (TD) preschoolers and age-matched preschoolers with specific language impairment (SLI) received event-related potentials (ERPs) to four monosyllabic speech sounds prior to treatment and, in the SLI group, after 6 months of grammatical treatment. Before treatment, the TD group processed speech sounds faster than the SLI group. The SLI group increased the speed of their speech processing after treatment. Post-treatment speed of speech processing predicted later impairment in comprehending phrase elaboration in the SLI group. During the treatment phase, change in speed of speech processing predicted growth rate of grammar in the SLI group. PMID:24219693
Phrase-level speech simulation with an airway modulation model of speech production
Story, Brad H.
2012-01-01
Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated. PMID:23503742
Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn
2018-01-29
To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound disorders. Non-speech oral motor exercise use was most frequently reported in the treatment of dysarthria. Non-speech oral motor exercise use when targeting speech sound disorders is not widely endorsed in the literature.
Speech sound articulation abilities of preschool-age children who stutter.
Clark, Chagit E; Conture, Edward G; Walden, Tedra A; Lambert, Warren E
2013-12-01
The purpose of this study was to assess the association between speech sound articulation and childhood stuttering in a relatively large sample of preschool-age children who do and do not stutter, using the Goldman-Fristoe Test of Articulation-2 (GFTA-2; Goldman & Fristoe, 2000). Participants included 277 preschool-age children who do (CWS; n=128, 101 males) and do not stutter (CWNS; n=149, 76 males). Generalized estimating equations (GEE) were performed to assess between-group (CWS versus CWNS) differences on the GFTA-2. Additionally, within-group correlations were performed to explore the relation between CWS' speech sound articulation abilities and their stuttering frequency and severity, as well as their sound prolongation index (SPI; Schwartz & Conture, 1988). No significant differences were found between the articulation scores of preschool-age CWS and CWNS. However, there was a small gender effect for the 5-year-old age group, with girls generally exhibiting better articulation scores than boys. Additional findings indicated no relation between CWS' speech sound articulation abilities and their stuttering frequency, severity, or SPI. Findings suggest no apparent association between speech sound articulation-as measured by one standardized assessment (GFTA-2)-and childhood stuttering for this sample of preschool-age children (N=277). After reading this article, the reader will be able to: (1) discuss salient issues in the articulation literature relative to children who stutter; (2) compare/contrast the present study's methodologies and main findings to those of previous studies that investigated the association between childhood stuttering and speech sound articulation; (3) identify future research needs relative to the association between childhood stuttering and speech sound development; (4) replicate the present study's methodology to expand this body of knowledge. Copyright © 2013 Elsevier Inc. All rights reserved.
Degraded speech sound processing in a rat model of fragile X syndrome
Engineer, Crystal T.; Centanni, Tracy M.; Im, Kwok W.; Rahebi, Kimiya C.; Buell, Elizabeth P.; Kilgard, Michael P.
2014-01-01
Fragile X syndrome is the most common inherited form of intellectual disability and the leading genetic cause of autism. Impaired phonological processing in fragile X syndrome interferes with the development of language skills. Although auditory cortex responses are known to be abnormal in fragile X syndrome, it is not clear how these differences impact speech sound processing. This study provides the first evidence that the cortical representation of speech sounds is impaired in Fmr1 knockout rats, despite normal speech discrimination behavior. Evoked potentials and spiking activity in response to speech sounds, noise burst trains, and tones were significantly degraded in primary auditory cortex, anterior auditory field and the ventral auditory field. Neurometric analysis of speech evoked activity using a pattern classifier confirmed that activity in these fields contains significantly less information about speech sound identity in Fmr1 knockout rats compared to control rats. Responses were normal in the posterior auditory field, which is associated with sound localization. The greatest impairment was observed in the ventral auditory field, which is related to emotional regulation. Dysfunction in the ventral auditory field may contribute to poor emotional regulation in fragile X syndrome and may help explain the observation that later auditory evoked responses are more disturbed in fragile X syndrome compared to earlier responses. Rodent models of fragile X syndrome are likely to prove useful for understanding the biological basis of fragile X syndrome and for testing candidate therapies. PMID:24713347
Children with Speech Sound Disorders at School: Challenges for Children, Parents and Teachers
ERIC Educational Resources Information Center
Daniel, Graham R.; McLeod, Sharynne
2017-01-01
Teachers play a major role in supporting children's educational, social, and emotional development although may be unprepared for supporting children with speech sound disorders. Interviews with 34 participants including six focus children, their parents, siblings, friends, teachers and other significant adults in their lives highlighted…
Gangji, Nazneen; Pascoe, Michelle; Smouse, Mantoa
2015-01-01
Swahili is widely spoken in East Africa, but to date there are no culturally and linguistically appropriate materials available for speech-language therapists working in the region. The challenges are further exacerbated by the limited research available on the typical acquisition of Swahili phonology. To describe the speech development of 24 typically developing first language Swahili-speaking children between the ages of 3;0 and 5;11 years in Dar es Salaam, Tanzania. A cross-sectional design was used with six groups of four children in 6-month age bands. Single-word speech samples were obtained from each child using a set of culturally appropriate pictures designed to elicit all consonants and vowels of Swahili. Each child's speech was audio-recorded and phonetically transcribed using International Phonetic Alphabet (IPA) conventions. Children's speech development is described in terms of (1) phonetic inventory, (2) syllable structure inventory, (3) phonological processes and (4) percentage consonants correct (PCC) and percentage vowels correct (PVC). Results suggest a gradual progression in the acquisition of speech sounds and syllables between the ages of 3;0 and 5;11 years. Vowel acquisition was completed and most of the consonants acquired by age 3;0. Fricatives/z, s, h/ were later acquired at 4 years and /θ/and /r/ were the last acquired consonants at age 5;11. Older children were able to produce speech sounds more accurately and had fewer phonological processes in their speech than younger children. Common phonological processes included lateralization and sound preference substitutions. The study contributes a preliminary set of normative data on speech development of Swahili-speaking children. Findings are discussed in relation to theories of phonological development, and may be used as a basis for further normative studies with larger numbers of children and ultimately the development of a contextually relevant assessment of the phonology of Swahili-speaking children. © 2014 Royal College of Speech and Language Therapists.
Corollary discharge provides the sensory content of inner speech.
Scott, Mark
2013-09-01
Inner speech is one of the most common, but least investigated, mental activities humans perform. It is an internal copy of one's external voice and so is similar to a well-established component of motor control: corollary discharge. Corollary discharge is a prediction of the sound of one's voice generated by the motor system. This prediction is normally used to filter self-caused sounds from perception, which segregates them from externally caused sounds and prevents the sensory confusion that would otherwise result. The similarity between inner speech and corollary discharge motivates the theory, tested here, that corollary discharge provides the sensory content of inner speech. The results reported here show that inner speech attenuates the impact of external sounds. This attenuation was measured using a context effect (an influence of contextual speech sounds on the perception of subsequent speech sounds), which weakens in the presence of speech imagery that matches the context sound. Results from a control experiment demonstrated this weakening in external speech as well. Such sensory attenuation is a hallmark of corollary discharge.
Left Lateralized Enhancement of Orofacial Somatosensory Processing Due to Speech Sounds
ERIC Educational Resources Information Center
Ito, Takayuki; Johns, Alexis R.; Ostry, David J.
2013-01-01
Purpose: Somatosensory information associated with speech articulatory movements affects the perception of speech sounds and vice versa, suggesting an intimate linkage between speech production and perception systems. However, it is unclear which cortical processes are involved in the interaction between speech sounds and orofacial somatosensory…
Speech training alters consonant and vowel responses in multiple auditory cortex fields
Engineer, Crystal T.; Rahebi, Kimiya C.; Buell, Elizabeth P.; Fink, Melyssa K.; Kilgard, Michael P.
2015-01-01
Speech sounds evoke unique neural activity patterns in primary auditory cortex (A1). Extensive speech sound discrimination training alters A1 responses. While the neighboring auditory cortical fields each contain information about speech sound identity, each field processes speech sounds differently. We hypothesized that while all fields would exhibit training-induced plasticity following speech training, there would be unique differences in how each field changes. In this study, rats were trained to discriminate speech sounds by consonant or vowel in quiet and in varying levels of background speech-shaped noise. Local field potential and multiunit responses were recorded from four auditory cortex fields in rats that had received 10 weeks of speech discrimination training. Our results reveal that training alters speech evoked responses in each of the auditory fields tested. The neural response to consonants was significantly stronger in anterior auditory field (AAF) and A1 following speech training. The neural response to vowels following speech training was significantly weaker in ventral auditory field (VAF) and posterior auditory field (PAF). This differential plasticity of consonant and vowel sound responses may result from the greater paired pulse depression, expanded low frequency tuning, reduced frequency selectivity, and lower tone thresholds, which occurred across the four auditory fields. These findings suggest that alterations in the distributed processing of behaviorally relevant sounds may contribute to robust speech discrimination. PMID:25827927
Visual Influences on Speech Perception in Children with Autism
ERIC Educational Resources Information Center
Iarocci, Grace; Rombough, Adrienne; Yager, Jodi; Weeks, Daniel J.; Chua, Romeo
2010-01-01
The bimodal perception of speech sounds was examined in children with autism as compared to mental age--matched typically developing (TD) children. A computer task was employed wherein only the mouth region of the face was displayed and children reported what they heard or saw when presented with consonant-vowel sounds in unimodal auditory…
Early Intervening for Students with Speech Sound Disorders: Lessons from a School District
ERIC Educational Resources Information Center
Mire, Stephen P.; Montgomery, Judy K.
2009-01-01
The concept of early intervening services was introduced into public school systems with the implementation of the Individuals With Disabilities Education Improvement Act (IDEA) of 2004. This article describes a program developed for students with speech sound disorders that incorporated concepts of early intervening services, response to…
ERIC Educational Resources Information Center
Froyen, Dries; Willems, Gonny; Blomert, Leo
2011-01-01
The phonological deficit theory of dyslexia assumes that degraded speech sound representations might hamper the acquisition of stable letter-speech sound associations necessary for learning to read. However, there is only scarce and mainly indirect evidence for this assumed letter-speech sound association problem. The present study aimed at…
Speech and Language Skills of Parents of Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry
2007-01-01
Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…
Discrimination of brief speech sounds is impaired in rats with auditory cortex lesions
Porter, Benjamin A.; Rosenthal, Tara R.; Ranasinghe, Kamalini G.; Kilgard, Michael P.
2011-01-01
Auditory cortex (AC) lesions impair complex sound discrimination. However, a recent study demonstrated spared performance on an acoustic startle response test of speech discrimination following AC lesions (Floody et al., 2010). The current study reports the effects of AC lesions on two operant speech discrimination tasks. AC lesions caused a modest and quickly recovered impairment in the ability of rats to discriminate consonant-vowel-consonant speech sounds. This result seems to suggest that AC does not play a role in speech discrimination. However, the speech sounds used in both studies differed in many acoustic dimensions and an adaptive change in discrimination strategy could allow the rats to use an acoustic difference that does not require an intact AC to discriminate. Based on our earlier observation that the first 40 ms of the spatiotemporal activity patterns elicited by speech sounds best correlate with behavioral discriminations of these sounds (Engineer et al., 2008), we predicted that eliminating additional cues by truncating speech sounds to the first 40 ms would render the stimuli indistinguishable to a rat with AC lesions. Although the initial discrimination of truncated sounds took longer to learn, the final performance paralleled rats using full-length consonant-vowel-consonant sounds. After 20 days of testing, half of the rats using speech onsets received bilateral AC lesions. Lesions severely impaired speech onset discrimination for at least one-month post lesion. These results support the hypothesis that auditory cortex is required to accurately discriminate the subtle differences between similar consonant and vowel sounds. PMID:21167211
Articulation of sounds in Serbian language in patients who learned esophageal speech successfully.
Vekić, Maja; Veselinović, Mila; Mumović, Gordana; Mitrović, Slobodan M
2014-01-01
Articulation of pronounced sounds during the training and subsequent use of esophageal speech is very important because it contributes significantly to intelligibility and aesthetics of spoken words and sentences, as well as of speech and language itself. The aim of this research was to determine the quality of articulation of sounds of Serbian language by groups of sounds in patients who had learned esophageal speech successfully as well as the effect of age and tooth loss on the quality of articulation. This retrospective-prospective study included 16 patients who had undergone total laryngectomy. Having completed the rehabilitation of speech, these patient used esophageal voice and speech. The quality of articulation was tested by the "Global test of articulation." Esophageal speech was rated with grade 5, 4 and 3 in 62.5%, 31.3% and one patient, respectively. Serbian was the native language of all the patients. The study included 30 sounds of Serbian language in 16 subjects (480 total sounds). Only two patients (12.5%) articulated all sounds properly, whereas 87.5% of them had incorrect articulation. The articulation of affricates and fricatives, especially sound /h/ from the group of the fricatives, was found to be the worst in the patients who had successfully mastered esophageal speech. The age and the tooth loss of patients who have mastered esophageal speech do not affect the articulation of sounds in Serbian language.
Temporal plasticity in auditory cortex improves neural discrimination of speech sounds
Engineer, Crystal T.; Shetake, Jai A.; Engineer, Navzer D.; Vrana, Will A.; Wolf, Jordan T.; Kilgard, Michael P.
2017-01-01
Background Many individuals with language learning impairments exhibit temporal processing deficits and degraded neural responses to speech sounds. Auditory training can improve both the neural and behavioral deficits, though significant deficits remain. Recent evidence suggests that vagus nerve stimulation (VNS) paired with rehabilitative therapies enhances both cortical plasticity and recovery of normal function. Objective/Hypothesis We predicted that pairing VNS with rapid tone trains would enhance the primary auditory cortex (A1) response to unpaired novel speech sounds. Methods VNS was paired with tone trains 300 times per day for 20 days in adult rats. Responses to isolated speech sounds, compressed speech sounds, word sequences, and compressed word sequences were recorded in A1 following the completion of VNS-tone train pairing. Results Pairing VNS with rapid tone trains resulted in stronger, faster, and more discriminable A1 responses to speech sounds presented at conversational rates. Conclusion This study extends previous findings by documenting that VNS paired with rapid tone trains altered the neural response to novel unpaired speech sounds. Future studies are necessary to determine whether pairing VNS with appropriate auditory stimuli could potentially be used to improve both neural responses to speech sounds and speech perception in individuals with receptive language disorders. PMID:28131520
Hashizume, Hiroshi; Taki, Yasuyuki; Sassa, Yuko; Thyreau, Benjamin; Asano, Michiko; Asano, Kohei; Takeuchi, Hikaru; Nouchi, Rui; Kotozaki, Yuka; Jeong, Hyeonjeong; Sugiura, Motoaki; Kawashima, Ryuta
2014-08-01
Older children are more successful at producing unfamiliar, non-native speech sounds than younger children during the initial stages of learning. To reveal the neuronal underpinning of the age-related increase in the accuracy of non-native speech production, we examined the developmental changes in activation involved in the production of novel speech sounds using functional magnetic resonance imaging. Healthy right-handed children (aged 6-18 years) were scanned while performing an overt repetition task and a perceptual task involving aurally presented non-native and native syllables. Productions of non-native speech sounds were recorded and evaluated by native speakers. The mouth regions in the bilateral primary sensorimotor areas were activated more significantly during the repetition task relative to the perceptual task. The hemodynamic response in the left inferior frontal gyrus pars opercularis (IFG pOp) specific to non-native speech sound production (defined by prior hypothesis) increased with age. Additionally, the accuracy of non-native speech sound production increased with age. These results provide the first evidence of developmental changes in the neural processes underlying the production of novel speech sounds. Our data further suggest that the recruitment of the left IFG pOp during the production of novel speech sounds was possibly enhanced due to the maturation of the neuronal circuits needed for speech motor planning. This, in turn, would lead to improvement in the ability to immediately imitate non-native speech. Copyright © 2014 Wiley Periodicals, Inc.
Confusability of Consonant Phonemes in Sound Discrimination Tasks.
ERIC Educational Resources Information Center
Rudegeair, Robert E.
The findings of Marsh and Sherman's investigation, in 1970, of the speech sound discrimination ability of kindergarten subjects, are discussed in this paper. In the study a comparison was made between performance when speech sounds were presented in isolation and when speech sounds were presented in a word context, using minimal sound contrasts.…
ERIC Educational Resources Information Center
Velleman, Shelley L.; Pearson, Barbara Zurer
2010-01-01
B. Z. Pearson, S. L. Velleman, T. J. Bryant, and T. Charko (2009) demonstrated phonological differences in typically developing children learning African American English as their first dialect vs. General American English only. Extending this research to children with speech sound disorders (SSD) has key implications for intervention. A total of…
Lepistö, T; Silokallio, S; Nieminen-von Wendt, T; Alku, P; Näätänen, R; Kujala, T
2006-10-01
Language development is delayed and deviant in individuals with autism, but proceeds quite normally in those with Asperger syndrome (AS). We investigated auditory-discrimination and orienting in children with AS using an event-related potential (ERP) paradigm that was previously applied to children with autism. ERPs were measured to pitch, duration, and phonetic changes in vowels and to corresponding changes in non-speech sounds. Active sound discrimination was evaluated with a sound-identification task. The mismatch negativity (MMN), indexing sound-discrimination accuracy, showed right-hemisphere dominance in the AS group, but not in the controls. Furthermore, the children with AS had diminished MMN-amplitudes and decreased hit rates for duration changes. In contrast, their MMN to speech pitch changes was parietally enhanced. The P3a, reflecting involuntary orienting to changes, was diminished in the children with AS for speech pitch and phoneme changes, but not for the corresponding non-speech changes. The children with AS differ from controls with respect to their sound-discrimination and orienting abilities. The results of the children with AS are relatively similar to those earlier obtained from children with autism using the same paradigm, although these clinical groups differ markedly in their language development.
ERIC Educational Resources Information Center
LeBlanc, Judith M.
To gain some insight into the problem of deviant speech development in low income populations, this study investigated the environmental factors that encourage the development of normal speech. Two specific questions were examined in this study: (1) If specific vocalized environmental sounds are presented contiguously with reinforcement, will…
ERIC Educational Resources Information Center
Gildersleeve-Neumann, Christina E.; Kester, Ellen S.; Davis, Barbara L.; Pena, Elizabeth D.
2008-01-01
Purpose: English speech acquisition by typically developing 3- to 4-year-old children with monolingual English was compared to English speech acquisition by typically developing 3- to 4-year-old children with bilingual English-Spanish backgrounds. We predicted that exposure to Spanish would not affect the English phonetic inventory but would…
Sound stream segregation: a neuromorphic approach to solve the “cocktail party problem” in real-time
Thakur, Chetan Singh; Wang, Runchun M.; Afshar, Saeed; Hamilton, Tara J.; Tapson, Jonathan C.; Shamma, Shihab A.; van Schaik, André
2015-01-01
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the “cocktail party effect.” It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition. PMID:26388721
Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André
2015-01-01
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition.
Park, H K; Bradley, J S
2009-09-01
Subjective ratings of the audibility, annoyance, and loudness of music and speech sounds transmitted through 20 different simulated walls were used to identify better single number ratings of airborne sound insulation. The first part of this research considered standard measures such as the sound transmission class the weighted sound reduction index (R(w)) and variations of these measures [H. K. Park and J. S. Bradley, J. Acoust. Soc. Am. 126, 208-219 (2009)]. This paper considers a number of other measures including signal-to-noise ratios related to the intelligibility of speech and measures related to the loudness of sounds. An exploration of the importance of the included frequencies showed that the optimum ranges of included frequencies were different for speech and music sounds. Measures related to speech intelligibility were useful indicators of responses to speech sounds but were not as successful for music sounds. A-weighted level differences, signal-to-noise ratios and an A-weighted sound transmission loss measure were good predictors of responses when the included frequencies were optimized for each type of sound. The addition of new spectrum adaptation terms to R(w) values were found to be the most practical approach for achieving more accurate predictions of subjective ratings of transmitted speech and music sounds.
Stekelenburg, Jeroen J; Keetels, Mirjam; Vroomen, Jean
2018-05-01
Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect. Here, we examined whether written text, like visual speech, can induce an illusory change in the perception of speech sounds on both the behavioural and neural levels. In a sound categorization task, we found that both text and visual speech changed the identity of speech sounds from an /aba/-/ada/ continuum, but the size of this audiovisual effect was considerably smaller for text than visual speech. To examine at which level in the information processing hierarchy these multisensory interactions occur, we recorded electroencephalography in an audiovisual mismatch negativity (MMN, a component of the event-related potential reflecting preattentive auditory change detection) paradigm in which deviant text or visual speech was used to induce an illusory change in a sequence of ambiguous sounds halfway between /aba/ and /ada/. We found that only deviant visual speech induced an MMN, but not deviant text, which induced a late P3-like positive potential. These results demonstrate that text has much weaker effects on sound processing than visual speech does, possibly because text has different biological roots than visual speech. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Anderson, Carolyn; Cohen, Wendy
2012-01-01
Children's speech sound development is assessed by comparing speech production with the typical development of speech sounds based on a child's age and developmental profile. One widely used method of sampling is to elicit a single-word sample along with connected speech. Words produced spontaneously rather than imitated may give a more accurate indication of a child's speech development. A published word complexity measure can be used to score later-developing speech sounds and more complex word patterns. There is a need for a screening word list that is quick to administer and reliably differentiates children with typically developing speech from children with patterns of delayed/disordered speech. To identify a short word list based on word complexity that could be spontaneously named by most typically developing children aged 3;00-5;05 years. One hundred and five children aged between 3;00 and 5;05 years from three local authority nursery schools took part in the study. Items from a published speech assessment were modified and extended to include a range of phonemic targets in different word positions in 78 monosyllabic and polysyllabic words. The 78 words were ranked both by phonemic/phonetic complexity as measured by word complexity and by ease of spontaneous production. The ten most complex words (hereafter Triage 10) were named spontaneously by more than 90% of the children. There was no significant difference between the complexity measures for five identified age groups when the data were examined in 6-month groups. A qualitative analysis revealed eight children with profiles of phonological delay or disorder. When these children were considered separately, there was a statistically significant difference (p < 0.005) between the mean word complexity measure of the group compared with the mean for the remaining children in all other age groups. The Triage 10 words reliably differentiated children with typically developing speech from those with delayed or disordered speech patterns. The Triage 10 words can be used as a screening tool for triage and general assessment and have the potential to monitor progress during intervention. Further testing is being undertaken to establish reliability with children referred to speech and language therapy services. © 2012 Royal College of Speech and Language Therapists.
A longitudinal study of the bilateral benefit in children with bilateral cochlear implants.
Asp, Filip; Mäki-Torkko, Elina; Karltorp, Eva; Harder, Henrik; Hergils, Leif; Eskilsson, Gunnar; Stenfelt, Stefan
2015-02-01
To study the development of the bilateral benefit in children using bilateral cochlear implants by measurements of speech recognition and sound localization. Bilateral and unilateral speech recognition in quiet, in multi-source noise, and horizontal sound localization was measured at three occasions during a two-year period, without controlling for age or implant experience. Longitudinal and cross-sectional analyses were performed. Results were compared to cross-sectional data from children with normal hearing. Seventy-eight children aged 5.1-11.9 years, with a mean bilateral cochlear implant experience of 3.3 years and a mean age of 7.8 years, at inclusion in the study. Thirty children with normal hearing aged 4.8-9.0 years provided normative data. For children with cochlear implants, bilateral and unilateral speech recognition in quiet was comparable whereas a bilateral benefit for speech recognition in noise and sound localization was found at all three test occasions. Absolute performance was lower than in children with normal hearing. Early bilateral implantation facilitated sound localization. A bilateral benefit for speech recognition in noise and sound localization continues to exist over time for children with bilateral cochlear implants, but no relative improvement is found after three years of bilateral cochlear implant experience.
ERIC Educational Resources Information Center
McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline
2013-01-01
Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context.…
ERIC Educational Resources Information Center
Wren, Yvonne; Harding, Sam; Goldbart, Juliet; Roulstone, Sue
2018-01-01
Background: Multiple interventions have been developed to address speech sound disorder (SSD) in children. Many of these have been evaluated but the evidence for these has not been considered within a model which categorizes types of intervention. The opportunity to carry out a systematic review of interventions for SSD arose as part of a larger…
Speech training alters tone frequency tuning in rat primary auditory cortex
Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Kilgard, Michael P.
2013-01-01
Previous studies in both humans and animals have documented improved performance following discrimination training. This enhanced performance is often associated with cortical response changes. In this study, we tested the hypothesis that long-term speech training on multiple tasks can improve primary auditory cortex (A1) responses compared to rats trained on a single speech discrimination task or experimentally naïve rats. Specifically, we compared the percent of A1 responding to trained sounds, the responses to both trained and untrained sounds, receptive field properties of A1 neurons, and the neural discrimination of pairs of speech sounds in speech trained and naïve rats. Speech training led to accurate discrimination of consonant and vowel sounds, but did not enhance A1 response strength or the neural discrimination of these sounds. Speech training altered tone responses in rats trained on six speech discrimination tasks but not in rats trained on a single speech discrimination task. Extensive speech training resulted in broader frequency tuning, shorter onset latencies, a decreased driven response to tones, and caused a shift in the frequency map to favor tones in the range where speech sounds are the loudest. Both the number of trained tasks and the number of days of training strongly predict the percent of A1 responding to a low frequency tone. Rats trained on a single speech discrimination task performed less accurately than rats trained on multiple tasks and did not exhibit A1 response changes. Our results indicate that extensive speech training can reorganize the A1 frequency map, which may have downstream consequences on speech sound processing. PMID:24344364
Perception of environmental sounds by experienced cochlear implant patients.
Shafiro, Valeriy; Gygi, Brian; Cheng, Min-Yu; Vachhani, Jay; Mulvey, Megan
2011-01-01
Environmental sound perception serves an important ecological function by providing listeners with information about objects and events in their immediate environment. Environmental sounds such as car horns, baby cries, or chirping birds can alert listeners to imminent dangers as well as contribute to one's sense of awareness and well being. Perception of environmental sounds as acoustically and semantically complex stimuli may also involve some factors common to the processing of speech. However, very limited research has investigated the abilities of cochlear implant (CI) patients to identify common environmental sounds, despite patients' general enthusiasm about them. This project (1) investigated the ability of patients with modern-day CIs to perceive environmental sounds, (2) explored associations among speech, environmental sounds, and basic auditory abilities, and (3) examined acoustic factors that might be involved in environmental sound perception. Seventeen experienced postlingually deafened CI patients participated in the study. Environmental sound perception was assessed with a large-item test composed of 40 sound sources, each represented by four different tokens. The relationship between speech and environmental sound perception and the role of working memory and some basic auditory abilities were examined based on patient performance on a battery of speech tests (HINT, CNC, and individual consonant and vowel tests), tests of basic auditory abilities (audiometric thresholds, gap detection, temporal pattern, and temporal order for tones tests), and a backward digit recall test. The results indicated substantially reduced ability to identify common environmental sounds in CI patients (45.3%). Except for vowels, all speech test scores significantly correlated with the environmental sound test scores: r = 0.73 for HINT in quiet, r = 0.69 for HINT in noise, r = 0.70 for CNC, r = 0.64 for consonants, and r = 0.48 for vowels. HINT and CNC scores in quiet moderately correlated with the temporal order for tones. However, the correlation between speech and environmental sounds changed little after partialling out the variance due to other variables. Present findings indicate that environmental sound identification is difficult for CI patients. They further suggest that speech and environmental sounds may overlap considerably in their perceptual processing. Certain spectrotemproral processing abilities are separately associated with speech and environmental sound performance. However, they do not appear to mediate the relationship between speech and environmental sounds in CI patients. Environmental sound rehabilitation may be beneficial to some patients. Environmental sound testing may have potential diagnostic applications, especially with difficult-to-test populations and might also be predictive of speech performance for prelingually deafened patients with cochlear implants.
Language and Speech Improvement for Kindergarten and First Grade. A Supplementary Handbook.
ERIC Educational Resources Information Center
Cole, Roberta; And Others
The 16-unit language and speech improvement handbook for kindergarten and first grade students contains an introductory section which includes a discussion of the child's developmental speech and language characteristics, a sound development chart, a speech and hearing language screening test, the Henja articulation test, and a general outline of…
Speech versus non-speech as irrelevant sound: controlling acoustic variation.
Little, Jason S; Martin, Frances Heritage; Thomson, Richard H S
2010-09-01
Functional differences between speech and non-speech within the irrelevant sound effect were investigated using repeated and changing formats of irrelevant sounds in the form of intelligible words and unintelligible signal correlated noise (SCN) versions of the words. Event-related potentials were recorded from 25 females aged between 18 and 25 while they completed a serial order recall task in the presence of irrelevant sound or silence. As expected and in line with the changing-state hypothesis both words and SCN produced robust changing-state effects. However, words produced a greater changing-state effect than SCN indicating that the spectral detail inherent within speech accounts for the greater irrelevant sound effect and changing-state effect typically observed with speech. ERP data in the form of N1 amplitude was modulated within some irrelevant sound conditions suggesting that attentional aspects are involved in the elicitation of the irrelevant sound effect. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Perceptual sensitivity to spectral properties of earlier sounds during speech categorization.
Stilp, Christian E; Assgari, Ashley A
2018-02-28
Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F 1 frequency regions, listeners report more high-F 1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur frequently in everyday speech perception.
Speech endpoint detection with non-language speech sounds for generic speech processing applications
NASA Astrophysics Data System (ADS)
McClain, Matthew; Romanowski, Brian
2009-05-01
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise
2012-01-01
Purpose To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost four years later. Method Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 and followed up at 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors were used to predict later speech sound production, PA, and literacy outcomes. Results Group averages revealed below-average school-age articulation scores and low-average PA, but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom more than 10% of their speech sound errors were atypical had lower PA and literacy scores at school-age than children who produced fewer than 10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores. Conclusions Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschool may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschool distortions may be resistant to change over time, leading to persisting speech sound production problems. PMID:23184137
Preston, Jonathan L; Hull, Margaret; Edwards, Mary Louise
2013-05-01
To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up at age 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors was used to predict later speech sound production, PA, and literacy outcomes. Group averages revealed below-average school-age articulation scores and low-average PA but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom >10% of their speech sound errors were atypical had lower PA and literacy scores at school age than children who produced <10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores than preschoolers who produced fewer distortion errors. Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschoolers may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschoolers' distortions may be resistant to change over time, leading to persisting speech sound production problems.
Statistical properties of Chinese phonemic networks
NASA Astrophysics Data System (ADS)
Yu, Shuiyuan; Liu, Haitao; Xu, Chunshan
2011-04-01
The study of properties of speech sound systems is of great significance in understanding the human cognitive mechanism and the working principles of speech sound systems. Some properties of speech sound systems, such as the listener-oriented feature and the talker-oriented feature, have been unveiled with the statistical study of phonemes in human languages and the research of the interrelations between human articulatory gestures and the corresponding acoustic parameters. With all the phonemes of speech sound systems treated as a coherent whole, our research, which focuses on the dynamic properties of speech sound systems in operation, investigates some statistical parameters of Chinese phoneme networks based on real text and dictionaries. The findings are as follows: phonemic networks have high connectivity degrees and short average distances; the degrees obey normal distribution and the weighted degrees obey power law distribution; vowels enjoy higher priority than consonants in the actual operation of speech sound systems; the phonemic networks have high robustness against targeted attacks and random errors. In addition, for investigating the structural properties of a speech sound system, a statistical study of dictionaries is conducted, which shows the higher frequency of shorter words and syllables and the tendency that the longer a word is, the shorter the syllables composing it are. From these structural properties and dynamic properties one can derive the following conclusion: the static structure of a speech sound system tends to promote communication efficiency and save articulation effort while the dynamic operation of this system gives preference to reliable transmission and easy recognition. In short, a speech sound system is an effective, efficient and reliable communication system optimized in many aspects.
Centanni, Tracy M.; Chen, Fuyi; Booker, Anne M.; Engineer, Crystal T.; Sloan, Andrew M.; Rennaker, Robert L.; LoTurco, Joseph J.; Kilgard, Michael P.
2014-01-01
In utero RNAi of the dyslexia-associated gene Kiaa0319 in rats (KIA-) degrades cortical responses to speech sounds and increases trial-by-trial variability in onset latency. We tested the hypothesis that KIA- rats would be impaired at speech sound discrimination. KIA- rats needed twice as much training in quiet conditions to perform at control levels and remained impaired at several speech tasks. Focused training using truncated speech sounds was able to normalize speech discrimination in quiet and background noise conditions. Training also normalized trial-by-trial neural variability and temporal phase locking. Cortical activity from speech trained KIA- rats was sufficient to accurately discriminate between similar consonant sounds. These results provide the first direct evidence that assumed reduced expression of the dyslexia-associated gene KIAA0319 can cause phoneme processing impairments similar to those seen in dyslexia and that intensive behavioral therapy can eliminate these impairments. PMID:24871331
Expertise with artificial non-speech sounds recruits speech-sensitive cortical regions
Leech, Robert; Holt, Lori L.; Devlin, Joseph T.; Dick, Frederic
2009-01-01
Regions of the human temporal lobe show greater activation for speech than for other sounds. These differences may reflect intrinsically specialized domain-specific adaptations for processing speech, or they may be driven by the significant expertise we have in listening to the speech signal. To test the expertise hypothesis, we used a video-game-based paradigm that tacitly trained listeners to categorize acoustically complex, artificial non-linguistic sounds. Before and after training, we used functional MRI to measure how expertise with these sounds modulated temporal lobe activation. Participants’ ability to explicitly categorize the non-speech sounds predicted the change in pre- to post-training activation in speech-sensitive regions of the left posterior superior temporal sulcus, suggesting that emergent auditory expertise may help drive this functional regionalization. Thus, seemingly domain-specific patterns of neural activation in higher cortical regions may be driven in part by experience-based restructuring of high-dimensional perceptual space. PMID:19386919
Brainstem transcription of speech is disrupted in children with autism spectrum disorders
Russo, Nicole; Nicol, Trent; Trommer, Barbara; Zecker, Steve; Kraus, Nina
2009-01-01
Language impairment is a hallmark of autism spectrum disorders (ASD). The origin of the deficit is poorly understood although deficiencies in auditory processing have been detected in both perception and cortical encoding of speech sounds. Little is known about the processing and transcription of speech sounds at earlier (brainstem) levels or about how background noise may impact this transcription process. Unlike cortical encoding of sounds, brainstem representation preserves stimulus features with a degree of fidelity that enables a direct link between acoustic components of the speech syllable (e.g., onsets) to specific aspects of neural encoding (e.g., waves V and A). We measured brainstem responses to the syllable /da/, in quiet and background noise, in children with and without ASD. Children with ASD exhibited deficits in both the neural synchrony (timing) and phase locking (frequency encoding) of speech sounds, despite normal click-evoked brainstem responses. They also exhibited reduced magnitude and fidelity of speech-evoked responses and inordinate degradation of responses by background noise in comparison to typically developing controls. Neural synchrony in noise was significantly related to measures of core and receptive language ability. These data support the idea that abnormalities in the brainstem processing of speech contribute to the language impairment in ASD. Because it is both passively-elicited and malleable, the speech-evoked brainstem response may serve as a clinical tool to assess auditory processing as well as the effects of auditory training in the ASD population. PMID:19635083
Effect of gap detection threshold on consistency of speech in children with speech sound disorder.
Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz
2017-02-01
The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
Tervaniemi, M; Kruck, S; De Baene, W; Schröger, E; Alter, K; Friederici, A D
2009-10-01
By recording auditory electrical brain potentials, we investigated whether the basic sound parameters (frequency, duration and intensity) are differentially encoded among speech vs. music sounds by musicians and non-musicians during different attentional demands. To this end, a pseudoword and an instrumental sound of comparable frequency and duration were presented. The accuracy of neural discrimination was tested by manipulations of frequency, duration and intensity. Additionally, the subjects' attentional focus was manipulated by instructions to ignore the sounds while watching a silent movie or to attentively discriminate the different sounds. In both musicians and non-musicians, the pre-attentively evoked mismatch negativity (MMN) component was larger to slight changes in music than in speech sounds. The MMN was also larger to intensity changes in music sounds and to duration changes in speech sounds. During attentional listening, all subjects more readily discriminated changes among speech sounds than among music sounds as indexed by the N2b response strength. Furthermore, during attentional listening, musicians displayed larger MMN and N2b than non-musicians for both music and speech sounds. Taken together, the data indicate that the discriminative abilities in human audition differ between music and speech sounds as a function of the sound-change context and the subjective familiarity of the sound parameters. These findings provide clear evidence for top-down modulatory effects in audition. In other words, the processing of sounds is realized by a dynamically adapting network considering type of sound, expertise and attentional demands, rather than by a strictly modularly organized stimulus-driven system.
Effects of Familiarity and Feeding on Newborn Speech-Voice Recognition
ERIC Educational Resources Information Center
Valiante, A. Grace; Barr, Ronald G.; Zelazo, Philip R.; Brant, Rollin; Young, Simon N.
2013-01-01
Newborn infants preferentially orient to familiar over unfamiliar speech sounds. They are also better at remembering unfamiliar speech sounds for short periods of time if learning and retention occur after a feed than before. It is unknown whether short-term memory for speech is enhanced when the sound is familiar (versus unfamiliar) and, if so,…
Sheft, Stanley; Gygi, Brian; Ho, Kim Thien N.
2012-01-01
Perceptual training with spectrally degraded environmental sounds results in improved environmental sound identification, with benefits shown to extend to untrained speech perception as well. The present study extended those findings to examine longer-term training effects as well as effects of mere repeated exposure to sounds over time. Participants received two pretests (1 week apart) prior to a week-long environmental sound training regimen, which was followed by two posttest sessions, separated by another week without training. Spectrally degraded stimuli, processed with a four-channel vocoder, consisted of a 160-item environmental sound test, word and sentence tests, and a battery of basic auditory abilities and cognitive tests. Results indicated significant improvements in all speech and environmental sound scores between the initial pretest and the last posttest with performance increments following both exposure and training. For environmental sounds (the stimulus class that was trained), the magnitude of positive change that accompanied training was much greater than that due to exposure alone, with improvement for untrained sounds roughly comparable to the speech benefit from exposure. Additional tests of auditory and cognitive abilities showed that speech and environmental sound performance were differentially correlated with tests of spectral and temporal-fine-structure processing, whereas working memory and executive function were correlated with speech, but not environmental sound perception. These findings indicate generalizability of environmental sound training and provide a basis for implementing environmental sound training programs for cochlear implant (CI) patients. PMID:22891070
Assessing Auditory Discrimination Skill of Malay Children Using Computer-based Method.
Ting, H; Yunus, J; Mohd Nordin, M Z
2005-01-01
The purpose of this paper is to investigate the auditory discrimination skill of Malay children using computer-based method. Currently, most of the auditory discrimination assessments are conducted manually by Speech-Language Pathologist. These conventional tests are actually general tests of sound discrimination, which do not reflect the client's specific speech sound errors. Thus, we propose computer-based Malay auditory discrimination test to automate the whole process of assessment as well as to customize the test according to the specific speech error sounds of the client. The ability in discriminating voiced and unvoiced Malay speech sounds was studied for the Malay children aged between 7 and 10 years old. The study showed no major difficulty for the children in discriminating the Malay speech sounds except differentiating /g/-/k/ sounds. Averagely the children of 7 years old failed to discriminate /g/-/k/ sounds.
Perceptual statistical learning over one week in child speech production.
Richtsmeier, Peter T; Goffman, Lisa
2017-07-01
What cognitive mechanisms account for the trajectory of speech sound development, in particular, gradually increasing accuracy during childhood? An intriguing potential contributor is statistical learning, a type of learning that has been studied frequently in infant perception but less often in child speech production. To assess the relevance of statistical learning to developing speech accuracy, we carried out a statistical learning experiment with four- and five-year-olds in which statistical learning was examined over one week. Children were familiarized with and tested on word-medial consonant sequences in novel words. There was only modest evidence for statistical learning, primarily in the first few productions of the first session. This initial learning effect nevertheless aligns with previous statistical learning research. Furthermore, the overall learning effect was similar to an estimate of weekly accuracy growth based on normative studies. The results implicate other important factors in speech sound development, particularly learning via production. Copyright © 2017 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Wellman, Rachel L.; Lewis, Barbara A.; Freebairn, Lisa A.; Avrich, Allison A.; Hansen, Amy J.; Stein, Catherine M.
2011-01-01
Purpose: The main purpose of this study was to examine how children with isolated speech sound disorders (SSDs; n = 20), children with combined SSDs and language impairment (LI; n = 20), and typically developing children (n = 20), ages 3;3 (years;months) to 6;6, differ in narrative ability. The second purpose was to determine if early narrative…
ERIC Educational Resources Information Center
Pivik, R. T.; Andres, Aline; Badger, Thomas M.
2011-01-01
Early post-natal nutrition influences later development, but there are no studies comparing brain function in healthy infants as a function of dietary intake even though the major infant diets differ significantly in nutrient composition. We studied brain responses (event-related potentials; ERPs) to speech sounds for infants who were fed either…
Hearing Evaluation in Children (For Parents)
... be used to test hearing, depending on a child's age, development, and health status. During behavioral tests, an audiologist carefully watches a child respond to sounds like calibrated speech (speech that ...
Phonological Awareness and Types of Sound Errors in Preschoolers with Speech Sound Disorders
ERIC Educational Resources Information Center
Preston, Jonathan; Edwards, Mary Louise
2010-01-01
Purpose: Some children with speech sound disorders (SSD) have difficulty with literacy-related skills, particularly phonological awareness (PA). This study investigates the PA skills of preschoolers with SSD by using a regression model to evaluate the degree to which PA can be concurrently predicted by types of speech sound errors. Method:…
Georgoulas, George; Georgopoulos, Voula C; Stylios, Chrysostomos D
2006-01-01
This paper proposes a novel integrated methodology to extract features and classify speech sounds with intent to detect the possible existence of a speech articulation disorder in a speaker. Articulation, in effect, is the specific and characteristic way that an individual produces the speech sounds. A methodology to process the speech signal, extract features and finally classify the signal and detect articulation problems in a speaker is presented. The use of support vector machines (SVMs), for the classification of speech sounds and detection of articulation disorders is introduced. The proposed method is implemented on a data set where different sets of features and different schemes of SVMs are tested leading to satisfactory performance.
ERIC Educational Resources Information Center
Preston, Jonathan L.; Edwards, Mary Louise
2009-01-01
Children with residual speech sound errors are often underserved clinically, yet there has been a lack of recent research elucidating the specific deficits in this population. Adolescents aged 10-14 with residual speech sound errors (RE) that included rhotics were compared to normally speaking peers on tasks assessing speed and accuracy of speech…
ERIC Educational Resources Information Center
Macrae, Toby; Tyler, Ann A.
2014-01-01
Purpose: The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. Method: In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different…
Processing of speech and non-speech stimuli in children with specific language impairment
NASA Astrophysics Data System (ADS)
Basu, Madhavi L.; Surprenant, Aimee M.
2003-10-01
Specific Language Impairment (SLI) is a developmental language disorder in which children demonstrate varying degrees of difficulties in acquiring a spoken language. One possible underlying cause is that children with SLI have deficits in processing sounds that are of short duration or when they are presented rapidly. Studies so far have compared their performance on speech and nonspeech sounds of unequal complexity. Hence, it is still unclear whether the deficit is specific to the perception of speech sounds or whether it more generally affects the auditory function. The current study aims to answer this question by comparing the performance of children with SLI on speech and nonspeech sounds synthesized from sine-wave stimuli. The children will be tested using the classic categorical perception paradigm that includes both the identification and discrimination of stimuli along a continuum. If there is a deficit in the performance on both speech and nonspeech tasks, it will show that these children have a deficit in processing complex sounds. Poor performance on only the speech sounds will indicate that the deficit is more related to language. The findings will offer insights into the exact nature of the speech perception deficits in children with SLI. [Work supported by ASHF.
Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic.
van der Ham, Sabine; de Boer, Bart
2015-10-01
When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults' generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants' reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories.
Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic
de Boer, Bart
2015-01-01
When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults’ generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants’ reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories. PMID:27648212
ERIC Educational Resources Information Center
Noguchi, Masaki; Hudson Kam, Carla L.
2018-01-01
In human languages, different speech sounds can be contextual variants of a single phoneme, called allophones. Learning which sounds are allophones is an integral part of the acquisition of phonemes. Whether given sounds are separate phonemes or allophones in a listener's language affects speech perception. Listeners tend to be less sensitive to…
ERIC Educational Resources Information Center
Yeni-Komshian, Grace; And Others
This study was designed to compare children and adults on their initial ability to identify and reproduce novel speech sounds and to evaluate their performance after receiving several training sessions in producing these sounds. The novel speech sounds used were two voiceless fricatives which are consonant phonemes in Arabic but which are…
Subtyping Children with Speech Sound Disorders by Endophenotypes
ERIC Educational Resources Information Center
Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.
2011-01-01
Purpose: The present study examined associations of 5 endophenotypes (i.e., measurable skills that are closely associated with speech sound disorders and are useful in detecting genetic influences on speech sound production), oral motor skills, phonological memory, phonological awareness, vocabulary, and speeded naming, with 3 clinical criteria…
Speech perception in individuals with auditory dys-synchrony.
Kumar, U A; Jayaram, M
2011-03-01
This study aimed to evaluate the effect of lengthening the transition duration of selected speech segments upon the perception of those segments in individuals with auditory dys-synchrony. Thirty individuals with auditory dys-synchrony participated in the study, along with 30 age-matched normal hearing listeners. Eight consonant-vowel syllables were used as auditory stimuli. Two experiments were conducted. Experiment one measured the 'just noticeable difference' time: the smallest prolongation of the speech sound transition duration which was noticeable by the subject. In experiment two, speech sounds were modified by lengthening the transition duration by multiples of the just noticeable difference time, and subjects' speech identification scores for the modified speech sounds were assessed. Subjects with auditory dys-synchrony demonstrated poor processing of temporal auditory information. Lengthening of speech sound transition duration improved these subjects' perception of both the placement and voicing features of the speech syllables used. These results suggest that innovative speech processing strategies which enhance temporal cues may benefit individuals with auditory dys-synchrony.
D'Souza, Dean; D'Souza, Hana; Johnson, Mark H; Karmiloff-Smith, Annette
2016-08-01
Typically-developing (TD) infants can construct unified cross-modal percepts, such as a speaking face, by integrating auditory-visual (AV) information. This skill is a key building block upon which higher-level skills, such as word learning, are built. Because word learning is seriously delayed in most children with neurodevelopmental disorders, we assessed the hypothesis that this delay partly results from a deficit in integrating AV speech cues. AV speech integration has rarely been investigated in neurodevelopmental disorders, and never previously in infants. We probed for the McGurk effect, which occurs when the auditory component of one sound (/ba/) is paired with the visual component of another sound (/ga/), leading to the perception of an illusory third sound (/da/ or /tha/). We measured AV integration in 95 infants/toddlers with Down, fragile X, or Williams syndrome, whom we matched on Chronological and Mental Age to 25 TD infants. We also assessed a more basic AV perceptual ability: sensitivity to matching vs. mismatching AV speech stimuli. Infants with Williams syndrome failed to demonstrate a McGurk effect, indicating poor AV speech integration. Moreover, while the TD children discriminated between matching and mismatching AV stimuli, none of the other groups did, hinting at a basic deficit or delay in AV speech processing, which is likely to constrain subsequent language development. Copyright © 2016 Elsevier Inc. All rights reserved.
The influence of (central) auditory processing disorder in speech sound disorders.
Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Vilela, Nadia; Carvallo, Renata Mota Mamede; Wertzner, Haydée Fiszbein
2016-01-01
Considering the importance of auditory information for the acquisition and organization of phonological rules, the assessment of (central) auditory processing contributes to both the diagnosis and targeting of speech therapy in children with speech sound disorders. To study phonological measures and (central) auditory processing of children with speech sound disorder. Clinical and experimental study, with 21 subjects with speech sound disorder aged between 7.0 and 9.11 years, divided into two groups according to their (central) auditory processing disorder. The assessment comprised tests of phonology, speech inconsistency, and metalinguistic abilities. The group with (central) auditory processing disorder demonstrated greater severity of speech sound disorder. The cutoff value obtained for the process density index was the one that best characterized the occurrence of phonological processes for children above 7 years of age. The comparison among the tests evaluated between the two groups showed differences in some phonological and metalinguistic abilities. Children with an index value above 0.54 demonstrated strong tendencies towards presenting a (central) auditory processing disorder, and this measure was effective to indicate the need for evaluation in children with speech sound disorder. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Australian children with cleft palate achieve age-appropriate speech by 5 years of age.
Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison
2017-12-01
Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.
Reed, Amanda C.; Centanni, Tracy M.; Borland, Michael S.; Matney, Chanel J.; Engineer, Crystal T.; Kilgard, Michael P.
2015-01-01
Objectives Hearing loss is a commonly experienced disability in a variety of populations including veterans and the elderly and can often cause significant impairment in the ability to understand spoken language. In this study, we tested the hypothesis that neural and behavioral responses to speech will be differentially impaired in an animal model after two forms of hearing loss. Design Sixteen female Sprague–Dawley rats were exposed to one of two types of broadband noise which was either moderate or intense. In nine of these rats, auditory cortex recordings were taken 4 weeks after noise exposure (NE). The other seven were pretrained on a speech sound discrimination task prior to NE and were then tested on the same task after hearing loss. Results Following intense NE, rats had few neural responses to speech stimuli. These rats were able to detect speech sounds but were no longer able to discriminate between speech sounds. Following moderate NE, rats had reorganized cortical maps and altered neural responses to speech stimuli but were still able to accurately discriminate between similar speech sounds during behavioral testing. Conclusions These results suggest that rats are able to adjust to the neural changes after moderate NE and discriminate speech sounds, but they are not able to recover behavioral abilities after intense NE. Animal models could help clarify the adaptive and pathological neural changes that contribute to speech processing in hearing-impaired populations and could be used to test potential behavioral and pharmacological therapies. PMID:25072238
ERIC Educational Resources Information Center
Johnson, Erin Phinney; Pennington, Bruce F.; Lowenstein, Joanna H.; Nittrouer, Susan
2011-01-01
Research Design;Intervention;Biology;Biotechnology;Teaching Methods;Hands on Science;Professional Development;Comparative Analysis;Genetics;Evaluation;Pretests Posttests;Control Groups;Science Education;Science Instruction;Pedagogical Content Knowledge;
Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena
2015-06-01
The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.
Liu, B; Wang, Z; Wu, G; Meng, X
2011-04-28
In this paper, we aim to study the cognitive integration of asynchronous natural or non-natural auditory and visual information in videos of real-world events. Videos with asynchronous semantically consistent or inconsistent natural sound or speech were used as stimuli in order to compare the difference and similarity between multisensory integrations of videos with asynchronous natural sound and speech. The event-related potential (ERP) results showed that N1 and P250 components were elicited irrespective of whether natural sounds were consistent or inconsistent with critical actions in videos. Videos with inconsistent natural sound could elicit N400-P600 effects compared to videos with consistent natural sound, which was similar to the results from unisensory visual studies. Videos with semantically consistent or inconsistent speech could both elicit N1 components. Meanwhile, videos with inconsistent speech would elicit N400-LPN effects in comparison with videos with consistent speech, which showed that this semantic processing was probably related to recognition memory. Moreover, the N400 effect elicited by videos with semantically inconsistent speech was larger and later than that elicited by videos with semantically inconsistent natural sound. Overall, multisensory integration of videos with natural sound or speech could be roughly divided into two stages. For the videos with natural sound, the first stage might reflect the connection between the received information and the stored information in memory; and the second one might stand for the evaluation process of inconsistent semantic information. For the videos with speech, the first stage was similar to the first stage of videos with natural sound; while the second one might be related to recognition memory process. Copyright © 2011 IBRO. Published by Elsevier Ltd. All rights reserved.
Phonetic Recalibration Only Occurs in Speech Mode
ERIC Educational Resources Information Center
Vroomen, Jean; Baart, Martijn
2009-01-01
Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…
Speech Sound Disorders in a Community Study of Preschool Children
ERIC Educational Resources Information Center
McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane
2013-01-01
Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…
Increasing Parental Involvement in Speech-Sound Remediation
ERIC Educational Resources Information Center
Roberts, Micah Renee Ferguson
2014-01-01
Speech therapy homework is a key component of a successful speech therapy program, increasing carryover of learned speech sounds. Poor return rate of homework assigned, with a lack of parental involvement, is a problem. The purpose of this project study was to examine what may increase parental participation in speech therapy homework. Guided by…
Sounds Exaggerate Visual Shape
ERIC Educational Resources Information Center
Sweeny, Timothy D.; Guzman-Martinez, Emmanuel; Ortega, Laura; Grabowecky, Marcia; Suzuki, Satoru
2012-01-01
While perceiving speech, people see mouth shapes that are systematically associated with sounds. In particular, a vertically stretched mouth produces a /woo/ sound, whereas a horizontally stretched mouth produces a /wee/ sound. We demonstrate that hearing these speech sounds alters how we see aspect ratio, a basic visual feature that contributes…
A multimedia PDA/PC speech and language therapy tool for patients with aphasia.
Reeves, Nina; Jefferies, Laura; Cunningham, Sally-Jo; Harris, Catherine
2007-01-01
Aphasia is a speech disorder usually caused by stroke or head injury and may involve a variety of communication difficulties. As 30% of stroke sufferers have a persisting speech and language disorder and therapy resources are low, there is clear scope for the development of technology to support patients between therapy sessions. This paper reports on an empirical study which evaluated SoundHelper, a multimedia application to demonstrate how to pronounce target speech sounds. Two prototypes, involving either video or animation, were developed and evaluated with 20 Speech and Language Therapists. Participants responded positively to both, with the video being preferred because of the perceived extra information provided. The potential for the use on portable devices, since internet access is limited in hospitals, is explored in the light of opinions of Augmented and Alternative Communication (AAC) device users in the UK nd Europe who have expressed a strong desire for more use of internet services.
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese.
Saito, Akie; Inoue, Tomoyoshi
2017-06-01
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the same position within a syllable or word. Most of the evidence for the effect comes from analyses of naturally occurring speech errors in Indo-European languages, and there are few studies examining the effect in experimentally elicited speech errors and in other languages. This study examined whether experimentally elicited sound errors in Japanese exhibits the syllable position effect. In Japanese, the sub-syllabic unit known as "mora" is considered to be a basic sound unit in production. Results showed that the syllable position effect occurred in mora errors, suggesting that the frame constrains the ordering of sounds during speech production.
ERIC Educational Resources Information Center
Leech, Robert; Saygin, Ayse Pinar
2011-01-01
Using functional MRI, we investigated whether auditory processing of both speech and meaningful non-linguistic environmental sounds in superior and middle temporal cortex relies on a complex and spatially distributed neural system. We found that evidence for spatially distributed processing of speech and environmental sounds in a substantial…
Dynamic Assessment of Phonological Awareness for Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Gillam, Sandra Laing; Ford, Mikenzi Bentley
2012-01-01
The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…
The speech perception skills of children with and without speech sound disorder.
Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie
To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
Relationship between individual differences in speech processing and cognitive functions.
Ou, Jinghua; Law, Sam-Po; Fung, Roxana
2015-12-01
A growing body of research has suggested that cognitive abilities may play a role in individual differences in speech processing. The present study took advantage of a widespread linguistic phenomenon of sound change to systematically assess the relationships between speech processing and various components of attention and working memory in the auditory and visual modalities among typically developed Cantonese-speaking individuals. The individual variations in speech processing are captured in an ongoing sound change-tone merging in Hong Kong Cantonese, in which typically developed native speakers are reported to lose the distinctions between some tonal contrasts in perception and/or production. Three groups of participants were recruited, with a first group of good perception and production, a second group of good perception but poor production, and a third group of good production but poor perception. Our findings revealed that modality-independent abilities of attentional switching/control and working memory might contribute to individual differences in patterns of speech perception and production as well as discrimination latencies among typically developed speakers. The findings not only have the potential to generalize to speech processing in other languages, but also broaden our understanding of the omnipresent phenomenon of language change in all languages.
The sensorimotor and social sides of the architecture of speech.
Pezzulo, Giovanni; Barca, Laura; D'Ausilio, Alessando
2014-12-01
Speech is a complex skill to master. In addition to sophisticated phono-articulatory abilities, speech acquisition requires neuronal systems configured for vocal learning, with adaptable sensorimotor maps that couple heard speech sounds with motor programs for speech production; imitation and self-imitation mechanisms that can train the sensorimotor maps to reproduce heard speech sounds; and a "pedagogical" learning environment that supports tutor learning.
NASA Astrophysics Data System (ADS)
Kaddoura, Tarek; Vadlamudi, Karunakar; Kumar, Shine; Bobhate, Prashant; Guo, Long; Jain, Shreepal; Elgendi, Mohamed; Coe, James Y.; Kim, Daniel; Taylor, Dylan; Tymchak, Wayne; Schuurmans, Dale; Zemp, Roger J.; Adatia, Ian
2016-09-01
We hypothesized that an automated speech- recognition-inspired classification algorithm could differentiate between the heart sounds in subjects with and without pulmonary hypertension (PH) and outperform physicians. Heart sounds, electrocardiograms, and mean pulmonary artery pressures (mPAp) were recorded simultaneously. Heart sound recordings were digitized to train and test speech-recognition-inspired classification algorithms. We used mel-frequency cepstral coefficients to extract features from the heart sounds. Gaussian-mixture models classified the features as PH (mPAp ≥ 25 mmHg) or normal (mPAp < 25 mmHg). Physicians blinded to patient data listened to the same heart sound recordings and attempted a diagnosis. We studied 164 subjects: 86 with mPAp ≥ 25 mmHg (mPAp 41 ± 12 mmHg) and 78 with mPAp < 25 mmHg (mPAp 17 ± 5 mmHg) (p < 0.005). The correct diagnostic rate of the automated speech-recognition-inspired algorithm was 74% compared to 56% by physicians (p = 0.005). The false positive rate for the algorithm was 34% versus 50% (p = 0.04) for clinicians. The false negative rate for the algorithm was 23% and 68% (p = 0.0002) for physicians. We developed an automated speech-recognition-inspired classification algorithm for the acoustic diagnosis of PH that outperforms physicians that could be used to screen for PH and encourage earlier specialist referral.
Liu, Chang; Jin, Su-Hyun
2015-11-01
This study investigated whether native listeners processed speech differently from non-native listeners in a speech detection task. Detection thresholds of Mandarin Chinese and Korean vowels and non-speech sounds in noise, frequency selectivity, and the nativeness of Mandarin Chinese and Korean vowels were measured for Mandarin Chinese- and Korean-native listeners. The two groups of listeners exhibited similar non-speech sound detection and frequency selectivity; however, the Korean listeners had better detection thresholds of Korean vowels than Chinese listeners, while the Chinese listeners performed no better at Chinese vowel detection than the Korean listeners. Moreover, thresholds predicted from an auditory model highly correlated with behavioral thresholds of the two groups of listeners, suggesting that detection of speech sounds not only depended on listeners' frequency selectivity, but also might be affected by their native language experience. Listeners evaluated their native vowels with higher nativeness scores than non-native listeners. Native listeners may have advantages over non-native listeners when processing speech sounds in noise, even without the required phonetic processing; however, such native speech advantages might be offset by Chinese listeners' lower sensitivity to vowel sounds, a characteristic possibly resulting from their sparse vowel system and their greater cognitive and attentional demands for vowel processing.
ERIC Educational Resources Information Center
Skahan, Sarah M.; Watson, Maggie; Lof, Gregory L.
2007-01-01
Purpose: This study examined assessment procedures used by speech-language pathologists (SLPs) when assessing children suspected of having speech sound disorders (SSD). This national survey also determined the information participants obtained from clients' speech samples, evaluation of non-native English speakers, and time spent on assessment.…
Bohm, Lauren A; Nelson, Marc E; Driver, Lynn E; Green, Glenn E
2010-12-01
To determine the importance of prelinguistic babbling by studying patterns of speech and language development after cricotracheal resection in aphonic children. Retrospective review of seven previously aphonic children who underwent cricotracheal resection by our pediatric thoracic airway team. The analyzed variables include age, sex, comorbidity, grade of stenosis, length of resected trachea, and communication methods. Data regarding the children's pre- and postsurgical communication methods, along with their utilization of speech therapy services, were obtained via speech-language pathology evaluations, clinical observations, and a standardized telephone survey supplemented by parental documentation. Postsurgical voice quality was assessed using the Pediatric Voice Outcomes Survey. All seven subjects underwent tracheostomy prior to 2 months of age when corrected for prematurity. The subjects remained aphonic for the entire duration of cannulation. Following cricotracheal resection, they experienced an initial delay in speech acquisition. Vegetative functions were the first laryngeal sounds to emerge. Initially, the children were only able to produce these sounds reflexively, but they subsequently gained voluntary control over these laryngeal functions. All subjects underwent an identifiable stage of canonical babbling that often occurred concomitantly with vocalizations. This was followed by the emergence of true speech. The initial delay in speech acquisition observed following decannulation, along with the presence of a postsurgical canonical stage in all study subjects, supports the hypothesis that babbling is necessary for speech and language development. Furthermore, the presence of babbling is universally evident regardless of the age at which speech develops. Finally, there is no demonstrable correlation between preoperative sign language and rate of speech development. Copyright © 2010 The American Laryngological, Rhinological, and Otological Society, Inc.
Sleep and Native Language Interference Affect Non-Native Speech Sound Learning
Earle, F. Sayako; Myers, Emily B.
2015-01-01
Adults learning a new language are faced with a significant challenge: non-native speech sounds that are perceptually similar to sounds in one’s native language can be very difficult to acquire. Sleep and native language interference, two factors that may help to explain this difficulty in acquisition, are addressed in three studies. Results of Experiment 1 showed that participants trained on a non-native contrast at night improved in discrimination 24 hours after training, while those trained in the morning showed no such improvement. Experiments 2 and 3 addressed the possibility that incidental exposure to perceptually similar native language speech sounds during the day interfered with maintenance in the morning group. Taken together, results show that the ultimate success of non-native speech sound learning depends not only on the similarity of learned sounds to the native language repertoire, but also to interference from native language sounds before sleep. PMID:26280264
Sleep and native language interference affect non-native speech sound learning.
Earle, F Sayako; Myers, Emily B
2015-12-01
Adults learning a new language are faced with a significant challenge: non-native speech sounds that are perceptually similar to sounds in one's native language can be very difficult to acquire. Sleep and native language interference, 2 factors that may help to explain this difficulty in acquisition, are addressed in 3 studies. Results of Experiment 1 showed that participants trained on a non-native contrast at night improved in discrimination 24 hr after training, while those trained in the morning showed no such improvement. Experiments 2 and 3 addressed the possibility that incidental exposure to perceptually similar native language speech sounds during the day interfered with maintenance in the morning group. Taken together, results show that the ultimate success of non-native speech sound learning depends not only on the similarity of learned sounds to the native language repertoire, but also to interference from native language sounds before sleep. (c) 2015 APA, all rights reserved).
Speech vs. singing: infants choose happier sounds
Corbeil, Marieve; Trehub, Sandra E.; Peretz, Isabelle
2013-01-01
Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4–13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119
Multilingual Aspects of Speech Sound Disorders in Children. Communication Disorders across Languages
ERIC Educational Resources Information Center
McLeod, Sharynne; Goldstein, Brian
2012-01-01
Multilingual Aspects of Speech Sound Disorders in Children explores both multilingual and multicultural aspects of children with speech sound disorders. The 30 chapters have been written by 44 authors from 16 different countries about 112 languages and dialects. The book is designed to translate research into clinical practice. It is divided into…
ERIC Educational Resources Information Center
Seitz, Aaron R.; Protopapas, Athanassios; Tsushima, Yoshiaki; Vlahou, Eleni L.; Gori, Simone; Grossberg, Stephen; Watanabe, Takeo
2010-01-01
Learning a second language as an adult is particularly effortful when new phonetic representations must be formed. Therefore the processes that allow learning of speech sounds are of great theoretical and practical interest. Here we examined whether perception of single formant transitions, that is, sound components critical in speech perception,…
Preschoolers' real-time coordination of vocal and facial emotional information.
Berman, Jared M J; Chambers, Craig G; Graham, Susan A
2016-02-01
An eye-tracking methodology was used to examine the time course of 3- and 5-year-olds' ability to link speech bearing different acoustic cues to emotion (i.e., happy-sounding, neutral, and sad-sounding intonation) to photographs of faces reflecting different emotional expressions. Analyses of saccadic eye movement patterns indicated that, for both 3- and 5-year-olds, sad-sounding speech triggered gaze shifts to a matching (sad-looking) face from the earliest moments of speech processing. However, it was not until approximately 800ms into a happy-sounding utterance that preschoolers began to use the emotional cues from speech to identify a matching (happy-looking) face. Complementary analyses based on conscious/controlled behaviors (children's explicit points toward the faces) indicated that 5-year-olds, but not 3-year-olds, could successfully match happy-sounding and sad-sounding vocal affect to a corresponding emotional face. Together, the findings clarify developmental patterns in preschoolers' implicit versus explicit ability to coordinate emotional cues across modalities and highlight preschoolers' greater sensitivity to sad-sounding speech as the auditory signal unfolds in time. Copyright © 2015 Elsevier Inc. All rights reserved.
Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen
2017-07-12
The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables-words of three or more syllables-are important to consider because unlike monosyllables, polysyllables have been associated with phonological processing and literacy difficulties in school-aged children. They therefore have the potential to help identify preschoolers most at risk of future literacy difficulties. Participants were 93 preschool children with SSD from the Sound Start Study. Participants completed the Polysyllable Preschool Test (Baker, 2013) as well as phonological processing, receptive vocabulary, and print knowledge tasks. Cluster analysis was completed, and 2 clusters were identified: low polysyllable accuracy and moderate polysyllable accuracy. The clusters were significantly different based on 2 measures of phonological awareness and measures of receptive vocabulary, rapid naming, and digit span. The clusters were not significantly different on sound matching accuracy or letter, sound, or print concept knowledge. The participants' poor performance on print knowledge tasks suggested that as a group, they were at risk of literacy difficulties but that there was a cluster of participants at greater risk-those with both low polysyllable accuracy and poor phonological processing.
Boldt, Robert; Malinen, Sanna; Seppä, Mika; Tikka, Pia; Savolainen, Petri; Hari, Riitta; Carlson, Synnöve
2013-01-01
Earlier studies have shown considerable intersubject synchronization of brain activity when subjects watch the same movie or listen to the same story. Here we investigated the across-subjects similarity of brain responses to speech and non-speech sounds in a continuous audio drama designed for blind people. Thirteen healthy adults listened for ∼19 min to the audio drama while their brain activity was measured with 3 T functional magnetic resonance imaging (fMRI). An intersubject-correlation (ISC) map, computed across the whole experiment to assess the stimulus-driven extrinsic brain network, indicated statistically significant ISC in temporal, frontal and parietal cortices, cingulate cortex, and amygdala. Group-level independent component (IC) analysis was used to parcel out the brain signals into functionally coupled networks, and the dependence of the ICs on external stimuli was tested by comparing them with the ISC map. This procedure revealed four extrinsic ICs of which two–covering non-overlapping areas of the auditory cortex–were modulated by both speech and non-speech sounds. The two other extrinsic ICs, one left-hemisphere-lateralized and the other right-hemisphere-lateralized, were speech-related and comprised the superior and middle temporal gyri, temporal poles, and the left angular and inferior orbital gyri. In areas of low ISC four ICs that were defined intrinsic fluctuated similarly as the time-courses of either the speech-sound-related or all-sounds-related extrinsic ICs. These ICs included the superior temporal gyrus, the anterior insula, and the frontal, parietal and midline occipital cortices. Taken together, substantial intersubject synchronization of cortical activity was observed in subjects listening to an audio drama, with results suggesting that speech is processed in two separate networks, one dedicated to the processing of speech sounds and the other to both speech and non-speech sounds. PMID:23734202
Boldt, Robert; Malinen, Sanna; Seppä, Mika; Tikka, Pia; Savolainen, Petri; Hari, Riitta; Carlson, Synnöve
2013-01-01
Earlier studies have shown considerable intersubject synchronization of brain activity when subjects watch the same movie or listen to the same story. Here we investigated the across-subjects similarity of brain responses to speech and non-speech sounds in a continuous audio drama designed for blind people. Thirteen healthy adults listened for ∼19 min to the audio drama while their brain activity was measured with 3 T functional magnetic resonance imaging (fMRI). An intersubject-correlation (ISC) map, computed across the whole experiment to assess the stimulus-driven extrinsic brain network, indicated statistically significant ISC in temporal, frontal and parietal cortices, cingulate cortex, and amygdala. Group-level independent component (IC) analysis was used to parcel out the brain signals into functionally coupled networks, and the dependence of the ICs on external stimuli was tested by comparing them with the ISC map. This procedure revealed four extrinsic ICs of which two-covering non-overlapping areas of the auditory cortex-were modulated by both speech and non-speech sounds. The two other extrinsic ICs, one left-hemisphere-lateralized and the other right-hemisphere-lateralized, were speech-related and comprised the superior and middle temporal gyri, temporal poles, and the left angular and inferior orbital gyri. In areas of low ISC four ICs that were defined intrinsic fluctuated similarly as the time-courses of either the speech-sound-related or all-sounds-related extrinsic ICs. These ICs included the superior temporal gyrus, the anterior insula, and the frontal, parietal and midline occipital cortices. Taken together, substantial intersubject synchronization of cortical activity was observed in subjects listening to an audio drama, with results suggesting that speech is processed in two separate networks, one dedicated to the processing of speech sounds and the other to both speech and non-speech sounds.
Ultrasound visual feedback treatment and practice variability for residual speech sound errors
Preston, Jonathan L.; McCabe, Patricia; Rivera-Campos, Ahmed; Whittle, Jessica L.; Landry, Erik; Maas, Edwin
2014-01-01
Purpose The goals were to (1) test the efficacy of a motor-learning based treatment that includes ultrasound visual feedback for individuals with residual speech sound errors, and (2) explore whether the addition of prosodic cueing facilitates speech sound learning. Method A multiple baseline single subject design was used, replicated across 8 participants. For each participant, one sound context was treated with ultrasound plus prosodic cueing for 7 sessions, and another sound context was treated with ultrasound but without prosodic cueing for 7 sessions. Sessions included ultrasound visual feedback as well as non-ultrasound treatment. Word-level probes assessing untreated words were used to evaluate retention and generalization. Results For most participants, increases in accuracy of target sound contexts at the word level were observed with the treatment program regardless of whether prosodic cueing was included. Generalization between onset singletons and clusters was observed, as well as generalization to sentence-level accuracy. There was evidence of retention during post-treatment probes, including at a two-month follow-up. Conclusions A motor-based treatment program that includes ultrasound visual feedback can facilitate learning of speech sounds in individuals with residual speech sound errors. PMID:25087938
Movement goals and feedback and feedforward control mechanisms in speech production
Perkell, Joseph S.
2010-01-01
Studies of speech motor control are described that support a theoretical framework in which fundamental control variables for phonemic movements are multi-dimensional regions in auditory and somatosensory spaces. Auditory feedback is used to acquire and maintain auditory goals and in the development and function of feedback and feedforward control mechanisms. Several lines of evidence support the idea that speakers with more acute sensory discrimination acquire more distinct goal regions and therefore produce speech sounds with greater contrast. Feedback modification findings indicate that fluently produced sound sequences are encoded as feedforward commands, and feedback control serves to correct mismatches between expected and produced sensory consequences. PMID:22661828
Movement goals and feedback and feedforward control mechanisms in speech production.
Perkell, Joseph S
2012-09-01
Studies of speech motor control are described that support a theoretical framework in which fundamental control variables for phonemic movements are multi-dimensional regions in auditory and somatosensory spaces. Auditory feedback is used to acquire and maintain auditory goals and in the development and function of feedback and feedforward control mechanisms. Several lines of evidence support the idea that speakers with more acute sensory discrimination acquire more distinct goal regions and therefore produce speech sounds with greater contrast. Feedback modification findings indicate that fluently produced sound sequences are encoded as feedforward commands, and feedback control serves to correct mismatches between expected and produced sensory consequences.
Girolametto, Luigi; Weitzman, Elaine; Greenberg, Janice
2012-02-01
This study examined the efficacy of a professional development program for early childhood educators that facilitated emergent literacy skills in preschoolers. The program, led by a speech-language pathologist, focused on teaching alphabet knowledge, print concepts, sound awareness, and decontextualized oral language within naturally occurring classroom interactions. Twenty educators were randomly assigned to experimental and control groups. Educators each recruited 3 to 4 children from their classrooms to participate. The experimental group participated in 18 hr of group training and 3 individual coaching sessions with a speech-language pathologist. The effects of intervention were examined in 30 min of videotaped interaction, including storybook reading and a post-story writing activity. At posttest, educators in the experimental group used a higher rate of utterances that included print/sound references and decontextualized language than the control group. Similarly, the children in the experimental group used a significantly higher rate of utterances that included print/sound references and decontextualized language compared to the control group. These findings suggest that professional development provided by a speech-language pathologist can yield short-term changes in the facilitation of emergent literacy skills in early childhood settings. Future research is needed to determine the impact of this program on the children's long-term development of conventional literacy skills.
Interventions for Speech Sound Disorders in Children
ERIC Educational Resources Information Center
Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.
2010-01-01
With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…
Fraga González, Gorka; Žarić, Gojko; Tijms, Jurgen; Bonte, Milene; van der Molen, Maurits W.
2015-01-01
A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound associations with a special focus on gains in reading fluency. A sample of 44 children with dyslexia and 23 typical readers, aged 8 to 9, was recruited. Children with dyslexia were randomly allocated to either the training program group (n = 23) or a waiting-list control group (n = 21). The training intensively focused on letter-speech sound mapping and consisted of 34 individual sessions of 45 minutes over a five month period. The children with dyslexia showed substantial reading gains for the main word reading and spelling measures after training, improving at a faster rate than typical readers and waiting-list controls. The results are interpreted within the conceptual framework assuming a multisensory integration deficit as the most proximal cause of dysfluent reading in dyslexia. Trial Registration: ISRCTN register ISRCTN12783279 PMID:26629707
APPLICATION OF MOWRER'S AUTISTIC THEORY TO THE SPEECH HABILITATION OF MENTALLY RETARDED PUPILS.
ERIC Educational Resources Information Center
RIGRODSKY, S.; AND OTHERS
A SPEECH THERAPY METHOD FOR MENTAL RETARDATES WAS DEVELOPED AND EVALUATED. THE METHOD WAS BASED UPON THE ESTABLISHMENT OF FAVORABLE ASSOCIATIONS IN THE CHILD BETWEEN THE WORDS AND SOUNDS OF LANGUAGE AND THE PRODUCER OF THE LANGUAGE, USING STIMULUS-REWARD AND SITUATION-REWARD PRINCIPLES. TRADITIONAL METHODS OF SPEECH THERAPY WERE ADMINISTERED,…
Soskey, Laura N; Allen, Paul D; Bennetto, Loisa
2017-08-01
One of the earliest observable impairments in autism spectrum disorder (ASD) is a failure to orient to speech and other social stimuli. Auditory spatial attention, a key component of orienting to sounds in the environment, has been shown to be impaired in adults with ASD. Additionally, specific deficits in orienting to social sounds could be related to increased acoustic complexity of speech. We aimed to characterize auditory spatial attention in children with ASD and neurotypical controls, and to determine the effect of auditory stimulus complexity on spatial attention. In a spatial attention task, target and distractor sounds were played randomly in rapid succession from speakers in a free-field array. Participants attended to a central or peripheral location, and were instructed to respond to target sounds at the attended location while ignoring nearby sounds. Stimulus-specific blocks evaluated spatial attention for simple non-speech tones, speech sounds (vowels), and complex non-speech sounds matched to vowels on key acoustic properties. Children with ASD had significantly more diffuse auditory spatial attention than neurotypical children when attending front, indicated by increased responding to sounds at adjacent non-target locations. No significant differences in spatial attention emerged based on stimulus complexity. Additionally, in the ASD group, more diffuse spatial attention was associated with more severe ASD symptoms but not with general inattention symptoms. Spatial attention deficits have important implications for understanding social orienting deficits and atypical attentional processes that contribute to core deficits of ASD. Autism Res 2017, 10: 1405-1416. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Tkach, Jean A.; Chen, Xu; Freebairn, Lisa A.; Schmithorst, Vincent J.; Holland, Scott K.; Lewis, Barbara A.
2011-01-01
Speech sound disorders (SSD) are the largest group of communication disorders observed in children. One explanation for these disorders is that children with SSD fail to form stable phonological representations when acquiring the speech sound system of their language due to poor phonological memory (PM). The goal of this study was to examine PM in…
Acoustic signals for emergency evacuation.
DOT National Transportation Integrated Search
1979-01-01
Previous studies of binaural hearing suggested that speech sounds are less resistant to masking than are nonspeech sounds; experiments demonstrated that, when the nonspeech sounds are given a message to convey, they act more like speech. Earlier rese...
Applications of Hilbert Spectral Analysis for Speech and Sound Signals
NASA Technical Reports Server (NTRS)
Huang, Norden E.
2003-01-01
A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.
Foreign Subtitles Help but Native-Language Subtitles Harm Foreign Speech Perception
Mitterer, Holger; McQueen, James M.
2009-01-01
Understanding foreign speech is difficult, in part because of unusual mappings between sounds and words. It is known that listeners in their native language can use lexical knowledge (about how words ought to sound) to learn how to interpret unusual speech-sounds. We therefore investigated whether subtitles, which provide lexical information, support perceptual learning about foreign speech. Dutch participants, unfamiliar with Scottish and Australian regional accents of English, watched Scottish or Australian English videos with Dutch, English or no subtitles, and then repeated audio fragments of both accents. Repetition of novel fragments was worse after Dutch-subtitle exposure but better after English-subtitle exposure. Native-language subtitles appear to create lexical interference, but foreign-language subtitles assist speech learning by indicating which words (and hence sounds) are being spoken. PMID:19918371
Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.
2015-01-01
In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749
ERIC Educational Resources Information Center
McGrath, Lauren M.; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D.; Pennington, Bruce F.
2008-01-01
This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language…
Dimensions of Early Speech Sound Disorders: A Factor Analytic Study
ERIC Educational Resources Information Center
Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Stein, Catherine M.; Shriberg, Lawrence D.; Iyengar, Sudha K.; Taylor, H. Gerry
2006-01-01
The goal of this study was to classify children with speech sound disorders (SSD) empirically, using factor analytic techniques. Participants were 3-7-year olds enrolled in speech/language therapy (N=185). Factor analysis of an extensive battery of speech and language measures provided support for two distinct factors, representing the skill…
ERIC Educational Resources Information Center
Klein, Harriet B.; Liu-Shea, May
2009-01-01
Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study
ERIC Educational Resources Information Center
Preston, Jonathan L.; Seki, Ayumi
2011-01-01
Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…
ERIC Educational Resources Information Center
Constantino, John N.; Yang, Dan; Gray, Teddi L.; Gross, Maggie M.; Abbacchi, Anna M.; Smith, Sarah C.; Kohn, Catherine E.; Kuhl, Patricia K.
2007-01-01
Autism spectrum disorders (ASDs) are characterized by correlated deficiencies in social and language development. This study explored a fundamental aspect of auditory information processing (AIP) that is dependent on social experience and critical to early language development: the ability to compartmentalize close-sounding speech sounds into…
Learning Vowel Categories from Maternal Speech in Gurindji Kriol
ERIC Educational Resources Information Center
Jones, Caroline; Meakins, Felicity; Muawiyath, Shujau
2012-01-01
Distributional learning is a proposal for how infants might learn early speech sound categories from acoustic input before they know many words. When categories in the input differ greatly in relative frequency and overlap in acoustic space, research in bilingual development suggests that this affects the course of development. In the present…
Optimizing acoustical conditions for speech intelligibility in classrooms
NASA Astrophysics Data System (ADS)
Yang, Wonyoung
High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with SNS = 4 dB and increased to 0.8 and 1.2 s with decreased SNS = 0 dB, for both normal and hearing-impaired listeners. Hearing-impaired listeners required more early energy than normal-hearing listeners. Reflective ceiling barriers and ceiling reflectors---in particular, parallel front-back rows of semi-circular reflectors---achieved the goal of decreasing reverberation with the least speech-level reduction.
The Multisensory Sound Lab: Sounds You Can See and Feel.
ERIC Educational Resources Information Center
Lederman, Norman; Hendricks, Paula
1994-01-01
A multisensory sound lab has been developed at the Model Secondary School for the Deaf (District of Columbia). A special floor allows vibrations to be felt, and a spectrum analyzer displays frequencies and harmonics visually. The lab is used for science education, auditory training, speech therapy, music and dance instruction, and relaxation…
Duke, Mila Morais; Wolfe, Jace; Schafer, Erin
2016-05-01
Cochlear implant (CI) recipients often experience difficulty understanding speech in noise and speech that originates from a distance. Many CI recipients also experience difficulty understanding speech originating from a television. Use of hearing assistance technology (HAT) may improve speech recognition in noise and for signals that originate from more than a few feet from the listener; however, there are no published studies evaluating the potential benefits of a wireless HAT designed to deliver audio signals from a television directly to a CI sound processor. The objective of this study was to compare speech recognition in quiet and in noise of CI recipients with the use of their CI alone and with the use of their CI and a wireless HAT (Cochlear Wireless TV Streamer). A two-way repeated measures design was used to evaluate performance differences obtained in quiet and in competing noise (65 dBA) with the CI sound processor alone and with the sound processor coupled to the Cochlear Wireless TV Streamer. Sixteen users of Cochlear Nucleus 24 Freedom, CI512, and CI422 implants were included in the study. Participants were evaluated in four conditions including use of the sound processor alone and use of the sound processor with the wireless streamer in quiet and in the presence of competing noise at 65 dBA. Speech recognition was evaluated in each condition with two full lists of Computer-Assisted Speech Perception Testing and Training Sentence-Level Test sentences presented from a light-emitting diode television. Speech recognition in noise was significantly better with use of the wireless streamer compared to participants' performance with their CI sound processor alone. There was also a nonsignificant trend toward better performance in quiet with use of the TV Streamer. Performance was significantly poorer when evaluated in noise compared to performance in quiet when the TV Streamer was not used. Use of the Cochlear Wireless TV Streamer designed to stream audio from a television directly to a CI sound processor provides better speech recognition in quiet and in noise when compared to performance obtained with use of the CI sound processor alone. American Academy of Audiology.
ERIC Educational Resources Information Center
Holden, Laura K.; Vandali, Andrew E.; Skinner, Margaret W.; Fourakis, Marios S.; Holden, Timothy A.
2005-01-01
One of the difficulties faced by cochlear implant (CI) recipients is perception of low-intensity speech cues. A. E. Vandali (2001) has developed the transient emphasis spectral maxima (TESM) strategy to amplify short-duration, low-level sounds. The aim of the present study was to determine whether speech scores would be significantly higher with…
Effects of fixed labial orthodontic appliances on speech sound production.
Paley, Jonathan S; Cisneros, George J; Nicolay, Olivier F; LeBlanc, Etoile M
2016-05-01
To explore the impact of fixed labial orthodontic appliances on speech sound production. Speech evaluations were performed on 23 patients with fixed labial appliances. Evaluations were performed immediately prior to appliance insertion, immediately following insertion, and 1 and 2 months post insertion. Baseline dental/skeletal variables were correlated with the ability to accommodate the presence of the appliances. Appliance effects were variable: 44% of the subjects were unaffected, 39% were temporarily affected but adapted within 2 months, and 17% of patients showed persistent sound errors at 2 months. Resolution of acquired sound errors was noted by 8 months post-appliance removal. Maladaptation to appliances was correlated to severity of malocclusion as determined by the Grainger's Treatment Priority Index. Sibilant sounds, most notably /s/, were affected most often. (1) Insertion of fixed labial appliances has an effect on speech sound production. (2) Sibilant and stopped sounds are affected, with /s/ being affected most often. (3) Accommodation to fixed appliances depends on the severity of malocclusion.
Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance
ERIC Educational Resources Information Center
Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina
2013-01-01
Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…
ERIC Educational Resources Information Center
Crowe, Kathryn; Cumming, Tamara; McCormack, Jane; Baker, Elise; McLeod, Sharynne; Wren, Yvonne; Roulstone, Sue; Masso, Sarah
2017-01-01
Early childhood educators are frequently called on to support preschool-aged children with speech sound disorders and to engage these children in activities that target their speech production. This study explored factors that acted as facilitators and/or barriers to the provision of computer-based support for children with speech sound disorders…
ERIC Educational Resources Information Center
McKinnon, David H.; McLeod, Sharynne; Reilly, Sheena
2007-01-01
Purpose: The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to…
ERIC Educational Resources Information Center
Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.
2015-01-01
Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…
Speech recognition: Acoustic phonetic and lexical knowledge representation
NASA Astrophysics Data System (ADS)
Zue, V. W.
1983-02-01
The purpose of this program is to develop a speech data base facility under which the acoustic characteristics of speech sounds in various contexts can be studied conveniently; investigate the phonological properties of a large lexicon of, say 10,000 words, and determine to what extent the phontactic constraints can be utilized in speech recognition; study the acoustic cues that are used to mark work boundaries; develop a test bed in the form of a large-vocabulary, IWR system to study the interactions of acoustic, phonetic and lexical knowledge; and develop a limited continuous speech recognition system with the goal of recognizing any English word from its spelling in order to assess the interactions of higher-level knowledge sources.
Flaherty, Mary; Dent, Micheal L.; Sawusch, James R.
2017-01-01
The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT) and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated), Passive speech exposure (regular exposure to human speech), and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with “d” or “t” and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal. PMID:28562597
Flaherty, Mary; Dent, Micheal L; Sawusch, James R
2017-01-01
The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT) and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated), Passive speech exposure (regular exposure to human speech), and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.
Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams.
Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P
2016-04-27
Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing. These animals also exhibited abnormal neural plasticity after training, which may be a mechanism for why some children with dyslexia do not respond to intervention. These results are in stark contrast to our previous work with a different candidate gene, which caused a different set of deficits. Our results shed some light on possible neural and genetic mechanisms causing heterogeneity in the dyslexic population. Copyright © 2016 the authors 0270-6474/16/364895-12$15.00/0.
Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams
Booker, Anne B.; Chen, Fuyi; Sloan, Andrew M.; Carraway, Ryan S.; Rennaker, Robert L.; LoTurco, Joseph J.; Kilgard, Michael P.
2016-01-01
Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC−) before any behavioral training. A separate group of 8 rats (3 DC−) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. SIGNIFICANCE STATEMENT Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing. These animals also exhibited abnormal neural plasticity after training, which may be a mechanism for why some children with dyslexia do not respond to intervention. These results are in stark contrast to our previous work with a different candidate gene, which caused a different set of deficits. Our results shed some light on possible neural and genetic mechanisms causing heterogeneity in the dyslexic population. PMID:27122044
Adults with Specific Language Impairment fail to consolidate speech sounds during sleep.
Earle, F Sayako; Landi, Nicole; Myers, Emily B
2018-02-14
Specific Language Impairment (SLI) is a common learning disability that is associated with poor speech sound representations. These differences in representational quality are thought to impose a burden on spoken language processing. The underlying mechanism to account for impoverished speech sound representations remains in debate. Previous findings that implicate sleep as important for building speech representations, combined with reports of atypical sleep in SLI, motivate the current investigation into a potential consolidation mechanism as a source of impoverished representations in SLI. In the current study, we trained individuals with SLI on a new (nonnative) set of speech sounds, and tracked their perceptual accuracy and neural responses to these sounds over two days. Adults with SLI achieved comparable performance to typical controls during training, however demonstrated a distinct lack of overnight gains on the next day. We propose that those with SLI may be impaired in the consolidation of acoustic-phonetic information. Published by Elsevier B.V.
Language Experience Affects Grouping of Musical Instrument Sounds
ERIC Educational Resources Information Center
Bhatara, Anjali; Boll-Avetisyan, Natalie; Agus, Trevor; Höhle, Barbara; Nazzi, Thierry
2016-01-01
Language experience clearly affects the perception of speech, but little is known about whether these differences in perception extend to non-speech sounds. In this study, we investigated rhythmic perception of non-linguistic sounds in speakers of French and German using a grouping task, in which complexity (variability in sounds, presence of…
Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K
2018-04-01
The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.
Li, Fangfang; Munson, Benjamin; Edwards, Jan; Yoneyama, Kiyoko; Hall, Kathleen
2011-01-01
Both English and Japanese have two voiceless sibilant fricatives, an anterior fricative ∕s∕ contrasting with a more posterior fricative ∕∫∕. When children acquire sibilant fricatives, English children typically substitute [s] for ∕∫∕, whereas Japanese children typically substitute [∫] for ∕∫∕. This study examined English- and Japanese-speaking adults’ perception of children’s productions of voiceless sibilant fricatives to investigate whether the apparent asymmetry in the acquisition of voiceless sibilant fricatives reported previously in the two languages was due in part to how adults perceive children’s speech. The results of this study show that adult speakers of English and Japanese weighed acoustic parameters differently when identifying fricatives produced by children and that these differences explain, in part, the apparent cross-language asymmetry in fricative acquisition. This study shows that generalizations about universal and language-specific patterns in speech-sound development cannot be determined without considering all sources of variation including speech perception. PMID:21361456
Theoretical Aspects of Speech Production.
ERIC Educational Resources Information Center
Stevens, Kenneth N.
1992-01-01
This paper on speech production in children and youth with hearing impairments summarizes theoretical aspects, including the speech production process, sound sources in the vocal tract, vowel production, and consonant production. Examples of spectra for several classes of vowel and consonant sounds in simple syllables are given. (DB)
The auditory representation of speech sounds in human motor cortex
Cheung, Connie; Hamilton, Liberty S; Johnson, Keith; Chang, Edward F
2016-01-01
In humans, listening to speech evokes neural responses in the motor cortex. This has been controversially interpreted as evidence that speech sounds are processed as articulatory gestures. However, it is unclear what information is actually encoded by such neural activity. We used high-density direct human cortical recordings while participants spoke and listened to speech sounds. Motor cortex neural patterns during listening were substantially different than during articulation of the same sounds. During listening, we observed neural activity in the superior and inferior regions of ventral motor cortex. During speaking, responses were distributed throughout somatotopic representations of speech articulators in motor cortex. The structure of responses in motor cortex during listening was organized along acoustic features similar to auditory cortex, rather than along articulatory features as during speaking. Motor cortex does not contain articulatory representations of perceived actions in speech, but rather, represents auditory vocal information. DOI: http://dx.doi.org/10.7554/eLife.12577.001 PMID:26943778
Results of the Sensory Profile in Children with Suspected Childhood Apraxia of Speech
ERIC Educational Resources Information Center
Newmeyer Amy J.; Grether, Sandra; Aylward, Christa; deGrauw, Ton; Akers, Rachel; Grasha, Carol; Ishikawa, Keiko; White, Jaye
2009-01-01
Speech-sound disorders are common in preschool-age children, and are characterized by difficulty in the planning and production of speech sounds and their combination into words and sentences. The objective of this study was to review and compare the results of the "Sensory Profile" ([Dunn, 1999]) in children with a specific type of speech-sound…
ERIC Educational Resources Information Center
Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise
2013-01-01
Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…
Visual Feedback of Tongue Movement for Novel Speech Sound Learning
Katz, William F.; Mehta, Sonya
2015-01-01
Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV) information. Second language (L2) learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals). However, little is known about the role of viewing one's own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker's learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA) was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ/; a voiced, coronal, palatal stop) before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers' productions were evaluated using kinematic (tongue-tip spatial positioning) and acoustic (burst spectra) measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing. PMID:26635571
A New Model of Sensorimotor Coupling in the Development of Speech
ERIC Educational Resources Information Center
Westermann, Gert; Miranda, Eduardo Reck
2004-01-01
We present a computational model that learns a coupling between motor parameters and their sensory consequences in vocal production during a babbling phase. Based on the coupling, preferred motor parameters and prototypically perceived sounds develop concurrently. Exposure to an ambient language modifies perception to coincide with the sounds from…
Zeitler, Daniel M; Dorman, Michael F; Natale, Sarah J; Loiselle, Louise; Yost, William A; Gifford, Rene H
2015-09-01
To assess improvements in sound source localization and speech understanding in complex listening environments after unilateral cochlear implantation for single-sided deafness (SSD). Nonrandomized, open, prospective case series. Tertiary referral center. Nine subjects with a unilateral cochlear implant (CI) for SSD (SSD-CI) were tested. Reference groups for the task of sound source localization included young (n = 45) and older (n = 12) normal-hearing (NH) subjects and 27 bilateral CI (BCI) subjects. Unilateral cochlear implantation. Sound source localization was tested with 13 loudspeakers in a 180 arc in front of the subject. Speech understanding was tested with the subject seated in an 8-loudspeaker sound system arrayed in a 360-degree pattern. Directionally appropriate noise, originally recorded in a restaurant, was played from each loudspeaker. Speech understanding in noise was tested using the Azbio sentence test and sound source localization quantified using root mean square error. All CI subjects showed poorer-than-normal sound source localization. SSD-CI subjects showed a bimodal distribution of scores: six subjects had scores near the mean of those obtained by BCI subjects, whereas three had scores just outside the 95th percentile of NH listeners. Speech understanding improved significantly in the restaurant environment when the signal was presented to the side of the CI. Cochlear implantation for SSD can offer improved speech understanding in complex listening environments and improved sound source localization in both children and adults. On tasks of sound source localization, SSD-CI patients typically perform as well as BCI patients and, in some cases, achieve scores at the upper boundary of normal performance.
Hodge, Megan M; Gotzke, Carrie L
2014-01-01
This study evaluated construct-related validity of the Test of Children's Speech (TOCS). Intelligibility scores obtained using open-set word identification tasks (orthographic transcription) for the TOCS word and sentence tests and rate scores for the TOCS sentence test (words per minute or WPM and intelligible words per minute or IWPM) were compared for a group of 15 adults (18-30 years of age) with normal speech production and three groups of children: 48 3-6 year-olds with typical speech development and neurological histories (TDS), 48 3-6 year-olds with a speech sound disorder of unknown origin and no identified neurological impairment (SSD-UNK), and 22 3-10 year-olds with dysarthria and cerebral palsy (DYS). As expected, mean intelligibility scores and rates increased with age in the TDS group. However, word test intelligibility, WPM and IWPM scores for the 6 year-olds in the TDS group were significantly lower than those for the adults. The DYS group had significantly lower word and sentence test intelligibility and WPM and IWPM scores than the TDS and SSD-UNK groups. Compared to the TDS group, the SSD-UNK group also had significantly lower intelligibility scores for the word and sentence tests, and significantly lower IWPM, but not WPM scores on the sentence test. The results support the construct-related validity of TOCS as a tool for obtaining intelligibility and rate scores that are sensitive to group differences in 3-6 year-old children, with and without speech sound disorders, and to 3+ year-old children with speech disorders, with and without dysarthria. Readers will describe the word and sentence intelligibility and speaking rate performance of children with typically developing speech at age levels of 3, 4, 5 and 6 years, as measured by the Test of Children's Speech, and how these compare with adult speakers and two groups of children with speech disorders. They will also recognize what measures on this test differentiate children with speech sound disorders of unknown origin from children with cerebral palsy and dysarthria. Copyright © 2014 Elsevier Inc. All rights reserved.
Fricative Contrast and Coarticulation in Children With and Without Speech Sound Disorders
Mailend, Marja-Liisa
2017-01-01
Purpose The purpose of this study was, first, to expand our understanding of typical speech development regarding segmental contrast and anticipatory coarticulation, and second, to explore the potential diagnostic utility of acoustic measures of fricative contrast and anticipatory coarticulation in children with speech sound disorders (SSD). Method In a cross-sectional design, 10 adults, 17 typically developing children, and 11 children with SSD repeated carrier phrases with novel words with fricatives (/s/, /ʃ/). Dependent measures were 2 ratios derived from spectral mean, obtained from perceptually accurate tokens. Group analyses compared adults and typically developing children; individual children with SSD were compared to their respective typically developing peers. Results Typically developing children demonstrated smaller fricative acoustic contrast than adults but similar coarticulatory patterns. Three children with SSD showed smaller fricative acoustic contrast than their typically developing peers, and 2 children showed abnormal coarticulation. The 2 children with abnormal coarticulation both had a clinical diagnosis of childhood apraxia of speech; no clear pattern was evident regarding SSD subtype for smaller fricative contrast. Conclusions Children have not reached adult-like speech motor control for fricative production by age 10 even when fricatives are perceptually accurate. Present findings also suggest that abnormal coarticulation but not reduced fricative contrast is SSD-subtype–specific. Supplemental Materials S1: https://doi.org/10.23641/asha.5103070. S2 and S3: https://doi.org/10.23641/asha.5106508 PMID:28654946
GraphoGame – a catalyst for multi-level promotion of literacy in diverse contexts
Ojanen, Emma; Ronimus, Miia; Ahonen, Timo; Chansa-Kabali, Tamara; February, Pamela; Jere-Folotiya, Jacqueline; Kauppinen, Karri-Pekka; Ketonen, Ritva; Ngorosho, Damaris; Pitkänen, Mikko; Puhakka, Suzanne; Sampa, Francis; Walubita, Gabriel; Yalukanda, Christopher; Pugh, Ken; Richardson, Ulla; Serpell, Robert; Lyytinen, Heikki
2015-01-01
GraphoGame (GG) is originally a technology-based intervention method for supporting children with reading difficulties. It is now known that children who face problems in reading acquisition have difficulties in learning to differentiate and manipulate speech sounds and consequently, in connecting these sounds to corresponding letters. GG was developed to provide intensive training in matching speech sounds and larger units of speech to their written counterparts. GG has been shown to benefit children with reading difficulties and the game is now available for all Finnish school children for literacy support. Presently millions of children in Africa fail to learn to read despite years of primary school education. As many African languages have transparent writing systems similar in structure to Finnish, it was hypothesized that GG-based training of letter-sound correspondences could also be effective in supporting children’s learning in African countries. In this article we will describe how GG has been developed from a Finnish dyslexia prevention game to an intervention method that can be used not only to improve children’s reading performance but also to raise teachers’ and parents’ awareness of the development of reading skill and effective reading instruction methods. We will also provide an overview of the GG activities in Zambia, Kenya, Tanzania, and Namibia, and the potential to promote education for all with a combination of scientific research and mobile learning. PMID:26113825
Duration of the speech disfluencies of beginning stutterers.
Zebrowski, P M
1991-06-01
This study compared the duration of within-word disfluencies and the number of repeated units per instance of sound/syllable and whole-word repetitions of beginning stutterers to those produced by age- and sex-matched nonstuttering children. Subjects were 10 stuttering children [9 males and 1 female; mean age 4:1 (years:months); age range 3:2-5:1), and 10 nonstuttering children (9 males and 1 female; mean age 4:0; age range: 2:10-5:1). Mothers of the stuttering children reported that their children had been stuttering for 1 year or less. One 300-word conversational speech sample from each of the stuttering and nonstuttering children was analyzed for (a) mean duration of sound/syllable repetition and sound prolongation, (b) mean number of repeated units per instance of sound/syllable and whole-word repetition, and (c) various related measures of the frequency of all between- and within-word speech disfluencies. There were no significant between-group differences for either the duration of acoustically measured sound/syllable repetitions and sound prolongations or the number of repeated units per instance of sound/syllable and whole-word repetition. Unlike frequency and type of speech disfluency produced, average duration of within-word disfluencies and number of repeated units per repetition do not differentiate the disfluent speech of beginning stutterers and their nonstuttering peers. Additional analyses support findings from previous perceptual work that type and frequency of speech disfluency, not duration, are the principal characteristics listeners use in distinguishing these two talker groups.
Intensive Treatment with Ultrasound Visual Feedback for Speech Sound Errors in Childhood Apraxia
Preston, Jonathan L.; Leece, Megan C.; Maas, Edwin
2016-01-01
Ultrasound imaging is an adjunct to traditional speech therapy that has shown to be beneficial in the remediation of speech sound errors. Ultrasound biofeedback can be utilized during therapy to provide clients with additional knowledge about their tongue shapes when attempting to produce sounds that are erroneous. The additional feedback may assist children with childhood apraxia of speech (CAS) in stabilizing motor patterns, thereby facilitating more consistent and accurate productions of sounds and syllables. However, due to its specialized nature, ultrasound visual feedback is a technology that is not widely available to clients. Short-term intensive treatment programs are one option that can be utilized to expand access to ultrasound biofeedback. Schema-based motor learning theory suggests that short-term intensive treatment programs (massed practice) may assist children in acquiring more accurate motor patterns. In this case series, three participants ages 10–14 years diagnosed with CAS attended 16 h of speech therapy over a 2-week period to address residual speech sound errors. Two participants had distortions on rhotic sounds, while the third participant demonstrated lateralization of sibilant sounds. During therapy, cues were provided to assist participants in obtaining a tongue shape that facilitated a correct production of the erred sound. Additional practice without ultrasound was also included. Results suggested that all participants showed signs of acquisition of sounds in error. Generalization and retention results were mixed. One participant showed generalization and retention of sounds that were treated; one showed generalization but limited retention; and the third showed no evidence of generalization or retention. Individual characteristics that may facilitate generalization are discussed. Short-term intensive treatment programs using ultrasound biofeedback may result in the acquisition of more accurate motor patterns and improved articulation of sounds previously in error, with varying levels of generalization and retention. PMID:27625603
Human emotions track changes in the acoustic environment.
Ma, Weiyi; Thompson, William Forde
2015-11-24
Emotional responses to biologically significant events are essential for human survival. Do human emotions lawfully track changes in the acoustic environment? Here we report that changes in acoustic attributes that are well known to interact with human emotions in speech and music also trigger systematic emotional responses when they occur in environmental sounds, including sounds of human actions, animal calls, machinery, or natural phenomena, such as wind and rain. Three changes in acoustic attributes known to signal emotional states in speech and music were imposed upon 24 environmental sounds. Evaluations of stimuli indicated that human emotions track such changes in environmental sounds just as they do for speech and music. Such changes not only influenced evaluations of the sounds themselves, they also affected the way accompanying facial expressions were interpreted emotionally. The findings illustrate that human emotions are highly attuned to changes in the acoustic environment, and reignite a discussion of Charles Darwin's hypothesis that speech and music originated from a common emotional signal system based on the imitation and modification of environmental sounds.
A lab-controlled simulation of a letter-speech sound binding deficit in dyslexia.
Aravena, Sebastián; Snellings, Patrick; Tijms, Jurgen; van der Molen, Maurits W
2013-08-01
Dyslexic and non-dyslexic readers engaged in a short training aimed at learning eight basic letter-speech sound correspondences within an artificial orthography. We examined whether a letter-speech sound binding deficit is behaviorally detectable within the initial steps of learning a novel script. Both letter knowledge and word reading ability within the artificial script were assessed. An additional goal was to investigate the influence of instructional approach on the initial learning of letter-speech sound correspondences. We assigned children from both groups to one of three different training conditions: (a) explicit instruction, (b) implicit associative learning within a computer game environment, or (c) a combination of (a) and (b) in which explicit instruction is followed by implicit learning. Our results indicated that dyslexics were outperformed by the controls on a time-pressured binding task and a word reading task within the artificial orthography, providing empirical support for the view that a letter-speech sound binding deficit is a key factor in dyslexia. A combination of explicit instruction and implicit techniques proved to be a more powerful tool in the initial teaching of letter-sound correspondences than implicit training alone. Copyright © 2013 Elsevier Inc. All rights reserved.
Getting the cocktail party started: masking effects in speech perception
Evans, S; McGettigan, C; Agnew, ZK; Rosen, S; Scott, SK
2016-01-01
Spoken conversations typically take place in noisy environments and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous functional Magnetic Resonance Imaging (fMRI), whilst they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioural task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream, and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment, activity was found within right lateralised frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise. PMID:26696297
Non-speech oral motor treatment for children with developmental speech sound disorders.
Lee, Alice S-Y; Gibbon, Fiona E
2015-03-25
Children with developmental speech sound disorders have difficulties in producing the speech sounds of their native language. These speech difficulties could be due to structural, sensory or neurophysiological causes (e.g. hearing impairment), but more often the cause of the problem is unknown. One treatment approach used by speech-language therapists/pathologists is non-speech oral motor treatment (NSOMT). NSOMTs are non-speech activities that aim to stimulate or improve speech production and treat specific speech errors. For example, using exercises such as smiling, pursing, blowing into horns, blowing bubbles, and lip massage to target lip mobility for the production of speech sounds involving the lips, such as /p/, /b/, and /m/. The efficacy of this treatment approach is controversial, and evidence regarding the efficacy of NSOMTs needs to be examined. To assess the efficacy of non-speech oral motor treatment (NSOMT) in treating children with developmental speech sound disorders who have speech errors. In April 2014 we searched the Cochrane Central Register of Controlled Trials (CENTRAL), Ovid MEDLINE (R) and Ovid MEDLINE In-Process & Other Non-Indexed Citations, EMBASE, Education Resources Information Center (ERIC), PsycINFO and 11 other databases. We also searched five trial and research registers, checked the reference lists of relevant titles identified by the search and contacted researchers to identify other possible published and unpublished studies. Randomised and quasi-randomised controlled trials that compared (1) NSOMT versus placebo or control; and (2) NSOMT as adjunctive treatment or speech intervention versus speech intervention alone, for children aged three to 16 years with developmental speech sound disorders, as judged by a speech and language therapist. Individuals with an intellectual disability (e.g. Down syndrome) or a physical disability were not excluded. The Trials Search Co-ordinator of the Cochrane Developmental, Psychosocial and Learning Problems Group and one review author ran the searches. Two review authors independently screened titles and abstracts to eliminate irrelevant studies, extracted data from the included studies and assessed risk of bias in each of these studies. In cases of ambiguity or information missing from the paper, we contacted trial authors. This review identified three studies (from four reports) involving a total of 22 children that investigated the efficacy of NSOMT as adjunctive treatment to conventional speech intervention versus conventional speech intervention for children with speech sound disorders. One study, a randomised controlled trial (RCT), included four boys aged seven years one month to nine years six months - all had speech sound disorders, and two had additional conditions (one was diagnosed as "communication impaired" and the other as "multiply disabled"). Of the two quasi-randomised controlled trials, one included 10 children (six boys and four girls), aged five years eight months to six years nine months, with speech sound disorders as a result of tongue thrust, and the other study included eight children (four boys and four girls), aged three to six years, with moderate to severe articulation disorder only. Two studies did not find NSOMT as adjunctive treatment to be more effective than conventional speech intervention alone, as both intervention and control groups made similar improvements in articulation after receiving treatments. One study reported a change in postintervention articulation test results but used an inappropriate statistical test and did not report the results clearly. None of the included studies examined the effects of NSOMTs on any other primary outcomes, such as speech intelligibility, speech physiology and adverse effects, or on any of the secondary outcomes such as listener acceptability.The RCT was judged at low risk for selection bias. The two quasi-randomised trials used randomisation but did not report the method for generating the random sequence and were judged as having unclear risk of selection bias. The three included studies were deemed to have high risk of performance bias as, given the nature of the intervention, blinding of participants was not possible. Only one study implemented blinding of outcome assessment and was at low risk for detection bias. One study showed high risk of other bias as the baseline characteristics of participants seemed to be unequal. The sample size of each of the included studies was very small, which means it is highly likely that participants in these studies were not representative of its target population. In the light of these serious limitations in methodology, the overall quality of the evidence provided by the included trials is judged to be low. Therefore, further research is very likely to have an important impact on our confidence in the estimate of treatment effect and is likely to change the estimate. The three included studies were small in scale and had a number of serious methodological limitations. In addition, they covered limited types of NSOMTs for treating children with speech sound disorders of unknown origin with the sounds /s/ and /z/. Hence, we judged the overall applicability of the evidence as limited and incomplete. Results of this review are consistent with those of previous reviews: Currently no strong evidence suggests that NSOMTs are an effective treatment or an effective adjunctive treatment for children with developmental speech sound disorders. Lack of strong evidence regarding the treatment efficacy of NSOMTs has implications for clinicians when they make decisions in relation to treatment plans. Well-designed research is needed to carefully investigate NSOMT as a type of treatment for children with speech sound disorders.
I Karipidis, Iliana; Pleisch, Georgette; Röthlisberger, Martina; Hofstetter, Christoph; Dornbierer, Dario; Stämpfli, Philipp; Brem, Silvia
2017-02-01
Learning letter-speech sound correspondences is a major step in reading acquisition and is severely impaired in children with dyslexia. Up to now, it remains largely unknown how quickly neural networks adopt specific functions during audiovisual integration of linguistic information when prereading children learn letter-speech sound correspondences. Here, we simulated the process of learning letter-speech sound correspondences in 20 prereading children (6.13-7.17 years) at varying risk for dyslexia by training artificial letter-speech sound correspondences within a single experimental session. Subsequently, we acquired simultaneously event-related potentials (ERP) and functional magnetic resonance imaging (fMRI) scans during implicit audiovisual presentation of trained and untrained pairs. Audiovisual integration of trained pairs correlated with individual learning rates in right superior temporal, left inferior temporal, and bilateral parietal areas and with phonological awareness in left temporal areas. In correspondence, a differential left-lateralized parietooccipitotemporal ERP at 400 ms for trained pairs correlated with learning achievement and familial risk. Finally, a late (650 ms) posterior negativity indicating audiovisual congruency of trained pairs was associated with increased fMRI activation in the left occipital cortex. Taken together, a short (<30 min) letter-speech sound training initializes audiovisual integration in neural systems that are responsible for processing linguistic information in proficient readers. To conclude, the ability to learn grapheme-phoneme correspondences, the familial history of reading disability, and phonological awareness of prereading children account for the degree of audiovisual integration in a distributed brain network. Such findings on emerging linguistic audiovisual integration could allow for distinguishing between children with typical and atypical reading development. Hum Brain Mapp 38:1038-1055, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis
NASA Astrophysics Data System (ADS)
Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert
2005-12-01
A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.
Simultaneous F 0-F 1 modifications of Arabic for the improvement of natural-sounding
NASA Astrophysics Data System (ADS)
Ykhlef, F.; Bensebti, M.
2013-03-01
Pitch (F 0) modification is one of the most important problems in the area of speech synthesis. Several techniques have been developed in the literature to achieve this goal. The main restrictions of these techniques are in the modification range and the synthesised speech quality, intelligibility and naturalness. The control of formants in a spoken language can significantly improve the naturalness of the synthesised speech. This improvement is mainly dependent on the control of the first formant (F 1). Inspired by this observation, this article proposes a new approach that modifies both F 0 and F 1 of Arabic voiced sounds in order to improve the naturalness of the pitch shifted speech. The developed strategy takes a parallel processing approach, in which the analysis segments are decomposed into sub-bands in the wavelet domain, modified in the desired sub-band by using a resampling technique and reconstructed without affecting the remained sub-bands. Pitch marking and voicing detection are performed in the frequency decomposition step based on the comparison of the multi-level approximation and detail signals. The performance of the proposed technique is evaluated by listening tests and compared to the pitch synchronous overlap and add (PSOLA) technique in the third approximation level. Experimental results have shown that the manipulation in the wavelet domain of F 0 in conjunction with F 1 guarantees natural-sounding of the synthesised speech compared to the classical pitch modification technique. This improvement was appropriate for high pitch modifications.
ERIC Educational Resources Information Center
Velleman, Shelley L.
2011-01-01
Although not the focus of her article, phonological development in young children with speech sound disorders of various types is highly germane to Stoel-Gammon's discussion (this issue) for at least two primary reasons. Most obvious is that typical processes and milestones of phonological development are the standards and benchmarks against which…
Differences in Talker Recognition by Preschoolers and Adults
ERIC Educational Resources Information Center
Creel, Sarah C.; Jimenez, Sofia R.
2012-01-01
Talker variability in speech influences language processing from infancy through adulthood and is inextricably embedded in the very cues that identify speech sounds. Yet little is known about developmental changes in the processing of talker information. On one account, children have not yet learned to separate speech sound variability from…
Linkage of Speech Sound Disorder to Reading Disability Loci
ERIC Educational Resources Information Center
Smith, Shelley D.; Pennington, Bruce F.; Boada, Richard; Shriberg, Lawrence D.
2005-01-01
Background: Speech sound disorder (SSD) is a common childhood disorder characterized by developmentally inappropriate errors in speech production that greatly reduce intelligibility. SSD has been found to be associated with later reading disability (RD), and there is also evidence for both a cognitive and etiological overlap between the two…
Techniques for decoding speech phonemes and sounds: A concept
NASA Technical Reports Server (NTRS)
Lokerson, D. C.; Holby, H. G.
1975-01-01
Techniques studied involve conversion of speech sounds into machine-compatible pulse trains. (1) Voltage-level quantizer produces number of output pulses proportional to amplitude characteristics of vowel-type phoneme waveforms. (2) Pulses produced by quantizer of first speech formants are compared with pulses produced by second formants.
Augmenting Comprehension of Speech in Noise with a Facial Avatar and Its Effect on Performance
2010-12-01
develop some aspects of speech more slowly than sighted children. In addition to “bleeping” or blanking the sound of censored words, network...the speech. Movie files were exported at a resolution of 600 by 800 pixels at 30 frames per second and were four seconds in length. It should be...noted that the speech, and synchronized facial movements, began one second after each movie file started. This delay was designed to ensure that the
Vuolo, Janet; Goffman, Lisa
2017-01-01
This exploratory treatment study used phonetic transcription and speech kinematics to examine changes in segmental and articulatory variability. Nine children, ages 4- to 8-years-old, served as participants, including two with childhood apraxia of speech (CAS), five with speech sound disorder (SSD), and two who were typically developing (TD). Children practised producing agent + action phrases in an imitation task (low linguistic load) and a retrieval task (high linguistic load) over five sessions. In the imitation task in session one, both participants with CAS showed high degrees of segmental and articulatory variability. After five sessions, imitation practice resulted in increased articulatory variability for five participants. Retrieval practice resulted in decreased articulatory variability in three participants with SSD. These results suggest that short-term speech production practice in rote imitation disrupts articulatory control in children with and without CAS. In contrast, tasks that require linguistic processing may scaffold learning for children with SSD but not CAS. PMID:27960554
A nationwide survey of nonspeech oral motor exercise use: implications for evidence-based practice.
Lof, Gregory L; Watson, Maggie M
2008-07-01
A nationwide survey was conducted to determine if speech-language pathologists (SLPs) use nonspeech oral motor exercises (NSOMEs) to address children's speech sound problems. For those SLPs who used NSOMEs, the survey also identified (a) the types of NSOMEs used by the SLPs, (b) the SLPs' underlying beliefs about why they use NSOMEs, (c) clinicians' training for these exercises, (d) the application of NSOMEs across various clinical populations, and (e) specific tasks/procedures/tools that are used for intervention. A total of 2,000 surveys were mailed to a randomly selected subgroup of SLPs, obtained from the American Speech-Language-Hearing Association (ASHA) membership roster, who self-identified that they worked in various settings with children who have speech sound problems. The questions required answers that used both a forced choice and Likert-type scales. The response rate was 27.5% (537 out of 2,000). Of these respondents, 85% reported using NSOMEs to deal with children's speech sound production problems. Those SLPs reported that the research literature supports the use of NSOMEs, and that they learned to use these techniques from continuing education events. They also stated that NSOMEs can help improve the speech of children from disparate etiologies, and "warming up" and strengthening the articulators are important components of speech sound therapy. There are theoretical and research data that challenge both the use of NSOMEs and the efficacy of such exercises in resolving speech sound problems. SLPs need to follow the concepts of evidence-based practice in order to determine if these exercises are actually effective in bringing about changes in speech productions.
A new model of sensorimotor coupling in the development of speech.
Westermann, Gert; Reck Miranda, Eduardo
2004-05-01
We present a computational model that learns a coupling between motor parameters and their sensory consequences in vocal production during a babbling phase. Based on the coupling, preferred motor parameters and prototypically perceived sounds develop concurrently. Exposure to an ambient language modifies perception to coincide with the sounds from the language. The model develops motor mirror neurons that are active when an external sound is perceived. An extension to visual mirror neurons for oral gestures is suggested.
NASA Astrophysics Data System (ADS)
Nishiura, Takanobu; Nakamura, Satoshi
2002-11-01
It is very important to capture distant-talking speech for a hands-free speech interface with high quality. A microphone array is an ideal candidate for this purpose. However, this approach requires localizing the target talker. Conventional talker localization algorithms in multiple sound source environments not only have difficulty localizing the multiple sound sources accurately, but also have difficulty localizing the target talker among known multiple sound source positions. To cope with these problems, we propose a new talker localization algorithm consisting of two algorithms. One is DOA (direction of arrival) estimation algorithm for multiple sound source localization based on CSP (cross-power spectrum phase) coefficient addition method. The other is statistical sound source identification algorithm based on GMM (Gaussian mixture model) for localizing the target talker position among localized multiple sound sources. In this paper, we particularly focus on the talker localization performance based on the combination of these two algorithms with a microphone array. We conducted evaluation experiments in real noisy reverberant environments. As a result, we confirmed that multiple sound signals can be identified accurately between ''speech'' or ''non-speech'' by the proposed algorithm. [Work supported by ATR, and MEXT of Japan.
ERIC Educational Resources Information Center
Girolametto, Luigi; Weitzman, Elaine; Greenberg, Janice
2012-01-01
Purpose: This study examined the efficacy of a professional development program for early childhood educators that facilitated emergent literacy skills in preschoolers. The program, led by a speech-language pathologist, focused on teaching alphabet knowledge, print concepts, sound awareness, and decontextualized oral language within naturally…
Influence of Gestational Age and Postnatal Age on Speech Sound Processing in NICU infants
Key, Alexandra P.F.; Lambert, E. Warren; Aschner, Judy L.; Maitre, Nathalie L.
2012-01-01
The study examined the effect of gestational (GA) and postnatal (PNA) age on speech sound perception in infants. Auditory ERPs were recorded in response to speech sounds (CV syllables) in 50 infant NICU patients (born at 24–40 weeks gestation) prior to discharge. Efficiency of speech perception was quantified as absolute difference in mean amplitudes of ERPs in response to vowel (/a/–/u/) and consonant (/b/–/g/, /d/–/g/) contrasts within 150–250, 250–400, 400–700 ms after stimulus onset. Results indicated that both GA and PNA affected speech sound processing. These effects were more pronounced for consonant than vowel contrasts. Increasing PNA was associated with greater sound discrimination in infants born at or after 30 weeks GA, while minimal PNA-related changes were observed for infants with GA less than 30 weeks. Our findings suggest that a certain level of brain maturity at birth is necessary to benefit from postnatal experience in the first 4 months of life, and both gestational and postnatal ages need to be considered when evaluating infant brain responses. PMID:22332725
Acoustic Event Detection and Classification
NASA Astrophysics Data System (ADS)
Temko, Andrey; Nadeu, Climent; Macho, Dušan; Malkin, Robert; Zieger, Christian; Omologo, Maurizio
The human activity that takes place in meeting rooms or classrooms is reflected in a rich variety of acoustic events (AE), produced either by the human body or by objects handled by humans, so the determination of both the identity of sounds and their position in time may help to detect and describe that human activity. Indeed, speech is usually the most informative sound, but other kinds of AEs may also carry useful information, for example, clapping or laughing inside a speech, a strong yawn in the middle of a lecture, a chair moving or a door slam when the meeting has just started. Additionally, detection and classification of sounds other than speech may be useful to enhance the robustness of speech technologies like automatic speech recognition.
Kujala, T; Kuuluvainen, S; Saalasti, S; Jansson-Verkasalo, E; von Wendt, L; Lepistö, T
2010-09-01
Asperger syndrome, belonging to the autistic spectrum of disorders, involves deficits in social interaction and prosodic use of language but normal development of formal language abilities. Auditory processing involves both hyper- and hypoactive reactivity to acoustic changes. Responses composed of mismatch negativity (MMN) and obligatory components were recorded for five types of deviations in syllables (vowel, vowel duration, consonant, syllable frequency, syllable intensity) with the multi-feature paradigm from 8-12-year old children with Asperger syndrome. Children with Asperger syndrome had larger MMNs for intensity and smaller MMNs for frequency changes than typically developing children, whereas no MMN group differences were found for the other deviant stimuli. Furthermore, children with Asperger syndrome performed more poorly than controls in Comprehension of Instructions subtest of a language test battery. Cortical speech-sound discrimination is aberrant in children with Asperger syndrome. This is evident both as hypersensitive and depressed neural reactions to speech-sound changes, and is associated with features (frequency, intensity) which are relevant for prosodic processing. The multi-feature MMN paradigm, which includes variation and thereby resembles natural speech hearing circumstances, suggests abnormal pattern of speech discrimination in Asperger syndrome, including both hypo- and hypersensitive responses for speech features. 2010 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Schafer, Erin C; Romine, Denise; Musgrave, Elizabeth; Momin, Sadaf; Huynh, Christy
2013-01-01
Previous research has suggested that electrically coupled frequency modulation (FM) systems substantially improved speech-recognition performance in noise in individuals with cochlear implants (CIs). However, there is limited evidence to support the use of electromagnetically coupled (neck loop) FM receivers with contemporary CI sound processors containing telecoils. The primary goal of this study was to compare speech-recognition performance in noise and subjective ratings of adolescents and adults using one of three contemporary CI sound processors coupled to electromagnetically and electrically coupled FM receivers from Oticon. A repeated-measures design was used to compare speech-recognition performance in noise and subjective ratings without and with the FM systems across three test sessions (Experiment 1) and to compare performance at different FM-gain settings (Experiment 2). Descriptive statistics were used in Experiment 3 to describe output differences measured through a CI sound processor. Experiment 1 included nine adolescents or adults with unilateral or bilateral Advanced Bionics Harmony (n = 3), Cochlear Nucleus 5 (n = 3), and MED-EL OPUS 2 (n = 3) CI sound processors. In Experiment 2, seven of the original nine participants were tested. In Experiment 3, electroacoustic output was measured from a Nucleus 5 sound processor when coupled to the electromagnetically coupled Oticon Arc neck loop and electrically coupled Oticon R2. In Experiment 1, participants completed a field trial with each FM receiver and three test sessions that included speech-recognition performance in noise and a subjective rating scale. In Experiment 2, participants were tested in three receiver-gain conditions. Results in both experiments were analyzed using repeated-measures analysis of variance. Experiment 3 involved electroacoustic-test measures to determine the monitor-earphone output of the CI alone and CI coupled to the two FM receivers. The results in Experiment 1 suggested that both FM receivers provided significantly better speech-recognition performance in noise than the CI alone; however, the electromagnetically coupled receiver provided significantly better speech-recognition performance in noise and better ratings in some situations than the electrically coupled receiver when set to the same gain. In Experiment 2, the primary analysis suggested significantly better speech-recognition performance in noise for the neck-loop versus electrically coupled receiver, but a second analysis, using the best performance across gain settings for each device, revealed no significant differences between the two FM receivers. Experiment 3 revealed monitor-earphone output differences in the Nucleus 5 sound processor for the two FM receivers when set to the +8 setting used in Experiment 1 but equal output when the electrically coupled device was set to a +16 gain setting and the electromagnetically coupled device was set to the +8 gain setting. Individuals with contemporary sound processors may show more favorable speech-recognition performance in noise electromagnetically coupled FM systems (i.e., Oticon Arc), which is most likely related to the input processing and signal processing pathway within the CI sound processor for direct input versus telecoil input. Further research is warranted to replicate these findings with a larger sample size and to develop and validate a more objective approach to fitting FM systems to CI sound processors. American Academy of Audiology.
Howard, Ian S.; Messum, Piers
2014-01-01
Words are made up of speech sounds. Almost all accounts of child speech development assume that children learn the pronunciation of first language (L1) speech sounds by imitation, most claiming that the child performs some kind of auditory matching to the elements of ambient speech. However, there is evidence to support an alternative account and we investigate the non-imitative child behavior and well-attested caregiver behavior that this account posits using Elija, a computational model of an infant. Through unsupervised active learning, Elija began by discovering motor patterns, which produced sounds. In separate interaction experiments, native speakers of English, French and German then played the role of his caregiver. In their first interactions with Elija, they were allowed to respond to his sounds if they felt this was natural. We analyzed the interactions through phonemic transcriptions of the caregivers' utterances and found that they interpreted his output within the framework of their native languages. Their form of response was almost always a reformulation of Elija's utterance into well-formed sounds of L1. Elija retained those motor patterns to which a caregiver responded and formed associations between his motor pattern and the response it provoked. Thus in a second phase of interaction, he was able to parse input utterances in terms of the caregiver responses he had heard previously, and respond using his associated motor patterns. This capacity enabled the caregivers to teach Elija to pronounce some simple words in their native languages, by his serial imitation of the words' component speech sounds. Overall, our results demonstrate that the natural responses and behaviors of human subjects to infant-like vocalizations can take a computational model from a biologically plausible initial state through to word pronunciation. This provides support for an alternative to current auditory matching hypotheses for how children learn to pronounce. PMID:25333740
Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors.
Preston, Jonathan L; McAllister Byun, Tara; Boyce, Suzanne E; Hamilton, Sarah; Tiede, Mark; Phillips, Emily; Rivera-Campos, Ahmed; Whalen, Douglas H
2017-01-03
Diagnostic ultrasound imaging has been a common tool in medical practice for several decades. It provides a safe and effective method for imaging structures internal to the body. There has been a recent increase in the use of ultrasound technology to visualize the shape and movements of the tongue during speech, both in typical speakers and in clinical populations. Ultrasound imaging of speech has greatly expanded our understanding of how sounds articulated with the tongue (lingual sounds) are produced. Such information can be particularly valuable for speech-language pathologists. Among other advantages, ultrasound images can be used during speech therapy to provide (1) illustrative models of typical (i.e. "correct") tongue configurations for speech sounds, and (2) a source of insight into the articulatory nature of deviant productions. The images can also be used as an additional source of feedback for clinical populations learning to distinguish their better productions from their incorrect productions, en route to establishing more effective articulatory habits. Ultrasound feedback is increasingly used by scientists and clinicians as both the expertise of the users increases and as the expense of the equipment declines. In this tutorial, procedures are presented for collecting ultrasound images of the tongue in a clinical context. We illustrate these procedures in an extended example featuring one common error sound, American English /r/. Images of correct and distorted /r/ are used to demonstrate (1) how to interpret ultrasound images, (2) how to assess tongue shape during production of speech sounds, (3), how to categorize tongue shape errors, and (4), how to provide visual feedback to elicit a more appropriate and functional tongue shape. We present a sample protocol for using real-time ultrasound images of the tongue for visual feedback to remediate speech sound errors. Additionally, example data are shown to illustrate outcomes with the procedure.
Klatte, Maria; Lachmann, Thomas; Meis, Markus
2010-01-01
The effects of classroom noise and background speech on speech perception, measured by word-to-picture matching, and listening comprehension, measured by execution of oral instructions, were assessed in first- and third-grade children and adults in a classroom-like setting. For speech perception, in addition to noise, reverberation time (RT) was varied by conducting the experiment in two virtual classrooms with mean RT = 0.47 versus RT = 1.1 s. Children were more impaired than adults by background sounds in both speech perception and listening comprehension. Classroom noise evoked a reliable disruption in children's speech perception even under conditions of short reverberation. RT had no effect on speech perception in silence, but evoked a severe increase in the impairments due to background sounds in all age groups. For listening comprehension, impairments due to background sounds were found in the children, stronger for first- than for third-graders, whereas adults were unaffected. Compared to classroom noise, background speech had a smaller effect on speech perception, but a stronger effect on listening comprehension, remaining significant when speech perception was controlled. This indicates that background speech affects higher-order cognitive processes involved in children's comprehension. Children's ratings of the sound-induced disturbance were low overall and uncorrelated to the actual disruption, indicating that the children did not consciously realize the detrimental effects. The present results confirm earlier findings on the substantial impact of noise and reverberation on children's speech perception, and extend these to classroom-like environmental settings and listening demands closely resembling those faced by children at school.
Environmental Sound Training in Cochlear Implant Users
Sheft, Stanley; Kuvadia, Sejal; Gygi, Brian
2015-01-01
Purpose The study investigated the effect of a short computer-based environmental sound training regimen on the perception of environmental sounds and speech in experienced cochlear implant (CI) patients. Method Fourteen CI patients with the average of 5 years of CI experience participated. The protocol consisted of 2 pretests, 1 week apart, followed by 4 environmental sound training sessions conducted on separate days in 1 week, and concluded with 2 posttest sessions, separated by another week without training. Each testing session included an environmental sound test, which consisted of 40 familiar everyday sounds, each represented by 4 different tokens, as well as the Consonant Nucleus Consonant (CNC) word test, and Revised Speech Perception in Noise (SPIN-R) sentence test. Results Environmental sounds scores were lower than for either of the speech tests. Following training, there was a significant average improvement of 15.8 points in environmental sound perception, which persisted 1 week later after training was discontinued. No significant improvements were observed for either speech test. Conclusions The findings demonstrate that environmental sound perception, which remains problematic even for experienced CI patients, can be improved with a home-based computer training regimen. Such computer-based training may thus provide an effective low-cost approach to rehabilitation for CI users, and potentially, other hearing impaired populations. PMID:25633579
NASA Astrophysics Data System (ADS)
Samardzic, Nikolina
The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly, as an example of the significance of speech intelligibility evaluation in the context of an applicable listening environment, as indicated in this research, it was found that the jury test participants required on average an approximate 3 dB increase in sound pressure level of speech material while driving and listening compared to when just listening, for an equivalent speech intelligibility performance and the same listening task.
Brainstem Transcription of Speech Is Disrupted in Children with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Russo, Nicole; Nicol, Trent; Trommer, Barbara; Zecker, Steve; Kraus, Nina
2009-01-01
Language impairment is a hallmark of autism spectrum disorders (ASD). The origin of the deficit is poorly understood although deficiencies in auditory processing have been detected in both perception and cortical encoding of speech sounds. Little is known about the processing and transcription of speech sounds at earlier (brainstem) levels or…
Hemispheric Differences in Processing Dichotic Meaningful and Non-Meaningful Words
ERIC Educational Resources Information Center
Yasin, Ifat
2007-01-01
Classic dichotic-listening paradigms reveal a right-ear advantage (REA) for speech sounds as compared to non-speech sounds. This REA is assumed to be associated with a left-hemisphere dominance for meaningful speech processing. This study objectively probed the relationship between ear advantage and hemispheric dominance in a dichotic-listening…
Evidence-Based Practice for Children with Speech Sound Disorders: Part 1 Narrative Review
ERIC Educational Resources Information Center
Baker, Elise; McLeod, Sharynne
2011-01-01
Purpose: This article provides a comprehensive narrative review of intervention studies for children with speech sound disorders (SSD). Its companion paper (Baker & McLeod, 2011) provides a tutorial and clinical example of how speech-language pathologists (SLPs) can engage in evidence-based practice (EBP) for this clinical population. Method:…
Influence of Sound Immersion and Communicative Interaction on the Lombard Effect
ERIC Educational Resources Information Center
Garnier, Maeva; Henrich, Nathalie; Dubois, Daniele
2010-01-01
Purpose: To examine the influence of sound immersion techniques and speech production tasks on speech adaptation in noise. Method: In Experiment 1, we compared the modification of speakers' perception and speech production in noise when noise is played into headphones (with and without additional self-monitoring feedback) or over loudspeakers. We…
ERIC Educational Resources Information Center
McLeod, Sharynne; Crowe, Kathryn; Masso, Sarah; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Susan; Howland, Charlotte
2017-01-01
Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children. The aim of this research was to describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns. 275 Australian 4-…
The Neural Substrates of Infant Speech Perception
ERIC Educational Resources Information Center
Homae, Fumitaka; Watanabe, Hama; Taga, Gentaro
2014-01-01
Infants often pay special attention to speech sounds, and they appear to detect key features of these sounds. To investigate the neural foundation of speech perception in infants, we measured cortical activation using near-infrared spectroscopy. We presented the following three types of auditory stimuli while 3-month-old infants watched a silent…
Vilela, Nadia; Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Sanches, Seisse Gabriela Gandolfi; Wertzner, Haydée Fiszbein; Carvallo, Renata Mota Mamede
2016-02-01
To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central) auditory processing disorder . Language, audiological and (central) auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central) auditory processing evaluation results. When a (central) auditory processing disorder was present in association with a speech disorder, the children tended to have lower scores on phonological assessments. A greater severity of speech disorder was related to a greater probability of the child having a (central) auditory processing disorder. The use of a cutoff value for the Percentage of Consonants Correct-Revised index successfully distinguished between children with and without a (central) auditory processing disorder. The severity of speech-sound disorder in children was influenced by the presence of (central) auditory processing disorder. The attempt to identify a cutoff value based on a severity index was successful.
Nonlinear frequency compression: effects on sound quality ratings of speech and music.
Parsa, Vijay; Scollie, Susan; Glista, Danielle; Seelisch, Andreas
2013-03-01
Frequency lowering technologies offer an alternative amplification solution for severe to profound high frequency hearing losses. While frequency lowering technologies may improve audibility of high frequency sounds, the very nature of this processing can affect the perceived sound quality. This article reports the results from two studies that investigated the impact of a nonlinear frequency compression (NFC) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing adults, 12 normal hearing children, 13 hearing impaired adults, and 9 hearing impaired children. In the second study, 12 normal hearing and 8 hearing impaired adult listeners rated the quality of speech in quiet, speech in noise, and music after processing with a different set of NFC parameters. Results showed that the cutoff frequency parameter had more impact on sound quality ratings than the compression ratio, and that the hearing impaired adults were more tolerant to increased frequency compression than normal hearing adults. No statistically significant differences were found in the sound quality ratings of speech-in-noise and music stimuli processed through various NFC settings by hearing impaired listeners. These findings suggest that there may be an acceptable range of NFC settings for hearing impaired individuals where sound quality is not adversely affected. These results may assist an Audiologist in clinical NFC hearing aid fittings for achieving a balance between high frequency audibility and sound quality.
Echoes of the spoken past: how auditory cortex hears context during speech perception
Skipper, Jeremy I.
2014-01-01
What do we hear when someone speaks and what does auditory cortex (AC) do with that sound? Given how meaningful speech is, it might be hypothesized that AC is most active when other people talk so that their productions get decoded. Here, neuroimaging meta-analyses show the opposite: AC is least active and sometimes deactivated when participants listened to meaningful speech compared to less meaningful sounds. Results are explained by an active hypothesis-and-test mechanism where speech production (SP) regions are neurally re-used to predict auditory objects associated with available context. By this model, more AC activity for less meaningful sounds occurs because predictions are less successful from context, requiring further hypotheses be tested. This also explains the large overlap of AC co-activity for less meaningful sounds with meta-analyses of SP. An experiment showed a similar pattern of results for non-verbal context. Specifically, words produced less activity in AC and SP regions when preceded by co-speech gestures that visually described those words compared to those words without gestures. Results collectively suggest that what we ‘hear’ during real-world speech perception may come more from the brain than our ears and that the function of AC is to confirm or deny internal predictions about the identity of sounds. PMID:25092665
Influences on infant speech processing: toward a new synthesis.
Werker, J F; Tees, R C
1999-01-01
To comprehend and produce language, we must be able to recognize the sound patterns of our language and the rules for how these sounds "map on" to meaning. Human infants are born with a remarkable array of perceptual sensitivities that allow them to detect the basic properties that are common to the world's languages. During the first year of life, these sensitivities undergo modification reflecting an exquisite tuning to just that phonological information that is needed to map sound to meaning in the native language. We review this transition from language-general to language-specific perceptual sensitivity that occurs during the first year of life and consider whether the changes propel the child into word learning. To account for the broad-based initial sensitivities and subsequent reorganizations, we offer an integrated transactional framework based on the notion of a specialized perceptual-motor system that has evolved to serve human speech, but which functions in concert with other developing abilities. In so doing, we highlight the links between infant speech perception, babbling, and word learning.
Ansari, M S; Rangasayee, R; Ansari, M A H
2017-03-01
Poor auditory speech perception in geriatrics is attributable to neural de-synchronisation due to structural and degenerative changes of ageing auditory pathways. The speech-evoked auditory brainstem response may be useful for detecting alterations that cause loss of speech discrimination. Therefore, this study aimed to compare the speech-evoked auditory brainstem response in adult and geriatric populations with normal hearing. The auditory brainstem responses to click sounds and to a 40 ms speech sound (the Hindi phoneme |da|) were compared in 25 young adults and 25 geriatric people with normal hearing. The latencies and amplitudes of transient peaks representing neural responses to the onset, offset and sustained portions of the speech stimulus in quiet and noisy conditions were recorded. The older group had significantly smaller amplitudes and longer latencies for the onset and offset responses to |da| in noisy conditions. Stimulus-to-response times were longer and the spectral amplitude of the sustained portion of the stimulus was reduced. The overall stimulus level caused significant shifts in latency across the entire speech-evoked auditory brainstem response in the older group. The reduction in neural speech processing in older adults suggests diminished subcortical responsiveness to acoustically dynamic spectral cues. However, further investigations are needed to encode temporal cues at the brainstem level and determine their relationship to speech perception for developing a routine tool for clinical decision-making.
A Generative Model of Speech Production in Broca’s and Wernicke’s Areas
Price, Cathy J.; Crinion, Jenny T.; MacSweeney, Mairéad
2011-01-01
Speech production involves the generation of an auditory signal from the articulators and vocal tract. When the intended auditory signal does not match the produced sounds, subsequent articulatory commands can be adjusted to reduce the difference between the intended and produced sounds. This requires an internal model of the intended speech output that can be compared to the produced speech. The aim of this functional imaging study was to identify brain activation related to the internal model of speech production after activation related to vocalization, auditory feedback, and movement in the articulators had been controlled. There were four conditions: silent articulation of speech, non-speech mouth movements, finger tapping, and visual fixation. In the speech conditions, participants produced the mouth movements associated with the words “one” and “three.” We eliminated auditory feedback from the spoken output by instructing participants to articulate these words without producing any sound. The non-speech mouth movement conditions involved lip pursing and tongue protrusions to control for movement in the articulators. The main difference between our speech and non-speech mouth movement conditions is that prior experience producing speech sounds leads to the automatic and covert generation of auditory and phonological associations that may play a role in predicting auditory feedback. We found that, relative to non-speech mouth movements, silent speech activated Broca’s area in the left dorsal pars opercularis and Wernicke’s area in the left posterior superior temporal sulcus. We discuss these results in the context of a generative model of speech production and propose that Broca’s and Wernicke’s areas may be involved in predicting the speech output that follows articulation. These predictions could provide a mechanism by which rapid movement of the articulators is precisely matched to the intended speech outputs during future articulations. PMID:21954392
Using listening difficulty ratings of conditions for speech communication in rooms
NASA Astrophysics Data System (ADS)
Sato, Hiroshi; Bradley, John S.; Morimoto, Masayuki
2005-03-01
The use of listening difficulty ratings of speech communication in rooms is explored because, in common situations, word recognition scores do not discriminate well among conditions that are near to acceptable. In particular, the benefits of early reflections of speech sounds on listening difficulty were investigated and compared to the known benefits to word intelligibility scores. Listening tests were used to assess word intelligibility and perceived listening difficulty of speech in simulated sound fields. The experiments were conducted in three types of sound fields with constant levels of ambient noise: only direct sound, direct sound with early reflections, and direct sound with early reflections and reverberation. The results demonstrate that (1) listening difficulty can better discriminate among these conditions than can word recognition scores; (2) added early reflections increase the effective signal-to-noise ratio equivalent to the added energy in the conditions without reverberation; (3) the benefit of early reflections on difficulty scores is greater than expected from the simple increase in early arriving speech energy with reverberation; (4) word intelligibility tests are most appropriate for conditions with signal-to-noise (S/N) ratios less than 0 dBA, and where S/N is between 0 and 15-dBA S/N, listening difficulty is a more appropriate evaluation tool. .
Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech.
Švec, Jan G; Granqvist, Svante
2018-03-15
Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to improve their accuracy and reproducibility. Basic information is put together from standards, technical, voice and speech literature, and practical experience of the authors and is explained for nontechnical readers. Variation of SPL with distance, sound level meters and their accuracy, frequency and time weightings, and background noise topics are reviewed. Several calibration procedures for SPL measurements are described for stand-mounted and head-mounted microphones. SPL of voice and speech should be reported together with the mouth-to-microphone distance so that the levels can be related to vocal power. Sound level measurement settings (i.e., frequency weighting and time weighting/averaging) should always be specified. Classified sound level meters should be used to assure measurement accuracy. Head-mounted microphones placed at the proximity of the mouth improve signal-to-noise ratio and can be taken advantage of for voice SPL measurements when calibrated. Background noise levels should be reported besides the sound levels of voice and speech.
Ultrasound biofeedback treatment for persisting childhood apraxia of speech.
Preston, Jonathan L; Brick, Nickole; Landi, Nicole
2013-11-01
The purpose of this study was to evaluate the efficacy of a treatment program that includes ultrasound biofeedback for children with persisting speech sound errors associated with childhood apraxia of speech (CAS). Six children ages 9-15 years participated in a multiple baseline experiment for 18 treatment sessions during which treatment focused on producing sequences involving lingual sounds. Children were cued to modify their tongue movements using visual feedback from real-time ultrasound images. Probe data were collected before, during, and after treatment to assess word-level accuracy for treated and untreated sound sequences. As participants reached preestablished performance criteria, new sequences were introduced into treatment. All participants met the performance criterion (80% accuracy for 2 consecutive sessions) on at least 2 treated sound sequences. Across the 6 participants, performance criterion was met for 23 of 31 treated sequences in an average of 5 sessions. Some participants showed no improvement in untreated sequences, whereas others showed generalization to untreated sequences that were phonetically similar to the treated sequences. Most gains were maintained 2 months after the end of treatment. The percentage of phonemes correct increased significantly from pretreatment to the 2-month follow-up. A treatment program including ultrasound biofeedback is a viable option for improving speech sound accuracy in children with persisting speech sound errors associated with CAS.
NASA Astrophysics Data System (ADS)
Mapp, Peter
2002-11-01
Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Dunlop, William A.; Enticott, Peter G.; Rajan, Ramesh
2016-01-01
Autism Spectrum Disorder (ASD), characterized by impaired communication skills and repetitive behaviors, can also result in differences in sensory perception. Individuals with ASD often perform normally in simple auditory tasks but poorly compared to typically developed (TD) individuals on complex auditory tasks like discriminating speech from complex background noise. A common trait of individuals with ASD is hypersensitivity to auditory stimulation. No studies to our knowledge consider whether hypersensitivity to sounds is related to differences in speech-in-noise discrimination. We provide novel evidence that individuals with high-functioning ASD show poor performance compared to TD individuals in a speech-in-noise discrimination task with an attentionally demanding background noise, but not in a purely energetic noise. Further, we demonstrate in our small sample that speech-hypersensitivity does not appear to predict performance in the speech-in-noise task. The findings support the argument that an attentional deficit, rather than a perceptual deficit, affects the ability of individuals with ASD to discriminate speech from background noise. Finally, we piloted a novel questionnaire that measures difficulty hearing in noisy environments, and sensitivity to non-verbal and verbal sounds. Psychometric analysis using 128 TD participants provided novel evidence for a difference in sensitivity to non-verbal and verbal sounds, and these findings were reinforced by participants with ASD who also completed the questionnaire. The study was limited by a small and high-functioning sample of participants with ASD. Future work could test larger sample sizes and include lower-functioning ASD participants. PMID:27555814
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder
Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.
2010-01-01
In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615
Speech sound discrimination training improves auditory cortex responses in a rat model of autism
Engineer, Crystal T.; Centanni, Tracy M.; Im, Kwok W.; Kilgard, Michael P.
2014-01-01
Children with autism often have language impairments and degraded cortical responses to speech. Extensive behavioral interventions can improve language outcomes and cortical responses. Prenatal exposure to the antiepileptic drug valproic acid (VPA) increases the risk for autism and language impairment. Prenatal exposure to VPA also causes weaker and delayed auditory cortex responses in rats. In this study, we document speech sound discrimination ability in VPA exposed rats and document the effect of extensive speech training on auditory cortex responses. VPA exposed rats were significantly impaired at consonant, but not vowel, discrimination. Extensive speech training resulted in both stronger and faster anterior auditory field (AAF) responses compared to untrained VPA exposed rats, and restored responses to control levels. This neural response improvement generalized to non-trained sounds. The rodent VPA model of autism may be used to improve the understanding of speech processing in autism and contribute to improving language outcomes. PMID:25140133
Giraud, Anne Lise; Truy, Eric
2002-01-01
Early visual cortex can be recruited by meaningful sounds in the absence of visual information. This occurs in particular in cochlear implant (CI) patients whose dependency on visual cues in speech comprehension is increased. Such cross-modal interaction mirrors the response of early auditory cortex to mouth movements (speech reading) and may reflect the natural expectancy of the visual counterpart of sounds, lip movements. Here we pursue the hypothesis that visual activations occur specifically in response to meaningful sounds. We performed PET in both CI patients and controls, while subjects listened either to their native language or to a completely unknown language. A recruitment of early visual cortex, the left posterior inferior temporal gyrus (ITG) and the left superior parietal cortex was observed in both groups. While no further activation occurred in the group of normal-hearing subjects, CI patients additionally recruited the right perirhinal/fusiform and mid-fusiform, the right temporo-occipito-parietal (TOP) junction and the left inferior prefrontal cortex (LIPF, Broca's area). This study confirms a participation of visual cortical areas in semantic processing of speech sounds. Observation of early visual activation in normal-hearing subjects shows that auditory-to-visual cross-modal effects can also be recruited under natural hearing conditions. In cochlear implant patients, speech activates the mid-fusiform gyrus in the vicinity of the so-called face area. This suggests that specific cross-modal interaction involving advanced stages in the visual processing hierarchy develops after cochlear implantation and may be the correlate of increased usage of lip-reading.
Kello, Christopher T; Bella, Simone Dalla; Médé, Butovens; Balasubramaniam, Ramesh
2017-10-01
Humans talk, sing and play music. Some species of birds and whales sing long and complex songs. All these behaviours and sounds exhibit hierarchical structure-syllables and notes are positioned within words and musical phrases, words and motives in sentences and musical phrases, and so on. We developed a new method to measure and compare hierarchical temporal structures in speech, song and music. The method identifies temporal events as peaks in the sound amplitude envelope, and quantifies event clustering across a range of timescales using Allan factor (AF) variance. AF variances were analysed and compared for over 200 different recordings from more than 16 different categories of signals, including recordings of speech in different contexts and languages, musical compositions and performances from different genres. Non-human vocalizations from two bird species and two types of marine mammals were also analysed for comparison. The resulting patterns of AF variance across timescales were distinct to each of four natural categories of complex sound: speech, popular music, classical music and complex animal vocalizations. Comparisons within and across categories indicated that nested clustering in longer timescales was more prominent when prosodic variation was greater, and when sounds came from interactions among individuals, including interactions between speakers, musicians, and even killer whales. Nested clustering also was more prominent for music compared with speech, and reflected beat structure for popular music and self-similarity across timescales for classical music. In summary, hierarchical temporal structures reflect the behavioural and social processes underlying complex vocalizations and musical performances. © 2017 The Author(s).
Human emotions track changes in the acoustic environment
Ma, Weiyi; Thompson, William Forde
2015-01-01
Emotional responses to biologically significant events are essential for human survival. Do human emotions lawfully track changes in the acoustic environment? Here we report that changes in acoustic attributes that are well known to interact with human emotions in speech and music also trigger systematic emotional responses when they occur in environmental sounds, including sounds of human actions, animal calls, machinery, or natural phenomena, such as wind and rain. Three changes in acoustic attributes known to signal emotional states in speech and music were imposed upon 24 environmental sounds. Evaluations of stimuli indicated that human emotions track such changes in environmental sounds just as they do for speech and music. Such changes not only influenced evaluations of the sounds themselves, they also affected the way accompanying facial expressions were interpreted emotionally. The findings illustrate that human emotions are highly attuned to changes in the acoustic environment, and reignite a discussion of Charles Darwin’s hypothesis that speech and music originated from a common emotional signal system based on the imitation and modification of environmental sounds. PMID:26553987
Effects of irrelevant sounds on phonological coding in reading comprehension and short-term memory.
Boyle, R; Coltheart, V
1996-05-01
The effects of irrelevant sounds on reading comprehension and short-term memory were studied in two experiments. In Experiment 1, adults judged the acceptability of written sentences during irrelevant speech, accompanied and unaccompanied singing, instrumental music, and in silence. Sentences varied in syntactic complexity: Simple sentences contained a right-branching relative clause (The applause pleased the woman that gave the speech) and syntactically complex sentences included a centre-embedded relative clause (The hay that the farmer stored fed the hungry animals). Unacceptable sentences either sounded acceptable (The dog chased the cat that eight up all his food) or did not (The man praised the child that sight up his spinach). Decision accuracy was impaired by syntactic complexity but not by irrelevant sounds. Phonological coding was indicated by increased errors on unacceptable sentences that sounded correct. These errors rates were unaffected by irrelevant sounds. Experiment 2 examined effects of irrelevant sounds on ordered recall of phonologically similar and dissimilar word lists. Phonological similarity impaired recall. Irrelevant speech reduced recall but did not interact with phonological similarity. The results of these experiments question assumptions about the relationship between speech input and phonological coding in reading and the short-term store.
McKinnon, David H; McLeod, Sharynne; Reilly, Sheena
2007-01-01
The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to schoolchildren with speech disorders. Students with speech disorders were identified from 10,425 students in Australia using a 4-stage process: training in the data collection process, teacher identification, confirmation by a speech-language pathologist, and consultation with district special needs advisors. The prevalence of students with speech disorders was estimated; specifically, 0.33% of students were identified as stuttering, 0.12% as having a voice disorder, and 1.06% as having a speech-sound disorder. There was a higher prevalence of speech disorders in males than in females. As grade level increased, the prevalence of speech disorders decreased. There was no significant difference in the pattern of prevalence across the three speech disorders and four socioeconomic groups; however, students who were identified with a speech disorder were more likely to be in the higher socioeconomic groups. Finally, there was a difference between the perceived and actual level of support that was provided to these students. These prevalence figures are lower than those using initial identification by speech-language pathologists and similar to those using parent report.
The Measurement of the Oral and Nasal Sound Pressure Levels of Speech
ERIC Educational Resources Information Center
Clarke, Wayne M.
1975-01-01
A nasal separator was used to measure the oral and nasal components in the speech of a normal adult Australian population. Results indicated no difference in oral and nasal sound pressure levels for read versus spontaneous speech samples; however, females tended to have a higher nasal component than did males. (Author/TL)
ERIC Educational Resources Information Center
Marks, William J.; Jones, W. Paul; Loe, Scott A.
2013-01-01
This study investigated the use of compressed speech as a modality for assessment of the simultaneous processing function for participants with visual impairment. A 24-item compressed speech test was created using a sound editing program to randomly remove sound elements from aural stimuli, holding pitch constant, with the objective to emulate the…
ERIC Educational Resources Information Center
Overby, Megan S.; Masterson, Julie J.; Preston, Jonathan L.
2015-01-01
Purpose: This archival investigation examined the relationship between preliteracy speech sound production skill (SSPS) and spelling in Grade 3 using a dataset in which children's receptive vocabulary was generally within normal limits, speech therapy was not provided until Grade 2, and phonological awareness instruction was discouraged at the…
ERIC Educational Resources Information Center
Watts Pappas, Nicole; McAllister, Lindy; McLeod, Sharynne
2016-01-01
Parental beliefs and experiences regarding involvement in speech intervention for their child with mild to moderate speech sound disorder (SSD) were explored using multiple, sequential interviews conducted during a course of treatment. Twenty-one interviews were conducted with seven parents of six children with SSD: (1) after their child's initial…
ERIC Educational Resources Information Center
Whitehouse, Andrew J. O.; Bishop, Dorothy V. M.
2008-01-01
Autism is a disorder characterized by a core impairment in social behaviour. A prominent component of this social deficit is poor orienting to speech. It is unclear whether this deficit involves an impairment in allocating attention to speech sounds, or a sensory impairment in processing phonetic information. In this study, event-related…
ERIC Educational Resources Information Center
Peter, Beate
2012-01-01
This study tested the hypothesis that children with speech sound disorder have generalized slowed motor speeds. It evaluated associations among oral and hand motor speeds and measures of speech (articulation and phonology) and language (receptive vocabulary, sentence comprehension, sentence imitation), in 11 children with moderate to severe SSD…
Calibration of Clinical Audio Recording and Analysis Systems for Sound Intensity Measurement.
Maryn, Youri; Zarowski, Andrzej
2015-11-01
Sound intensity is an important acoustic feature of voice/speech signals. Yet recordings are performed with different microphone, amplifier, and computer configurations, and it is therefore crucial to calibrate sound intensity measures of clinical audio recording and analysis systems on the basis of output of a sound-level meter. This study was designed to evaluate feasibility, validity, and accuracy of calibration methods, including audiometric speech noise signals and human voice signals under typical speech conditions. Calibration consisted of 3 comparisons between data from 29 measurement microphone-and-computer systems and data from the sound-level meter: signal-specific comparison with audiometric speech noise at 5 levels, signal-specific comparison with natural voice at 3 levels, and cross-signal comparison with natural voice at 3 levels. Intensity measures from recording systems were then linearly converted into calibrated data on the basis of these comparisons, and validity and accuracy of calibrated sound intensity were investigated. Very strong correlations and quasisimilarity were found between calibrated data and sound-level meter data across calibration methods and recording systems. Calibration of clinical sound intensity measures according to this method is feasible, valid, accurate, and representative for a heterogeneous set of microphones and data acquisition systems in real-life circumstances with distinct noise contexts.
Tumanova, Victoria; Zebrowski, Patricia M; Throneburg, Rebecca N; Kulak Kayikci, Mavis E
2011-01-01
The purpose of this study was to examine the relationship between articulation rate, frequency and duration of disfluencies of different types, and temperament in preschool children who stutter (CWS). In spontaneous speech samples from 19 CWS (mean age=3:9; years:months), we measured articulation rate, the frequency and duration of (a) sound prolongations; (b) sound-syllable repetitions; (c) single syllable whole word repetitions; and (d) clusters. Temperament was assessed with the Children's Behavior Questionnaire (Rothbart et al., 2001). There was a significant negative correlation between articulation rate and average duration of sound prolongations (p<0.01), and between articulation rate and frequency of stuttering-like disfluencies (SLDs) (p<0.05). No other relationships proved statistically significant. Results do not support models of stuttering development that implicate particular characteristics of temperament as proximal contributors to stuttering; however, this is likely due to the fact that current methods, including the ones used in the present study, do not allow for the identification of a functional relationship between temperament and speech production. Findings do indicate that for some CWS, relatively longer sound prolongations co-occur with relatively slower speech rate, which suggests that sound prolongations, across a range of durations, may represent a distinct type of SLD, not just in their obvious perceptual characteristics, but in their potential influence on overall speech production at multiple levels. Readers will be able to describe the relationship between stuttering-like disfluencies, articulation rate and temperament in children who stutter, and discuss different measurements of articulation rate. Copyright © 2010 Elsevier Inc. All rights reserved.
Xie, Zilong; Reetzke, Rachel; Chandrasekaran, Bharath
2018-05-24
Increasing visual perceptual load can reduce pre-attentive auditory cortical activity to sounds, a reflection of the limited and shared attentional resources for sensory processing across modalities. Here, we demonstrate that modulating visual perceptual load can impact the early sensory encoding of speech sounds, and that the impact of visual load is highly dependent on the predictability of the incoming speech stream. Participants (n = 20, 9 females) performed a visual search task of high (target similar to distractors) and low (target dissimilar to distractors) perceptual load, while early auditory electrophysiological responses were recorded to native speech sounds. Speech sounds were presented either in a 'repetitive context', or a less predictable 'variable context'. Independent of auditory stimulus context, pre-attentive auditory cortical activity was reduced during high visual load, relative to low visual load. We applied a data-driven machine learning approach to decode speech sounds from the early auditory electrophysiological responses. Decoding performance was found to be poorer under conditions of high (relative to low) visual load, when the incoming acoustic stream was predictable. When the auditory stimulus context was less predictable, decoding performance was substantially greater for the high (relative to low) visual load conditions. Our results provide support for shared attentional resources between visual and auditory modalities that substantially influence the early sensory encoding of speech signals in a context-dependent manner. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.
Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin
2018-02-21
In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
MMSE Estimator for Children’s Speech with Car and Weather Noise
NASA Astrophysics Data System (ADS)
Sayuthi, V.
2018-04-01
Previous research mentioned that most people need and use vehicles for various purposes, in this recent time and future, as a means of traveling. Many ways can be done in a vehicle, such as for enjoying entertainment, and doing work, so vehicles not just only as a means of traveling. In this study, we will examine the children’s speech from a girl in the vehicle that affected by noise disturbances from the sound source of car noise and the weather sound noise around it, in this case, the rainy weather noise. Vehicle sounds may be from car engine or car air conditioner. The minimum mean square error (MMSE) estimator is used as an attempt to obtain or detect the children’s clear speech by representing simulation research as random process signal that factored by the autocorrelation of both the child’s voice and the disturbance noise signal. This MMSE estimator can be considered as wiener filter as the clear sound are reconstructed again. We expected that the results of this study can help as the basis for development of entertainment or communication technology for passengers of vehicles in the future, particularly using MMSE estimators.
Perceptual Learning of Speech under Optimal and Adverse Conditions
Zhang, Xujin; Samuel, Arthur G.
2014-01-01
Humans have a remarkable ability to understand spoken language despite the large amount of variability in speech. Previous research has shown that listeners can use lexical information to guide their interpretation of atypical sounds in speech (Norris, McQueen, & Cutler, 2003). This kind of lexically induced perceptual learning enables people to adjust to the variations in utterances due to talker-specific characteristics, such as individual identity and dialect. The current study investigated perceptual learning in two optimal conditions: conversational speech (Experiment 1) vs. clear speech (Experiment 2), and three adverse conditions: noise (Experiment 3a) vs. two cognitive loads (Experiments 4a & 4b). Perceptual learning occurred in the two optimal conditions and in the two cognitive load conditions, but not in the noise condition. Furthermore, perceptual learning occurred only in the first of two sessions for each participant, and only for atypical /s/ sounds and not for atypical /f/ sounds. This pattern of learning and non-learning reflects a balance between flexibility and stability that the speech system must have to deal with speech variability in the diverse conditions that speech is encountered. PMID:23815478
Benders, Titia
2013-12-01
Exaggeration of the vowel space in infant-directed speech (IDS) is well documented for English, but not consistently replicated in other languages or for other speech-sound contrasts. A second attested, but less discussed, pattern of change in IDS is an overall rise of the formant frequencies, which may reflect an affective speaking style. The present study investigates longitudinally how Dutch mothers change their corner vowels, voiceless fricatives, and pitch when speaking to their infant at 11 and 15 months of age. In comparison to adult-directed speech (ADS), Dutch IDS has a smaller vowel space, higher second and third formant frequencies in the vowels, and a higher spectral frequency in the fricatives. The formants of the vowels and spectral frequency of the fricatives are raised more strongly for infants at 11 than at 15 months, while the pitch is more extreme in IDS to 15-month olds. These results show that enhanced positive affect is the main factor influencing Dutch mothers' realisation of speech sounds in IDS, especially to younger infants. This study provides evidence that mothers' expression of emotion in IDS can influence the realisation of speech sounds, and that the loss or gain of speech clarity may be secondary effects of affect. Copyright © 2013 Elsevier Inc. All rights reserved.
Speech intelligibility in complex acoustic environments in young children
NASA Astrophysics Data System (ADS)
Litovsky, Ruth
2003-04-01
While the auditory system undergoes tremendous maturation during the first few years of life, it has become clear that in complex scenarios when multiple sounds occur and when echoes are present, children's performance is significantly worse than their adult counterparts. The ability of children (3-7 years of age) to understand speech in a simulated multi-talker environment and to benefit from spatial separation of the target and competing sounds was investigated. In these studies, competing sources vary in number, location, and content (speech, modulated or unmodulated speech-shaped noise and time-reversed speech). The acoustic spaces were also varied in size and amount of reverberation. Finally, children with chronic otitis media who received binaural training were tested pre- and post-training on a subset of conditions. Results indicated the following. (1) Children experienced significantly more masking than adults, even in the simplest conditions tested. (2) When the target and competing sounds were spatially separated speech intelligibility improved, but the amount varied with age, type of competing sound, and number of competitors. (3) In a large reverberant classroom there was no benefit of spatial separation. (4) Binaural training improved speech intelligibility performance in children with otitis media. Future work includes similar studies in children with unilateral and bilateral cochlear implants. [Work supported by NIDCD, DRF, and NOHR.
A Selective Deficit in Phonetic Recalibration by Text in Developmental Dyslexia.
Keetels, Mirjam; Bonte, Milene; Vroomen, Jean
2018-01-01
Upon hearing an ambiguous speech sound, listeners may adjust their perceptual interpretation of the speech input in accordance with contextual information, like accompanying text or lipread speech (i.e., phonetic recalibration; Bertelson et al., 2003). As developmental dyslexia (DD) has been associated with reduced integration of text and speech sounds, we investigated whether this deficit becomes manifest when text is used to induce this type of audiovisual learning. Adults with DD and normal readers were exposed to ambiguous consonants halfway between /aba/ and /ada/ together with text or lipread speech. After this audiovisual exposure phase, they categorized auditory-only ambiguous test sounds. Results showed that individuals with DD, unlike normal readers, did not use text to recalibrate their phoneme categories, whereas their recalibration by lipread speech was spared. Individuals with DD demonstrated similar deficits when ambiguous vowels (halfway between /wIt/ and /wet/) were recalibrated by text. These findings indicate that DD is related to a specific letter-speech sound association deficit that extends over phoneme classes (vowels and consonants), but - as lipreading was spared - does not extend to a more general audio-visual integration deficit. In particular, these results highlight diminished reading-related audiovisual learning in addition to the commonly reported phonological problems in developmental dyslexia.
A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields.
Carlin, Michael A; Elhilali, Mounya
2015-12-01
One of the hallmarks of sound processing in the brain is the ability of the nervous system to adapt to changing behavioral demands and surrounding soundscapes. It can dynamically shift sensory and cognitive resources to focus on relevant sounds. Neurophysiological studies indicate that this ability is supported by adaptively retuning the shapes of cortical spectro-temporal receptive fields (STRFs) to enhance features of target sounds while suppressing those of task-irrelevant distractors. Because an important component of human communication is the ability of a listener to dynamically track speech in noisy environments, the solution obtained by auditory neurophysiology implies a useful adaptation strategy for speech activity detection (SAD). SAD is an important first step in a number of automated speech processing systems, and performance is often reduced in highly noisy environments. In this paper, we describe how task-driven adaptation is induced in an ensemble of neurophysiological STRFs, and show how speech-adapted STRFs reorient themselves to enhance spectro-temporal modulations of speech while suppressing those associated with a variety of nonspeech sounds. We then show how an adapted ensemble of STRFs can better detect speech in unseen noisy environments compared to an unadapted ensemble and a noise-robust baseline. Finally, we use a stimulus reconstruction task to demonstrate how the adapted STRF ensemble better captures the spectrotemporal modulations of attended speech in clean and noisy conditions. Our results suggest that a biologically plausible adaptation framework can be applied to speech processing systems to dynamically adapt feature representations for improving noise robustness.
Hayiou-Thomas, Marianna E; Carroll, Julia M; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J
2017-02-01
This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were assessed at the start of formal reading instruction (age 5½), using measures of phoneme awareness, word-level reading and spelling; and 3 years later (age 8), using measures of word-level reading, spelling and reading comprehension. The presence of early SSD conferred a small but significant risk of poor phonemic skills and spelling at the age of 5½ and of poor word reading at the age of 8. Furthermore, within the group with SSD, the persistence of speech difficulties to the point of school entry was associated with poorer emergent literacy skills, and children with 'disordered' speech errors had poorer word reading skills than children whose speech errors indicated 'delay'. In contrast, the initial severity of SSD was not a significant predictor of reading development. Beyond the domain of speech, the presence of a co-occurring language impairment was strongly predictive of literacy skills and having a family risk of dyslexia predicted additional variance in literacy at both time-points. Early SSD alone has only modest effects on literacy development but when additional risk factors are present, these can have serious negative consequences, consistent with the view that multiple risks accumulate to predict reading disorders. © 2016 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common.
Weninger, Felix; Eyben, Florian; Schuller, Björn W; Mortillaro, Marcello; Scherer, Klaus R
2013-01-01
WITHOUT DOUBT, THERE IS EMOTIONAL INFORMATION IN ALMOST ANY KIND OF SOUND RECEIVED BY HUMANS EVERY DAY: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow's pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of "the sound that something makes," in order to evaluate the system's auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects.
ERIC Educational Resources Information Center
Nash, Hannah M.; Gooch, Debbie; Hulme, Charles; Mahajan, Yatin; McArthur, Genevieve; Steinmetzger, Kurt; Snowling, Margaret J.
2017-01-01
The "automatic letter-sound integration hypothesis" (Blomert, [Blomert, L., 2011]) proposes that dyslexia results from a failure to fully integrate letters and speech sounds into automated audio-visual objects. We tested this hypothesis in a sample of English-speaking children with dyslexic difficulties (N = 13) and samples of…
Use of Authentic-Speech Technique for Teaching Sound Recognition to EFL Students
ERIC Educational Resources Information Center
Sersen, William J.
2011-01-01
The main objective of this research was to test an authentic-speech technique for improving the sound-recognition skills of EFL (English as a foreign language) students at Roi-Et Rajabhat University. The secondary objective was to determine the correlation, if any, between students' self-evaluation of sound-recognition progress and the actual…
ERIC Educational Resources Information Center
Baker, Elise; McLeod, Sharynne
2011-01-01
Purpose: This article provides both a tutorial and a clinical example of how speech-language pathologists (SLPs) can conduct evidence-based practice (EBP) when working with children with speech sound disorders (SSDs). It is a companion paper to the narrative review of 134 intervention studies for children who have an SSD (Baker & McLeod, 2011).…
ERIC Educational Resources Information Center
Ferati, Mexhid Adem
2012-01-01
To access interactive systems, blind and visually impaired users can leverage their auditory senses by using non-speech sounds. The current structure of non-speech sounds, however, is geared toward conveying user interface operations (e.g., opening a file) rather than large theme-based information (e.g., a history passage) and, thus, is ill-suited…
ERIC Educational Resources Information Center
Preston, Jonathan L.; Felsenfeld, Susan; Frost, Stephen J.; Mencl, W. Einar; Fulbright, Robert K.; Grigorenko, Elena L.; Landi, Nicole; Seki, Ayumi; Pugh, Kenneth R.
2012-01-01
Purpose: To examine neural response to spoken and printed language in children with speech sound errors (SSE). Method: Functional magnetic resonance imaging was used to compare processing of auditorily and visually presented words and pseudowords in 17 children with SSE, ages 8;6[years;months] through 10;10, with 17 matched controls. Results: When…
ERIC Educational Resources Information Center
Waring, R.; Knight, R.
2013-01-01
Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
NASA Astrophysics Data System (ADS)
Gover, Bradford Noel
The problem of hands-free speech pick-up is introduced, and it is identified how details of the spatial properties of the reverberant field may be useful for enhanced design of microphone arrays. From this motivation, a broadly-applicable measurement system has been developed for the analysis of the directional and spatial variations in reverberant sound fields. Two spherical, 32-element arrays of microphones are used to generate narrow beams over two different frequency ranges, together covering 300--3300 Hz. Using an omnidirectional loudspeaker as excitation in a room, the pressure impulse response in each of 60 steering directions is measured. Through analysis of these responses, the variation of arriving energy with direction is studied. The system was first validated in simple sound fields in an anechoic chamber and in a reverberation chamber. The system characterizes these sound fields as expected, both quantitatively through numerical descriptors and qualitatively from plots of the arriving energy versus direction. The system was then used to measure the sound fields in several actual rooms. Through both qualitative and quantitative output, these sound fields were seen to be highly anisotropic, influenced greatly by the direct sound and early-arriving reflections. Furthermore, the rate of sound decay was not independent of direction, sound being absorbed more rapidly in some directions than in others. These results are discussed in the context of the original motivation, and methods for their application to enhanced speech pick-up using microphone arrays are proposed.
HUMAN SPEECH: A RESTRICTED USE OF THE MAMMALIAN LARYNX
Titze, Ingo R.
2016-01-01
Purpose Speech has been hailed as unique to human evolution. While the inventory of distinct sounds producible with vocal tract articulators is a great advantage in human oral communication, it is argued here that the larynx as a sound source in speech is limited in its range and capability because a low fundamental frequency is ideal for phonemic intelligibility and source-filter independence. Method Four existing data sets were combined to make an argument regarding exclusive use of the larynx for speech: (1) range of fundamental frequency, (2) laryngeal muscle activation, (3) vocal fold length in relation to sarcomere length of the major laryngeal muscles, and (4) vocal fold morphological development. Results Limited data support the notion that speech tends to produce a contracture of the larynx. The morphological design of the human vocal folds, like that of primates and other mammals, is optimized for vocal communication over distances for which higher fundamental frequency, higher intensity, and fewer unvoiced segments are utilized than in conversational speech. Conclusion The positive message is that raising one’s voice to call, shout, or sing, or executing pitch glides to stretch the vocal folds, can counteract this trend toward a contracted state. PMID:27397113
Result on speech perception after conversion from Spectra® to Freedom®.
Magalhães, Ana Tereza de Matos; Goffi-Gomez, Maria Valéria Schmidt; Hoshino, Ana Cristina; Tsuji, Robinson Koji; Bento, Ricardo Ferreira; Brito, Rubens
2012-04-01
New technology in the Freedom® speech processor for cochlear implants was developed to improve how incoming acoustic sound is processed; this applies not only for new users, but also for previous generations of cochlear implants. To identify the contribution of this technology-- the Nucleus 22®--on speech perception tests in silence and in noise, and on audiometric thresholds. A cross-sectional cohort study was undertaken. Seventeen patients were selected. The last map based on the Spectra® was revised and optimized before starting the tests. Troubleshooting was used to identify malfunction. To identify the contribution of the Freedom® technology for the Nucleus22®, auditory thresholds and speech perception tests were performed in free field in sound-proof booths. Recorded monosyllables and sentences in silence and in noise (SNR = 0dB) were presented at 60 dBSPL. The nonparametric Wilcoxon test for paired data was used to compare groups. Freedom® applied for the Nucleus22® showed a statistically significant difference in all speech perception tests and audiometric thresholds. The Freedom® technology improved the performance of speech perception and audiometric thresholds of patients with Nucleus 22®.
Turnover and intent to leave among speech pathologists.
McLaughlin, Emma G H; Adamson, Barbara J; Lincoln, Michelle A; Pallant, Julie F; Cooper, Cary L
2010-05-01
Sound, large scale and systematic research into why health professionals want to leave their jobs is needed. This study used psychometrically-sound tools and logistic regression analyses to determine why Australian speech pathologists were intending to leave their jobs or the profession. Based on data from 620 questionnaires, several variables were found to be significantly related to intent to leave. The speech pathologists intending to look for a new job were more likely to be under 34 years of age, and perceive low levels of job security and benefits of the profession. Those intending to leave the profession were more likely to spend greater than half their time at work on administrative duties, have a higher negative affect score, not have children under 18 years of age, and perceive that speech pathology did not offer benefits that met their professional needs. The findings of this study provide the first evidence regarding the reasons for turnover and attrition in the Australian speech pathology workforce, and can inform the development of strategies to retain a skilled and experienced allied health workforce.
Treatment for Acquired Apraxia of Speech: Examination of Treatment Intensity and Practice Schedule
ERIC Educational Resources Information Center
Wambaugh, Julie L.; Nessler, Christina; Cameron, Rosalea; Mauszycki, Shannon C.
2013-01-01
Purpose: The authors designed this investigation to extend the development of a treatment for acquired apraxia of speech (AOS)--sound production treatment (SPT)--by examining the effects of 2 treatment intensities and 2 schedules of practice. Method: The authors used a multiple baseline design across participants and behaviors with 4 speakers with…
Human neuromagnetic steady-state responses to amplitude-modulated tones, speech, and music.
Lamminmäki, Satu; Parkkonen, Lauri; Hari, Riitta
2014-01-01
Auditory steady-state responses that can be elicited by various periodic sounds inform about subcortical and early cortical auditory processing. Steady-state responses to amplitude-modulated pure tones have been used to scrutinize binaural interaction by frequency-tagging the two ears' inputs at different frequencies. Unlike pure tones, speech and music are physically very complex, as they include many frequency components, pauses, and large temporal variations. To examine the utility of magnetoencephalographic (MEG) steady-state fields (SSFs) in the study of early cortical processing of complex natural sounds, the authors tested the extent to which amplitude-modulated speech and music can elicit reliable SSFs. MEG responses were recorded to 90-s-long binaural tones, speech, and music, amplitude-modulated at 41.1 Hz at four different depths (25, 50, 75, and 100%). The subjects were 11 healthy, normal-hearing adults. MEG signals were averaged in phase with the modulation frequency, and the sources of the resulting SSFs were modeled by current dipoles. After the MEG recording, intelligibility of the speech, musical quality of the music stimuli, naturalness of music and speech stimuli, and the perceived deterioration caused by the modulation were evaluated on visual analog scales. The perceived quality of the stimuli decreased as a function of increasing modulation depth, more strongly for music than speech; yet, all subjects considered the speech intelligible even at the 100% modulation. SSFs were the strongest to tones and the weakest to speech stimuli; the amplitudes increased with increasing modulation depth for all stimuli. SSFs to tones were reliably detectable at all modulation depths (in all subjects in the right hemisphere, in 9 subjects in the left hemisphere) and to music stimuli at 50 to 100% depths, whereas speech usually elicited clear SSFs only at 100% depth.The hemispheric balance of SSFs was toward the right hemisphere for tones and speech, whereas SSFs to music showed no lateralization. In addition, the right lateralization of SSFs to the speech stimuli decreased with decreasing modulation depth. The results showed that SSFs can be reliably measured to amplitude-modulated natural sounds, with slightly different hemispheric lateralization for different carrier sounds. With speech stimuli, modulation at 100% depth is required, whereas for music the 75% or even 50% modulation depths provide a reasonable compromise between the signal-to-noise ratio of SSFs and sound quality or perceptual requirements. SSF recordings thus seem feasible for assessing the early cortical processing of natural sounds.
Speech perception: Some new directions in research and theory
Pisoni, David B.
2012-01-01
The perception of speech is one of the most fascinating attributes of human behavior; both the auditory periphery and higher centers help define the parameters of sound perception. In this paper some of the fundamental perceptual problems facing speech sciences are described. The paper focuses on several of the new directions speech perception research is taking to solve these problems. Recent developments suggest that major breakthroughs in research and theory will soon be possible. The current study of segmentation, invariance, and normalization are described. The paper summarizes some of the new techniques used to understand auditory perception of speech signals and their linguistic significance to the human listener. PMID:4031245
On the Perception of Speech Sounds as Biologically Significant Signals1,2
Pisoni, David B.
2012-01-01
This paper reviews some of the major evidence and arguments currently available to support the view that human speech perception may require the use of specialized neural mechanisms for perceptual analysis. Experiments using synthetically produced speech signals with adults are briefly summarized and extensions of these results to infants and other organisms are reviewed with an emphasis towards detailing those aspects of speech perception that may require some need for specialized species-specific processors. Finally, some comments on the role of early experience in perceptual development are provided as an attempt to identify promising areas of new research in speech perception. PMID:399200
Speech transformation system (spectrum and/or excitation) without pitch extraction
NASA Astrophysics Data System (ADS)
Seneff, S.
1980-07-01
A speech analysis synthesis system was developed which is capable of independent manipulation of the fundamental frequency and spectral envelope of a speech waveform. The system deconvolved the original speech with the spectral envelope estimate to obtain a model for the excitation, explicit pitch extraction was not required and as a consequence, the transformed speech was more natural sounding than would be the case if the excitation were modeled as a sequence of pulses. It is shown that the system has applications in the areas of voice modifications, baseband excited vocoders, time scale modifications, and frequency compression as an aid to the partially deaf.
Using the structure of natural scenes and sounds to predict neural response properties in the brain
NASA Astrophysics Data System (ADS)
Deweese, Michael
2014-03-01
The natural scenes and sounds we encounter in the world are highly structured. The fact that animals and humans are so efficient at processing these sensory signals compared with the latest algorithms running on the fastest modern computers suggests that our brains can exploit this structure. We have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but we also find more exotic structures in the spectrogra representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported in the Inferior Colliculus (IC), as well as auditory thalamus (MGBv) and primary auditory cortex (A1), and our model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To our knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory nerve can be predicted based on coding principles and the statistical properties of recorded sounds. We have also developed a biologically-inspired neural network model of primary visual cortex (V1) that can learn a sparse representation of natural scenes using spiking neurons and strictly local plasticity rules. The representation learned by our model is in good agreement with measured receptive fields in V1, demonstrating that sparse sensory coding can be achieved in a realistic biological setting.
ERIC Educational Resources Information Center
Pejovic, Jovana; Molnar, Monika
2017-01-01
Recently it has been proposed that sensitivity to nonarbitrary relationships between speech sounds and objects potentially bootstraps lexical acquisition. However, it is currently unclear whether preverbal infants (e.g., before 6 months of age) with different linguistic profiles are sensitive to such nonarbitrary relationships. Here, the authors…
Criteria for the Segmentation of Vowels on Duplex Oscillograms.
ERIC Educational Resources Information Center
Naeser, Margaret A.
This paper develops criteria for the segmentation of vowels on duplex oscillograms. Previous vowel duration studies have primarily used sound spectrograms. The use of duplex oscillograms, rather than sound spectrograms, permits faster production (real time) at less expense (adding machine paper may be used). The speech signal can be more spread…
Awareness of Rhythm Patterns in Speech and Music in Children with Specific Language Impairments
Cumming, Ruth; Wilson, Angela; Leong, Victoria; Colling, Lincoln J.; Goswami, Usha
2015-01-01
Children with specific language impairments (SLIs) show impaired perception and production of language, and also show impairments in perceiving auditory cues to rhythm [amplitude rise time (ART) and sound duration] and in tapping to a rhythmic beat. Here we explore potential links between language development and rhythm perception in 45 children with SLI and 50 age-matched controls. We administered three rhythmic tasks, a musical beat detection task, a tapping-to-music task, and a novel music/speech task, which varied rhythm and pitch cues independently or together in both speech and music. Via low-pass filtering, the music sounded as though it was played from a low-quality radio and the speech sounded as though it was muffled (heard “behind the door”). We report data for all of the SLI children (N = 45, IQ varying), as well as for two independent subgroupings with intact IQ. One subgroup, “Pure SLI,” had intact phonology and reading (N = 16), the other, “SLI PPR” (N = 15), had impaired phonology and reading. When IQ varied (all SLI children), we found significant group differences in all the rhythmic tasks. For the Pure SLI group, there were rhythmic impairments in the tapping task only. For children with SLI and poor phonology (SLI PPR), group differences were found in all of the filtered speech/music AXB tasks. We conclude that difficulties with rhythmic cues in both speech and music are present in children with SLIs, but that some rhythmic measures are more sensitive than others. The data are interpreted within a “prosodic phrasing” hypothesis, and we discuss the potential utility of rhythmic and musical interventions in remediating speech and language difficulties in children. PMID:26733848
Awareness of Rhythm Patterns in Speech and Music in Children with Specific Language Impairments.
Cumming, Ruth; Wilson, Angela; Leong, Victoria; Colling, Lincoln J; Goswami, Usha
2015-01-01
Children with specific language impairments (SLIs) show impaired perception and production of language, and also show impairments in perceiving auditory cues to rhythm [amplitude rise time (ART) and sound duration] and in tapping to a rhythmic beat. Here we explore potential links between language development and rhythm perception in 45 children with SLI and 50 age-matched controls. We administered three rhythmic tasks, a musical beat detection task, a tapping-to-music task, and a novel music/speech task, which varied rhythm and pitch cues independently or together in both speech and music. Via low-pass filtering, the music sounded as though it was played from a low-quality radio and the speech sounded as though it was muffled (heard "behind the door"). We report data for all of the SLI children (N = 45, IQ varying), as well as for two independent subgroupings with intact IQ. One subgroup, "Pure SLI," had intact phonology and reading (N = 16), the other, "SLI PPR" (N = 15), had impaired phonology and reading. When IQ varied (all SLI children), we found significant group differences in all the rhythmic tasks. For the Pure SLI group, there were rhythmic impairments in the tapping task only. For children with SLI and poor phonology (SLI PPR), group differences were found in all of the filtered speech/music AXB tasks. We conclude that difficulties with rhythmic cues in both speech and music are present in children with SLIs, but that some rhythmic measures are more sensitive than others. The data are interpreted within a "prosodic phrasing" hypothesis, and we discuss the potential utility of rhythmic and musical interventions in remediating speech and language difficulties in children.
A novel radar sensor for the non-contact detection of speech signals.
Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi
2010-01-01
Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.
A Novel Radar Sensor for the Non-Contact Detection of Speech Signals
Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi
2010-01-01
Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects. PMID:22399895
Kempe, Vera; Bublitz, Dennis; Brooks, Patricia J
2015-05-01
Is the observed link between musical ability and non-native speech-sound processing due to enhanced sensitivity to acoustic features underlying both musical and linguistic processing? To address this question, native English speakers (N = 118) discriminated Norwegian tonal contrasts and Norwegian vowels. Short tones differing in temporal, pitch, and spectral characteristics were used to measure sensitivity to the various acoustic features implicated in musical and speech processing. Musical ability was measured using Gordon's Advanced Measures of Musical Audiation. Results showed that sensitivity to specific acoustic features played a role in non-native speech-sound processing: Controlling for non-verbal intelligence, prior foreign language-learning experience, and sex, sensitivity to pitch and spectral information partially mediated the link between musical ability and discrimination of non-native vowels and lexical tones. The findings suggest that while sensitivity to certain acoustic features partially mediates the relationship between musical ability and non-native speech-sound processing, complex tests of musical ability also tap into other shared mechanisms. © 2014 The British Psychological Society.
CNTNAP2 Is Significantly Associated With Speech Sound Disorder in the Chinese Han Population.
Zhao, Yun-Jing; Wang, Yue-Ping; Yang, Wen-Zhu; Sun, Hong-Wei; Ma, Hong-Wei; Zhao, Ya-Ru
2015-11-01
Speech sound disorder is the most common communication disorder. Some investigations support the possibility that the CNTNAP2 gene might be involved in the pathogenesis of speech-related diseases. To investigate single-nucleotide polymorphisms in the CNTNAP2 gene, 300 unrelated speech sound disorder patients and 200 normal controls were included in the study. Five single-nucleotide polymorphisms were amplified and directly sequenced. Significant differences were found in the genotype (P = .0003) and allele (P = .0056) frequencies of rs2538976 between patients and controls. The excess frequency of the A allele in the patient group remained significant after Bonferroni correction (P = .0280). A significant haplotype association with rs2710102T/+rs17236239A/+2538976A/+2710117A (P = 4.10e-006) was identified. A neighboring single-nucleotide polymorphism, rs10608123, was found in complete linkage disequilibrium with rs2538976, and the genotypes exactly corresponded to each other. The authors propose that these CNTNAP2 variants increase the susceptibility to speech sound disorder. The single-nucleotide polymorphisms rs10608123 and rs2538976 may merge into one single-nucleotide polymorphism. © The Author(s) 2015.
Auditory-neurophysiological responses to speech during early childhood: Effects of background noise
White-Schwoch, Travis; Davies, Evan C.; Thompson, Elaine C.; Carr, Kali Woodruff; Nicol, Trent; Bradlow, Ann R.; Kraus, Nina
2015-01-01
Early childhood is a critical period of auditory learning, during which children are constantly mapping sounds to meaning. But learning rarely occurs under ideal listening conditions—children are forced to listen against a relentless din. This background noise degrades the neural coding of these critical sounds, in turn interfering with auditory learning. Despite the importance of robust and reliable auditory processing during early childhood, little is known about the neurophysiology underlying speech processing in children so young. To better understand the physiological constraints these adverse listening scenarios impose on speech sound coding during early childhood, auditory-neurophysiological responses were elicited to a consonant-vowel syllable in quiet and background noise in a cohort of typically-developing preschoolers (ages 3–5 yr). Overall, responses were degraded in noise: they were smaller, less stable across trials, slower, and there was poorer coding of spectral content and the temporal envelope. These effects were exacerbated in response to the consonant transition relative to the vowel, suggesting that the neural coding of spectrotemporally-dynamic speech features is more tenuous in noise than the coding of static features—even in children this young. Neural coding of speech temporal fine structure, however, was more resilient to the addition of background noise than coding of temporal envelope information. Taken together, these results demonstrate that noise places a neurophysiological constraint on speech processing during early childhood by causing a breakdown in neural processing of speech acoustics. These results may explain why some listeners have inordinate difficulties understanding speech in noise. Speech-elicited auditory-neurophysiological responses offer objective insight into listening skills during early childhood by reflecting the integrity of neural coding in quiet and noise; this paper documents typical response properties in this age group. These normative metrics may be useful clinically to evaluate auditory processing difficulties during early childhood. PMID:26113025
A Smartphone Application for Customized Frequency Table Selection in Cochlear Implants.
Jethanamest, Daniel; Azadpour, Mahan; Zeman, Annette M; Sagi, Elad; Svirsky, Mario A
2017-09-01
A novel smartphone-based software application can facilitate self-selection of frequency allocation tables (FAT) in postlingually deaf cochlear implant (CI) users. CIs use FATs to represent the tonotopic organization of a normal cochlea. Current CI fitting methods typically use a standard FAT for all patients regardless of individual differences in cochlear size and electrode location. In postlingually deaf patients, different amounts of mismatch can result between the frequency-place function they experienced when they had normal hearing and the frequency-place function that results from the standard FAT. For some CI users, an alternative FAT may enhance sound quality or speech perception. Currently, no widely available tools exist to aid real-time selection of different FATs. This study aims to develop a new smartphone tool for this purpose and to evaluate speech perception and sound quality measures in a pilot study of CI subjects using this application. A smartphone application for a widely available mobile platform (iOS) was developed to serve as a preprocessor of auditory input to a clinical CI speech processor and enable interactive real-time selection of FATs. The application's output was validated by measuring electrodograms for various inputs. A pilot study was conducted in six CI subjects. Speech perception was evaluated using word recognition tests. All subjects successfully used the portable application with their clinical speech processors to experience different FATs while listening to running speech. The users were all able to select one table that they judged provided the best sound quality. All subjects chose a FAT different from the standard FAT in their everyday clinical processor. Using the smartphone application, the mean consonant-nucleus-consonant score with the default FAT selection was 28.5% (SD 16.8) and 29.5% (SD 16.4) when using a self-selected FAT. A portable smartphone application enables CI users to self-select frequency allocation tables in real time. Even though the self-selected FATs that were deemed to have better sound quality were only tested acutely (i.e., without long-term experience with them), speech perception scores were not inferior to those obtained with the clinical FATs. This software application may be a valuable tool for improving future methods of CI fitting.
Transitioning from analog to digital audio recording in childhood speech sound disorders.
Shriberg, Lawrence D; McSweeny, Jane L; Anderson, Bruce E; Campbell, Thomas F; Chial, Michael R; Green, Jordan R; Hauner, Katherina K; Moore, Christopher A; Rusiewicz, Heather L; Wilson, David L
2005-06-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants' speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise.
Transitioning from analog to digital audio recording in childhood speech sound disorders
Shriberg, Lawrence D.; McSweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.
2014-01-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants’ speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise. PMID:16019779
ERIC Educational Resources Information Center
Harrison, Linda J.; McLeod, Sharynne; McAllister, Lindy; McCormack, Jane
2017-01-01
This study sought to assess the level of correspondence between parent and teacher report of concern about young children's speech and specialist assessment of speech sound disorders (SSD). A sample of 157 children aged 4-5 years was recruited in preschools and long day care centres in Victoria and New South Wales (NSW). SSD was assessed…
Attentional modulation of informational masking on early cortical representations of speech signals.
Zhang, Changxin; Arnott, Stephen R; Rabaglia, Cristina; Avivi-Reich, Meital; Qi, James; Wu, Xihong; Li, Liang; Schneider, Bruce A
2016-01-01
To recognize speech in a noisy auditory scene, listeners need to perceptually segregate the target talker's voice from other competing sounds (stream segregation). A number of studies have suggested that the attentional demands placed on listeners increase as the acoustic properties and informational content of the competing sounds become more similar to that of the target voice. Hence we would expect attentional demands to be considerably greater when speech is masked by speech than when it is masked by steady-state noise. To investigate the role of attentional mechanisms in the unmasking of speech sounds, event-related potentials (ERPs) were recorded to a syllable masked by noise or competing speech under both active (the participant was asked to respond when the syllable was presented) or passive (no response was required) listening conditions. The results showed that the long-latency auditory response to a syllable (/bi/), presented at different signal-to-masker ratios (SMRs), was similar in both passive and active listening conditions, when the masker was a steady-state noise. In contrast, a switch from the passive listening condition to the active one, when the masker was two-talker speech, significantly enhanced the ERPs to the syllable. These results support the hypothesis that the need to engage attentional mechanisms in aid of scene analysis increases as the similarity (both acoustic and informational) between the target speech and the competing background sounds increases. Copyright © 2015 Elsevier B.V. All rights reserved.
Participation of the Classical Speech Areas in Auditory Long-Term Memory
Karabanov, Anke Ninija; Paine, Rainer; Chao, Chi Chao; Schulze, Katrin; Scott, Brian; Hallett, Mark; Mishkin, Mortimer
2015-01-01
Accumulating evidence suggests that storing speech sounds requires transposing rapidly fluctuating sound waves into more easily encoded oromotor sequences. If so, then the classical speech areas in the caudalmost portion of the temporal gyrus (pSTG) and in the inferior frontal gyrus (IFG) may be critical for performing this acoustic-oromotor transposition. We tested this proposal by applying repetitive transcranial magnetic stimulation (rTMS) to each of these left-hemisphere loci, as well as to a nonspeech locus, while participants listened to pseudowords. After 5 minutes these stimuli were re-presented together with new ones in a recognition test. Compared to control-site stimulation, pSTG stimulation produced a highly significant increase in recognition error rate, without affecting reaction time. By contrast, IFG stimulation led only to a weak, non-significant, trend toward recognition memory impairment. Importantly, the impairment after pSTG stimulation was not due to interference with perception, since the same stimulation failed to affect pseudoword discrimination examined with short interstimulus intervals. Our findings suggest that pSTG is essential for transforming speech sounds into stored motor plans for reproducing the sound. Whether or not the IFG also plays a role in speech-sound recognition could not be determined from the present results. PMID:25815813
Participation of the classical speech areas in auditory long-term memory.
Karabanov, Anke Ninija; Paine, Rainer; Chao, Chi Chao; Schulze, Katrin; Scott, Brian; Hallett, Mark; Mishkin, Mortimer
2015-01-01
Accumulating evidence suggests that storing speech sounds requires transposing rapidly fluctuating sound waves into more easily encoded oromotor sequences. If so, then the classical speech areas in the caudalmost portion of the temporal gyrus (pSTG) and in the inferior frontal gyrus (IFG) may be critical for performing this acoustic-oromotor transposition. We tested this proposal by applying repetitive transcranial magnetic stimulation (rTMS) to each of these left-hemisphere loci, as well as to a nonspeech locus, while participants listened to pseudowords. After 5 minutes these stimuli were re-presented together with new ones in a recognition test. Compared to control-site stimulation, pSTG stimulation produced a highly significant increase in recognition error rate, without affecting reaction time. By contrast, IFG stimulation led only to a weak, non-significant, trend toward recognition memory impairment. Importantly, the impairment after pSTG stimulation was not due to interference with perception, since the same stimulation failed to affect pseudoword discrimination examined with short interstimulus intervals. Our findings suggest that pSTG is essential for transforming speech sounds into stored motor plans for reproducing the sound. Whether or not the IFG also plays a role in speech-sound recognition could not be determined from the present results.
Stuttering may start with repeating consonants (k, g, t). If stuttering becomes worse, words and phrases are repeated. Later, vocal spasms develop. There is a forced, almost explosive sound to speech. The ...
Perceptual learning of speech under optimal and adverse conditions.
Zhang, Xujin; Samuel, Arthur G
2014-02-01
Humans have a remarkable ability to understand spoken language despite the large amount of variability in speech. Previous research has shown that listeners can use lexical information to guide their interpretation of atypical sounds in speech (Norris, McQueen, & Cutler, 2003). This kind of lexically induced perceptual learning enables people to adjust to the variations in utterances due to talker-specific characteristics, such as individual identity and dialect. The current study investigated perceptual learning in two optimal conditions: conversational speech (Experiment 1) versus clear speech (Experiment 2), and three adverse conditions: noise (Experiment 3a) versus two cognitive loads (Experiments 4a and 4b). Perceptual learning occurred in the two optimal conditions and in the two cognitive load conditions, but not in the noise condition. Furthermore, perceptual learning occurred only in the first of two sessions for each participant, and only for atypical /s/ sounds and not for atypical /f/ sounds. This pattern of learning and nonlearning reflects a balance between flexibility and stability that the speech system must have to deal with speech variability in the diverse conditions that speech is encountered. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Tran, Phuong K; Letowski, Tomasz R; McBride, Maranda E
2013-06-01
Speech signals can be converted into electrical audio signals using either conventional air conduction (AC) microphone or a contact bone conduction (BC) microphone. The goal of this study was to investigate the effects of the location of a BC microphone on the intensity and frequency spectrum of the recorded speech. Twelve locations, 11 on the talker's head and 1 on the collar bone, were investigated. The speech sounds were three vowels (/u/, /a/, /i/) and two consonants (/m/, /∫/). The sounds were produced by 12 talkers. Each sound was recorded simultaneously with two BC microphones and an AC microphone. Analyzed spectral data showed that the BC recordings made at the forehead of the talker were the most similar to the AC recordings, whereas the collar bone recordings were most different. Comparison of the spectral data with speech intelligibility data collected in another study revealed a strong negative relationship between BC speech intelligibility and the degree of deviation of the BC speech spectrum from the AC spectrum. In addition, the head locations that resulted in the highest speech intelligibility were associated with the lowest output signals among all tested locations. Implications of these findings for BC communication are discussed.
... of the palate is because of abnormal speech. The speech has a nasal sound because air is lost through the nose. In such cases the child’s speech should be evaluated by a speech pathologist who, ...
Kalinowski, Joseph; Saltuklaroglu, Tim
2003-04-01
'Choral speech', 'unison speech', or 'imitation speech' has long been known to immediately induce reflexive, spontaneous, and natural sounding fluency, even the most severe cases of stuttering. Unlike typical post-therapeutic speech, a hallmark characteristic of choral speech is the sense of 'invulnerability' to stuttering, regardless of phonetic context, situational environment, or audience size. We suggest that choral speech immediately inhibits stuttering by engaging mirror systems of neurons, innate primitive neuronal substrates that dominate the initial phases of language development due to their predisposition to reflexively imitate gestural action sequences in a fluent manner. Since mirror systems are primordial in nature, they take precedence over the much later developing stuttering pathology. We suggest that stuttering may best be ameliorated by reengaging mirror neurons via choral speech or one of its derivatives (using digital signal processing technology) to provide gestural mirrors, that are nature's way of immediately overriding the central stuttering block. Copyright 2003 Elsevier Science Ltd.
Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Pinheiro da Silva, Joyce; Wertzner, Haydée Fiszbein
2017-05-22
The purpose of the study was to determine the sensitivity and specificity, and to establish cutoff points for the severity index Percentage of Consonants Correct - Revised (PCC-R) in Brazilian Portuguese-speaking children with and without speech sound disorders. 72 children between 5:00 and 7:11 years old - 36 children without speech and language complaints and 36 children with speech sound disorders. The PCC-R was applied to the figure naming and word imitation tasks that are part of the ABFW Child Language Test. Results were statistically analyzed. The ROC curve was performed and sensitivity and specificity values of the index were verified. The group of children without speech sound disorders presented greater PCC-R values in both tasks, regardless of the gender of the participants. The cutoff value observed for the picture naming task was 93.4%, with a sensitivity value of 0.89 and specificity of 0.94 (age independent). For the word imitation task, results were age-dependent: for age group ≤6:5 years old, the cutoff value was 91.0% (sensitivity of 0.77 and specificity of 0.94) and for age group >6:5 years-old, the cutoff value was 93.9% (sensitivity of 0.93 and specificity of 0.94). Given the high sensitivity and specificity of PCC-R, we can conclude that the index was effective in discriminating and identifying children with and without speech sound disorders.
ERIC Educational Resources Information Center
McLeod, Sharynne; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Sue; Crowe, Kathryn; Masso, Sarah; White, Paul; Howland, Charlotte
2017-01-01
Purpose: The aim was to evaluate the effectiveness of computer-assisted input-based intervention for children with speech sound disorders (SSD). Method: The Sound Start Study was a cluster-randomized controlled trial. Seventy-nine early childhood centers were invited to participate, 45 were recruited, and 1,205 parents and educators of 4- and…
A Perceptual and Electropalatographic Study of /Esh/ in Young People with Down's Syndrome
ERIC Educational Resources Information Center
Timmins, Claire; Cleland, Joanne; Wood, Sara E.; Hardcastle, William J.; Wishart, Jennifer G.
2009-01-01
Speech production in young people with Down's syndrome has been found to be variable and inconsistent. Errors tend to be more in the production of sounds that typically develop later, for example, fricatives and affricates, rather than stops and nasals. It has been suggested that inconsistency in production is a result of a motor speech deficit.…
A Freely-Available Authoring System for Browser-Based CALL with Speech Recognition
ERIC Educational Resources Information Center
O'Brien, Myles
2017-01-01
A system for authoring browser-based CALL material incorporating Google speech recognition has been developed and made freely available for download. The system provides a teacher with a simple way to set up CALL material, including an optional image, sound or video, which will elicit spoken (and/or typed) answers from the user and check them…
Masterson, Julie J.; Preston, Jonathan L.
2015-01-01
Purpose This archival investigation examined the relationship between preliteracy speech sound production skill (SSPS) and spelling in Grade 3 using a dataset in which children's receptive vocabulary was generally within normal limits, speech therapy was not provided until Grade 2, and phonological awareness instruction was discouraged at the time data were collected. Method Participants (N = 250), selected from the Templin Archive (Templin, 2004), varied on prekindergarten SSPS. Participants' real word spellings in Grade 3 were evaluated using a metric of linguistic knowledge, the Computerized Spelling Sensitivity System (Masterson & Apel, 2013). Relationships between kindergarten speech error types and later spellings also were explored. Results Prekindergarten children in the lowest SPSS (7th percentile) scored poorest among articulatory subgroups on both individual spelling elements (phonetic elements, junctures, and affixes) and acceptable spelling (using relatively more omissions and illegal spelling patterns). Within the 7th percentile subgroup, there were no statistical spelling differences between those with mostly atypical speech sound errors and those with mostly typical speech sound errors. Conclusions Findings were consistent with predictions from dual route models of spelling that SSPS is one of many variables associated with spelling skill and that children with impaired SSPS are at risk for spelling difficulty. PMID:26380965
Precision of working memory for speech sounds.
Joseph, Sabine; Iverson, Paul; Manohar, Sanjay; Fox, Zoe; Scott, Sophie K; Husain, Masud
2015-01-01
Memory for speech sounds is a key component of models of verbal working memory (WM). But how good is verbal WM? Most investigations assess this using binary report measures to derive a fixed number of items that can be stored. However, recent findings in visual WM have challenged such "quantized" views by employing measures of recall precision with an analogue response scale. WM for speech sounds might rely on both continuous and categorical storage mechanisms. Using a novel speech matching paradigm, we measured WM recall precision for phonemes. Vowel qualities were sampled from a formant space continuum. A probe vowel had to be adjusted to match the vowel quality of a target on a continuous, analogue response scale. Crucially, this provided an index of the variability of a memory representation around its true value and thus allowed us to estimate how memories were distorted from the original sounds. Memory load affected the quality of speech sound recall in two ways. First, there was a gradual decline in recall precision with increasing number of items, consistent with the view that WM representations of speech sounds become noisier with an increase in the number of items held in memory, just as for vision. Based on multidimensional scaling (MDS), the level of noise appeared to be reflected in distortions of the formant space. Second, as memory load increased, there was evidence of greater clustering of participants' responses around particular vowels. A mixture model captured both continuous and categorical responses, demonstrating a shift from continuous to categorical memory with increasing WM load. This suggests that direct acoustic storage can be used for single items, but when more items must be stored, categorical representations must be used.
Pfiffner, Flurin; Kompis, Martin; Stieger, Christof
2009-10-01
To investigate correlations between preoperative hearing thresholds and postoperative aided thresholds and speech understanding of users of Bone-anchored Hearing Aids (BAHA). Such correlations may be useful to estimate the postoperative outcome with BAHA from preoperative data. Retrospective case review. Tertiary referral center. : Ninety-two adult unilaterally implanted BAHA users in 3 groups: (A) 24 subjects with a unilateral conductive hearing loss, (B) 38 subjects with a bilateral conductive hearing loss, and (C) 30 subjects with single-sided deafness. Preoperative air-conduction and bone-conduction thresholds and 3-month postoperative aided and unaided sound-field thresholds as well as speech understanding using German 2-digit numbers and monosyllabic words were measured and analyzed. Correlation between preoperative air-conduction and bone-conduction thresholds of the better and of the poorer ear and postoperative aided thresholds as well as correlations between gain in sound-field threshold and gain in speech understanding. Aided postoperative sound-field thresholds correlate best with BC threshold of the better ear (correlation coefficients, r2 = 0.237 to 0.419, p = 0.0006 to 0.0064, depending on the group of subjects). Improvements in sound-field threshold correspond to improvements in speech understanding. When estimating expected postoperative aided sound-field thresholds of BAHA users from preoperative hearing thresholds, the BC threshold of the better ear should be used. For the patient groups considered, speech understanding in quiet can be estimated from the improvement in sound-field thresholds.
The McGurk effect in children with autism and Asperger syndrome.
Bebko, James M; Schroeder, Jessica H; Weiss, Jonathan A
2014-02-01
Children with autism may have difficulties in audiovisual speech perception, which has been linked to speech perception and language development. However, little has been done to examine children with Asperger syndrome as a group on tasks assessing audiovisual speech perception, despite this group's often greater language skills. Samples of children with autism, Asperger syndrome, and Down syndrome, as well as a typically developing sample, were presented with an auditory-only condition, a speech-reading condition, and an audiovisual condition designed to elicit the McGurk effect. Children with autism demonstrated unimodal performance at the same level as the other groups, yet showed a lower rate of the McGurk effect compared with the Asperger, Down and typical samples. These results suggest that children with autism may have unique intermodal speech perception difficulties linked to their representations of speech sounds. © 2013 International Society for Autism Research, Wiley Periodicals, Inc.
LISTENING HABITS AND SPEECH SOUND DISCRIMINATION DEVELOPED THROUGH A MULTIPLE SENSORY APPROACH.
ERIC Educational Resources Information Center
SMITH, MARY LOUISE
ALTHOUGH INTENDED PRIMARILY FOR KINDERGARTEN AND PRIMARY TEACHERS, THE SUGGESTIONS FOR LESSONS SHOULD PROVE USEFUL (BY CHANGE IN VOCABULARY AND APPROACH) TO MIDDLE AND UPPER GRADE CHILDREN HAVING DIFFICULTY WITH SOUND DISCRIMINATION. LESSONS SHOULD PROVE HELPFUL TO TEACHERS OF BILINGUAL OR MENTALLY RETARDED PUPILS. IT IS HELD THAT "HOW TO LISTEN"…
Effect of Blast Injury on Auditory Localization in Military Service Members.
Kubli, Lina R; Brungart, Douglas; Northern, Jerry
Among the many advantages of binaural hearing are the abilities to localize sounds in space and to attend to one sound in the presence of many sounds. Binaural hearing provides benefits for all listeners, but it may be especially critical for military personnel who must maintain situational awareness in complex tactical environments with multiple speech and noise sources. There is concern that Military Service Members who have been exposed to one or more high-intensity blasts during their tour of duty may have difficulty with binaural and spatial ability due to degradation in auditory and cognitive processes. The primary objective of this study was to assess the ability of blast-exposed Military Service Members to localize speech sounds in quiet and in multisource environments with one or two competing talkers. Participants were presented with one, two, or three topic-related (e.g., sports, food, travel) sentences under headphones and required to attend to, and then locate the source of, the sentence pertaining to a prespecified target topic within a virtual space. The listener's head position was monitored by a head-mounted tracking device that continuously updated the apparent spatial location of the target and competing speech sounds as the subject turned within the virtual space. Measurements of auditory localization ability included mean absolute error in locating the source of the target sentence, the time it took to locate the target sentence within 30 degrees, target/competitor confusion errors, response time, and cumulative head motion. Twenty-one blast-exposed Active-Duty or Veteran Military Service Members (blast-exposed group) and 33 non-blast-exposed Service Members and beneficiaries (control group) were evaluated. In general, the blast-exposed group performed as well as the control group if the task involved localizing the source of a single speech target. However, if the task involved two or three simultaneous talkers, localization ability was compromised for some participants in the blast-exposed group. Blast-exposed participants were less accurate in their localization responses and required more exploratory head movements to find the location of the target talker. Results suggest that blast-exposed participants have more difficulty than non-blast-exposed participants in localizing sounds in complex acoustic environments. This apparent deficit in spatial hearing ability highlights the need to develop new diagnostic tests using complex listening tasks that involve multiple sound sources that require speech segregation and comprehension.
Dynamics of infant cortical auditory evoked potentials (CAEPs) for tone and speech tokens.
Cone, Barbara; Whitaker, Richard
2013-07-01
Cortical auditory evoked potentials (CAEPs) to tones and speech sounds were obtained in infants to: (1) further knowledge of auditory development above the level of the brainstem during the first year of life; (2) establish CAEP input-output functions for tonal and speech stimuli as a function of stimulus level and (3) elaborate the data-base that establishes CAEP in infants tested while awake using clinically relevant stimuli, thus providing methodology that would have translation to pediatric audiological assessment. Hypotheses concerning CAEP development were that the latency and amplitude input-output functions would reflect immaturity in encoding stimulus level. In a second experiment, infants were tested with the same stimuli used to evoke the CAEPs. Thresholds for these stimuli were determined using observer-based psychophysical techniques. The hypothesis was that the behavioral thresholds would be correlated with CAEP input-output functions because of shared cortical response areas known to be active in sound detection. 36 infants, between the ages of 4 and 12 months (mean=8 months, s.d.=1.8 months) and 9 young adults (mean age 21 years) with normal hearing were tested. First, CAEPs amplitude and latency input-output functions were obtained for 4 tone bursts and 7 speech tokens. The tone bursts stimuli were 50 ms tokens of pure tones at 0.5, 1.0, 2.0 and 4.0 kHz. The speech sound tokens, /a/, /i/, /o/, /u/, /m/, /s/, and /∫/, were created from natural speech samples and were also 50 ms in duration. CAEPs were obtained for tone burst and speech token stimuli at 10 dB level decrements in descending order from 70 dB SPL. All CAEP tests were completed while the infants were awake and engaged in quiet play. For the second experiment, observer-based psychophysical methods were used to establish perceptual threshold for the same speech sound and tone tokens. Infant CAEP component latencies were prolonged by 100-150 ms in comparison to adults. CAEP latency-intensity input output functions were steeper in infants compared to adults. CAEP amplitude growth functions with respect to stimulus SPL are adult-like at this age, particularly for the earliest component, P1-N1. Infant perceptual thresholds were elevated with respect to those found in adults. Furthermore, perceptual thresholds were higher, on average, than levels at which CAEPs could be obtained. When CAEP amplitudes were plotted with respect to perceptual threshold (dB SL), the infant CAEP amplitude growth slopes were steeper than in adults. Although CAEP latencies indicate immaturity in neural transmission at the level of the cortex, amplitude growth with respect to stimulus SPL is adult-like at this age, particularly for the earliest component, P1-N1. The latency and amplitude input-output functions may provide additional information as to how infants perceive stimulus level. The reasons for the discrepancy between electrophysiologic and perceptual threshold may be due to immaturity in perceptual temporal resolution abilities and the broad-band listening strategy employed by infants. The findings from the current study can be translated to the clinical setting. It is possible to use tonal or speech sound tokens to evoke CAEPs in an awake, passively alert infant, and thus determine whether these sounds activate the auditory cortex. This could be beneficial in the verification of hearing aid or cochlear implant benefit. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Improving speech perception in noise for children with cochlear implants.
Gifford, René H; Olund, Amy P; Dejong, Melissa
2011-10-01
Current cochlear implant recipients are achieving increasingly higher levels of speech recognition; however, the presence of background noise continues to significantly degrade speech understanding for even the best performers. Newer generation Nucleus cochlear implant sound processors can be programmed with SmartSound strategies that have been shown to improve speech understanding in noise for adult cochlear implant recipients. The applicability of these strategies for use in children, however, is not fully understood nor widely accepted. To assess speech perception for pediatric cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether Nucleus sound processor SmartSound strategies yield improved sentence recognition in noise for children who learn language through the implant. Single subject, repeated measures design. Twenty-two experimental subjects with cochlear implants (mean age 11.1 yr) and 25 control subjects with normal hearing (mean age 9.6 yr) participated in this prospective study. Speech reception thresholds (SRT) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the experimental subjects' everyday program incorporating Adaptive Dynamic Range Optimization (ADRO) as well as with the addition of Autosensitivity control (ASC). Adaptive SRTs with the Hearing In Noise Test (HINT) sentences were obtained for all 22 experimental subjects, and performance-in percent correct-was assessed in a fixed +6 dB SNR (signal-to-noise ratio) for a six-subject subset. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the SmartSound setting on the SRT in noise. The primary findings mirrored those reported previously with adult cochlear implant recipients in that the addition of ASC to ADRO significantly improved speech recognition in noise for pediatric cochlear implant recipients. The mean degree of improvement in the SRT with the addition of ASC to ADRO was 3.5 dB for a mean SRT of 10.9 dB SNR. Thus, despite the fact that these children have acquired auditory/oral speech and language through the use of their cochlear implant(s) equipped with ADRO, the addition of ASC significantly improved their ability to recognize speech in high levels of diffuse background noise. The mean SRT for the control subjects with normal hearing was 0.0 dB SNR. Given that the mean SRT for the experimental group was 10.9 dB SNR, despite the improvements in performance observed with the addition of ASC, cochlear implants still do not completely overcome the speech perception deficit encountered in noisy environments accompanying the diagnosis of severe-to-profound hearing loss. SmartSound strategies currently available in latest generation Nucleus cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise for pediatric cochlear implant recipients. Despite the reluctance of pediatric audiologists to utilize SmartSound settings for regular use, the results of the current study support the addition of ASC to ADRO for everyday listening environments to improve speech perception in a child's typical everyday program. American Academy of Audiology.
Acoustic analysis of trill sounds.
Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri
2012-04-01
In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.
Sound and speech detection and classification in a Health Smart Home.
Fleury, A; Noury, N; Vacher, M; Glasson, H; Seri, J F
2008-01-01
Improvements in medicine increase life expectancy in the world and create a new bottleneck at the entrance of specialized and equipped institutions. To allow elderly people to stay at home, researchers work on ways to monitor them in their own environment, with non-invasive sensors. To meet this goal, smart homes, equipped with lots of sensors, deliver information on the activities of the person and can help detect distress situations. In this paper, we present a global speech and sound recognition system that can be set-up in a flat. We placed eight microphones in the Health Smart Home of Grenoble (a real living flat of 47m(2)) and we automatically analyze and sort out the different sounds recorded in the flat and the speech uttered (to detect normal or distress french sentences). We introduce the methods for the sound and speech recognition, the post-processing of the data and finally the experimental results obtained in real conditions in the flat.
Elmer, Stefan; Klein, Carina; Kühnis, Jürg; Liem, Franziskus; Meyer, Martin; Jäncke, Lutz
2014-10-01
In this study, we used high-density EEG to evaluate whether speech and music expertise has an influence on the categorization of expertise-related and unrelated sounds. With this purpose in mind, we compared the categorization of speech, music, and neutral sounds between professional musicians, simultaneous interpreters (SIs), and controls in response to morphed speech-noise, music-noise, and speech-music continua. Our hypothesis was that music and language expertise will strengthen the memory representations of prototypical sounds, which act as a perceptual magnet for morphed variants. This means that the prototype would "attract" variants. This so-called magnet effect should be manifested by an increased assignment of morphed items to the trained category, by a reduced maximal slope of the psychometric function, as well as by differential event-related brain responses reflecting memory comparison processes (i.e., N400 and P600 responses). As a main result, we provide first evidence for a domain-specific behavioral bias of musicians and SIs toward the trained categories, namely music and speech. In addition, SIs showed a bias toward musical items, indicating that interpreting training has a generic influence on the cognitive representation of spectrotemporal signals with similar acoustic properties to speech sounds. Notably, EEG measurements revealed clear distinct N400 and P600 responses to both prototypical and ambiguous items between the three groups at anterior, central, and posterior scalp sites. These differential N400 and P600 responses represent synchronous activity occurring across widely distributed brain networks, and indicate a dynamical recruitment of memory processes that vary as a function of training and expertise.
Sound-direction identification with bilateral cochlear implants.
Neuman, Arlene C; Haravon, Anita; Sislian, Nicole; Waltzman, Susan B
2007-02-01
The purpose of this study was to compare the accuracy of sound-direction identification in the horizontal plane by bilateral cochlear implant users when localization was measured with pink noise and with speech stimuli. Eight adults who were bilateral users of Nucleus 24 Contour devices participated in the study. All had received implants in both ears in a single surgery. Sound-direction identification was measured in a large classroom by using a nine-loudspeaker array. Localization was tested in three listening conditions (bilateral cochlear implants, left cochlear implant, and right cochlear implant), using two different stimuli (a speech stimulus and pink noise bursts) in a repeated-measures design. Sound-direction identification accuracy was significantly better when using two implants than when using a single implant. The mean root-mean-square error was 29 degrees for the bilateral condition, 54 degrees for the left cochlear implant, and 46.5 degrees for the right cochlear implant condition. Unilateral accuracy was similar for right cochlear implant and left cochlear implant performance. Sound-direction identification performance was similar for speech and pink noise stimuli. The data obtained in this study add to the growing body of evidence that sound-direction identification with bilateral cochlear implants is better than with a single implant. The similarity in localization performance obtained with the speech and pink noise supports the use of either stimulus for measuring sound-direction identification.
Extensions to the Speech Disorders Classification System (SDCS)
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.
2010-01-01
This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…
On the Acoustics of Emotion in Audio: What Speech, Music, and Sound have in Common
Weninger, Felix; Eyben, Florian; Schuller, Björn W.; Mortillaro, Marcello; Scherer, Klaus R.
2013-01-01
Without doubt, there is emotional information in almost any kind of sound received by humans every day: be it the affective state of a person transmitted by means of speech; the emotion intended by a composer while writing a musical piece, or conveyed by a musician while performing it; or the affective state connected to an acoustic event occurring in the environment, in the soundtrack of a movie, or in a radio play. In the field of affective computing, there is currently some loosely connected research concerning either of these phenomena, but a holistic computational model of affect in sound is still lacking. In turn, for tomorrow’s pervasive technical systems, including affective companions and robots, it is expected to be highly beneficial to understand the affective dimensions of “the sound that something makes,” in order to evaluate the system’s auditory environment and its own audio output. This article aims at a first step toward a holistic computational model: starting from standard acoustic feature extraction schemes in the domains of speech, music, and sound analysis, we interpret the worth of individual features across these three domains, considering four audio databases with observer annotations in the arousal and valence dimensions. In the results, we find that by selection of appropriate descriptors, cross-domain arousal, and valence regression is feasible achieving significant correlations with the observer annotations of up to 0.78 for arousal (training on sound and testing on enacted speech) and 0.60 for valence (training on enacted speech and testing on music). The high degree of cross-domain consistency in encoding the two main dimensions of affect may be attributable to the co-evolution of speech and music from multimodal affect bursts, including the integration of nature sounds for expressive effects. PMID:23750144
Unicomb, Rachael; Hewat, Sally; Spencer, Elizabeth; Harrison, Elisabeth
2017-06-01
There is a paucity of evidence to guide treatment for children with co-occurring stuttering and speech sound disorder. Some guidelines suggest treating the two disorders simultaneously using indirect treatment approaches; however, the research supporting these recommendations is over 20 years old. In this clinical case series, we investigate whether these co-occurring disorders could be treated concurrently using direct treatment approaches supported by up-to-date, high-level evidence, and whether this could be done in an efficacious, safe and efficient manner. Five pre-school-aged participants received individual concurrent, direct intervention for both stuttering and speech sound disorder. All participants used the Lidcombe Program, as manualised. Direct treatment for speech sound disorder was individualised based on analysis of each child's sound system. At 12 months post commencement of treatment, all except one participant had completed the Lidcombe Program, and were less than 1.0% syllables stuttered on samples gathered within and beyond the clinic. These four participants completed Stage 1 of the Lidcombe Program in between 14 and 22 clinic visits, consistent with current benchmark data for this programme. At the same assessment point, all five participants exhibited significant increases in percentage of consonants correct and were in alignment with age-expected estimates of this measure. Further, they were treated in an average number of clinic visits that compares favourably with other research on treatment for speech sound disorder. These preliminary results indicate that young children with co-occurring stuttering and speech sound disorder may be treated concurrently using direct treatment approaches. This method of service delivery may have implications for cost and time efficiency and may also address the crucial need for early intervention in both disorders. These positive findings highlight the need for further research in the area and contribute to the limited evidence base.
Monkey vocal tracts are speech-ready.
Fitch, W Tecumseh; de Boer, Bart; Mathur, Neil; Ghazanfar, Asif A
2016-12-01
For four decades, the inability of nonhuman primates to produce human speech sounds has been claimed to stem from limitations in their vocal tract anatomy, a conclusion based on plaster casts made from the vocal tract of a monkey cadaver. We used x-ray videos to quantify vocal tract dynamics in living macaques during vocalization, facial displays, and feeding. We demonstrate that the macaque vocal tract could easily produce an adequate range of speech sounds to support spoken language, showing that previous techniques based on postmortem samples drastically underestimated primate vocal capabilities. Our findings imply that the evolution of human speech capabilities required neural changes rather than modifications of vocal anatomy. Macaques have a speech-ready vocal tract but lack a speech-ready brain to control it.
Maggu, Akshay R; Liu, Fang; Antoniou, Mark; Wong, Patrick C M
2016-01-01
Across time, languages undergo changes in phonetic, syntactic, and semantic dimensions. Social, cognitive, and cultural factors contribute to sound change, a phenomenon in which the phonetics of a language undergo changes over time. Individuals who misperceive and produce speech in a slightly divergent manner (called innovators ) contribute to variability in the society, eventually leading to sound change. However, the cause of variability in these individuals is still unknown. In this study, we examined whether such misperceptions are represented in neural processes of the auditory system. We investigated behavioral, subcortical (via FFR), and cortical (via P300) manifestations of sound change processing in Cantonese, a Chinese language in which several lexical tones are merging. Across the merging categories, we observed a similar gradation of speech perception abilities in both behavior and the brain (subcortical and cortical processes). Further, we also found that behavioral evidence of tone merging correlated with subjects' encoding at the subcortical and cortical levels. These findings indicate that tone-merger categories, that are indicators of sound change in Cantonese, are represented neurophysiologically with high fidelity. Using our results, we speculate that innovators encode speech in a slightly deviant neurophysiological manner, and thus produce speech divergently that eventually spreads across the community and contributes to sound change.
Maggu, Akshay R.; Liu, Fang; Antoniou, Mark; Wong, Patrick C. M.
2016-01-01
Across time, languages undergo changes in phonetic, syntactic, and semantic dimensions. Social, cognitive, and cultural factors contribute to sound change, a phenomenon in which the phonetics of a language undergo changes over time. Individuals who misperceive and produce speech in a slightly divergent manner (called innovators) contribute to variability in the society, eventually leading to sound change. However, the cause of variability in these individuals is still unknown. In this study, we examined whether such misperceptions are represented in neural processes of the auditory system. We investigated behavioral, subcortical (via FFR), and cortical (via P300) manifestations of sound change processing in Cantonese, a Chinese language in which several lexical tones are merging. Across the merging categories, we observed a similar gradation of speech perception abilities in both behavior and the brain (subcortical and cortical processes). Further, we also found that behavioral evidence of tone merging correlated with subjects' encoding at the subcortical and cortical levels. These findings indicate that tone-merger categories, that are indicators of sound change in Cantonese, are represented neurophysiologically with high fidelity. Using our results, we speculate that innovators encode speech in a slightly deviant neurophysiological manner, and thus produce speech divergently that eventually spreads across the community and contributes to sound change. PMID:28066218
Milovanov, Riia; Huotilainen, Minna; Esquef, Paulo A A; Alku, Paavo; Välimäki, Vesa; Tervaniemi, Mari
2009-08-28
We examined 10-12-year old elementary school children's ability to preattentively process sound durations in music and speech stimuli. In total, 40 children had either advanced foreign language production skills and higher musical aptitude or less advanced results in both musicality and linguistic tests. Event-related potential (ERP) recordings of the mismatch negativity (MMN) show that the duration changes in musical sounds are more prominently and accurately processed than changes in speech sounds. Moreover, children with advanced pronunciation and musicality skills displayed enhanced MMNs to duration changes in both speech and musical sounds. Thus, our study provides further evidence for the claim that musical aptitude and linguistic skills are interconnected and the musical features of the stimuli could have a preponderant role in preattentive duration processing.
On hemispheric differences in evoked potentials to speech stimuli
NASA Technical Reports Server (NTRS)
Galambos, R.; Smith, T. S.; Schulman-Galambos, C.; Osier, H.; Benson, P.
1975-01-01
Subjects were asked to count the number of times a 'target' sound occurred in lists of speech sounds (pa or ba) or pure tones (250 or 600 c/sec) in which one of the sounds (the 'frequent') appeared about four times as often as the target. The response to both targets and frequents were separately averaged from electrodes at vertex at symmetrical left and right parietal locations. The expected sequence of deflections, including P3 waves with about 350 msec latency, was found in the responses to target stimuli. Very little difference was found between the right and left hemispheric responses to speech or pure tones, either frequent or target.
Effects of musical expertise on oscillatory brain activity in response to emotional sounds.
Nolden, Sophie; Rigoulot, Simon; Jolicoeur, Pierre; Armony, Jorge L
2017-08-01
Emotions can be conveyed through a variety of channels in the auditory domain, be it via music, non-linguistic vocalizations, or speech prosody. Moreover, recent studies suggest that expertise in one sound category can impact the processing of emotional sounds in other sound categories as they found that musicians process more efficiently emotional musical and vocal sounds than non-musicians. However, the neural correlates of these modulations, especially their time course, are not very well understood. Consequently, we focused here on how the neural processing of emotional information varies as a function of sound category and expertise of participants. Electroencephalogram (EEG) of 20 non-musicians and 17 musicians was recorded while they listened to vocal (speech and vocalizations) and musical sounds. The amplitude of EEG-oscillatory activity in the theta, alpha, beta, and gamma band was quantified and Independent Component Analysis (ICA) was used to identify underlying components of brain activity in each band. Category differences were found in theta and alpha bands, due to larger responses to music and speech than to vocalizations, and in posterior beta, mainly due to differential processing of speech. In addition, we observed greater activation in frontal theta and alpha for musicians than for non-musicians, as well as an interaction between expertise and emotional content of sounds in frontal alpha. The results reflect musicians' expertise in recognition of emotion-conveying music, which seems to also generalize to emotional expressions conveyed by the human voice, in line with previous accounts of effects of expertise on musical and vocal sounds processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Blackman, Graham A.; Hall, Deborah A.
2011-01-01
Purpose: The intense sound generated during functional magnetic resonance imaging (fMRI) complicates studies of speech and hearing. This experiment evaluated the benefits of using active noise cancellation (ANC), which attenuates the level of the scanner sound at the participant's ear by up to 35 dB around the peak at 600 Hz. Method: Speech and…
Chalupper, Josef
2017-01-01
The benefits of combining a cochlear implant (CI) and a hearing aid (HA) in opposite ears on speech perception were examined in 15 adult unilateral CI recipients who regularly use a contralateral HA. A within-subjects design was carried out to assess speech intelligibility testing, listening effort ratings, and a sound quality questionnaire for the conditions CI alone, CIHA together, and HA alone when applicable. The primary outcome of bimodal benefit, defined as the difference between CIHA and CI, was statistically significant for speech intelligibility in quiet as well as for intelligibility in noise across tested spatial conditions. A reduction in effort on top of intelligibility at the highest tested signal-to-noise ratio was found. Moreover, the bimodal listening situation was rated to sound more voluminous, less tinny, and less unpleasant than CI alone. Listening effort and sound quality emerged as feasible and relevant measures to demonstrate bimodal benefit across a clinically representative range of bimodal users. These extended dimensions of speech perception can shed more light on the array of benefits provided by complementing a CI with a contralateral HA. PMID:28874096
Oral breathing and speech disorders in children.
Hitos, Silvia F; Arakaki, Renata; Solé, Dirceu; Weckx, Luc L M
2013-01-01
To assess speech alterations in mouth-breathing children, and to correlate them with the respiratory type, etiology, gender, and age. A total of 439 mouth-breathers were evaluated, aged between 4 and 12 years. The presence of speech alterations in children older than 5 years was considered delayed speech development. The observed alterations were tongue interposition (TI), frontal lisp (FL), articulatory disorders (AD), sound omissions (SO), and lateral lisp (LL). The etiology of mouth breathing, gender, age, respiratory type, and speech disorders were correlated. Speech alterations were diagnosed in 31.2% of patients, unrelated to the respiratory type: oral or mixed. Increased frequency of articulatory disorders and more than one speech disorder were observed in males. TI was observed in 53.3% patients, followed by AD in 26.3%, and by FL in 21.9%. The co-occurrence of two or more speech alterations was observed in 24.8% of the children. Mouth breathing can affect speech development, socialization, and school performance. Early detection of mouth breathing is essential to prevent and minimize its negative effects on the overall development of individuals. Copyright © 2013 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
Optimizing Classroom Acoustics Using Computer Model Studies.
ERIC Educational Resources Information Center
Reich, Rebecca; Bradley, John
1998-01-01
Investigates conditions relating to the maximum useful-to-detrimental sound ratios present in classrooms and determining the optimum conditions for speech intelligibility. Reveals that speech intelligibility is more strongly influenced by ambient noise levels and that the optimal location for sound absorbing material is on a classroom's upper…
Acoustic assessment of speech privacy curtains in two nursing units
Pope, Diana S.; Miller-Klein, Erik T.
2016-01-01
Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959
Acoustic assessment of speech privacy curtains in two nursing units.
Pope, Diana S; Miller-Klein, Erik T
2016-01-01
Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
On the importance of early reflections for speech in rooms.
Bradley, J S; Sato, H; Picard, M
2003-06-01
This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.
Tsunoda, Koichi; Sekimoto, Sotaro; Itoh, Kenji
2016-06-01
Conclusions The result suggested that mother tongue Japanese and non- mother tongue Japanese differ in their pattern of brain dominance when listening to sounds from the natural world-in particular, insect sounds. These results reveal significant support for previous findings from Tsunoda (in 1970). Objectives This study concentrates on listeners who show clear evidence of a 'speech' brain vs a 'music' brain and determines which side is most active in the processing of insect sounds, using with near-infrared spectroscopy. Methods The present study uses 2-channel Near Infrared Spectroscopy (NIRS) to provide a more direct measure of left- and right-brain activity while participants listen to each of three types of sounds: Japanese speech, Western violin music, or insect sounds. Data were obtained from 33 participants who showed laterality on opposite sides for Japanese speech and Western music. Results Results showed that a majority (80%) of the MJ participants exhibited dominance for insect sounds on the side that was dominant for language, while a majority (62%) of the non-MJ participants exhibited dominance for insect sounds on the side that was dominant for music.
Müller, Rainer; Höhlein, Andreas; Wolf, Annette; Markwardt, Jutta; Schulz, Matthias C; Range, Ursula; Reitemeier, Bernd
2013-01-01
Ablative surgery of oropharyngeal tumors frequently leads to defects in the speech organs, resulting in impairment of speech up to the point of unintelligibility. The aim of the present study was the assessment of selected parameters of speech with and without resection prostheses. The speech sounds of 22 patients suffering from maxillary and mandibular defects were recorded using a digital audio tape (DAT) recorder with and without resection prostheses. Evaluation of the resonance and the production of the sounds /s/, /sch/, and /ch/ was performed by 2 experienced speech therapists. Additionally, the patients completed a non-standardized questionnaire containing a linguistic self-assessment. After prosthesis supply, the number of patients with rhinophonia aperta decreased from 7 to 2 while the number of patients with intelligible speech increased from 2 to 20. Correct production of the sounds /s/, /sch/, and /ch/ increased from 2 to 13 patients. A significant improvement of the evaluated parameters could be observed only in patients with maxillary defects. The linguistic self-assessment showed a higher satisfaction in patients with maxillary defects. In patients with maxillary defects due to ablative tumor surgery, an increase in speech performance and intelligibility is possible by supplying resection prostheses. © 2013 S. Karger GmbH, Freiburg.
Caversaccio, Marco
2014-01-01
Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract) to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct), aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found. PMID:25140314
Fuller, Christina; Free, Rolien; Maat, Bert; Başkent, Deniz
2012-08-01
In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech perception, and as an extension also the quality of life, by cochlear-implant users. To test this hypothesis, this study explored musical background [using the Dutch Musical Background Questionnaire (DMBQ)], and self-perceived sound and speech perception and quality of life [using the Nijmegen Cochlear Implant Questionnaire (NCIQ) and the Speech Spatial and Qualities of Hearing Scale (SSQ)] in 98 postlingually deafened adult cochlear-implant recipients. In addition to self-perceived measures, speech perception scores (percentage of phonemes recognized in words presented in quiet) were obtained from patient records. The self-perceived hearing performance was associated with the objective speech perception. Forty-one respondents (44% of 94 respondents) indicated some form of formal musical training. Fifteen respondents (18% of 83 respondents) judged themselves as having musical training, experience, and knowledge. No association was observed between musical background (quantified by DMBQ), and self-perceived hearing-related performance or quality of life (quantified by NCIQ and SSQ), or speech perception in quiet.
Real time speech formant analyzer and display
Holland, George E.; Struve, Walter S.; Homer, John F.
1987-01-01
A speech analyzer for interpretation of sound includes a sound input which converts the sound into a signal representing the sound. The signal is passed through a plurality of frequency pass filters to derive a plurality of frequency formants. These formants are converted to voltage signals by frequency-to-voltage converters and then are prepared for visual display in continuous real time. Parameters from the inputted sound are also derived and displayed. The display may then be interpreted by the user. The preferred embodiment includes a microprocessor which is interfaced with a television set for displaying of the sound formants. The microprocessor software enables the sound analyzer to present a variety of display modes for interpretive and therapeutic used by the user.
Real time speech formant analyzer and display
Holland, G.E.; Struve, W.S.; Homer, J.F.
1987-02-03
A speech analyzer for interpretation of sound includes a sound input which converts the sound into a signal representing the sound. The signal is passed through a plurality of frequency pass filters to derive a plurality of frequency formants. These formants are converted to voltage signals by frequency-to-voltage converters and then are prepared for visual display in continuous real time. Parameters from the inputted sound are also derived and displayed. The display may then be interpreted by the user. The preferred embodiment includes a microprocessor which is interfaced with a television set for displaying of the sound formants. The microprocessor software enables the sound analyzer to present a variety of display modes for interpretive and therapeutic used by the user. 19 figs.
ERIC Educational Resources Information Center
Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve
2018-01-01
To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
Loebach, Jeremy L.; Pisoni, David B.; Svirsky, Mario A.
2009-01-01
Objective The objective of this study was to assess whether training on speech processed with an 8-channel noise vocoder to simulate the output of a cochlear implant would produce transfer of auditory perceptual learning to the recognition of non-speech environmental sounds, the identification of speaker gender, and the discrimination of talkers by voice. Design Twenty-four normal hearing subjects were trained to transcribe meaningful English sentences processed with a noise vocoder simulation of a cochlear implant. An additional twenty-four subjects served as an untrained control group and transcribed the same sentences in their unprocessed form. All subjects completed pre- and posttest sessions in which they transcribed vocoded sentences to provide an assessment of training efficacy. Transfer of perceptual learning was assessed using a series of closed-set, nonlinguistic tasks: subjects identified talker gender, discriminated the identity of pairs of talkers, and identified ecologically significant environmental sounds from a closed set of alternatives. Results Although both groups of subjects showed significant pre- to posttest improvements, subjects who transcribed vocoded sentences during training performed significantly better at posttest than subjects in the control group. Both groups performed equally well on gender identification and talker discrimination. Subjects who received explicit training on the vocoded sentences, however, performed significantly better on environmental sound identification than the untrained subjects. Moreover, across both groups, pretest speech performance, and to a higher degree posttest speech performance, were significantly correlated with environmental sound identification. For both groups, environmental sounds that were characterized as having more salient temporal information were identified more often than environmental sounds that were characterized as having more salient spectral information. Conclusions Listeners trained to identify noise-vocoded sentences showed evidence of transfer of perceptual learning to the identification of environmental sounds. In addition, the correlation between environmental sound identification and sentence transcription indicates that subjects who were better able to utilize the degraded acoustic information to identify the environmental sounds were also better able to transcribe the linguistic content of novel sentences. Both trained and untrained groups performed equally well (~75% correct) on the gender identification task, indicating that training did not have an effect on the ability to identify the gender of talkers. Although better than chance, performance on the talker discrimination task was poor overall (~55%), suggesting that either explicit training is required to reliably discriminate talkers’ voices, or that additional information (perhaps spectral in nature) not present in the vocoded speech is required to excel in such tasks. Taken together, the results suggest that while transfer of auditory perceptual learning with spectrally degraded speech does occur, explicit task-specific training may be necessary for tasks that cannot rely on temporal information alone. PMID:19773659
Anderson, Karen L; Goldstein, Howard
2004-04-01
Children typically learn in classroom environments that have background noise and reverberation that interfere with accurate speech perception. Amplification technology can enhance the speech perception of students who are hard of hearing. This study used a single-subject alternating treatments design to compare the speech recognition abilities of children who are, hard of hearing when they were using hearing aids with each of three frequency modulated (FM) or infrared devices. Eight 9-12-year-olds with mild to severe hearing loss repeated Hearing in Noise Test (HINT) sentence lists under controlled conditions in a typical kindergarten classroom with a background noise level of +10 dB signal-to-noise (S/N) ratio and 1.1 s reverberation time. Participants listened to HINT lists using hearing aids alone and hearing aids in combination with three types of S/N-enhancing devices that are currently used in mainstream classrooms: (a) FM systems linked to personal hearing aids, (b) infrared sound field systems with speakers placed throughout the classroom, and (c) desktop personal sound field FM systems. The infrared ceiling sound field system did not provide benefit beyond that provided by hearing aids alone. Desktop and personal FM systems in combination with personal hearing aids provided substantial improvements in speech recognition. This information can assist in making S/N-enhancing device decisions for students using hearing aids. In a reverberant and noisy classroom setting, classroom sound field devices are not beneficial to speech perception for students with hearing aids, whereas either personal FM or desktop sound field systems provide listening benefits.
Deconvolution of magnetic acoustic change complex (mACC).
Bardy, Fabrice; McMahon, Catherine M; Yau, Shu Hui; Johnson, Blake W
2014-11-01
The aim of this study was to design a novel experimental approach to investigate the morphological characteristics of auditory cortical responses elicited by rapidly changing synthesized speech sounds. Six sound-evoked magnetoencephalographic (MEG) responses were measured to a synthesized train of speech sounds using the vowels /e/ and /u/ in 17 normal hearing young adults. Responses were measured to: (i) the onset of the speech train, (ii) an F0 increment; (iii) an F0 decrement; (iv) an F2 decrement; (v) an F2 increment; and (vi) the offset of the speech train using short (jittered around 135ms) and long (1500ms) stimulus onset asynchronies (SOAs). The least squares (LS) deconvolution technique was used to disentangle the overlapping MEG responses in the short SOA condition only. Comparison between the morphology of the recovered cortical responses in the short and long SOAs conditions showed high similarity, suggesting that the LS deconvolution technique was successful in disentangling the MEG waveforms. Waveform latencies and amplitudes were different for the two SOAs conditions and were influenced by the spectro-temporal properties of the sound sequence. The magnetic acoustic change complex (mACC) for the short SOA condition showed significantly lower amplitudes and shorter latencies compared to the long SOA condition. The F0 transition showed a larger reduction in amplitude from long to short SOA compared to the F2 transition. Lateralization of the cortical responses were observed under some stimulus conditions and appeared to be associated with the spectro-temporal properties of the acoustic stimulus. The LS deconvolution technique provides a new tool to study the properties of the auditory cortical response to rapidly changing sound stimuli. The presence of the cortical auditory evoked responses for rapid transition of synthesized speech stimuli suggests that the temporal code is preserved at the level of the auditory cortex. Further, the reduced amplitudes and shorter latencies might reflect intrinsic properties of the cortical neurons to rapidly presented sounds. This is the first demonstration of the separation of overlapping cortical responses to rapidly changing speech sounds and offers a potential new biomarker of discrimination of rapid transition of sound. Crown Copyright © 2014. Published by Elsevier Ireland Ltd. All rights reserved.
Fluid-acoustic interactions and their impact on pathological voiced speech
NASA Astrophysics Data System (ADS)
Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.
2011-11-01
Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Hidden Markov models in automatic speech recognition
NASA Astrophysics Data System (ADS)
Wrzoskowicz, Adam
1993-11-01
This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
ERIC Educational Resources Information Center
Ruscello, Dennis M.; Douglas, Cara; Tyson, Tabitha; Durkee, Mark
2005-01-01
A young child with macroglossia of unknown cause was seen for treatment to modify resting tongue posture and improve speech sound production. Evaluation of the treatments indicated positive change in resting tongue posture and a modest change in speech sound production. Treatment for such patients can be complex and must consider orthodontic…
Analyzing Stimulus-Stimulus Pairing Effects on Preferences for Speech Sounds
ERIC Educational Resources Information Center
Petursdottir, Anna Ingeborg; Carp, Charlotte L.; Matthies, Derek W.; Esch, Barbara E.
2011-01-01
Several studies have demonstrated effects of stimulus-stimulus pairing (SSP) on children's vocalizations, but numerous treatment failures have also been reported. The present study attempted to isolate procedural variables related to failures of SSP to condition speech sounds as reinforcers. Three boys diagnosed with autism-spectrum disorders…
Severe Speech Sound Disorders: An Integrated Multimodal Intervention
ERIC Educational Resources Information Center
King, Amie M.; Hengst, Julie A.; DeThorne, Laura S.
2013-01-01
Purpose: This study introduces an integrated multimodal intervention (IMI) and examines its effectiveness for the treatment of persistent and severe speech sound disorders (SSD) in young children. The IMI is an activity-based intervention that focuses simultaneously on increasing the "quantity" of a child's meaningful productions of target words…
Sensory-Cognitive Interaction in the Neural Encoding of Speech in Noise: A Review
Anderson, Samira; Kraus, Nina
2011-01-01
Background Speech-in-noise (SIN) perception is one of the most complex tasks faced by listeners on a daily basis. Although listening in noise presents challenges for all listeners, background noise inordinately affects speech perception in older adults and in children with learning disabilities. Hearing thresholds are an important factor in SIN perception, but they are not the only factor. For successful comprehension, the listener must perceive and attend to relevant speech features, such as the pitch, timing, and timbre of the target speaker’s voice. Here, we review recent studies linking SIN and brainstem processing of speech sounds. Purpose To review recent work that has examined the ability of the auditory brainstem response to complex sounds (cABR), which reflects the nervous system’s transcription of pitch, timing, and timbre, to be used as an objective neural index for hearing-in-noise abilities. Study Sample We examined speech-evoked brainstem responses in a variety of populations, including children who are typically developing, children with language-based learning impairment, young adults, older adults, and auditory experts (i.e., musicians). Data Collection and Analysis In a number of studies, we recorded brainstem responses in quiet and babble noise conditions to the speech syllable /da/ in all age groups, as well as in a variable condition in children in which /da/ was presented in the context of seven other speech sounds. We also measured speech-in-noise perception using the Hearing-in-Noise Test (HINT) and the Quick Speech-in-Noise Test (QuickSIN). Results Children and adults with poor SIN perception have deficits in the subcortical spectrotemporal representation of speech, including low-frequency spectral magnitudes and the timing of transient response peaks. Furthermore, auditory expertise, as engendered by musical training, provides both behavioral and neural advantages for processing speech in noise. Conclusions These results have implications for future assessment and management strategies for young and old populations whose primary complaint is difficulty hearing in background noise. The cABR provides a clinically applicable metric for objective assessment of individuals with SIN deficits, for determination of the biologic nature of disorders affecting SIN perception, for evaluation of appropriate hearing aid algorithms, and for monitoring the efficacy of auditory remediation and training. PMID:21241645
Evaluation of synthesized voice approach callouts /SYNCALL/
NASA Technical Reports Server (NTRS)
Simpson, C. A.
1981-01-01
The two basic approaches to the generation of 'synthesized' speech include a utilization of analog recorded human speech and a construction of speech entirely from algorithms applied to constants describing speech sounds. Given the availability of synthesized speech displays for man-machine systems, research is needed to study suggested applications for speech and design principles for speech displays. The present investigation is concerned with a study for which new performance measures were developed. A number of air carrier approach and landing accidents during low or impaired visibility have been associated with the absence of approach callouts. The study had the purpose to compare a pilot-not-flying (PNF) approach callout system to a system composed of PNF callouts augmented by an automatic synthesized voice callout system (SYNCALL). Pilots were found to favor the use of a SYNCALL system containing certain modifications.
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.
Berezutskaya, Julia; Freudenburg, Zachary V; Güçlü, Umut; van Gerven, Marcel A J; Ramsey, Nick F
2017-08-16
Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain. SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the perisylvian cortex and potentially contribute to the emergence of "coarse" speech representations in inferior frontal gyrus typically associated with high-level language processing. These findings add to the previous work on auditory processing and underline a distinctive role of inferior frontal gyrus in natural speech comprehension. Copyright © 2017 the authors 0270-6474/17/377906-15$15.00/0.
The development of visual speech perception in Mandarin Chinese-speaking children.
Chen, Liang; Lei, Jianghua
2017-01-01
The present study aimed to investigate the development of visual speech perception in Chinese-speaking children. Children aged 7, 13 and 16 were asked to visually identify both consonant and vowel sounds in Chinese as quickly and accurately as possible. Results revealed (1) an increase in accuracy of visual speech perception between ages 7 and 13 after which the accuracy rate either stagnates or drops; and (2) a U-shaped development pattern in speed of perception with peak performance in 13-year olds. Results also showed that across all age groups, the overall levels of accuracy rose, whereas the response times fell for simplex finals, complex finals and initials. These findings suggest that (1) visual speech perception in Chinese is a developmental process that is acquired over time and is still fine-tuned well into late adolescence; (2) factors other than cross-linguistic differences in phonological complexity and degrees of reliance on visual information are involved in development of visual speech perception.
Effects of environmental sounds on the guessability of animated graphic symbols.
Harmon, Ashley C; Schlosser, Ralf W; Gygi, Brian; Shane, Howard C; Kong, Ying-Yee; Book, Lorraine; Macduff, Kelly; Hearn, Emilia
2014-12-01
Graphic symbols are a necessity for pre-literate children who use aided augmentative and alternative communication (AAC) systems (including non-electronic communication boards and speech generating devices), as well as for mobile technologies using AAC applications. Recently, developers of the Autism Language Program (ALP) Animated Graphics Set have added environmental sounds to animated symbols representing verbs in an attempt to enhance their iconicity. The purpose of this study was to examine the effects of environmental sounds (added to animated graphic symbols representing verbs) in terms of naming. Participants included 46 children with typical development between the ages of 3;0 to 3;11 (years;months). The participants were randomly allocated to a condition of symbols with environmental sounds or a condition without environmental sounds. Results indicated that environmental sounds significantly enhanced the naming accuracy of animated symbols for verbs. Implications in terms of symbol selection, symbol refinement, and future symbol development will be discussed.
Smith, Anne; Goffman, Lisa; Sasisekaran, Jayanthi; Weber-Fox, Christine
2012-01-01
Stuttering is a disorder of speech production that typically arises in the preschool years, and many accounts of its onset and development implicate language and motor processes as critical underlying factors. There have, however, been very few studies of speech motor control processes in preschool children who stutter. Hearing novel nonwords and reproducing them engages multiple neural networks, including those involved in phonological analysis and storage and speech motor programming and execution. We used this task to explore speech motor and language abilities of 31 children aged 4–5 years who were diagnosed as stuttering. We also used sensitive and specific standardized tests of speech and language abilities to determine which of the children who stutter had concomitant language and/or phonological disorders. Approximately half of our sample of stuttering children had language and/or phonological disorders. As previous investigations would suggest, the stuttering children with concomitant language or speech sound disorders produced significantly more errors on the nonword repetition task compared to typically developing children. In contrast, the children who were diagnosed as stuttering, but who had normal speech sound and language abilities, performed the nonword repetition task with equal accuracy compared to their normally fluent peers. Analyses of interarticulator motions during accurate and fluent productions of the nonwords revealed that the children who stutter (without concomitant disorders) showed higher variability in oral motor coordination indices. These results provide new evidence that preschool children diagnosed as stuttering lag their typically developing peers in maturation of speech motor control processes. Educational objectives The reader will be able to: (a) discuss why performance on nonword repetition tasks has been investigated in children who stutter; (b) discuss why children who stutter in the current study had a higher incidence of concomitant language deficits compared to several other studies; (c) describe how performance differed on a nonword repetition test between children who stutter who do and do not have concomitant speech or language deficits; (d) make a general statement about speech motor control for nonword production in children who stutter compared to controls. PMID:23218217
Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z
2015-01-01
The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
Voice and Speech after Laryngectomy
ERIC Educational Resources Information Center
Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka
2006-01-01
The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…
Clinical Validation of a Sound Processor Upgrade in Direct Acoustic Cochlear Implant Subjects
Kludt, Eugen; D’hondt, Christiane; Lenarz, Thomas; Maier, Hannes
2017-01-01
Objective: The objectives of the investigation were to evaluate the effect of a sound processor upgrade on the speech reception threshold in noise and to collect long-term safety and efficacy data after 2½ to 5 years of device use of direct acoustic cochlear implant (DACI) recipients. Study Design: The study was designed as a mono-centric, prospective clinical trial. Setting: Tertiary referral center. Patients: Fifteen patients implanted with a direct acoustic cochlear implant. Intervention: Upgrade with a newer generation of sound processor. Main Outcome Measures: Speech recognition test in quiet and in noise, pure tone thresholds, subject-reported outcome measures. Results: The speech recognition in quiet and in noise is superior after the sound processor upgrade and stable after long-term use of the direct acoustic cochlear implant. The bone conduction thresholds did not decrease significantly after long-term high level stimulation. Conclusions: The new sound processor for the DACI system provides significant benefits for DACI users for speech recognition in both quiet and noise. Especially the noise program with the use of directional microphones (Zoom) allows DACI patients to have much less difficulty when having conversations in noisy environments. Furthermore, the study confirms that the benefits of the sound processor upgrade are available to the DACI recipients even after several years of experience with a legacy sound processor. Finally, our study demonstrates that the DACI system is a safe and effective long-term therapy. PMID:28406848
Fifty years of progress in acoustic phonetics
NASA Astrophysics Data System (ADS)
Stevens, Kenneth N.
2004-10-01
Three events that occurred 50 or 60 years ago shaped the study of acoustic phonetics, and in the following few decades these events influenced research and applications in speech disorders, speech development, speech synthesis, speech recognition, and other subareas in speech communication. These events were: (1) the source-filter theory of speech production (Chiba and Kajiyama; Fant); (2) the development of the sound spectrograph and its interpretation (Potter, Kopp, and Green; Joos); and (3) the birth of research that related distinctive features to acoustic patterns (Jakobson, Fant, and Halle). Following these events there has been systematic exploration of the articulatory, acoustic, and perceptual bases of phonological categories, and some quantification of the sources of variability in the transformation of this phonological representation of speech into its acoustic manifestations. This effort has been enhanced by studies of how children acquire language in spite of this variability and by research on speech disorders. Gaps in our knowledge of this inherent variability in speech have limited the directions of applications such as synthesis and recognition of speech, and have led to the implementation of data-driven techniques rather than theoretical principles. Some examples of advances in our knowledge, and limitations of this knowledge, are reviewed.
NASA Astrophysics Data System (ADS)
Jelinek, H. J.
1986-01-01
This is the Final Report of Electronic Design Associates on its Phase I SBIR project. The purpose of this project is to develop a method for correcting helium speech, as experienced in diver-surface communication. The goal of the Phase I study was to design, prototype, and evaluate a real time helium speech corrector system based upon digital signal processing techniques. The general approach was to develop hardware (an IBM PC board) to digitize helium speech and software (a LAMBDA computer based simulation) to translate the speech. As planned in the study proposal, this initial prototype may now be used to assess expected performance from a self contained real time system which uses an identical algorithm. The Final Report details the work carried out to produce the prototype system. Four major project tasks were: a signal processing scheme for converting helium speech to normal sounding speech was generated. The signal processing scheme was simulated on a general purpose (LAMDA) computer. Actual helium speech was supplied to the simulation and the converted speech was generated. An IBM-PC based 14 bit data Input/Output board was designed and built. A bibliography of references on speech processing was generated.
Sapienza, C M; Crandell, C C; Curtis, B
1999-09-01
Voice problems are a frequent difficulty that teachers experience. Common complaints by teachers include vocal fatigue and hoarseness. One possible explanation for these symptoms is prolonged elevations in vocal loudness within the classroom. This investigation examined the effectiveness of sound-field frequency modulation (FM) amplification on reducing the sound pressure level (SPL) of the teacher's voice during classroom instruction. Specifically, SPL was examined during speech produced in a classroom lecture by 10 teachers with and without the use of sound-field amplification. Results indicated a significant 2.42-dB decrease in SPL with the use of sound-field FM amplification. These data support the use of sound-field amplification in the vocal hygiene regimen recommended to teachers by speech-language pathologists.
Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness
Kates, James M.; Arehart, Kathryn H.; Souza, Pamela E.
2013-01-01
Individual factors beyond the audiogram, such as age and cognitive abilities, can influence speech intelligibility and speech quality judgments. This paper develops a neural network framework for combining multiple subject factors into a single model that predicts speech intelligibility and quality for a nonlinear hearing-aid processing strategy. The nonlinear processing approach used in the paper is frequency compression, which is intended to improve the audibility of high-frequency speech sounds by shifting them to lower frequency regions where listeners with high-frequency loss have better hearing thresholds. An ensemble averaging approach is used for the neural network to avoid the problems associated with overfitting. Models are developed for two subject groups, one having nearly normal hearing and the other mild-to-moderate sloping losses. PMID:25669257
Understanding environmental sounds in sentence context.
Uddin, Sophia; Heald, Shannon L M; Van Hedger, Stephen C; Klos, Serena; Nusbaum, Howard C
2018-03-01
There is debate about how individuals use context to successfully predict and recognize words. One view argues that context supports neural predictions that make use of the speech motor system, whereas other views argue for a sensory or conceptual level of prediction. While environmental sounds can convey clear referential meaning, they are not linguistic signals, and are thus neither produced with the vocal tract nor typically encountered in sentence context. We compared the effect of spoken sentence context on recognition and comprehension of spoken words versus nonspeech, environmental sounds. In Experiment 1, sentence context decreased the amount of signal needed for recognition of spoken words and environmental sounds in similar fashion. In Experiment 2, listeners judged sentence meaning in both high and low contextually constraining sentence frames, when the final word was present or replaced with a matching environmental sound. Results showed that sentence constraint affected decision time similarly for speech and nonspeech, such that high constraint sentences (i.e., frame plus completion) were processed faster than low constraint sentences for speech and nonspeech. Linguistic context facilitates the recognition and understanding of nonspeech sounds in much the same way as for spoken words. This argues against a simple form of a speech-motor explanation of predictive coding in spoken language understanding, and suggests support for conceptual-level predictions. Copyright © 2017 Elsevier B.V. All rights reserved.
Park, H K; Bradley, J S
2009-07-01
This paper reports the results of an evaluation of the merits of standard airborne sound insulation measures with respect to subjective ratings of the annoyance and loudness of transmitted sounds. Subjects listened to speech and music sounds modified to represent transmission through 20 different walls with sound transmission class (STC) ratings from 34 to 58. A number of variations in the standard measures were also considered. These included variations in the 8-dB rule for the maximum allowed deficiency in the STC measure as well as variations in the standard 32-dB total allowed deficiency. Several spectrum adaptation terms were considered in combination with weighted sound reduction index (R(w)) values as well as modifications to the range of included frequencies in the standard rating contour. A STC measure without an 8-dB rule and an R(w) rating with a new spectrum adaptation term were better predictors of annoyance and loudness ratings of speech sounds. R(w) ratings with one of two modified C(tr) spectrum adaptation terms were better predictors of annoyance and loudness ratings of transmitted music sounds. Although some measures were much better predictors of responses to one type of sound than were the standard STC and R(w) values, no measure was remarkably improved for predicting annoyance and loudness ratings of both music and speech sounds.
Articulatory speech synthesis and speech production modelling
NASA Astrophysics Data System (ADS)
Huang, Jun
This dissertation addresses the problem of speech synthesis and speech production modelling based on the fundamental principles of human speech production. Unlike the conventional source-filter model, which assumes the independence of the excitation and the acoustic filter, we treat the entire vocal apparatus as one system consisting of a fluid dynamic aspect and a mechanical part. We model the vocal tract by a three-dimensional moving geometry. We also model the sound propagation inside the vocal apparatus as a three-dimensional nonplane-wave propagation inside a viscous fluid described by Navier-Stokes equations. In our work, we first propose a combined minimum energy and minimum jerk criterion to estimate the dynamic vocal tract movements during speech production. Both theoretical error bound analysis and experimental results show that this method can achieve very close match at the target points and avoid the abrupt change in articulatory trajectory at the same time. Second, a mechanical vocal fold model is used to compute the excitation signal of the vocal tract. The advantage of this model is that it is closely coupled with the vocal tract system based on fundamental aerodynamics. As a result, we can obtain an excitation signal with much more detail than the conventional parametric vocal fold excitation model. Furthermore, strong evidence of source-tract interaction is observed. Finally, we propose a computational model of the fricative and stop types of sounds based on the physical principles of speech production. The advantage of this model is that it uses an exogenous process to model the additional nonsteady and nonlinear effects due to the flow mode, which are ignored by the conventional source- filter speech production model. A recursive algorithm is used to estimate the model parameters. Experimental results show that this model is able to synthesize good quality fricative and stop types of sounds. Based on our dissertation work, we carefully argue that the articulatory speech production model has the potential to flexibly synthesize natural-quality speech sounds and to provide a compact computational model for speech production that can be beneficial to a wide range of areas in speech signal processing.
Searchfield, Grant D; Linford, Tania; Kobayashi, Kei; Crowhen, David; Latzel, Matthias
2018-03-01
To compare preference for and performance of manually selected programmes to an automatic sound classifier, the Phonak AutoSense OS. A single blind repeated measures study. Participants were fit with Phonak Virto V90 ITE aids; preferences for different listening programmes were compared across four different sound scenarios (speech in: quiet, noise, loud noise and a car). Following a 4-week trial preferences were reassessed and the users preferred programme was compared to the automatic classifier for sound quality and hearing in noise (HINT test) using a 12 loudspeaker array. Twenty-five participants with symmetrical moderate-severe sensorineural hearing loss. Participant preferences of manual programme for scenarios varied considerably between and within sessions. A HINT Speech Reception Threshold (SRT) advantage was observed for the automatic classifier over participant's manual selection for speech in quiet, loud noise and car noise. Sound quality ratings were similar for both manual and automatic selections. The use of a sound classifier is a viable alternative to manual programme selection.
Pre-Literacy Skills of Subgroups of Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Raitano, Nancy A.; Pennington, Bruce F.; Tunick, Rachel A.; Boada, Richard; Shriberg, Lawrence D.
2004-01-01
Background: The existing literature has conflicting findings about the literacy outcome of children with speech sound disorders (SSD), which may be due to the heterogeneity within SSD. Previous studies have documented that two important dimensions of heterogeneity are the presence of a comorbid language impairment (LI) and the persistence of SSD,…
Intervention Efficacy and Intensity for Children with Speech Sound Disorder
ERIC Educational Resources Information Center
Allen, Melissa M.
2013-01-01
Purpose: Clinicians do not have an evidence base they can use to recommend optimum intervention intensity for preschool children who present with speech sound disorder (SSD). This study examined the effect of dose frequency on phonological performance and the efficacy of the multiple oppositions approach. Method: Fifty-four preschool children with…
ERIC Educational Resources Information Center
Hickok, G.; Okada, K.; Barr, W.; Pa, J.; Rogalsky, C.; Donnelly, K.; Barde, L.; Grant, A.
2008-01-01
Data from lesion studies suggest that the ability to perceive speech sounds, as measured by auditory comprehension tasks, is supported by temporal lobe systems in both the left and right hemisphere. For example, patients with left temporal lobe damage and auditory comprehension deficits (i.e., Wernicke's aphasics), nonetheless comprehend isolated…
Impact of Aberrant Acoustic Properties on the Perception of Sound Quality in Electrolarynx Speech
ERIC Educational Resources Information Center
Meltzner, Geoffrey S.; Hillman, Robert E.
2005-01-01
A large percentage of patients who have undergone laryngectomy to treat advanced laryngeal cancer rely on an electrolarynx (EL) to communicate verbally. Although serviceable, EL speech is plagued by shortcomings in both sound quality and intelligibility. This study sought to better quantify the relative contributions of previously identified…
Phonological Processing and Reading in Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Rvachew, Susan
2007-01-01
Purpose: To examine the relationship between phonological processing skills prior to kindergarten entry and reading skills at the end of 1st grade, in children with speech sound disorders (SSD). Method: The participants were 17 children with SSD and poor phonological processing skills (SSD-low PP), 16 children with SSD and good phonological…
What Influences Literacy Outcome in Children with Speech Sound Disorder?
ERIC Educational Resources Information Center
Peterson, Robin L.; Pennington, Bruce F.; Shriberg, Lawrence D.; Boada, Richard
2009-01-01
Purpose: In this study, the authors evaluated literacy outcome in children with histories of speech sound disorder (SSD) who were characterized along 2 dimensions: broader language function and persistence of SSD. In previous studies, authors have demonstrated that each dimension relates to literacy but have not disentangled their effects.…
Phonetic Variability in Residual Speech Sound Disorders: Exploration of Subtypes
ERIC Educational Resources Information Center
Preston, Jonathan L.; Koenig, Laura L.
2011-01-01
Purpose: To explore whether subgroups of children with residual speech sound disorders (R-SSDs) can be identified through multiple measures of token-to-token phonetic variability (changes in one spoken production to the next). Method: Children with R-SSDs were recorded during a rapid multisyllabic picture naming task and an oral diadochokinetic…
Correlates of Phonological Awareness in Preschoolers with Speech Sound Disorders
ERIC Educational Resources Information Center
Rvachew, Susan; Grawburg, Meghann
2006-01-01
Purpose: The purpose of this study was to examine the relationships among variables that may contribute to poor phonological awareness (PA) skills in preschool-aged children with speech sound disorders (SSD). Method: Ninety-five 4- and 5-year-old children with SSD were assessed during the spring of their prekindergarten year. Linear structural…
ERIC Educational Resources Information Center
Lousada, M.; Jesus, Luis M. T.; Hall, A.; Joffe, V.
2014-01-01
Background: The effectiveness of two treatment approaches (phonological therapy and articulation therapy) for treatment of 14 children, aged 4;0-6;7 years, with phonologically based speech-sound disorder (SSD) has been previously analysed with severity outcome measures (percentage of consonants correct score, percentage occurrence of phonological…
Stimulus Characteristics of Single-Word Tests of Children's Speech Sound Production
ERIC Educational Resources Information Center
Macrae, Toby
2017-01-01
Purpose: This clinical focus article provides readers with a description of the stimulus characteristics of 12 popular tests of speech sound production. Method: Using significance testing and descriptive analyses, stimulus items were compared in terms of the number of opportunities for production of all consonant singletons, clusters, and rhotic…
Perception of Spectral Contrast by Hearing-Impaired Listeners
ERIC Educational Resources Information Center
Dreisbach, Laura E.; Leek, Marjorie R.; Lentz, Jennifer J.
2005-01-01
The ability to discriminate the spectral shapes of complex sounds is critical to accurate speech perception. Part of the difficulty experienced by listeners with hearing loss in understanding speech sounds in noise may be related to a smearing of the internal representation of the spectral peaks and valleys because of the loss of sensitivity and…
Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech
ERIC Educational Resources Information Center
Švec, Jan G.; Granqvist, Svante
2018-01-01
Purpose: Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to…
Inferior Frontal Sensitivity to Common Speech Sounds Is Amplified by Increasing Word Intelligibility
ERIC Educational Resources Information Center
Vaden, Kenneth I., Jr.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.
2011-01-01
The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of…
Nonspeech Oral Motor Treatment Issues Related to Children with Developmental Speech Sound Disorders
ERIC Educational Resources Information Center
Ruscello, Dennis M.
2008-01-01
Purpose: This article examines nonspeech oral motor treatments (NSOMTs) in the population of clients with developmental speech sound disorders. NSOMTs are a collection of nonspeech methods and procedures that claim to influence tongue, lip, and jaw resting postures; increase strength; improve muscle tone; facilitate range of motion; and develop…
Transitioning from Analog to Digital Audio Recording in Childhood Speech Sound Disorders
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Mcsweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.
2005-01-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing…
Binaural Release from Masking for a Speech Sound in Infants, Preschoolers, and Adults.
ERIC Educational Resources Information Center
Nozza, Robert J.
1988-01-01
Binaural masked thresholds for a speech sound (/ba/) were estimated under two interaural phase conditions in three age groups (infants, preschoolers, adults). Differences as a function of both age and condition and effects of reducing intensity for adults were significant in indicating possible developmental binaural hearing changes, especially…
Speaker Identity Supports Phonetic Category Learning
ERIC Educational Resources Information Center
Mani, Nivedita; Schneider, Signe
2013-01-01
Visual cues from the speaker's face, such as the discriminable mouth movements used to produce speech sounds, improve discrimination of these sounds by adults. The speaker's face, however, provides more information than just the mouth movements used to produce speech--it also provides a visual indexical cue of the identity of the speaker. The…
Noise-induced hearing impairment and handicap
NASA Technical Reports Server (NTRS)
1984-01-01
A permanent, noise-induced hearing loss has doubly harmful effect on speech communications. First, the elevation in the threshold of hearing means that many speech sounds are too weak to be heard, and second, very intense speech sounds may appear to be distorted. The whole question of the impact of noise-induced hearing loss upon the impairments and handicaps experienced by people with such hearing losses was somewhat controversial partly because of the economic aspects of related practical noise control and workmen's compensation.
NASA Astrophysics Data System (ADS)
Mosko, J. D.; Stevens, K. N.; Griffin, G. R.
1983-08-01
Acoustical analyses were conducted of words produced by four speakers in a motion stress-inducing situation. The aim of the analyses was to document the kinds of changes that occur in the vocal utterances of speakers who are exposed to motion stress and to comment on the implications of these results for the design and development of voice interactive systems. The speakers differed markedly in the types and magnitudes of the changes that occurred in their speech. For some speakers, the stress-inducing experimental condition caused an increase in fundamental frequency, changes in the pattern of vocal fold vibration, shifts in vowel production and changes in the relative amplitudes of sounds containing turbulence noise. All speakers showed greater variability in the experimental condition than in more relaxed control situation. The variability was manifested in the acoustical characteristics of individual phonetic elements, particularly in speech sound variability observed serve to unstressed syllables. The kinds of changes and variability observed serve to emphasize the limitations of speech recognition systems based on template matching of patterns that are stored in the system during a training phase. There is need for a better understanding of these phonetic modifications and for developing ways of incorporating knowledge about these changes within a speech recognition system.
Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn
2015-05-01
Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time. Because ClearVoice does not degrade performance in quiet settings, clinicians should consider recommending ClearVoice for routine, full-time use for AB implant recipients. Roger should be used in all instances in which remote microphone technology may assist the user in understanding speech in the presence of noise. American Academy of Audiology.
Obstructive sleep apnea, seizures, and childhood apraxia of speech.
Caspari, Susan S; Strand, Edythe A; Kotagal, Suresh; Bergqvist, Christina
2008-06-01
Associations between obstructive sleep apnea and motor speech disorders in adults have been suggested, though little has been written about possible effects of sleep apnea on speech acquisition in children with motor speech disorders. This report details the medical and speech history of a nonverbal child with seizures and severe apraxia of speech. For 6 years, he made no functional gains in speech production, despite intensive speech therapy. After tonsillectomy for obstructive sleep apnea at age 6 years, he experienced a reduction in seizures and rapid growth in speech production. The findings support a relationship between obstructive sleep apnea and childhood apraxia of speech. The rather late diagnosis and treatment of obstructive sleep apnea, especially in light of what was such a life-altering outcome (gaining functional speech), has significant implications. Most speech sounds develop during ages 2-5 years, which is also the peak time of occurrence of adenotonsillar hypertrophy and childhood obstructive sleep apnea. Hence it is important to establish definitive diagnoses, and to consider early and more aggressive treatments for obstructive sleep apnea, in children with motor speech disorders.
Implications of diadochokinesia in children with speech sound disorder.
Wertzner, Haydée Fiszbein; Pagan-Neves, Luciana de Oliveira; Alves, Renata Ramos; Barrozo, Tatiane Faria
2013-01-01
To verify the performance of children with and without speech sound disorder in oral motor skills measured by oral diadochokinesia according to age and gender and to compare the results by two different methods of analysis. Participants were 72 subjects aged from 5 years to 7 years and 11 months divided into four subgroups according to the presence of speech sound disorder (Study Group and Control Group) and age (<6 years and 5 months and >6 years and 5 months). Diadochokinesia skills were assessed by the repetition of the sequences 'pa', 'ta', 'ka' and 'pataka' measured both manually and by the software Motor Speech Profile®. Gender was statistically different for both groups but it did not influence on the number of sequences per second produced. Correlation between the number of sequences per second and age was observed for all sequences (except for 'ka') only for the control group children. Comparison between groups did not indicate differences between the number of sequences per second and age. Results presented strong agreement between the values of oral diadochokinesia measured manually and by MSP. This research demonstrated the importance of using different methods of analysis on the functional evaluation of oro-motor processing aspects of children with speech sound disorder and evidenced the oro-motor difficulties on children aged under than eight years old.
Auditory Cortex Processes Variation in Our Own Speech
Sitek, Kevin R.; Mathalon, Daniel H.; Roach, Brian J.; Houde, John F.; Niziolek, Caroline A.; Ford, Judith M.
2013-01-01
As we talk, we unconsciously adjust our speech to ensure it sounds the way we intend it to sound. However, because speech production involves complex motor planning and execution, no two utterances of the same sound will be exactly the same. Here, we show that auditory cortex is sensitive to natural variations in self-produced speech from utterance to utterance. We recorded event-related potentials (ERPs) from ninety-nine subjects while they uttered “ah” and while they listened to those speech sounds played back. Subjects' utterances were sorted based on their formant deviations from the previous utterance. Typically, the N1 ERP component is suppressed during talking compared to listening. By comparing ERPs to the least and most variable utterances, we found that N1 was less suppressed to utterances that differed greatly from their preceding neighbors. In contrast, an utterance's difference from the median formant values did not affect N1. Trial-to-trial pitch (f0) deviation and pitch difference from the median similarly did not affect N1. We discuss mechanisms that may underlie the change in N1 suppression resulting from trial-to-trial formant change. Deviant utterances require additional auditory cortical processing, suggesting that speaking-induced suppression mechanisms are optimally tuned for a specific production. PMID:24349399
Brammer, Anthony J; Yu, Gongqiang; Bernstein, Eric R; Cherniack, Martin G; Peterson, Donald R; Tufts, Jennifer B
2014-08-01
An adaptive, delayless, subband feed-forward control structure is employed to improve the speech signal-to-noise ratio (SNR) in the communication channel of a circumaural headset/hearing protector (HPD) from 90 Hz to 11.3 kHz, and to provide active noise control (ANC) from 50 to 800 Hz to complement the passive attenuation of the HPD. The task involves optimizing the speech SNR for each communication channel subband, subject to limiting the maximum sound level at the ear, maintaining a speech SNR preferred by users, and reducing large inter-band gain differences to improve speech quality. The performance of a proof-of-concept device has been evaluated in a pseudo-diffuse sound field when worn by human subjects under conditions of environmental noise and speech that do not pose a risk to hearing, and by simulation for other conditions. For the environmental noises employed in this study, subband speech SNR control combined with subband ANC produced greater improvement in word scores than subband ANC alone, and improved the consistency of word scores across subjects. The simulation employed a subject-specific linear model, and predicted that word scores are maintained in excess of 90% for sound levels outside the HPD of up to ∼115 dBA.
Integrating speech in time depends on temporal expectancies and attention.
Scharinger, Mathias; Steinberg, Johanna; Tavano, Alessandro
2017-08-01
Sensory information that unfolds in time, such as in speech perception, relies on efficient chunking mechanisms in order to yield optimally-sized units for further processing. Whether or not two successive acoustic events receive a one-unit or a two-unit interpretation seems to depend on the fit between their temporal extent and a stipulated temporal window of integration. However, there is ongoing debate on how flexible this temporal window of integration should be, especially for the processing of speech sounds. Furthermore, there is no direct evidence of whether attention may modulate the temporal constraints on the integration window. For this reason, we here examine how different word durations, which lead to different temporal separations of sound onsets, interact with attention. In an Electroencephalography (EEG) study, participants actively and passively listened to words where word-final consonants were occasionally omitted. Words had either a natural duration or were artificially prolonged in order to increase the separation of speech sound onsets. Omission responses to incomplete speech input, originating in left temporal cortex, decreased when the critical speech sound was separated from previous sounds by more than 250 msec, i.e., when the separation was larger than the stipulated temporal window of integration (125-150 msec). Attention, on the other hand, only increased omission responses for stimuli with natural durations. We complemented the event-related potential (ERP) analyses by a frequency-domain analysis on the stimulus presentation rate. Notably, the power of stimulation frequency showed the same duration and attention effects than the omission responses. We interpret these findings on the background of existing research on temporal integration windows and further suggest that our findings may be accounted for within the framework of predictive coding. Copyright © 2017 Elsevier Ltd. All rights reserved.
The development of the Nucleus Freedom Cochlear implant system.
Patrick, James F; Busby, Peter A; Gibson, Peter J
2006-12-01
Cochlear Limited (Cochlear) released the fourth-generation cochlear implant system, Nucleus Freedom, in 2005. Freedom is based on 25 years of experience in cochlear implant research and development and incorporates advances in medicine, implantable materials, electronic technology, and sound coding. This article presents the development of Cochlear's implant systems, with an overview of the first 3 generations, and details of the Freedom system: the CI24RE receiver-stimulator, the Contour Advance electrode, the modular Freedom processor, the available speech coding strategies, the input processing options of Smart Sound to improve the signal before coding as electrical signals, and the programming software. Preliminary results from multicenter studies with the Freedom system are reported, demonstrating better levels of performance compared with the previous systems. The final section presents the most recent implant reliability data, with the early findings at 18 months showing improved reliability of the Freedom implant compared with the earlier Nucleus 3 System. Also reported are some of the findings of Cochlear's collaborative research programs to improve recipient outcomes. Included are studies showing the benefits from bilateral implants, electroacoustic stimulation using an ipsilateral and/or contralateral hearing aid, advanced speech coding, and streamlined speech processor programming.
Infant Perception of Atypical Speech Signals
ERIC Educational Resources Information Center
Vouloumanos, Athena; Gelfand, Hanna M.
2013-01-01
The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…
Status Report on Speech Research, No. 27, July-September 1971.
ERIC Educational Resources Information Center
Haskins Labs., New Haven, CT.
This report contains fourteen papers on a wide range of current topics and experiments in speech research, ranging from the relationship between speech and reading to questions of memory and perception of speech sounds. The following papers are included: "How Is Language Conveyed by Speech?;""Reading, the Linguistic Process, and Linguistic…
Learning-induced neural plasticity of speech processing before birth
Partanen, Eino; Kujala, Teija; Näätänen, Risto; Liitola, Auli; Sambeth, Anke; Huotilainen, Minna
2013-01-01
Learning, the foundation of adaptive and intelligent behavior, is based on plastic changes in neural assemblies, reflected by the modulation of electric brain responses. In infancy, auditory learning implicates the formation and strengthening of neural long-term memory traces, improving discrimination skills, in particular those forming the prerequisites for speech perception and understanding. Although previous behavioral observations show that newborns react differentially to unfamiliar sounds vs. familiar sound material that they were exposed to as fetuses, the neural basis of fetal learning has not thus far been investigated. Here we demonstrate direct neural correlates of human fetal learning of speech-like auditory stimuli. We presented variants of words to fetuses; unlike infants with no exposure to these stimuli, the exposed fetuses showed enhanced brain activity (mismatch responses) in response to pitch changes for the trained variants after birth. Furthermore, a significant correlation existed between the amount of prenatal exposure and brain activity, with greater activity being associated with a higher amount of prenatal speech exposure. Moreover, the learning effect was generalized to other types of similar speech sounds not included in the training material. Consequently, our results indicate neural commitment specifically tuned to the speech features heard before birth and their memory representations. PMID:23980148
Parent-child interaction in motor speech therapy.
Namasivayam, Aravind Kumar; Jethava, Vibhuti; Pukonen, Margit; Huynh, Anna; Goshulak, Debra; Kroll, Robert; van Lieshout, Pascal
2018-01-01
This study measures the reliability and sensitivity of a modified Parent-Child Interaction Observation scale (PCIOs) used to monitor the quality of parent-child interaction. The scale is part of a home-training program employed with direct motor speech intervention for children with speech sound disorders. Eighty-four preschool age children with speech sound disorders were provided either high- (2×/week/10 weeks) or low-intensity (1×/week/10 weeks) motor speech intervention. Clinicians completed the PCIOs at the beginning, middle, and end of treatment. Inter-rater reliability (Kappa scores) was determined by an independent speech-language pathologist who assessed videotaped sessions at the midpoint of the treatment block. Intervention sensitivity of the scale was evaluated using a Friedman test for each item and then followed up with Wilcoxon pairwise comparisons where appropriate. We obtained fair-to-good inter-rater reliability (Kappa = 0.33-0.64) for the PCIOs using only video-based scoring. Child-related items were more strongly influenced by differences in treatment intensity than parent-related items, where a greater number of sessions positively influenced parent learning of treatment skills and child behaviors. The adapted PCIOs is reliable and sensitive to monitor the quality of parent-child interactions in a 10-week block of motor speech intervention with adjunct home therapy. Implications for rehabilitation Parent-centered therapy is considered a cost effective method of speech and language service delivery. However, parent-centered models may be difficult to implement for treatments such as developmental motor speech interventions that require a high degree of skill and training. For children with speech sound disorders and motor speech difficulties, a translated and adapted version of the parent-child observation scale was found to be sufficiently reliable and sensitive to assess changes in the quality of the parent-child interactions during intervention. In developmental motor speech interventions, high-intensity treatment (2×/week/10 weeks) facilitates greater changes in the parent-child interactions than low intensity treatment (1×/week/10 weeks). On one hand, parents may need to attend more than five sessions with the clinician to learn how to observe and address their child's speech difficulties. On the other hand, children with speech sound disorders may need more than 10 sessions to adapt to structured play settings even when activities and therapy materials are age-appropriate.
Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D; Senn, Pascal
2013-01-01
To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0-500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Higher frame rate (>7 fps), higher camera resolution (>640 × 480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
Abou-Elsaad, Tamer; Baz, Hemmat; Afsah, Omayma; Mansy, Alzahraa
2015-09-01
Even with early surgical repair, the majority of cleft palate children demonstrate articulation errors and have typical cleft palate speech. Was to determine the nature of articulation errors of Arabic consonants in Egyptian Arabic-speaking children with velopharyngeal insufficiency (VPI). Thirty Egyptian Arabic-speaking children with VPI due to cleft palate (whether primary repaired or secondary repaired) were studied. Auditory perceptual assessment (APA) of children speech was conducted. Nasopharyngoscopy was done to assess the velopharyngeal port (VPP) movements while the child was repeating speech tasks. Mansoura Arabic Articulation test (MAAT) was performed to analyze the consonants articulation of these children. The most frequent type of articulatory errors observed was substitution, more specifically, backing. Pharyngealization of anterior fricatives was the most frequent substitution, especially for the /s/ sound. The most frequent substituting sounds for other sounds were /ʔ/ followed by /k/ and /n/ sounds. Significant correlations were found between the degrees of the open nasality and VPP closure and the articulation errors. On the other hand, the sounds (/ʔ/,/ħ/,/ʕ/,/n/,/w/,/j/) were normally articulated in all studied group. The determination of articulation errors in VPI children could guide the therapists for designing appropriate speech therapy programs for these cases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Buss, Emily; Leibold, Lori J.; Porter, Heather L.; Grose, John H.
2017-01-01
Children perform more poorly than adults on a wide range of masked speech perception paradigms, but this effect is particularly pronounced when the masker itself is also composed of speech. The present study evaluated two factors that might contribute to this effect: the ability to perceptually isolate the target from masker speech, and the ability to recognize target speech based on sparse cues (glimpsing). Speech reception thresholds (SRTs) were estimated for closed-set, disyllabic word recognition in children (5–16 years) and adults in a one- or two-talker masker. Speech maskers were 60 dB sound pressure level (SPL), and they were either presented alone or in combination with a 50-dB-SPL speech-shaped noise masker. There was an age effect overall, but performance was adult-like at a younger age for the one-talker than the two-talker masker. Noise tended to elevate SRTs, particularly for older children and adults, and when summed with the one-talker masker. Removing time-frequency epochs associated with a poor target-to-masker ratio markedly improved SRTs, with larger effects for younger listeners; the age effect was not eliminated, however. Results were interpreted as indicating that development of speech-in-speech recognition is likely impacted by development of both perceptual masking and the ability recognize speech based on sparse cues. PMID:28464682
Speech Intelligibility in Various Noise Conditions with the Nucleus® 5 CP810 Sound Processor.
Dillier, Norbert; Lai, Wai Kong
2015-06-11
The Nucleus(®) 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of -8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and back) is truly remarkable and comparable to the performance of normal hearing listeners in the same test environment. The static directivity (Zoom) option in the diffuse noise condition still provides a significant benefit of 5.9 dB in comparison with the standard omnidirectional microphone setting. These results indicate that CI recipients may improve their speech recognition in noisy environments significantly using these directional microphone-processing options.
Moore, Brian C J; Füllgrabe, Christian; Stone, Michael A
2011-01-01
To determine preferred parameters of multichannel compression using individually fitted simulated hearing aids and a method of paired comparisons. Fourteen participants with mild to moderate hearing loss listened via a simulated five-channel compression hearing aid fitted using the CAMEQ2-HF method to pairs of speech sounds (a male talker and a female talker) and musical sounds (a percussion instrument, orchestral classical music, and a jazz trio) presented sequentially and indicated which sound of the pair was preferred and by how much. The sounds in each pair were derived from the same token and differed along a single dimension in the type of processing applied. For the speech sounds, participants judged either pleasantness or clarity; in the latter case, the speech was presented in noise at a 2-dB signal-to-noise ratio. For musical sounds, they judged pleasantness. The parameters explored were time delay of the audio signal relative to the gain control signal (the alignment delay), compression speed (attack and release times), bandwidth (5, 7.5, or 10 kHz), and gain at high frequencies relative to that prescribed by CAMEQ2-HF. Pleasantness increased with increasing alignment delay only for the percussive musical sound. Clarity was not affected by alignment delay. There was a trend for pleasantness to decrease slightly with increasing bandwidth, but this was significant only for female speech with fast compression. Judged clarity was significantly higher for the 7.5- and 10-kHz bandwidths than for the 5-kHz bandwidth for both slow and fast compression and for both talker genders. Compression speed had little effect on pleasantness for 50- or 65-dB SPL input levels, but slow compression was generally judged as slightly more pleasant than fast compression for an 80-dB SPL input level. Clarity was higher for slow than for fast compression for input levels of 80 and 65 dB SPL but not for a level of 50 dB SPL. Preferences for pleasantness were approximately equal with CAMEQ2-HF gains and with gains slightly reduced at high frequencies and were lower when gains were slightly increased at high frequencies. Speech clarity was not affected by changing the gain at high frequencies. Effects of alignment delay were small except for the percussive sound. A wider bandwidth was slightly preferred for speech clarity. Speech clarity was slightly greater with slow compression, especially at high levels. Preferred high-frequency gains were close to or a little below those prescribed by CAMEQ2-HF.
Telkemeyer, Silke; Rossi, Sonja; Nierhaus, Till; Steinbrink, Jens; Obrig, Hellmuth; Wartenburger, Isabell
2010-01-01
Speech perception requires rapid extraction of the linguistic content from the acoustic signal. The ability to efficiently process rapid changes in auditory information is important for decoding speech and thereby crucial during language acquisition. Investigating functional networks of speech perception in infancy might elucidate neuronal ensembles supporting perceptual abilities that gate language acquisition. Interhemispheric specializations for language have been demonstrated in infants. How these asymmetries are shaped by basic temporal acoustic properties is under debate. We recently provided evidence that newborns process non-linguistic sounds sharing temporal features with language in a differential and lateralized fashion. The present study used the same material while measuring brain responses of 6 and 3 month old infants using simultaneous recordings of electroencephalography (EEG) and near-infrared spectroscopy (NIRS). NIRS reveals that the lateralization observed in newborns remains constant over the first months of life. While fast acoustic modulations elicit bilateral neuronal activations, slow modulations lead to right-lateralized responses. Additionally, auditory-evoked potentials and oscillatory EEG responses show differential responses for fast and slow modulations indicating a sensitivity for temporal acoustic variations. Oscillatory responses reveal an effect of development, that is, 6 but not 3 month old infants show stronger theta-band desynchronization for slowly modulated sounds. Whether this developmental effect is due to increasing fine-grained perception for spectrotemporal sounds in general remains speculative. Our findings support the notion that a more general specialization for acoustic properties can be considered the basis for lateralization of speech perception. The results show that concurrent assessment of vascular based imaging and electrophysiological responses have great potential in the research on language acquisition. PMID:21716574
Rossouw, Kate; Pascoe, Michelle
2018-03-19
Bilingualism is common in South Africa, with many children acquiring isiXhosa as a home language and learning English from a young age in nursery or crèche. IsiXhosa is a local language, part of the Bantu language family, widely spoken in the country. Aims: To describe changes in a bilingual child's speech following intervention based on a theoretically motivated and tailored intervention plan. Methods and procedures: This study describes a female isiXhosa-English bilingual child, named Gcobisa (pseudonym) (chronological age 4 years and 2 months) with a speech sound disorder. Gcobisa's speech was assessed and her difficulties categorised according to Dodd's (2005) diagnostic framework. From this, intervention was planned and the language of intervention was selected. Following intervention, Gcobisa's speech was reassessed. Outcomes and results: Gcobisa's speech was categorised as a consistent phonological delay as she presented with gliding of/l/in both English and isiXhosa, cluster reduction in English and several other age appropriate phonological processes. She was provided with 16 sessions of intervention using a minimal pairs approach, targeting the phonological process of gliding of/l/, which was not considered age appropriate for Gcobisa in isiXhosa when compared to the small set of normative data regarding monolingual isiXhosa development. As a result, the targets and stimuli were in isiXhosa while the main language of instruction was English. This reflects the language mismatch often faced by speech language therapists in South Africa. Gcobisa showed evidence of generalising the target phoneme to English words. Conclusions and implications: The data have theoretical implications regarding bilingual development of isiXhosa-English, as it highlights the ways bilingual development may differ from the monolingual development of this language pair. It adds to the small set of intervention studies investigating the changes in the speech of bilingual children following intervention. In addition, it contributes to the small amount of data gathered regarding typical bilingual acquisition of this language pair.
Intertrial auditory neural stability supports beat synchronization in preschoolers
Carr, Kali Woodruff; Tierney, Adam; White-Schwoch, Travis; Kraus, Nina
2016-01-01
The ability to synchronize motor movements along with an auditory beat places stringent demands on the temporal processing and sensorimotor integration capabilities of the nervous system. Links between millisecond-level precision of auditory processing and the consistency of sensorimotor beat synchronization implicate fine auditory neural timing as a mechanism for forming stable internal representations of, and behavioral reactions to, sound. Here, for the first time, we demonstrate a systematic relationship between consistency of beat synchronization and trial-by-trial stability of subcortical speech processing in preschoolers (ages 3 and 4 years old). We conclude that beat synchronization might provide a useful window into millisecond-level neural precision for encoding sound in early childhood, when speech processing is especially important for language acquisition and development. PMID:26760457
Speech-Sound Duration Processing in a Second Language is Specific to Phonetic Categories
ERIC Educational Resources Information Center
Nenonen, Sari; Shestakova, Anna; Huotilainen, Minna; Naatanen, Risto
2005-01-01
The mismatch negativity (MMN) component of the auditory event-related potential was used to determine the effect of native language, Russian, on the processing of speech-sound duration in a second language, Finnish, that uses duration as a cue for phonological distinction. The native-language effect was compared with Finnish vowels that either can…
ERIC Educational Resources Information Center
Lalonde, Kaylah; Holt, Rachael Frush
2014-01-01
Purpose: This preliminary investigation explored potential cognitive and linguistic sources of variance in 2- year-olds' speech-sound discrimination by using the toddler change/no-change procedure and examined whether modifications would result in a procedure that can be used consistently with younger 2-year-olds. Method: Twenty typically…
Speech-Sound Disorders and Attention-Deficit/Hyperactivity Disorder Symptoms
ERIC Educational Resources Information Center
Lewis, Barbara A.; Short, Elizabeth J.; Iyengar, Sudha K.; Taylor, H. Gerry; Freebairn, Lisa; Tag, Jessica; Avrich, Allison A.; Stein, Catherine M.
2012-01-01
Purpose: The purpose of this study was to examine the association of speech-sound disorders (SSD) with symptoms of attention-deficit/hyperactivity disorder (ADHD) by the severity of the SSD and the mode of transmission of SSD within the pedigrees of children with SSD. Participants and Methods: The participants were 412 children who were enrolled…
ERIC Educational Resources Information Center
Peter, Beate; Raskind, Wendy H.
2011-01-01
Purpose: To evaluate phenotypic expressions of speech sound disorder (SSD) in multigenerational families with evidence of familial forms of SSD. Method: Members of five multigenerational families (N = 36) produced rapid sequences of monosyllables and disyllables and tapped computer keys with repetitive and alternating movements. Results: Measures…
What Factors Place Children with Speech Sound Disorders at Risk for Reading Problems?
ERIC Educational Resources Information Center
Anthony, Jason L.; Aghara, Rachel Greenblatt; Dunkelberger, Martha J.; Anthony, Teresa I.; Williams, Jeffrey M.; Zhang, Zhou
2011-01-01
Purpose: To identify weaknesses in print awareness and phonological processing that place children with speech sound disorders (SSDs) at increased risk for reading difficulties. Method: Language, literacy, and phonological skills of 3 groups of preschool-age children were compared: a group of 68 children with SSDs, a group of 68 peers with normal…
ERIC Educational Resources Information Center
Brown, Carrie; And Others
This final report describes activities and outcomes of a research project on a sound-to-speech translation system utilizing a graphic mediation interface for students with severe disabilities. The STS/Graphics system is a voice recognition, computer-based system designed to allow individuals with mental retardation and/or severe physical…
ERIC Educational Resources Information Center
Goswami, Usha; Fosker, Tim; Huss, Martina; Mead, Natasha; Szucs, Denes
2011-01-01
Across languages, children with developmental dyslexia have a specific difficulty with the neural representation of the sound structure (phonological structure) of speech. One likely cause of their difficulties with phonology is a perceptual difficulty in auditory temporal processing (Tallal, 1980). Tallal (1980) proposed that basic auditory…
ERIC Educational Resources Information Center
Watson, Maggie M.; Lof, Gregory L.
2009-01-01
Purpose: The purpose of this article was to obtain and organize information from instructors who teach course work on the subject of children's speech sound disorders (SSD) regarding their use of teaching resources, involvement in students' clinical practica, and intervention approaches presented to students. Instructors also reported if they…
Psychometric Characteristics of Single-Word Tests of Children's Speech Sound Production
ERIC Educational Resources Information Center
Flipsen, Peter, Jr.; Ogiela, Diane A.
2015-01-01
Purpose: Our understanding of test construction has improved since the now-classic review by McCauley and Swisher (1984) . The current review article examines the psychometric characteristics of current single-word tests of speech sound production in an attempt to determine whether our tests have improved since then. It also provides a resource…
A Longitudinal Investigation of Morpho-Syntax in Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Mortimer, Jennifer; Rvachew, Susan
2010-01-01
Purpose: The intent of this study was to examine the longitudinal morpho-syntactic progression of children with Speech Sound Disorders (SSD) grouped according to Mean Length of Utterance (MLU) scores. Methods: Thirty-seven children separated into four clusters were assessed in their pre-kindergarten and Grade 1 years. Cluster 1 were children with…
Possible-word constraints in Cantonese speech segmentation.
Yip, Michael C
2004-03-01
A Cantonese syllable-spotting experiment was conducted to examine whether the Possible-Word Constraint (PWC), proposed by Norris, McQueen, Cutler, and Butterfield (1997), can apply in Cantonese speech segmentation. In the experiment, listeners were asked to spot out the target Cantonese syllable from a series of nonsense sound strings. Results suggested that listeners found it more difficult to spot out the target syllable [kDm1] in the nonsense sound strings that attached with a single consonant [tkDm1] than in the nonsense sound strings that attached either with a vowel [a:kDm1] or a pseudo-syllable [khow1kDm1]. Finally, the current set of results further supported that the PWC appears to be a language-universal mechanism in segmenting continuous speech.
D Chorna, Olena; L Hamm, Ellyn; Shrivastava, Hemang; Maitre, Nathalie L
2018-01-01
Atypical maturation of auditory neural processing contributes to preterm-born infants' language delays. Event-related potential (ERP) measurement of speech-sound differentiation might fill a gap in treatment-response biomarkers to auditory interventions. We evaluated whether these markers could measure treatment effects in a quasi-randomized prospective study. Hospitalized preterm infants in passive or active, suck-contingent mother's voice exposure groups were not different at baseline. Post-intervention, the active group had greater increases in/du/-/gu/differentiation in left frontal and temporal regions. Infants with brain injury had lower baseline/ba/-/ga/and/du/-/gu/differentiation than those without. ERP provides valid discriminative, responsive, and predictive biomarkers of infant speech-sound differentiation.
Analysis of speech sounds is left-hemisphere predominant at 100-150ms after sound onset.
Rinne, T; Alho, K; Alku, P; Holi, M; Sinkkonen, J; Virtanen, J; Bertrand, O; Näätänen, R
1999-04-06
Hemispheric specialization of human speech processing has been found in brain imaging studies using fMRI and PET. Due to the restricted time resolution, these methods cannot, however, determine the stage of auditory processing at which this specialization first emerges. We used a dense electrode array covering the whole scalp to record the mismatch negativity (MMN), an event-related brain potential (ERP) automatically elicited by occasional changes in sounds, which ranged from non-phonetic (tones) to phonetic (vowels). MMN can be used to probe auditory central processing on a millisecond scale with no attention-dependent task requirements. Our results indicate that speech processing occurs predominantly in the left hemisphere at the early, pre-attentive level of auditory analysis.
Auditory cortical change detection in adults with Asperger syndrome.
Lepistö, Tuulia; Nieminen-von Wendt, Taina; von Wendt, Lennart; Näätänen, Risto; Kujala, Teija
2007-03-06
The present study investigated whether auditory deficits reported in children with Asperger syndrome (AS) are also present in adulthood. To this end, event-related potentials (ERPs) were recorded from adults with AS for duration, pitch, and phonetic changes in vowels, and for acoustically matched non-speech stimuli. These subjects had enhanced mismatch negativity (MMN) amplitudes particularly for pitch and duration deviants, indicating enhanced sound-discrimination abilities. Furthermore, as reflected by the P3a, their involuntary orienting was enhanced for changes in non-speech sounds, but tended to be deficient for changes in speech sounds. The results are consistent with those reported earlier in children with AS, except for the duration-MMN, which was diminished in children and enhanced in adults.
Johnson, Erin Phinney; Pennington, Bruce F.; Lowenstein, Joanna H.; Nittrouer, Susan
2011-01-01
Purpose Children with speech sound disorder (SSD) and reading disability (RD) have poor phonological awareness, a problem believed to arise largely from deficits in processing the sensory information in speech, specifically individual acoustic cues. However, such cues are details of acoustic structure. Recent theories suggest that listeners also need to be able to integrate those details to perceive linguistically relevant form. This study examined abilities of children with SSD, RD, and SSD+RD not only to process acoustic cues but also to recover linguistically relevant form from the speech signal. Method Ten- to 11-year-olds with SSD (n = 17), RD (n = 16), SSD+RD (n = 17), and Controls (n = 16) were tested to examine their sensitivity to (1) voice onset times (VOT); (2) spectral structure in fricative-vowel syllables; and (3) vocoded sentences. Results Children in all groups performed similarly with VOT stimuli, but children with disorders showed delays on other tasks, although the specifics of their performance varied. Conclusion Children with poor phonemic awareness not only lack sensitivity to acoustic details, but are also less able to recover linguistically relevant forms. This is contrary to one of the main current theories of the relation between spoken and written language development. PMID:21329941
Speech motor planning and execution deficits in early childhood stuttering.
Walsh, Bridget; Mettel, Kathleen Marie; Smith, Anne
2015-01-01
Five to eight percent of preschool children develop stuttering, a speech disorder with clearly observable, hallmark symptoms: sound repetitions, prolongations, and blocks. While the speech motor processes underlying stuttering have been widely documented in adults, few studies to date have assessed the speech motor dynamics of stuttering near its onset. We assessed fundamental characteristics of speech movements in preschool children who stutter and their fluent peers to determine if atypical speech motor characteristics described for adults are early features of the disorder or arise later in the development of chronic stuttering. Orofacial movement data were recorded from 58 children who stutter and 43 children who do not stutter aged 4;0 to 5;11 (years; months) in a sentence production task. For single speech movements and multiple speech movement sequences, we computed displacement amplitude, velocity, and duration. For the phrase level movement sequence, we computed an index of articulation coordination consistency for repeated productions of the sentence. Boys who stutter, but not girls, produced speech with reduced amplitudes and velocities of articulatory movement. All children produced speech with similar durations. Boys, particularly the boys who stuttered, had more variable patterns of articulatory coordination compared to girls. This study is the first to demonstrate sex-specific differences in speech motor control processes between preschool boys and girls who are stuttering. The sex-specific lag in speech motor development in many boys who stutter likely has significant implications for the dramatically different recovery rates between male and female preschoolers who stutter. Further, our findings document that atypical speech motor development is an early feature of stuttering.
NASA Astrophysics Data System (ADS)
Ryan, Timothy James
The effects of multiple arrivals on the intelligibility of speech produced by live-sound reinforcement systems are examined. The intent is to determine if correlations exist between the manipulation of sound system optimization parameters and the subjective attribute speech intelligibility. Given the number, and wide range, of variables involved, this exploratory research project attempts to narrow the focus of further studies. Investigated variables are delay time between signals arriving from multiple elements of a loudspeaker array, array type and geometry and the two-way interactions of speech-to-noise ratio and array geometry with delay time. Intelligibility scores were obtained through subjective evaluation of binaural recordings, reproduced via headphone, using the Modified Rhyme Test. These word-score results are compared with objective measurements of Speech Transmission Index (STI). Results indicate that both variables, delay time and array geometry, have significant effects on intelligibility. Additionally, it is seen that all three of the possible two-way interactions have significant effects. Results further reveal that the STI measurement method overestimates the decrease in intelligibility due to short delay times between multiple arrivals.
Loebach, Jeremy L; Pisoni, David B; Svirsky, Mario A
2009-12-01
The objective of this study was to assess whether training on speech processed with an eight-channel noise vocoder to simulate the output of a cochlear implant would produce transfer of auditory perceptual learning to the recognition of nonspeech environmental sounds, the identification of speaker gender, and the discrimination of talkers by voice. Twenty-four normal-hearing subjects were trained to transcribe meaningful English sentences processed with a noise vocoder simulation of a cochlear implant. An additional 24 subjects served as an untrained control group and transcribed the same sentences in their unprocessed form. All subjects completed pre- and post-test sessions in which they transcribed vocoded sentences to provide an assessment of training efficacy. Transfer of perceptual learning was assessed using a series of closed set, nonlinguistic tasks: subjects identified talker gender, discriminated the identity of pairs of talkers, and identified ecologically significant environmental sounds from a closed set of alternatives. Although both groups of subjects showed significant pre- to post-test improvements, subjects who transcribed vocoded sentences during training performed significantly better at post-test than those in the control group. Both groups performed equally well on gender identification and talker discrimination. Subjects who received explicit training on the vocoded sentences, however, performed significantly better on environmental sound identification than the untrained subjects. Moreover, across both groups, pre-test speech performance and, to a higher degree, post-test speech performance, were significantly correlated with environmental sound identification. For both groups, environmental sounds that were characterized as having more salient temporal information were identified more often than environmental sounds that were characterized as having more salient spectral information. Listeners trained to identify noise-vocoded sentences showed evidence of transfer of perceptual learning to the identification of environmental sounds. In addition, the correlation between environmental sound identification and sentence transcription indicates that subjects who were better able to use the degraded acoustic information to identify the environmental sounds were also better able to transcribe the linguistic content of novel sentences. Both trained and untrained groups performed equally well ( approximately 75% correct) on the gender-identification task, indicating that training did not have an effect on the ability to identify the gender of talkers. Although better than chance, performance on the talker discrimination task was poor overall ( approximately 55%), suggesting that either explicit training is required to discriminate talkers' voices reliably or that additional information (perhaps spectral in nature) not present in the vocoded speech is required to excel in such tasks. Taken together, the results suggest that although transfer of auditory perceptual learning with spectrally degraded speech does occur, explicit task-specific training may be necessary for tasks that cannot rely on temporal information alone.
Perception of Intersensory Synchrony in Audiovisual Speech: Not that Special
ERIC Educational Resources Information Center
Vroomen, Jean; Stekelenburg, Jeroen J.
2011-01-01
Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired ("unity assumption"). Participants made…
Numerical Models for Sound Propagation in Long Spaces
NASA Astrophysics Data System (ADS)
Lai, Chenly Yuen Cheung
Both reverberation time and steady-state sound field are the key elements for assessing the acoustic condition in an enclosed space. They affect the noise propagation, speech intelligibility, clarity index, and definition. Since the sound field in a long space is non diffuse, classical room acoustics theory does not apply in this situation. The ray tracing technique and the image source methods are two common models to fathom both reverberation time and steady-state sound field in long enclosures nowadays. Although both models can give an accurate estimate of reverberation times and steady-state sound field directly or indirectly, they often involve time-consuming calculations. In order to simplify the acoustic consideration, a theoretical formulation has been developed for predicting both steady-state sound fields and reverberation times in street canyons. The prediction model is further developed to predict the steady-state sound field in a long enclosure. Apart from the straight long enclosure, there are other variations such as a cross junction, a long enclosure with a T-intersection, an U-turn long enclosure. In the present study, an theoretical and experimental investigations were conducted to develop formulae for predicting reverberation times and steady-state sound fields in a junction of a street canyon and in a long enclosure with T-intersection. The theoretical models are validated by comparing the numerical predictions with published experimental results. The theoretical results are also compared with precise indoor measurements and large-scale outdoor experimental results. In all of previous acoustical studies related to long enclosure, most of the studies are focused on the monopole sound source. Besides non-directional noise source, many noise sources in long enclosure are dipole like, such as train noise and fan noise. In order to study the characteristics of directional noise sources, a review of available dipole source was conducted. A dipole was constructed which was subsequent used for experimental studies. In additional, a theoretical model was developed for predicting dipole sound fields. The theoretical model can be used to study the effect of a dipole source on the speech intelligibility in long enclosures.
Ding, Nai; Pan, Xunyi; Luo, Cheng; Su, Naifei; Zhang, Wen; Zhang, Jianfeng
2018-01-31
How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention. SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention. Copyright © 2018 the authors 0270-6474/18/381178-11$15.00/0.
Wolfe, Jace; Schafer, Erin; Parkinson, Aaron; John, Andrew; Hudson, Mary; Wheeler, Julie; Mucci, Angie
2013-01-01
The objective of this study was to compare speech recognition in quiet and in noise for cochlear implant recipients using two different types of personal frequency modulation (FM) systems (directly coupled [direct auditory input] versus induction neckloop) with each of two sound processors (Cochlear Nucleus Freedom versus Cochlear Nucleus 5). Two different experiments were conducted within this study. In both these experiments, mixing of the FM signal within the Freedom processor was implemented via the same scheme used clinically for the Freedom sound processor. In Experiment 1, the aforementioned comparisons were conducted with the Nucleus 5 programmed so that the microphone and FM signals were mixed and then the mixed signals were subjected to autosensitivity control (ASC). In Experiment 2, comparisons between the two FM systems and processors were conducted again with the Nucleus 5 programmed to provide a more complex multistage implementation of ASC during the preprocessing stage. This study was a within-subject, repeated-measures design. Subjects were recruited from the patient population at the Hearts for Hearing Foundation in Oklahoma City, OK. Fifteen subjects participated in Experiment 1, and 16 subjects participated in Experiment 2. Subjects were adults who had used either unilateral or bilateral cochlear implants for at least 1 year. In this experiment, no differences were found in speech recognition in quiet obtained with the two different FM systems or the various sound-processor conditions. With each sound processor, speech recognition in noise was better with the directly coupled direct auditory input system relative to the neckloop system. The multistage ASC processing of the Nucleus 5 sound processor provided better performance than the single-stage approach for the Nucleus 5 and the Nucleus Freedom sound processor. Speech recognition in noise is substantially affected by the type of sound processor, FM system, and implementation of ASC used by a Cochlear implant recipient.
Affective Properties of Mothers' Speech to Infants with Hearing Impairment and Cochlear Implants
ERIC Educational Resources Information Center
Kondaurova, Maria V.; Bergeson, Tonya R.; Xu, Huiping; Kitamura, Christine
2015-01-01
Purpose: The affective properties of infant-directed speech influence the attention of infants with normal hearing to speech sounds. This study explored the affective quality of maternal speech to infants with hearing impairment (HI) during the 1st year after cochlear implantation as compared to speech to infants with normal hearing. Method:…
McNeill, Brigid C; Wolter, Julie; Gillon, Gail T
2017-05-17
This study explored the specific nature of a spelling impairment in children with speech sound disorder (SSD) in relation to metalinguistic predictors of spelling development. The metalinguistic (phoneme, morphological, and orthographic awareness) and spelling development of 28 children ages 6-8 years with a history of inconsistent SSD were compared to those of their age-matched (n = 28) and reading-matched (n = 28) peers. Analysis of the literacy outcomes of children within the cohort with persistent (n = 18) versus resolved (n = 10) SSD was also conducted. The age-matched peers outperformed the SSD group on all measures. Children with SSD performed comparably to their reading-matched peers on metalinguistic measures but exhibited lower spelling scores. Children with persistent SSD generally had less favorable outcomes than children with resolved SSD; however, even children with resolved SSD performed poorly on normative spelling measures. Children with SSD have a specific difficulty with spelling that is not commensurate with their metalinguistic and reading ability. Although low metalinguistic awareness appears to inhibit these children's spelling development, other factors should be considered, such as nonverbal rehearsal during spelling attempts and motoric ability. Integration of speech-production and spelling-intervention goals is important to enhance literacy outcomes for this group.
Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan
2009-06-01
The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p < 0.001) than with the 30 IIDR. Group mean CNC scores at 60 dB SPL, loudness ratings, and the signal to noise ratios-50 for Bamford-Kowal-Bench Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
A systematic review of treatment intensity in speech disorders.
Kaipa, Ramesh; Peterson, Abigail Marie
2016-12-01
Treatment intensity (sometimes referred to as "practice amount") has been well-investigated in learning non-speech tasks, but its role in treating speech disorders has not been largely analysed. This study reviewed the literature regarding treatment intensity in speech disorders. A systematic search was conducted in four databases using appropriate search terms. Seven articles from a total of 580 met the inclusion criteria. The speech disorders investigated included speech sound disorders, dysarthria, acquired apraxia of speech and childhood apraxia of speech. All seven studies were evaluated for their methodological quality, research phase and evidence level. Evidence level of reviewed studies ranged from moderate to strong. With regard to the research phase, only one study was considered to be phase III research, which corresponds to the controlled trial phase. The remaining studies were considered to be phase II research, which corresponds to the phase where magnitude of therapeutic effect is assessed. Results suggested that higher treatment intensity was favourable over lower treatment intensity of specific treatment technique(s) for treating childhood apraxia of speech and speech sound (phonological) disorders. Future research should incorporate randomised-controlled designs to establish optimal treatment intensity that is specific to each of the speech disorders.
Lexical and phonological variability in preschool children with speech sound disorder.
Macrae, Toby; Tyler, Ann A; Lewis, Kerry E
2014-02-01
The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.
The prevalence of speech disorder in primary school students in Yazd-Iran.
Karbasi, Sedighah Akhavan; Fallah, Razieh; Golestan, Motaharah
2011-01-01
Communication disorder is a widespread disabling problems and associated with adverse, long term outcome that impact on individuals, families and academic achievement of children in the school years and affect vocational choices later in adulthood. The aim of this study was to determine prevalence of speech disorders specifically stuttering, voice, and speech-sound disorders in primary school students in Iran-Yazd. In a descriptive study, 7881 primary school students in Yazd evaluated in view from of speech disorders with use of direct and face to face assessment technique in 2005. The prevalence of total speech disorders was 14.8% among whom 13.8% had speech-sound disorder, 1.2% stuttering and 0.47% voice disorder. The prevalence of speech disorders was higher than in males (16.7%) as compared to females (12.7%). Pattern of prevalence of the three speech disorders was significantly different according to gender, parental education and by number of family member. There was no significant difference across speech disorders and birth order, religion and paternal consanguinity. These prevalence figures are higher than more studies that using parent or teacher reports.
Maitre, Nathalie L.; Slaughter, James C.; Aschner, Judy L.; Key, Alexandra P.
2014-01-01
Neurodevelopmental delays in intensive care neonates are common but difficult to predict. In children, hemisphere differences in cortical processing of speech are predictive of cognitive performance. We hypothesized that hemisphere differences in auditory event-related potentials in intensive care neonates are predictive of neurodevelopment in infancy, even in those born preterm. Event-related potentials to speech sounds were prospectively recorded in 57 infants (gestational age 24–40 weeks) prior to discharge. The Developmental Assessment of Young Children was performed at 6 and 12 months. Hemisphere differences in mean amplitudes increased with postnatal age (P < .01) but not with gestational age. Greater hemisphere differences were associated with improved communication and cognitive scores at 6 and 12 months, but decreased in significance at 12 months after adjusting for socioeconomic and clinical factors. Auditory cortical responses can be used in intensive care neonates to help identify infants at higher risk for delays in infancy. PMID:23864588
Chang, Son-A; Won, Jong Ho; Kim, HyangHee; Oh, Seung-Ha; Tyler, Richard S.; Cho, Chang Hyun
2018-01-01
Background and Objectives It is important to understand the frequency region of cues used, and not used, by cochlear implant (CI) recipients. Speech and environmental sound recognition by individuals with CI and normal-hearing (NH) was measured. Gradients were also computed to evaluate the pattern of change in identification performance with respect to the low-pass filtering or high-pass filtering cutoff frequencies. Subjects and Methods Frequency-limiting effects were implemented in the acoustic waveforms by passing the signals through low-pass filters (LPFs) or high-pass filters (HPFs) with seven different cutoff frequencies. Identification of Korean vowels and consonants produced by a male and female speaker and environmental sounds was measured. Crossover frequencies were determined for each identification test, where the LPF and HPF conditions show the identical identification scores. Results CI and NH subjects showed changes in identification performance in a similar manner as a function of cutoff frequency for the LPF and HPF conditions, suggesting that the degraded spectral information in the acoustic signals may similarly constraint the identification performance for both subject groups. However, CI subjects were generally less efficient than NH subjects in using the limited spectral information for speech and environmental sound identification due to the inefficient coding of acoustic cues through the CI sound processors. Conclusions This finding will provide vital information in Korean for understanding how different the frequency information is in receiving speech and environmental sounds by CI processor from normal hearing. PMID:29325391
Chang, Son-A; Won, Jong Ho; Kim, HyangHee; Oh, Seung-Ha; Tyler, Richard S; Cho, Chang Hyun
2017-12-01
It is important to understand the frequency region of cues used, and not used, by cochlear implant (CI) recipients. Speech and environmental sound recognition by individuals with CI and normal-hearing (NH) was measured. Gradients were also computed to evaluate the pattern of change in identification performance with respect to the low-pass filtering or high-pass filtering cutoff frequencies. Frequency-limiting effects were implemented in the acoustic waveforms by passing the signals through low-pass filters (LPFs) or high-pass filters (HPFs) with seven different cutoff frequencies. Identification of Korean vowels and consonants produced by a male and female speaker and environmental sounds was measured. Crossover frequencies were determined for each identification test, where the LPF and HPF conditions show the identical identification scores. CI and NH subjects showed changes in identification performance in a similar manner as a function of cutoff frequency for the LPF and HPF conditions, suggesting that the degraded spectral information in the acoustic signals may similarly constraint the identification performance for both subject groups. However, CI subjects were generally less efficient than NH subjects in using the limited spectral information for speech and environmental sound identification due to the inefficient coding of acoustic cues through the CI sound processors. This finding will provide vital information in Korean for understanding how different the frequency information is in receiving speech and environmental sounds by CI processor from normal hearing.
Using a new, free spectrograph program to critically investigate acoustics
NASA Astrophysics Data System (ADS)
Ball, Edward; Ruiz, Michael J.
2016-11-01
We have developed an online spectrograph program with a bank of over 30 audio clips to visualise a variety of sounds. Our audio library includes everyday sounds such as speech, singing, musical instruments, birds, a baby, cat, dog, sirens, a jet, thunder, and screaming. We provide a link to a video of the sound sources superimposed with their respective spectrograms in real time. Readers can use our spectrograph program to view our library, open their own desktop audio files, and use the program in real time with a computer microphone.
One approach to design of speech emotion database
NASA Astrophysics Data System (ADS)
Uhrin, Dominik; Chmelikova, Zdenka; Tovarek, Jaromir; Partila, Pavol; Voznak, Miroslav
2016-05-01
This article describes a system for evaluating the credibility of recordings with emotional character. Sound recordings form Czech language database for training and testing systems of speech emotion recognition. These systems are designed to detect human emotions in his voice. The emotional state of man is useful in the security forces and emergency call service. Man in action (soldier, police officer and firefighter) is often exposed to stress. Information about the emotional state (his voice) will help to dispatch to adapt control commands for procedure intervention. Call agents of emergency call service must recognize the mental state of the caller to adjust the mood of the conversation. In this case, the evaluation of the psychological state is the key factor for successful intervention. A quality database of sound recordings is essential for the creation of the mentioned systems. There are quality databases such as Berlin Database of Emotional Speech or Humaine. The actors have created these databases in an audio studio. It means that the recordings contain simulated emotions, not real. Our research aims at creating a database of the Czech emotional recordings of real human speech. Collecting sound samples to the database is only one of the tasks. Another one, no less important, is to evaluate the significance of recordings from the perspective of emotional states. The design of a methodology for evaluating emotional recordings credibility is described in this article. The results describe the advantages and applicability of the developed method.
The Dynamic Nature of Speech Perception
ERIC Educational Resources Information Center
McQueen, James M.; Norris, Dennis; Cutler, Anne
2006-01-01
The speech perception system must be flexible in responding to the variability in speech sounds caused by differences among speakers and by language change over the lifespan of the listener. Indeed, listeners use lexical knowledge to retune perception of novel speech (Norris, McQueen, & Cutler, 2003). In that study, Dutch listeners made…
Normal Aspects of Speech, Hearing, and Language.
ERIC Educational Resources Information Center
Minifie, Fred. D., Ed.; And Others
This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
Hemispheric Differences in the Effects of Context on Vowel Perception
ERIC Educational Resources Information Center
Sjerps, Matthias J.; Mitterer, Holger; McQueen, James M.
2012-01-01
Listeners perceive speech sounds relative to context. Contextual influences might differ over hemispheres if different types of auditory processing are lateralized. Hemispheric differences in contextual influences on vowel perception were investigated by presenting speech targets and both speech and non-speech contexts to listeners' right or left…
Intensive treatment of speech disorders in robin sequence: a case report.
Pinto, Maria Daniela Borro; Pegoraro-Krook, Maria Inês; Andrade, Laura Katarine Félix de; Correa, Ana Paula Carvalho; Rosa-Lugo, Linda Iris; Dutka, Jeniffer de Cássia Rillo
2017-10-23
To describe the speech of a patient with Pierre Robin Sequence (PRS) and severe speech disorders before and after participating in an Intensive Speech Therapy Program (ISTP). The ISTP consisted of two daily sessions of therapy over a 36-week period, resulting in a total of 360 therapy sessions. The sessions included the phases of establishment, generalization, and maintenance. A combination of strategies, such as modified contrast therapy and speech sound perception training, were used to elicit adequate place of articulation. The ISTP addressed correction of place of production of oral consonants and maximization of movement of the pharyngeal walls with a speech bulb reduction program. Therapy targets were addressed at the phonetic level with a gradual increase in the complexity of the productions hierarchically (e.g., syllables, words, phrases, conversation) while simultaneously addressing the velopharyngeal hypodynamism with speech bulb reductions. Re-evaluation after the ISTP revealed normal speech resonance and articulation with the speech bulb. Nasoendoscopic assessment indicated consistent velopharyngeal closure for all oral sounds with the speech bulb in place. Intensive speech therapy, combined with the use of the speech bulb, yielded positive outcomes in the rehabilitation of a clinical case with severe speech disorders associated with velopharyngeal dysfunction in Pierre Robin Sequence.
The role of reverberation-related binaural cues in the externalization of speech.
Catic, Jasmina; Santurette, Sébastien; Dau, Torsten
2015-08-01
The perception of externalization of speech sounds was investigated with respect to the monaural and binaural cues available at the listeners' ears in a reverberant environment. Individualized binaural room impulse responses (BRIRs) were used to simulate externalized sound sources via headphones. The measured BRIRs were subsequently modified such that the proportion of the response containing binaural vs monaural information was varied. Normal-hearing listeners were presented with speech sounds convolved with such modified BRIRs. Monaural reverberation cues were found to be sufficient for the externalization of a lateral sound source. In contrast, for a frontal source, an increased amount of binaural cues from reflections was required in order to obtain well externalized sound images. It was demonstrated that the interaction between the interaural cues of the direct sound and the reverberation strongly affects the perception of externalization. An analysis of the short-term binaural cues showed that the amount of fluctuations of the binaural cues corresponded well to the externalization ratings obtained in the listening tests. The results further suggested that the precedence effect is involved in the auditory processing of the dynamic binaural cues that are utilized for externalization perception.
Dance, Stephen; Backus, Bradford; Morales, Lorenzo
2018-01-01
Introduction: The effect of a sound reinforcement system, in terms of speech intelligibility, has been systematically determined under realistic conditions. Different combinations of ambient and reverberant conditions representative of a classroom environment have been investigated. Materials and Methods: By comparing the measured speech transmission index metric with and without the system in the same space under different room acoustics conditions, it was possible to determine when the system was most effective. A new simple criterion, equivalent noise reduction (ENR), was introduced to determine the effectiveness of the sound reinforcement system which can be used to predict the speech transmission index based on the ambient sound pressure and reverberation time with and without amplification. Results: This criterion had a correlation, R2 > 0.97. It was found that sound reinforcement provided no benefit if the competing noise level was less than 40 dBA. However, the maximum benefit of such a system was equivalent to a 7.7 dBA noise reduction. Conclusion: Using the ENR model, it would be possible to determine the suitability of implementing sound reinforcement systems in any room, thus providing a tool to determine if natural acoustic treatment or sound field amplification would be of most benefit to the occupants of any particular room. PMID:29785972
Dance, Stephen; Backus, Bradford; Morales, Lorenzo
2018-01-01
The effect of a sound reinforcement system, in terms of speech intelligibility, has been systematically determined under realistic conditions. Different combinations of ambient and reverberant conditions representative of a classroom environment have been investigated. By comparing the measured speech transmission index metric with and without the system in the same space under different room acoustics conditions, it was possible to determine when the system was most effective. A new simple criterion, equivalent noise reduction (ENR), was introduced to determine the effectiveness of the sound reinforcement system which can be used to predict the speech transmission index based on the ambient sound pressure and reverberation time with and without amplification. This criterion had a correlation, R 2 > 0.97. It was found that sound reinforcement provided no benefit if the competing noise level was less than 40 dBA. However, the maximum benefit of such a system was equivalent to a 7.7 dBA noise reduction. Using the ENR model, it would be possible to determine the suitability of implementing sound reinforcement systems in any room, thus providing a tool to determine if natural acoustic treatment or sound field amplification would be of most benefit to the occupants of any particular room.
Early electrophysiological markers of atypical language processing in prematurely born infants.
Paquette, Natacha; Vannasing, Phetsamone; Tremblay, Julie; Lefebvre, Francine; Roy, Marie-Sylvie; McKerral, Michelle; Lepore, Franco; Lassonde, Maryse; Gallagher, Anne
2015-12-01
Because nervous system development may be affected by prematurity, many prematurely born children present language or cognitive disorders at school age. The goal of this study is to investigate whether these impairments can be identified early in life using electrophysiological auditory event-related potentials (AERPs) and mismatch negativity (MMN). Brain responses to speech and non-speech stimuli were assessed in prematurely born children to identify early electrophysiological markers of language and cognitive impairments. Participants were 74 children (41 full-term, 33 preterm) aged 3, 12, and 36 months. Pre-attentional auditory responses (MMN and AERPs) were assessed using an oddball paradigm, with speech and non-speech stimuli presented in counterbalanced order between participants. Language and cognitive development were assessed using the Bayley Scale of Infant Development, Third Edition (BSID-III). Results show that preterms as young as 3 months old had delayed MMN response to speech stimuli compared to full-terms. A significant negative correlation was also found between MMN latency to speech sounds and the BSID-III expressive language subscale. However, no significant differences between full-terms and preterms were found for the MMN to non-speech stimuli, suggesting preserved pre-attentional auditory discrimination abilities in these children. Identification of early electrophysiological markers for delayed language development could facilitate timely interventions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Perception and Confusion of Speech Sounds by Adults with a Cochlear Implant
ERIC Educational Resources Information Center
Rodvik, Arne K.
2008-01-01
The aim of this pilot study was to identify the most common speech sound confusions of 5 Norwegian cochlear implanted post-lingually deafened adults. We played recorded nonwords, aCa, iCi and bVb, to our informants, asked them to repeat what they heard, recorded their repetitions and transcribed these phonetically. We arranged the collected data…
ERIC Educational Resources Information Center
Powell, Thomas W.
2008-01-01
Purpose: The use of nonspeech oral motor treatments (NSOMTs) in the management of pediatric speech sound production disorders is controversial. This article serves as a prologue to a clinical forum that examines this topic in depth. Method: Theoretical, historical, and ethical issues are reviewed to create a series of clinical questions that…
Literacy Outcomes of Children with Early Childhood Speech Sound Disorders: Impact of Endophenotypes
ERIC Educational Resources Information Center
Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Hansen, Amy J.; Sucheston, Lara E.; Kuo, Iris; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.
2011-01-01
Purpose: To demonstrate that early childhood speech sound disorders (SSD) and later school-age reading, written expression, and spelling skills are influenced by shared endophenotypes that may be in part genetic. Method: Children with SSD and their siblings were assessed at early childhood (ages 4-6 years) and followed at school age (7-12 years).…
ERIC Educational Resources Information Center
Osnes, Berge; Hugdahl, Kenneth; Hjelmervik, Helene; Specht, Karsten
2012-01-01
In studies on auditory speech perception, participants are often asked to perform active tasks, e.g. decide whether the perceived sound is a speech sound or not. However, information about the stimulus, inherent in such tasks, may induce expectations that cause altered activations not only in the auditory cortex, but also in frontal areas such as…
ERIC Educational Resources Information Center
Overby, Megan; Carrell, Thomas; Bernthal, John
2007-01-01
Purpose: This study examined 2nd-grade teachers' perceptions of the academic, social, and behavioral competence of students with speech sound disorders (SSDs). Method: Forty-eight 2nd-grade teachers listened to 2 groups of sentences differing by intelligibility and pitch but spoken by a single 2nd grader. For each sentence group, teachers rated…
ERIC Educational Resources Information Center
Blau, Vera; Reithler, Joel; van Atteveldt, Nienke; Seitz, Jochen; Gerretsen, Patty; Goebel, Rainer; Blomert, Leo
2010-01-01
Learning to associate auditory information of speech sounds with visual information of letters is a first and critical step for becoming a skilled reader in alphabetic languages. Nevertheless, it remains largely unknown which brain areas subserve the learning and automation of such associations. Here, we employ functional magnetic resonance…
Macrae, Toby; Tyler, Ann A
2014-10-01
The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different tests of articulation/phonology, percent consonants correct, and the number of omission, substitution, distortion, typical, and atypical error patterns used in the production of different wordlists that had similar levels of phonetic and structural complexity. In comparison with children with SSD only, children with SSD and LI used similar numbers but different types of errors, including more omission patterns ( p < .001, d = 1.55) and fewer distortion patterns ( p = .022, d = 1.03). There were no significant differences in substitution, typical, and atypical error pattern use. Frequent omission error pattern use may reflect a more compromised linguistic system characterized by absent phonological representations for target sounds (see Shriberg et al., 2005). Research is required to examine the diagnostic potential of early frequent omission error pattern use in predicting later diagnoses of co-occurring SSD and LI and/or reading problems.
Action planning and predictive coding when speaking
Wang, Jun; Mathalon, Daniel H.; Roach, Brian J.; Reilly, James; Keedy, Sarah; Sweeney, John A.; Ford, Judith M.
2014-01-01
Across the animal kingdom, sensations resulting from an animal's own actions are processed differently from sensations resulting from external sources, with self-generated sensations being suppressed. A forward model has been proposed to explain this process across sensorimotor domains. During vocalization, reduced processing of one's own speech is believed to result from a comparison of speech sounds to corollary discharges of intended speech production generated from efference copies of commands to speak. Until now, anatomical and functional evidence validating this model in humans has been indirect. Using EEG with anatomical MRI to facilitate source localization, we demonstrate that inferior frontal gyrus activity during the 300ms before speaking was associated with suppressed processing of speech sounds in auditory cortex around 100ms after speech onset (N1). These findings indicate that an efference copy from speech areas in prefrontal cortex is transmitted to auditory cortex, where it is used to suppress processing of anticipated speech sounds. About 100ms after N1, a subsequent auditory cortical component (P2) was not suppressed during talking. The combined N1 and P2 effects suggest that although sensory processing is suppressed as reflected in N1, perceptual gaps are filled as reflected in the lack of P2 suppression, explaining the discrepancy between sensory suppression and preserved sensory experiences. These findings, coupled with the coherence between relevant brain regions before and during speech, provide new mechanistic understanding of the complex interactions between action planning and sensory processing that provide for differentiated tagging and monitoring of one's own speech, processes disrupted in neuropsychiatric disorders. PMID:24423729
Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D.; Senn, Pascal
2013-01-01
Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals. PMID:23359119
Early Vocabulary Development in Children with Bilateral Cochlear Implants
ERIC Educational Resources Information Center
Välimaa, Taina; Kunnari, Sari; Laukkanen-Nevala, Päivi; Lonka, Eila
2018-01-01
Background: Children with unilateral cochlear implants (CIs) may have delayed vocabulary development for an extended period after implantation. Bilateral cochlear implantation is reported to be associated with improved sound localization and enhanced speech perception in noise. This study proposed that bilateral implantation might also promote…
Geometric Constraints on Human Speech Sound Inventories
Dunbar, Ewan; Dupoux, Emmanuel
2016-01-01
We investigate the idea that the languages of the world have developed coherent sound systems in which having one sound increases or decreases the chances of having certain other sounds, depending on shared properties of those sounds. We investigate the geometries of sound systems that are defined by the inherent properties of sounds. We document three typological tendencies in sound system geometries: economy, a tendency for the differences between sounds in a system to be definable on a relatively small number of independent dimensions; local symmetry, a tendency for sound systems to have relatively large numbers of pairs of sounds that differ only on one dimension; and global symmetry, a tendency for sound systems to be relatively balanced. The finding of economy corroborates previous results; the two symmetry properties have not been previously documented. We also investigate the relation between the typology of inventory geometries and the typology of individual sounds, showing that the frequency distribution with which individual sounds occur across languages works in favor of both local and global symmetry. PMID:27462296
Yoon, Sung Hoon; Nam, Kyoung Won; Yook, Sunhyun; Cho, Baek Hwan; Jang, Dong Pyo; Hong, Sung Hwa; Kim, In Young
2017-03-01
In an effort to improve hearing aid users' satisfaction, recent studies on trainable hearing aids have attempted to implement one or two environmental factors into training. However, it would be more beneficial to train the device based on the owner's personal preferences in a more expanded environmental acoustic conditions. Our study aimed at developing a trainable hearing aid algorithm that can reflect the user's individual preferences in a more extensive environmental acoustic conditions (ambient sound level, listening situation, and degree of noise suppression) and evaluated the perceptual benefit of the proposed algorithm. Ten normal hearing subjects participated in this study. Each subjects trained the algorithm to their personal preference and the trained data was used to record test sounds in three different settings to be utilized to evaluate the perceptual benefit of the proposed algorithm by performing the Comparison Mean Opinion Score test. Statistical analysis revealed that of the 10 subjects, four showed significant differences in amplification constant settings between the noise-only and speech-in-noise situation ( P <0.05) and one subject also showed significant difference between the speech-only and speech-in-noise situation ( P <0.05). Additionally, every subject preferred different β settings for beamforming in all different input sound levels. The positive findings from this study suggested that the proposed algorithm has potential to improve hearing aid users' personal satisfaction under various ambient situations.
NASA Astrophysics Data System (ADS)
O'Donnell, Michael J.; Bisnovatyi, Ilia
2000-11-01
Computing practice today depends on visual output to drive almost all user interaction. Other senses, such as audition, may be totally neglected, or used tangentially, or used in highly restricted specialized ways. We have excellent audio rendering through D-A conversion, but we lack rich general facilities for modeling and manipulating sound comparable in quality and flexibility to graphics. We need coordinated research in several disciplines to improve the use of sound as an interactive information channel. Incremental and separate improvements in synthesis, analysis, speech processing, audiology, acoustics, music, etc. will not alone produce the radical progress that we seek in sonic practice. We also need to create a new central topic of study in digital audio research. The new topic will assimilate the contributions of different disciplines on a common foundation. The key central concept that we lack is sound as a general-purpose information channel. We must investigate the structure of this information channel, which is driven by the cooperative development of auditory perception and physical sound production. Particular audible encodings, such as speech and music, illuminate sonic information by example, but they are no more sufficient for a characterization than typography is sufficient for characterization of visual information. To develop this new conceptual topic of sonic information structure, we need to integrate insights from a number of different disciplines that deal with sound. In particular, we need to coordinate central and foundational studies of the representational models of sound with specific applications that illuminate the good and bad qualities of these models. Each natural or artificial process that generates informative sound, and each perceptual mechanism that derives information from sound, will teach us something about the right structure to attribute to the sound itself. The new Sound topic will combine the work of computer scientists with that of numerical mathematicians studying sonification, psychologists, linguists, bioacousticians, and musicians to illuminate the structure of sound from different angles. Each of these disciplines deals with the use of sound to carry a different sort of information, under different requirements and constraints. By combining their insights, we can learn to understand of the structure of sound in general.
ERIC Educational Resources Information Center
Lu, Shuang
2013-01-01
The relationship between speech perception and production has been debated for a long time. The Motor Theory of speech perception (Liberman et al., 1989) claims that perceiving speech is identifying the intended articulatory gestures rather than perceiving the sound patterns. It seems to suggest that speech production precedes speech perception,…
Higher order statistical analysis of /x/ in male speech.
Orr, M C; Lithgow, B
2005-03-01
This paper presents a study of kurtosis analysis for the sound /x/ in male speech, /x/ is the sound of the 'o' at the end of words such as 'ago'. The sound analysed for this paper came from the Australian National Database of Spoken Language, more specifically the male speaker 17. The /x/ was isolated and extracted from the database by the author in a quiet booth using standard multimedia software. A 5 millisecond window was used for the analysis as it was shown previously by the author to be the most appropriate size for speech phoneme analysis. The significance of the research presented here is shown in the results where a majority of coefficients had a platykurtic (kurtosis between 0 and 3) value as opposed to the previously held leptokurtic (kurtosis > 3) belief.
Restoring speech perception with cochlear implants by spanning defective electrode contacts.
Frijns, Johan H M; Snel-Bongers, Jorien; Vellinga, Dirk; Schrage, Erik; Vanpoucke, Filiep J; Briaire, Jeroen J
2013-04-01
Even with six defective contacts, spanning can largely restore speech perception with the HiRes 120 speech processing strategy to the level supported by an intact electrode array. Moreover, the sound quality is not degraded. Previous studies have demonstrated reduced speech perception scores (SPS) with defective contacts in HiRes 120. This study investigated whether replacing defective contacts by spanning, i.e. current steering on non-adjacent contacts, is able to restore speech recognition to the level supported by an intact electrode array. Ten adult cochlear implant recipients (HiRes90K, HiFocus1J) with experience with HiRes 120 participated in this study. Three different defective electrode arrays were simulated (six separate defective contacts, three pairs or two triplets). The participants received three take-home strategies and were asked to evaluate the sound quality in five predefined listening conditions. After 3 weeks, SPS were evaluated with monosyllabic words in quiet and in speech-shaped background noise. The participants rated the sound quality equal for all take-home strategies. SPS with background noise were equal for all conditions tested. However, SPS in quiet (85% phonemes correct on average with the full array) decreased significantly with increasing spanning distance, with a 3% decrease for each spanned contact.
Díaz, Begoña; Baus, Cristina; Escera, Carles; Costa, Albert; Sebastián-Gallés, Núria
2008-01-01
Human beings differ in their ability to master the sounds of their second language (L2). Phonetic training studies have proposed that differences in phonetic learning stem from differences in psychoacoustic abilities rather than speech-specific capabilities. We aimed at finding the origin of individual differences in L2 phonetic acquisition in natural learning contexts. We consider two alternative explanations: a general psychoacoustic origin vs. a speech-specific one. For this purpose, event-related potentials (ERPs) were recorded from two groups of early, proficient Spanish-Catalan bilinguals who differed in their mastery of the Catalan (L2) phonetic contrast /e-ε/. Brain activity in response to acoustic change detection was recorded in three different conditions involving tones of different length (duration condition), frequency (frequency condition), and presentation order (pattern condition). In addition, neural correlates of speech change detection were also assessed for both native (/o/-/e/) and nonnative (/o/-/ö/) phonetic contrasts (speech condition). Participants' discrimination accuracy, reflected electrically as a mismatch negativity (MMN), was similar between the two groups of participants in the three acoustic conditions. Conversely, the MMN was reduced in poor perceivers (PP) when they were presented with speech sounds. Therefore, our results support a speech-specific origin of individual variability in L2 phonetic mastery. PMID:18852470
Using ultrasound visual biofeedback to treat persistent primary speech sound disorders.
Cleland, Joanne; Scobbie, James M; Wrench, Alan A
2015-01-01
Growing evidence suggests that speech intervention using visual biofeedback may benefit people for whom visual skills are stronger than auditory skills (for example, the hearing-impaired population), especially when the target articulation is hard to describe or see. Diagnostic ultrasound can be used to image the tongue and has recently become more compact and affordable leading to renewed interest in it as a practical, non-invasive visual biofeedback tool. In this study, we evaluate its effectiveness in treating children with persistent speech sound disorders that have been unresponsive to traditional therapy approaches. A case series of seven different children (aged 6-11) with persistent speech sound disorders were evaluated. For each child, high-speed ultrasound (121 fps), audio and lip video recordings were made while probing each child's specific errors at five different time points (before, during and after intervention). After intervention, all the children made significant progress on targeted segments, evidenced by both perceptual measures and changes in tongue-shape.
NASA Astrophysics Data System (ADS)
1992-06-01
Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.
Loiselle, Louise H; Dorman, Michael F; Yost, William A; Cook, Sarah J; Gifford, Rene H
2016-08-01
To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Eleven bilateral listeners with MED-EL (Durham, NC) CIs and 8 listeners with hearing-preservation CIs with symmetrical low frequency, acoustic hearing using the MED-EL or Cochlear device were evaluated using 2 tests designed to task binaural hearing, localization, and a simulated cocktail party. Access to interaural cues for localization was constrained by the use of low-pass, high-pass, and wideband noise stimuli. Sound-source localization accuracy for listeners with bilateral CIs in response to the high-pass noise stimulus and sound-source localization accuracy for the listeners with hearing-preservation CIs in response to the low-pass noise stimulus did not differ significantly. Speech understanding in a cocktail party listening environment improved for all listeners when interaural cues, either interaural time difference or interaural level difference, were available. The findings of the current study indicate that similar degrees of benefit to sound-source localization and speech understanding in complex listening environments are possible with 2 very different rehabilitation strategies: the provision of bilateral CIs and the preservation of hearing.
Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J
2018-01-01
Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.
Karipidis, Iliana I; Pleisch, Georgette; Brandeis, Daniel; Roth, Alexander; Röthlisberger, Martina; Schneebeli, Maya; Walitza, Susanne; Brem, Silvia
2018-05-08
During reading acquisition, neural reorganization of the human brain facilitates the integration of letters and speech sounds, which enables successful reading. Neuroimaging and behavioural studies have established that impaired audiovisual integration of letters and speech sounds is a core deficit in individuals with developmental dyslexia. This longitudinal study aimed to identify neural and behavioural markers of audiovisual integration that are related to future reading fluency. We simulated the first step of reading acquisition by performing artificial-letter training with prereading children at risk for dyslexia. Multiple logistic regressions revealed that our training provides new precursors of reading fluency at the beginning of reading acquisition. In addition, an event-related potential around 400 ms and functional magnetic resonance imaging activation patterns in the left planum temporale to audiovisual correspondences improved cross-validated prediction of future poor readers. Finally, an exploratory analysis combining simultaneously acquired electroencephalography and hemodynamic data suggested that modulation of temporoparietal brain regions depended on future reading skills. The multimodal approach demonstrates neural adaptations to audiovisual integration in the developing brain that are related to reading outcome. Despite potential limitations arising from the restricted sample size, our results may have promising implications both for identifying poor-reading children and for monitoring early interventions.
Perceptual centres in speech - an acoustic analysis
NASA Astrophysics Data System (ADS)
Scott, Sophie Kerttu
Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of the increase in amplitude within one output channel of the filterbank. When this was used to make predictions of the P-centres for all the stimuli used in the thesis, 85[percent] of the observed variance was accounted for. The FAIM approach combines aspects of previous, speech and non speech models (Gordon 1987, Marcus 1981, Vos and Rasch 1981). P-centre were thus modelled in a non speech specific, local manner.
Mcleod, Sharynne; Baker, Elise
2014-01-01
A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with previously reported practices for children with SSD in the US, UK, and the Netherlands, with some (but not all) practices aligning with current research evidence.
Wan, Catherine Y; Bazen, Loes; Baars, Rebecca; Libenson, Amanda; Zipse, Lauryn; Zuk, Jennifer; Norton, Andrea; Schlaug, Gottfried
2011-01-01
Although up to 25% of children with autism are non-verbal, there are very few interventions that can reliably produce significant improvements in speech output. Recently, a novel intervention called Auditory-Motor Mapping Training (AMMT) has been developed, which aims to promote speech production directly by training the association between sounds and articulatory actions using intonation and bimanual motor activities. AMMT capitalizes on the inherent musical strengths of children with autism, and offers activities that they intrinsically enjoy. It also engages and potentially stimulates a network of brain regions that may be dysfunctional in autism. Here, we report an initial efficacy study to provide 'proof of concept' for AMMT. Six non-verbal children with autism participated. Prior to treatment, the children had no intelligible words. They each received 40 individual sessions of AMMT 5 times per week, over an 8-week period. Probe assessments were conducted periodically during baseline, therapy, and follow-up sessions. After therapy, all children showed significant improvements in their ability to articulate words and phrases, with generalization to items that were not practiced during therapy sessions. Because these children had no or minimal vocal output prior to treatment, the acquisition of speech sounds and word approximations through AMMT represents a critical step in expressive language development in children with autism.
Words in Puddles of Sound: Modelling Psycholinguistic Effects in Speech Segmentation
ERIC Educational Resources Information Center
Monaghan, Padraic; Christiansen, Morten H.
2010-01-01
There are numerous models of how speech segmentation may proceed in infants acquiring their first language. We present a framework for considering the relative merits and limitations of these various approaches. We then present a model of speech segmentation that aims to reveal important sources of information for speech segmentation, and to…
ERIC Educational Resources Information Center
Toohill, Bethany J.; Mcleod, Sharynne; Mccormack, Jane
2012-01-01
This study investigated the effect of dialectal difference on identification and rating of severity of speech impairment in children from Indigenous Australian backgrounds. The speech of 15 Indigenous Australian children identified by their parents/caregivers and teachers as having "difficulty talking and making speech sounds" was…
Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.
2010-01-01
A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…
The Tuning of Human Neonates' Preference for Speech
ERIC Educational Resources Information Center
Vouloumanos, Athena; Hauser, Marc D.; Werker, Janet F.; Martin, Alia
2010-01-01
Human neonates prefer listening to speech compared to many nonspeech sounds, suggesting that humans are born with a bias for speech. However, neonates' preference may derive from properties of speech that are not unique but instead are shared with the vocalizations of other species. To test this, thirty neonates and sixteen 3-month-olds were…
Spectral-temporal EEG dynamics of speech discrimination processing in infants during sleep.
Gilley, Phillip M; Uhler, Kristin; Watson, Kaylee; Yoshinaga-Itano, Christine
2017-03-22
Oddball paradigms are frequently used to study auditory discrimination by comparing event-related potential (ERP) responses from a standard, high probability sound and to a deviant, low probability sound. Previous research has established that such paradigms, such as the mismatch response or mismatch negativity, are useful for examining auditory processes in young children and infants across various sleep and attention states. The extent to which oddball ERP responses may reflect subtle discrimination effects, such as speech discrimination, is largely unknown, especially in infants that have not yet acquired speech and language. Mismatch responses for three contrasts (non-speech, vowel, and consonant) were computed as a spectral-temporal probability function in 24 infants, and analyzed at the group level by a modified multidimensional scaling. Immediately following an onset gamma response (30-50 Hz), the emergence of a beta oscillation (12-30 Hz) was temporally coupled with a lower frequency theta oscillation (2-8 Hz). The spectral-temporal probability of this coupling effect relative to a subsequent theta modulation corresponds with discrimination difficulty for non-speech, vowel, and consonant contrast features. The theta modulation effect suggests that unexpected sounds are encoded as a probabilistic measure of surprise. These results support the notion that auditory discrimination is driven by the development of brain networks for predictive processing, and can be measured in infants during sleep. The results presented here have implications for the interpretation of discrimination as a probabilistic process, and may provide a basis for the development of single-subject and single-trial classification in a clinically useful context. An infant's brain is processing information about the environment and performing computations, even during sleep. These computations reflect subtle differences in acoustic feature processing that are necessary for language-learning. Results from this study suggest that brain responses to deviant sounds in an oddball paradigm follow a cascade of oscillatory modulations. This cascade begins with a gamma response that later emerges as a beta synchronization, which is temporally coupled with a theta modulation, and followed by a second, subsequent theta modulation. The difference in frequency and timing of the theta modulations appears to reflect a measure of surprise. These insights into the neurophysiological mechanisms of auditory discrimination provide a basis for exploring the clinically utility of the MMR TF and other auditory oddball responses.
Canale, Andrea; Dalmasso, Giulia; Dagna, Federico; Lacilla, Michelangelo; Montuschi, Carla; Rosa, Rosalba Di; Albera, Roberto
2016-08-01
To determine whether speech recognition scores (SRS) differ between adults with long-term auditory deprivation in the implanted ear and adults who received cochlear implant (CI) in the nonsound-deprived ear, either for hearing aid-assisted or due to rapidly deteriorating hearing loss. Retrospective study. Speech recognition scores at evaluations (3 and 14 months postimplantation) conducted with CI alone at 60-dB sound pressure level intensity were compared in 15 patients (4 with bilateral severe hearing loss; 11 with asymmetric hearing loss, 7 of which had contralateral hearing aid), all with long-term auditory deprivation (mean duration 16.9 years) (group A), and in 15 other patients with postlingual hearing loss (10 symmetric, 5 asymmetric with bimodal stimulation) (controls, group B). Comparison of mean percentage of correctly recognized words on speech audiometry at 3 and 14 months showed improvement within each group (P < 0.05). Between-group comparison showed no significant difference at 3 (P = 0.17) or 14 months (P = 0.46). Comparison of SRSs in group A (bimodal stimulation [n = 7] and binaural sound deprivation [n = 4]) versus group B showed no significant differences at 3 (bimodal stimulation P = 0.16; binaural sound deprivation P = 0.19) or 14 months (bimodal stimulation P = 0.14; binaural sound deprivation P = 0.82). Speech recognition scores in monaural and binaural sound-deprived ears did not significantly differ from ears with unilateral cochlear implantation in nonsound-deprived ears when tested with CI alone. Improvement in the implanted worse ear indicates that it could be a potential candidate ear for cochlear implantation even when sound deprived. 4. Laryngoscope, 126:1905-1910, 2016. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
Auditory stream segregation in children with Asperger syndrome
Lepistö, T.; Kuitunen, A.; Sussman, E.; Saalasti, S.; Jansson-Verkasalo, E.; Nieminen-von Wendt, T.; Kujala, T.
2009-01-01
Individuals with Asperger syndrome (AS) often have difficulties in perceiving speech in noisy environments. The present study investigated whether this might be explained by deficient auditory stream segregation ability, that is, by a more basic difficulty in separating simultaneous sound sources from each other. To this end, auditory event-related brain potentials were recorded from a group of school-aged children with AS and a group of age-matched controls using a paradigm specifically developed for studying stream segregation. Differences in the amplitudes of ERP components were found between groups only in the stream segregation conditions and not for simple feature discrimination. The results indicated that children with AS have difficulties in segregating concurrent sound streams, which ultimately may contribute to the difficulties in speech-in-noise perception. PMID:19751798
Phonology, Reading Development, and Dyslexia: A Cross-Linguistic Perspective.
ERIC Educational Resources Information Center
Goswami, Usha
2002-01-01
This article presents a theoretical overview at the cognitive level of the role of phonological awareness in reading development and developmental dyslexia across languages. It is argued that the primary deficit in developmental dyslexia in all languages lies in representing speech sounds: a deficit in phonological representation. (Contains…
Coding strategies for cochlear implants under adverse environments
NASA Astrophysics Data System (ADS)
Tahmina, Qudsia
Cochlear implants are electronic prosthetic devices that restores partial hearing in patients with severe to profound hearing loss. Although most coding strategies have significantly improved the perception of speech in quite listening conditions, there remains limitations on speech perception under adverse environments such as in background noise, reverberation and band-limited channels, and we propose strategies that improve the intelligibility of speech transmitted over the telephone networks, reverberated speech and speech in the presence of background noise. For telephone processed speech, we propose to examine the effects of adding low-frequency and high- frequency information to the band-limited telephone speech. Four listening conditions were designed to simulate the receiving frequency characteristics of telephone handsets. Results indicated improvement in cochlear implant and bimodal listening when telephone speech was augmented with high frequency information and therefore this study provides support for design of algorithms to extend the bandwidth towards higher frequencies. The results also indicated added benefit from hearing aids for bimodal listeners in all four types of listening conditions. Speech understanding in acoustically reverberant environments is always a difficult task for hearing impaired listeners. Reverberated sounds consists of direct sound, early reflections and late reflections. Late reflections are known to be detrimental to speech intelligibility. In this study, we propose a reverberation suppression strategy based on spectral subtraction to suppress the reverberant energies from late reflections. Results from listening tests for two reverberant conditions (RT60 = 0.3s and 1.0s) indicated significant improvement when stimuli was processed with SS strategy. The proposed strategy operates with little to no prior information on the signal and the room characteristics and therefore, can potentially be implemented in real-time CI speech processors. For speech in background noise, we propose a mechanism underlying the contribution of harmonics to the benefit of electroacoustic stimulations in cochlear implants. The proposed strategy is based on harmonic modeling and uses synthesis driven approach to synthesize the harmonics in voiced segments of speech. Based on objective measures, results indicated improvement in speech quality. This study warrants further work into development of algorithms to regenerate harmonics of voiced segments in the presence of noise.
Multilingual Vocabularies in Automatic Speech Recognition
2000-08-01
monolingual (a few thousands) is an obstacle to a full generalization of the inventories, then moved to the multilingual case. In the approach towards the...direction of language independence. In this monolingual experiment, we developed two types of unit sets for paper, we extend the method presented in [3...sound ji is not assimilated 3.2.1 Monolingual experiments to the corresponding sound in Spanish, but it is left apart as a The baseline model for English
An efficient robust sound classification algorithm for hearing aids.
Nordqvist, Peter; Leijon, Arne
2004-06-01
An efficient robust sound classification algorithm based on hidden Markov models is presented. The system would enable a hearing aid to automatically change its behavior for differing listening environments according to the user's preferences. This work attempts to distinguish between three listening environment categories: speech in traffic noise, speech in babble, and clean speech, regardless of the signal-to-noise ratio. The classifier uses only the modulation characteristics of the signal. The classifier ignores the absolute sound pressure level and the absolute spectrum shape, resulting in an algorithm that is robust against irrelevant acoustic variations. The measured classification hit rate was 96.7%-99.5% when the classifier was tested with sounds representing one of the three environment categories included in the classifier. False-alarm rates were 0.2%-1.7% in these tests. The algorithm is robust and efficient and consumes a small amount of instructions and memory. It is fully possible to implement the classifier in a DSP-based hearing instrument.
Sleep duration predicts behavioral and neural differences in adult speech sound learning.
Earle, F Sayako; Landi, Nicole; Myers, Emily B
2017-01-01
Sleep is important for memory consolidation and contributes to the formation of new perceptual categories. This study examined sleep as a source of variability in typical learners' ability to form new speech sound categories. We trained monolingual English speakers to identify a set of non-native speech sounds at 8PM, and assessed their ability to identify and discriminate between these sounds immediately after training, and at 8AM on the following day. We tracked sleep duration overnight, and found that light sleep duration predicted gains in identification performance, while total sleep duration predicted gains in discrimination ability. Participants obtained an average of less than 6h of sleep, pointing to the degree of sleep deprivation as a potential factor. Behavioral measures were associated with ERP indexes of neural sensitivity to the learned contrast. These results demonstrate that the relative success in forming new perceptual categories depends on the duration of post-training sleep. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Sleep duration predicts behavioral and neural differences in adult speech sound learning
Earle, F. Sayako; Landi, Nicole; Myers, Emily B.
2016-01-01
Sleep is important for memory consolidation and contributes to the formation of new perceptual categories. This study examined sleep as a source of variability in typical learners’ ability to form new speech sound categories. We trained monolingual English speakers to identify a set of non-native speech sounds at 8PM, and assessed their ability to identify and discriminate between these sounds immediately after training, and at 8AM on the following day. We tracked sleep duration overnight, and found that light sleep duration predicted gains in identification performance, while total sleep duration predicted gains in discrimination ability. Participants obtained an average of less than 6 hours of sleep, pointing to the degree of sleep deprivation as a potential factor. Behavioral measures were associated with ERP indexes of neural sensitivity to the learned contrast. These results demonstrate that the relative success in forming new perceptual categories depends on the duration of post-training sleep. PMID:27793703
Lim, Sung-joo; Holt, Lori L
2011-01-01
Although speech categories are defined by multiple acoustic dimensions, some are perceptually weighted more than others and there are residual effects of native-language weightings in non-native speech perception. Recent research on nonlinguistic sound category learning suggests that the distribution characteristics of experienced sounds influence perceptual cue weights: Increasing variability across a dimension leads listeners to rely upon it less in subsequent category learning (Holt & Lotto, 2006). The present experiment investigated the implications of this among native Japanese learning English /r/-/l/ categories. Training was accomplished using a videogame paradigm that emphasizes associations among sound categories, visual information, and players' responses to videogame characters rather than overt categorization or explicit feedback. Subjects who played the game for 2.5h across 5 days exhibited improvements in /r/-/l/ perception on par with 2-4 weeks of explicit categorization training in previous research and exhibited a shift toward more native-like perceptual cue weights. Copyright © 2011 Cognitive Science Society, Inc.
Lim, Sung-joo; Holt, Lori L.
2011-01-01
Although speech categories are defined by multiple acoustic dimensions, some are perceptually-weighted more than others and there are residual effects of native-language weightings in non-native speech perception. Recent research on nonlinguistic sound category learning suggests that the distribution characteristics of experienced sounds influence perceptual cue weights: increasing variability across a dimension leads listeners to rely upon it less in subsequent category learning (Holt & Lotto, 2006). The present experiment investigated the implications of this among native Japanese learning English /r/-/l/ categories. Training was accomplished using a videogame paradigm that emphasizes associations among sound categories, visual information and players’ responses to videogame characters rather than overt categorization or explicit feedback. Subjects who played the game for 2.5 hours across 5 days exhibited improvements in /r/-/l/ perception on par with 2–4 weeks of explicit categorization training in previous research and exhibited a shift toward more native-like perceptual cue weights. PMID:21827533
Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope.
Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan; Simon, Jonathan Z; Francart, Tom
2018-04-01
Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.
ERIC Educational Resources Information Center
Wren, Yvonne; Miller, Laura L.; Peters, Tim J.; Emond, Alan; Roulstone, Sue
2016-01-01
Purpose: The purpose of this study was to determine prevalence and predictors of persistent speech sound disorder (SSD) in children aged 8 years after disregarding children presenting solely with common clinical distortions (i.e., residual errors). Method: Data from the Avon Longitudinal Study of Parents and Children (Boyd et al., 2012) were used.…
Neural correlates of audiotactile phonetic processing in early-blind readers: an fMRI study.
Pishnamazi, Morteza; Nojaba, Yasaman; Ganjgahi, Habib; Amousoltani, Asie; Oghabian, Mohammad Ali
2016-05-01
Reading is a multisensory function that relies on arbitrary associations between auditory speech sounds and symbols from a second modality. Studies of bimodal phonetic perception have mostly investigated the integration of visual letters and speech sounds. Blind readers perform an analogous task by using tactile Braille letters instead of visual letters. The neural underpinnings of audiotactile phonetic processing have not been studied before. We used functional magnetic resonance imaging to reveal the neural correlates of audiotactile phonetic processing in 16 early-blind Braille readers. Braille letters and corresponding speech sounds were presented in unimodal, and congruent/incongruent bimodal configurations. We also used a behavioral task to measure the speed of blind readers in identifying letters presented via tactile and/or auditory modalities. Reaction times for tactile stimuli were faster. The reaction times for bimodal stimuli were equal to those for the slower auditory-only stimuli. fMRI analyses revealed the convergence of unimodal auditory and unimodal tactile responses in areas of the right precentral gyrus and bilateral crus I of the cerebellum. The left and right planum temporale fulfilled the 'max criterion' for bimodal integration, but activities of these areas were not sensitive to the phonetical congruency between sounds and Braille letters. Nevertheless, congruency effects were found in regions of frontal lobe and cerebellum. Our findings suggest that, unlike sighted readers who are assumed to have amodal phonetic representations, blind readers probably process letters and sounds separately. We discuss that this distinction might be due to mal-development of multisensory neural circuits in early blinds or it might be due to inherent differences between Braille and print reading mechanisms.
Is Statistical Learning Constrained by Lower Level Perceptual Organization?
Emberson, Lauren L.; Liu, Ran; Zevin, Jason D.
2013-01-01
In order for statistical information to aid in complex developmental processes such as language acquisition, learning from higher-order statistics (e.g. across successive syllables in a speech stream to support segmentation) must be possible while perceptual abilities (e.g. speech categorization) are still developing. The current study examines how perceptual organization interacts with statistical learning. Adult participants were presented with multiple exemplars from novel, complex sound categories designed to reflect some of the spectral complexity and variability of speech. These categories were organized into sequential pairs and presented such that higher-order statistics, defined based on sound categories, could support stream segmentation. Perceptual similarity judgments and multi-dimensional scaling revealed that participants only perceived three perceptual clusters of sounds and thus did not distinguish the four experimenter-defined categories, creating a tension between lower level perceptual organization and higher-order statistical information. We examined whether the resulting pattern of learning is more consistent with statistical learning being “bottom-up,” constrained by the lower levels of organization, or “top-down,” such that higher-order statistical information of the stimulus stream takes priority over the perceptual organization, and perhaps influences perceptual organization. We consistently find evidence that learning is constrained by perceptual organization. Moreover, participants generalize their learning to novel sounds that occupy a similar perceptual space, suggesting that statistical learning occurs based on regions of or clusters in perceptual space. Overall, these results reveal a constraint on learning of sound sequences, such that statistical information is determined based on lower level organization. These findings have important implications for the role of statistical learning in language acquisition. PMID:23618755
Van der Haegen, Lise; Acke, Frederic; Vingerhoets, Guy; Dhooge, Ingeborg; De Leenheer, Els; Cai, Qing; Brysbaert, Marc
2016-12-01
Auditory speech perception, speech production and reading lateralize to the left hemisphere in the majority of healthy right-handers. In this study, we investigated to what extent sensory input underlies the side of language dominance. We measured the lateralization of the three core subprocesses of language in patients who had profound hearing loss in the right ear from birth and in matched control subjects. They took part in a semantic decision listening task involving speech and sound stimuli (auditory perception), a word generation task (speech production) and a passive reading task (reading). The results show that a lack of sensory auditory input on the right side, which is strongly connected to the contralateral left hemisphere, does not lead to atypical lateralization of speech perception. Speech production and reading were also typically left lateralized in all but one patient, contradicting previous small scale studies. Other factors such as genetic constraints presumably overrule the role of sensory input in the development of (a)typical language lateralization. Copyright © 2015 Elsevier Ltd. All rights reserved.
Phonology and Vocal Behavior in Toddlers with Autism Spectrum Disorders
Schoen, Elizabeth; Paul, Rhea; Chawarska, Katyrzyna
2011-01-01
Scientific Abstract The purpose of this study is to examine the phonological and other vocal productions of children, 18-36 months, with autism spectrum disorder (ASD) and to compare these productions to those of age-matched and language-matched controls. Speech samples were obtained from 30 toddlers with ASD, 11 age-matched toddlers and 23 language-matched toddlers during either parent-child or clinician-child play sessions. Samples were coded for a variety of speech-like and non-speech vocalization productions. Toddlers with ASD produced speech-like vocalizations similar to those of language-matched peers, but produced significantly more atypical non-speech vocalizations when compared to both control groups.Toddlers with ASD show speech-like sound production that is linked to their language level, in a manner similar to that seen in typical development. The main area of difference in vocal development in this population is in the production of atypical vocalizations. Findings suggest that toddlers with autism spectrum disorders might not tune into the language model of their environment. Failure to attend to the ambient language environment negatively impacts the ability to acquire spoken language. PMID:21308998
Speech and oromotor outcome in adolescents born preterm: relationship to motor tract integrity.
Northam, Gemma B; Liégeois, Frédérique; Chong, Wui K; Baker, Kate; Tournier, Jacques-Donald; Wyatt, John S; Baldeweg, Torsten; Morgan, Angela
2012-03-01
To assess speech abilities in adolescents born preterm and investigate whether there is an association between specific speech deficits and brain abnormalities. Fifty adolescents born prematurely (<33 weeks' gestation) with a spectrum of brain injuries were recruited (mean age, 16 years). Speech examination included tests of speech-sound processing and production and speech and oromotor control. Conventional magnetic resonance imaging and diffusion-weighted imaging was acquired in all adolescents born preterm and 30 term-born control subjects. Radiological ratings of brain injury were recorded and the integrity of the primary motor projections was measured (corticospinal tract and speech-motor corticobulbar tract [CST/CBT]). There were no clinical diagnoses of developmental dysarthria, dyspraxia, or a speech-sound disorder, but difficulties in speech and oromotor control were common. A regression analysis revealed that presence of a neurologic impairment, and diffusion-weighted imaging abnormalities in the left CST/CBT were significant independent predictors of poor speech and oromotor outcome. These left-lateralized abnormalities were most evident at the level of the posterior limb of the internal capsule. Difficulties in speech and oromotor control are common in adolescents born preterm, and adolescents with injury to the CST/CBT pathways in the left-hemisphere may be most at risk. Copyright © 2012 Mosby, Inc. All rights reserved.
Speech impairment in Down syndrome: a review.
Kent, Ray D; Vorperian, Houri K
2013-02-01
This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.
Towards parameter-free classification of sound effects in movies
NASA Astrophysics Data System (ADS)
Chu, Selina; Narayanan, Shrikanth; Kuo, C.-C. J.
2005-08-01
The problem of identifying intense events via multimedia data mining in films is investigated in this work. Movies are mainly characterized by dialog, music, and sound effects. We begin our investigation with detecting interesting events through sound effects. Sound effects are neither speech nor music, but are closely associated with interesting events such as car chases and gun shots. In this work, we utilize low-level audio features including MFCC and energy to identify sound effects. It was shown in previous work that the Hidden Markov model (HMM) works well for speech/audio signals. However, this technique requires a careful choice in designing the model and choosing correct parameters. In this work, we introduce a framework that will avoid such necessity and works well with semi- and non-parametric learning algorithms.
Asymmetries in the Processing of Vowel Height
ERIC Educational Resources Information Center
Scharinger, Mathias; Monahan, Philip J.; Idsardi, William J.
2012-01-01
Purpose: Speech perception can be described as the transformation of continuous acoustic information into discrete memory representations. Therefore, research on neural representations of speech sounds is particularly important for a better understanding of this transformation. Speech perception models make specific assumptions regarding the…
... Toddlers and Older Children Speech While the tongue is remarkably able to compensate and many children have no speech impediments due to tongue-tie, others may. Around the age of three, speech problems, especially articulation of the sounds - l, r, t, d, n, th, sh, and z ...
Treatment model in children with speech disorders and its therapeutic efficiency.
Barberena, Luciana; Keske-Soares, Márcia; Cervi, Taís; Brandão, Mariane
2014-07-01
Introduction Speech articulation disorders affect the intelligibility of speech. Studies on therapeutic models show the effectiveness of the communication treatment. Objective To analyze the progress achieved by treatment with the ABAB-Withdrawal and Multiple Probes Model in children with different degrees of phonological disorders. Methods The diagnosis of speech articulation disorder was determined by speech and hearing evaluation and complementary tests. The subjects of this research were eight children, with the average age of 5:5. The children were distributed into four groups according to the degrees of the phonological disorders, based on the percentage of correct consonants, as follows: severe, moderate to severe, mild to moderate, and mild. The phonological treatment applied was the ABAB-Withdrawal and Multiple Probes Model. The development of the therapy by generalization was observed through the comparison between the two analyses: contrastive and distinctive features at the moment of evaluation and reevaluation. Results The following types of generalization were found: to the items not used in the treatment (other words), to another position in the word, within a sound class, to other classes of sounds, and to another syllable structure. Conclusion The different types of generalization studied showed the expansion of production and proper use of therapy-trained targets in other contexts or untrained environments. Therefore, the analysis of the generalizations proved to be an important criterion to measure the therapeutic efficacy.
A Construction System for CALL Materials from TV News with Captions
NASA Astrophysics Data System (ADS)
Kobayashi, Satoshi; Tanaka, Takashi; Mori, Kazumasa; Nakagawa, Seiichi
Many language learning materials have been published. In language learning, although repetition training is obviously necessary, it is difficult to maintain the learner's interest/motivation using existing learning materials, because those materials are limited in their scope and contents. In addition, we doubt whether the speech sounds used in most materials are natural in various situations. Nowadays, some TV news programs (CNN, ABC, PBS, NHK, etc.) have closed/open captions corresponding to the announcer's speech. We have developed a system that makes Computer Assisted Language Learning (CALL) materials for both English learning by Japanese and Japanese learning by foreign students from such captioned newscasts. This system computes the synchronization between captions and speech by using HMMs and a forced alignment algorithm. Materials made by the system have following functions: full/partial text caption display, repetition listening, consulting an electronic dictionary, display of the user's/announcer's sound waveform and pitch contour, and automatic construction of a dictation test. Materials have following advantages: materials present polite and natural speech, various and timely topics. Furthermore, the materials have the following possibility: automatic creation of listening/understanding tests, and storage/retrieval of the many materials. In this paper, firstly, we present the organization of the system. Then, we describe results of questionnaires on trial use of the materials. As the result, we got enough accuracy on the synchronization between captions and speech. Speaking totally, we encouraged to research this system.
Martin, B A; Sigal, A; Kurtzberg, D; Stapells, D R
1997-03-01
This study investigated the effects of decreased audibility produced by high-pass noise masking on cortical event-related potentials (ERPs) N1, N2, and P3 to the speech sounds /ba/and/da/presented at 65 and 80 dB SPL. Normal-hearing subjects pressed a button in response to the deviant sound in an oddball paradigm. Broadband masking noise was presented at an intensity sufficient to completely mask the response to the 65-dB SPL speech sounds, and subsequently high-pass filtered at 4000, 2000, 1000, 500, and 250 Hz. With high-pass masking noise, pure-tone behavioral thresholds increased by an average of 38 dB at the high-pass cutoff and by 50 dB one octave above the cutoff frequency. Results show that as the cutoff frequency of the high-pass masker was lowered, ERP latencies to speech sounds increased and amplitudes decreased. The cutoff frequency where these changes first occurred and the rate of the change differed for N1 compared to N2, P3, and the behavioral measures. N1 showed gradual changes as the masker cutoff frequency was lowered. N2, P3, and behavioral measures showed marked changes below a masker cutoff of 2000 Hz. These results indicate that the decreased audibility resulting from the noise masking affects the various ERP components in a differential manner. N1 is related to the presence of audible stimulus energy, being present whether audible stimuli are discriminable or not. In contrast, N2 and P3 were absent when the stimuli were audible but not discriminable (i.e., when the second formant transitions were masked), reflecting stimulus discrimination. These data have implications regarding the effects of decreased audibility on cortical processing of speech sounds and for the study of cortical ERPs in populations with hearing impairment.
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.
Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth
2017-08-09
Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate anatomically distinct cortical representations of modulated noise in normal-hearing and hearing-impaired listeners. This work provides the first link among hearing thresholds, the amplitude of cortical representations of modulated sounds, and the ability to understand speech in modulated background noise. In light of previous work, we propose that magnified cortical representations of modulated sounds disrupt the separation of speech from modulated background noise in auditory cortex. Copyright © 2017 Millman et al.
Vocal Age Disguise: The Role of Fundamental Frequency and Speech Rate and Its Perceived Effects.
Skoog Waller, Sara; Eriksson, Mårten
2016-01-01
The relationship between vocal characteristics and perceived age is of interest in various contexts, as is the possibility to affect age perception through vocal manipulation. A few examples of such situations are when age is staged by actors, when ear witnesses make age assessments based on vocal cues only or when offenders (e.g., online groomers) disguise their voice to appear younger or older. This paper investigates how speakers spontaneously manipulate two age related vocal characteristics ( f 0 and speech rate) in attempt to sound younger versus older than their true age, and if the manipulations correspond to actual age related changes in f 0 and speech rate (Study 1). Further aims of the paper is to determine how successful vocal age disguise is by asking listeners to estimate the age of generated speech samples (Study 2) and to examine whether or not listeners use f 0 and speech rate as cues to perceived age. In Study 1, participants from three age groups (20-25, 40-45, and 60-65 years) agreed to read a short text under three voice conditions. There were 12 speakers in each age group (six women and six men). They used their natural voice in one condition, attempted to sound 20 years younger in another and 20 years older in a third condition. In Study 2, 60 participants (listeners) listened to speech samples from the three voice conditions in Study 1 and estimated the speakers' age. Each listener was exposed to all three voice conditions. The results from Study 1 indicated that the speakers increased fundamental frequency ( f 0 ) and speech rate when attempting to sound younger and decreased f 0 and speech rate when attempting to sound older. Study 2 showed that the voice manipulations had an effect in the sought-after direction, although the achieved mean effect was only 3 years, which is far less than the intended effect of 20 years. Moreover, listeners used speech rate, but not f 0 , as a cue to speaker age. It was concluded that age disguise by voice can be achieved by naïve speakers even though the perceived effect was smaller than intended.
Language-Specific Developmental Differences in Speech Production: A Cross-Language Acoustic Study
ERIC Educational Resources Information Center
Li, Fangfang
2012-01-01
Speech productions of 40 English- and 40 Japanese-speaking children (aged 2-5) were examined and compared with the speech produced by 20 adult speakers (10 speakers per language). Participants were recorded while repeating words that began with "s" and "sh" sounds. Clear language-specific patterns in adults' speech were found,…
How Our Own Speech Rate Influences Our Perception of Others
ERIC Educational Resources Information Center
Bosker, Hans Rutger
2017-01-01
In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects…
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese
ERIC Educational Resources Information Center
Saito, Akie; Inoue, Tomoyoshi
2017-01-01
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the…
Ingressive Speech Errors: A Service Evaluation of Speech-Sound Therapy in a Child Aged 4;6
ERIC Educational Resources Information Center
Hrastelj, Laura; Knight, Rachael-Anne
2017-01-01
Background: A pattern of ingressive substitutions for word-final sibilants can be identified in a small number of cases in child speech disorder, with growing evidence suggesting it is a phonological difficulty, despite the unusual surface form. Phonological difficulty implies a problem with the cognitive process of organizing speech into sound…
Perceptual context effects of speech and nonspeech sounds: the role of auditory categories.
Aravamudhan, Radhika; Lotto, Andrew J; Hawks, John W
2008-09-01
Williams [(1986). "Role of dynamic information in the perception of coarticulated vowels," Ph.D. thesis, University of Connecticut, Standford, CT] demonstrated that nonspeech contexts had no influence on pitch judgments of nonspeech targets, whereas context effects were obtained when instructed to perceive the sounds as speech. On the other hand, Holt et al. [(2000). "Neighboring spectral content influences vowel identification," J. Acoust. Soc. Am. 108, 710-722] showed that nonspeech contexts were sufficient to elicit context effects in speech targets. The current study was to test a hypothesis that could explain the varying effectiveness of nonspeech contexts: Context effects are obtained only when there are well-established perceptual categories for the target stimuli. Experiment 1 examined context effects in speech and nonspeech signals using four series of stimuli: steady-state vowels that perceptually spanned from /inverted ohm/-/I/ in isolation and in the context of /w/ (with no steady-state portion) and two nonspeech sine-wave series that mimicked the acoustics of the speech series. In agreement with previous work context effects were obtained for speech contexts and targets but not for nonspeech analogs. Experiment 2 tested predictions of the hypothesis by testing for nonspeech context effects after the listeners had been trained to categorize the sounds. Following training, context-dependent categorization was obtained for nonspeech stimuli in the training group. These results are presented within a general perceptual-cognitive framework for speech perception research.