Sample records for speech

  1. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...: This is a summary of the Commission's Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...), Internet Protocol Relay (IP Relay), and IP captioned telephone service (IP CTS) as compensable forms of TRS...

  2. 78 FR 49717 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services... Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay... (IP Relay) and video relay service (VRS), the Commission should bundle national STS outreach efforts...

  3. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, J.F.; Ng, L.C.

    1998-03-17

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.

  4. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  5. Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

    PubMed

    Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

    2018-01-29

    To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound disorders. Non-speech oral motor exercise use was most frequently reported in the treatment of dysarthria. Non-speech oral motor exercise use when targeting speech sound disorders is not widely endorsed in the literature.

  6. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holzrichter, J.F.; Ng, L.C.

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less

  7. Speech processing using maximum likelihood continuity mapping

    DOEpatents

    Hogden, John E.

    2000-01-01

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  8. Speech processing using maximum likelihood continuity mapping

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogden, J.E.

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  9. Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions.

    PubMed

    Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang

    2018-04-04

    Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.

  10. The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.

    PubMed

    De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

    2016-10-01

    This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.

  11. Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension.

    PubMed

    Drijvers, Linda; Özyürek, Asli

    2017-01-01

    This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Twenty participants watched videos of an actress uttering an action verb and completed a free-recall task. The videos were presented in 3 speech conditions (2-band noise-vocoding, 6-band noise-vocoding, clear), 3 multimodal conditions (speech + lips blurred, speech + visible speech, speech + visible speech + gesture), and 2 visual-only conditions (visible speech, visible speech + gesture). Accuracy levels were higher when both visual articulators were present compared with 1 or none. The enhancement effects of (a) visible speech, (b) gestural information on top of visible speech, and (c) both visible speech and iconic gestures were larger in 6-band than 2-band noise-vocoding or visual-only conditions. Gestural enhancement in 2-band noise-vocoding did not differ from gestural enhancement in visual-only conditions. When perceiving degraded speech in a visual context, listeners benefit more from having both visual articulators present compared with 1. This benefit was larger at 6-band than 2-band noise-vocoding, where listeners can benefit from both phonological cues from visible speech and semantic cues from iconic gestures to disambiguate speech.

  12. Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

    PubMed

    Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

    2017-09-01

    Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Non-fluent speech following stroke is caused by impaired efference copy.

    PubMed

    Feenaughty, Lynda; Basilakos, Alexandra; Bonilha, Leonardo; den Ouden, Dirk-Bart; Rorden, Chris; Stark, Brielle; Fridriksson, Julius

    2017-09-01

    Efference copy is a cognitive mechanism argued to be critical for initiating and monitoring speech: however, the extent to which breakdown of efference copy mechanisms impact speech production is unclear. This study examined the best mechanistic predictors of non-fluent speech among 88 stroke survivors. Objective speech fluency measures were subjected to a principal component analysis (PCA). The primary PCA factor was then entered into a multiple stepwise linear regression analysis as the dependent variable, with a set of independent mechanistic variables. Participants' ability to mimic audio-visual speech ("speech entrainment response") was the best independent predictor of non-fluent speech. We suggest that this "speech entrainment" factor reflects integrity of internal monitoring (i.e., efference copy) of speech production, which affects speech initiation and maintenance. Results support models of normal speech production and suggest that therapy focused on speech initiation and maintenance may improve speech fluency for individuals with chronic non-fluent aphasia post stroke.

  14. Autistic traits and attention to speech: Evidence from typically developing individuals.

    PubMed

    Korhonen, Vesa; Werner, Stefan

    2017-04-01

    Individuals with autism spectrum disorder have a preference for attending to non-speech stimuli over speech stimuli. We are interested in whether non-speech preference is only a feature of diagnosed individuals, and whether we can we test implicit preference experimentally. In typically developed individuals, serial recall is disrupted more by speech stimuli than by non-speech stimuli. Since behaviour of individuals with autistic traits resembles that of individuals with autism, we have used serial recall to test whether autistic traits influence task performance during irrelevant speech sounds. The errors made on the serial recall task during speech or non-speech sounds were counted as a measure of speech or non-speech preference in relation to no sound condition. We replicated the serial order effect and found the speech to be more disruptive than the non-speech sounds, but were unable to find any associations between the autism quotient scores and the non-speech sounds. Our results may indicate a learnt behavioural response to speech sounds.

  15. Electrophysiological evidence for speech-specific audiovisual integration.

    PubMed

    Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

    2014-01-01

    Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.

  16. Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

    PubMed

    Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

    2017-09-18

    Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p < .01) and with mean length of utterance produced during a written picture description (r = .96, p < .01). Correlations between inner speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.

  17. Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

    PubMed

    Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P

    2017-01-01

    Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.

  18. Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study

    PubMed Central

    Stefanidou, Chrysi; McCleery, Joseph P.

    2017-01-01

    Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6—year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age. PMID:28738063

  19. Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

    NASA Astrophysics Data System (ADS)

    Kayasith, Prakasith; Theeramunkong, Thanaruk

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  20. Motor Speech Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Progressive Apraxia of Speech.

    PubMed

    Poole, Matthew L; Brodtmann, Amy; Darby, David; Vogel, Adam P

    2017-04-14

    Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Speech and neuroimaging data described in studies of FTD and PPA were systematically reviewed. A meta-analysis was conducted for speech measures that were used consistently in multiple studies. The methods and nomenclature used to describe speech in these disorders varied between studies. Our meta-analysis identified 3 speech measures which differentiate variants or healthy control-group participants (e.g., nonfluent and logopenic variants of PPA from all other groups, behavioral-variant FTD from a control group). Deficits within the frontal-lobe speech networks are linked to motor speech profiles of the nonfluent variant of PPA and progressive apraxia of speech. Motor speech impairment is rarely reported in semantic and logopenic variants of PPA. Limited data are available on motor speech impairment in the behavioral variant of FTD. Our review identified several measures of speech which may assist with diagnosis and classification, and consolidated the brain-behavior associations relating to speech in FTD, PPA, and progressive apraxia of speech.

  1. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2002-01-01

    Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

  2. Inner Speech and Clarity of Self-Concept in Thought Disorder and Auditory-Verbal Hallucinations

    PubMed Central

    de Sousa, Paulo; Sellwood, William; Spray, Amy; Fernyhough, Charles; Bentall, Richard P.

    2016-01-01

    Abstract Eighty patients and thirty controls were interviewed using one interview that promoted personal disclosure and another about everyday topics. Speech was scored using the Thought, Language and Communication scale (TLC). All participants completed the Self-Concept Clarity Scale (SCCS) and the Varieties of Inner Speech Questionnaire (VISQ). Patients scored lower than comparisons on the SCCS. Low scores were associated the disorganized dimension of TD. Patients also scored significantly higher on condensed and other people in inner speech, but not on dialogical or evaluative inner speech. The poverty of speech dimension of TD was associated with less dialogical inner speech, other people in inner speech, and less evaluative inner speech. Hallucinations were significantly associated with more other people in inner speech and evaluative inner speech. Clarity of self-concept and qualities of inner speech are differentially associated with dimensions of TD. The findings also support inner speech models of hallucinations. PMID:27898489

  3. Inner Speech and Clarity of Self-Concept in Thought Disorder and Auditory-Verbal Hallucinations.

    PubMed

    de Sousa, Paulo; Sellwood, William; Spray, Amy; Fernyhough, Charles; Bentall, Richard P

    2016-12-01

    Eighty patients and thirty controls were interviewed using one interview that promoted personal disclosure and another about everyday topics. Speech was scored using the Thought, Language and Communication scale (TLC). All participants completed the Self-Concept Clarity Scale (SCCS) and the Varieties of Inner Speech Questionnaire (VISQ). Patients scored lower than comparisons on the SCCS. Low scores were associated the disorganized dimension of TD. Patients also scored significantly higher on condensed and other people in inner speech, but not on dialogical or evaluative inner speech. The poverty of speech dimension of TD was associated with less dialogical inner speech, other people in inner speech, and less evaluative inner speech. Hallucinations were significantly associated with more other people in inner speech and evaluative inner speech. Clarity of self-concept and qualities of inner speech are differentially associated with dimensions of TD. The findings also support inner speech models of hallucinations.

  4. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

    PubMed

    Greene, Beth G; Logan, John S; Pisoni, David B

    1986-03-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.

  5. Neural pathways for visual speech perception

    PubMed Central

    Bernstein, Lynne E.; Liebenthal, Einat

    2014-01-01

    This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611

  6. The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

    PubMed Central

    Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

    2010-01-01

    In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615

  7. Speech entrainment enables patients with Broca’s aphasia to produce fluent speech

    PubMed Central

    Hubbard, H. Isabel; Hudspeth, Sarah Grace; Holland, Audrey L.; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

    2012-01-01

    A distinguishing feature of Broca’s aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect ‘speech entrainment’ and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca’s aphasia. In Experiment 1, 13 patients with Broca’s aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca’s area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and functional magnetic resonance imaging data were collected before and after the treatment phase. Patients were able to produce a greater variety of words with and without speech entrainment at 1 and 6 weeks after training. Treatment-related decrease in cortical activation associated with speech entrainment was found in areas of the left posterior-inferior parietal lobe. We conclude that speech entrainment allows patients with Broca’s aphasia to double their speech output compared with spontaneous speech. Neuroimaging results suggest that speech entrainment allows patients to produce fluent speech by providing an external gating mechanism that yokes a ventral language network that encodes conceptual aspects of speech. Preliminary results suggest that training with speech entrainment improves speech production in Broca’s aphasia providing a potential therapeutic method for a disorder that has been shown to be particularly resistant to treatment. PMID:23250889

  8. [Speech fluency developmental profile in Brazilian Portuguese speakers].

    PubMed

    Martins, Vanessa de Oliveira; Andrade, Claudia Regina Furquim de

    2008-01-01

    speech fluency varies from one individual to the next, fluent or stutterer, depending on several factors. Studies that investigate the influence of age on fluency patterns have been identified; however these differences were investigated in isolated age groups. Studies about life span fluency variations were not found. to verify the speech fluency developmental profile. speech samples of 594 fluent participants of both genders, with ages between 2:0 and 99:11 years, speakers of the Brazilian Portuguese language, were analyzed. Participants were grouped as follows: pre-scholars, scholars, early adolescence, late adolescence, adults and elderlies. Speech samples were analyzed according to the Speech Fluency Profile variables and were compared regarding: typology of speech disruptions (typical and less typical), speech rate (words and syllables per minute) and frequency of speech disruptions (percentage of speech discontinuity). although isolated variations were identified, overall there was no significant difference between the age groups for the speech disruption indexes (typical and less typical speech disruptions and percentage of speech discontinuity). Significant differences were observed between the groups when considering speech rate. the development of the neurolinguistic system for speech fluency, in terms of speech disruptions, seems to stabilize itself during the first years of life, presenting no alterations during the life span. Indexes of speech rate present variations in the age groups, indicating patterns of acquisition, development, stabilization and degeneration.

  9. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

    PubMed Central

    GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

    2012-01-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

  10. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    ERIC Educational Resources Information Center

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  11. Detection of target phonemes in spontaneous and read speech.

    PubMed

    Mehta, G; Cutler, A

    1988-01-01

    Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalise to the recognition of spontaneous speech. In the present study listeners were presented with both spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Responses were, overall, equally fast in each speech mode. However, analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than in unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claims from previous work that listeners pay great attention to prosodic information in the process of recognising speech.

  12. Relative Salience of Speech Rhythm and Speech Rate on Perceived Foreign Accent in a Second Language.

    PubMed

    Polyanskaya, Leona; Ordin, Mikhail; Busa, Maria Grazia

    2017-09-01

    We investigated the independent contribution of speech rate and speech rhythm to perceived foreign accent. To address this issue we used a resynthesis technique that allows neutralizing segmental and tonal idiosyncrasies between identical sentences produced by French learners of English at different proficiency levels and maintaining the idiosyncrasies pertaining to prosodic timing patterns. We created stimuli that (1) preserved the idiosyncrasies in speech rhythm while controlling for the differences in speech rate between the utterances; (2) preserved the idiosyncrasies in speech rate while controlling for the differences in speech rhythm between the utterances; and (3) preserved the idiosyncrasies both in speech rate and speech rhythm. All the stimuli were created in intoned (with imposed intonational contour) and flat (with monotonized, constant F0) conditions. The original and the resynthesized sentences were rated by native speakers of English for degree of foreign accent. We found that both speech rate and speech rhythm influence the degree of perceived foreign accent, but the effect of speech rhythm is larger than that of speech rate. We also found that intonation enhances the perception of fine differences in rhythmic patterns but reduces the perceptual salience of fine differences in speech rate.

  13. Review of Visual Speech Perception by Hearing and Hearing-Impaired People: Clinical Implications

    ERIC Educational Resources Information Center

    Woodhouse, Lynn; Hickson, Louise; Dodd, Barbara

    2009-01-01

    Background: Speech perception is often considered specific to the auditory modality, despite convincing evidence that speech processing is bimodal. The theoretical and clinical roles of speech-reading for speech perception, however, have received little attention in speech-language therapy. Aims: The role of speech-read information for speech…

  14. Electrophysiological Evidence for a Multisensory Speech-Specific Mode of Perception

    ERIC Educational Resources Information Center

    Stekelenburg, Jeroen J.; Vroomen, Jean

    2012-01-01

    We investigated whether the interpretation of auditory stimuli as speech or non-speech affects audiovisual (AV) speech integration at the neural level. Perceptually ambiguous sine-wave replicas (SWS) of natural speech were presented to listeners who were either in "speech mode" or "non-speech mode". At the behavioral level, incongruent lipread…

  15. Free Speech Yearbook: 1972.

    ERIC Educational Resources Information Center

    Tedford, Thomas L., Ed.

    This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…

  16. The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses

    ERIC Educational Resources Information Center

    Adank, Patti

    2012-01-01

    The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…

  17. Methods and apparatus for non-acoustic speech characterization and recognition

    DOEpatents

    Holzrichter, John F.

    1999-01-01

    By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.

  18. Methods and apparatus for non-acoustic speech characterization and recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holzrichter, J.F.

    By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.

  19. Voice technology and BBN

    NASA Technical Reports Server (NTRS)

    Wolf, Jared J.

    1977-01-01

    The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.

  20. A functional speech impression used to fabricate a maxillary speech prosthesis for a partial glossectomy patient.

    PubMed

    Knowles, J C; Chalian, V A; Shanks, J C

    1984-02-01

    Surgery for cancer of the floor of the mouth often results in alteration of the muscles of the tongue and floor of the mouth. Both primary and secondary surgical procedures often result in scar formation with reduced mobility of the tongue during speech and deglutition. Speech is often used as a diagnostic tool in the placement of the anterior teeth during fabrication of a prosthesis. Speech can similarly be used to help determine the proper placement of a speech portion of the prosthesis. The prosthetic rehabilitation approach described lowers the palatal vault with a false palate to enable the tongue to function against it during speech (Fig. 15). Group studies have shown that the design and fabrication of speech prostheses for partial glossectomy patients have significantly improved speech and swallowing for these patients. A speech pathologist is helpful during diagnosis, and speech therapy is necessary for significant speech improvement. Prosthetic rehabilitation alone cannot be expected to improve speech.

  1. Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain

    PubMed Central

    Gross, Joachim; Hoogenboom, Nienke; Thut, Gregor; Schyns, Philippe; Panzeri, Stefano; Belin, Pascal; Garrod, Simon

    2013-01-01

    Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations. PMID:24391472

  2. Specific acoustic models for spontaneous and dictated style in indonesian speech recognition

    NASA Astrophysics Data System (ADS)

    Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.

    2018-03-01

    The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.

  3. Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers

    PubMed Central

    Mustafa, Mumtaz Begum; Salim, Siti Salwah; Mohamed, Noraini; Al-Qatab, Bassam; Siong, Chng Eng

    2014-01-01

    Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. PMID:24466004

  4. Speech outcomes in Cantonese patients after glossectomy.

    PubMed

    Wong, Ripley Kit; Poon, Esther Sok-Man; Woo, Cynthia Yuen-Man; Chan, Sabina Ching-Shun; Wong, Elsa Siu-Ping; Chu, Ada Wai-Sze

    2007-08-01

    We sought to determine the major factors affecting speech production of Cantonese-speaking glossectomized patients. Error pattern was analyzed. Forty-one Cantonese-speaking subjects who had undergone glossectomy > or = 6 months previously were recruited. Speech production evaluation included (1) phonetic error analysis in nonsense syllable; (2) speech intelligibility in sentences evaluated by naive listeners; (3) overall speech intelligibility in conversation evaluated by experienced speech therapists. Patients receiving adjuvant radiotherapy had significantly poorer segmental and connected speech production. Total or subtotal glossectomy also resulted in poor speech outcomes. Patients having free flap reconstruction showed the best speech outcomes. Patients without lymph node metastasis had significantly better speech scores when compared with patients with lymph node metastasis. Initial consonant production had the worst scores, while vowel production was the least affected. Speech outcomes of Cantonese-speaking glossectomized patients depended on the severity of the disease. Initial consonants had the greatest effect on speech intelligibility.

  5. Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age

    PubMed Central

    Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik

    2015-01-01

    Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker’s age. Here, we report two experiments on age estimation by “naïve” listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers’ natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60–65 years) speakers in comparison with younger (20–25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40–45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed. PMID:26236259

  6. Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

    PubMed

    Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

    2018-05-01

    Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

  7. Giving speech a hand: gesture modulates activity in auditory cortex during speech perception.

    PubMed

    Hubbard, Amy L; Wilson, Stephen M; Callan, Daniel E; Dapretto, Mirella

    2009-03-01

    Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture-a fundamental type of hand gesture that marks speech prosody-might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions.

  8. Giving Speech a Hand: Gesture Modulates Activity in Auditory Cortex During Speech Perception

    PubMed Central

    Hubbard, Amy L.; Wilson, Stephen M.; Callan, Daniel E.; Dapretto, Mirella

    2008-01-01

    Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture – a fundamental type of hand gesture that marks speech prosody – might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions. PMID:18412134

  9. Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

    ERIC Educational Resources Information Center

    Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

    2018-01-01

    To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…

  10. Multilevel Analysis in Analyzing Speech Data

    ERIC Educational Resources Information Center

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  11. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  12. The influence of visual and auditory information on the perception of speech and non-speech oral movements in patients with left hemisphere lesions.

    PubMed

    Schmid, Gabriele; Thielmann, Anke; Ziegler, Wolfram

    2009-03-01

    Patients with lesions of the left hemisphere often suffer from oral-facial apraxia, apraxia of speech, and aphasia. In these patients, visual features often play a critical role in speech and language therapy, when pictured lip shapes or the therapist's visible mouth movements are used to facilitate speech production and articulation. This demands audiovisual processing both in speech and language treatment and in the diagnosis of oral-facial apraxia. The purpose of this study was to investigate differences in audiovisual perception of speech as compared to non-speech oral gestures. Bimodal and unimodal speech and non-speech items were used and additionally discordant stimuli constructed, which were presented for imitation. This study examined a group of healthy volunteers and a group of patients with lesions of the left hemisphere. Patients made substantially more errors than controls, but the factors influencing imitation accuracy were more or less the same in both groups. Error analyses in both groups suggested different types of representations for speech as compared to the non-speech domain, with speech having a stronger weight on the auditory modality and non-speech processing on the visual modality. Additionally, this study was able to show that the McGurk effect is not limited to speech.

  13. The Levels of Speech Usage Rating Scale: Comparison of Client Self-Ratings with Speech Pathologist Ratings

    ERIC Educational Resources Information Center

    Gray, Christina; Baylor, Carolyn; Eadie, Tanya; Kendall, Diane; Yorkston, Kathryn

    2012-01-01

    Background: The term "speech usage" refers to what people want or need to do with their speech to fulfil the communication demands in their life roles. Speech-language pathologists (SLPs) need to know about clients' speech usage to plan appropriate interventions to meet their life participation goals. The Levels of Speech Usage is a…

  14. Speech Anxiety: The Importance of Identification in the Basic Speech Course.

    ERIC Educational Resources Information Center

    Mandeville, Mary Y.

    A study investigated speech anxiety in the basic speech course by means of pre and post essays. Subjects, 73 students in 3 classes in the basic speech course at a southwestern multiuniversity, wrote a two-page essay on their perceptions of their speech anxiety before the first speaking project. Students discussed speech anxiety in class and were…

  15. Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

    NASA Astrophysics Data System (ADS)

    Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

    2017-09-01

    This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.

  16. Obstructive sleep apnea, seizures, and childhood apraxia of speech.

    PubMed

    Caspari, Susan S; Strand, Edythe A; Kotagal, Suresh; Bergqvist, Christina

    2008-06-01

    Associations between obstructive sleep apnea and motor speech disorders in adults have been suggested, though little has been written about possible effects of sleep apnea on speech acquisition in children with motor speech disorders. This report details the medical and speech history of a nonverbal child with seizures and severe apraxia of speech. For 6 years, he made no functional gains in speech production, despite intensive speech therapy. After tonsillectomy for obstructive sleep apnea at age 6 years, he experienced a reduction in seizures and rapid growth in speech production. The findings support a relationship between obstructive sleep apnea and childhood apraxia of speech. The rather late diagnosis and treatment of obstructive sleep apnea, especially in light of what was such a life-altering outcome (gaining functional speech), has significant implications. Most speech sounds develop during ages 2-5 years, which is also the peak time of occurrence of adenotonsillar hypertrophy and childhood obstructive sleep apnea. Hence it is important to establish definitive diagnoses, and to consider early and more aggressive treatments for obstructive sleep apnea, in children with motor speech disorders.

  17. Visual Phonetic Processing Localized Using Speech and Non-Speech Face Gestures in Video and Point-Light Displays

    PubMed Central

    Bernstein, Lynne E.; Jiang, Jintao; Pantazis, Dimitrios; Lu, Zhong-Lin; Joshi, Anand

    2011-01-01

    The talking face affords multiple types of information. To isolate cortical sites with responsibility for integrating linguistically relevant visual speech cues, speech and non-speech face gestures were presented in natural video and point-light displays during fMRI scanning at 3.0T. Participants with normal hearing viewed the stimuli and also viewed localizers for the fusiform face area (FFA), the lateral occipital complex (LOC), and the visual motion (V5/MT) regions of interest (ROIs). The FFA, the LOC, and V5/MT were significantly less activated for speech relative to non-speech and control stimuli. Distinct activation of the posterior superior temporal sulcus and the adjacent middle temporal gyrus to speech, independent of media, was obtained in group analyses. Individual analyses showed that speech and non-speech stimuli were associated with adjacent but different activations, with the speech activations more anterior. We suggest that the speech activation area is the temporal visual speech area (TVSA), and that it can be localized with the combination of stimuli used in this study. PMID:20853377

  18. Relationship between speech motor control and speech intelligibility in children with speech sound disorders.

    PubMed

    Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F

    2013-01-01

    The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.

  19. A common functional neural network for overt production of speech and gesture.

    PubMed

    Marstaller, L; Burianová, H

    2015-01-22

    The perception of co-speech gestures, i.e., hand movements that co-occur with speech, has been investigated by several studies. The results show that the perception of co-speech gestures engages a core set of frontal, temporal, and parietal areas. However, no study has yet investigated the neural processes underlying the production of co-speech gestures. Specifically, it remains an open question whether Broca's area is central to the coordination of speech and gestures as has been suggested previously. The objective of this study was to use functional magnetic resonance imaging to (i) investigate the regional activations underlying overt production of speech, gestures, and co-speech gestures, and (ii) examine functional connectivity with Broca's area. We hypothesized that co-speech gesture production would activate frontal, temporal, and parietal regions that are similar to areas previously found during co-speech gesture perception and that both speech and gesture as well as co-speech gesture production would engage a neural network connected to Broca's area. Whole-brain analysis confirmed our hypothesis and showed that co-speech gesturing did engage brain areas that form part of networks known to subserve language and gesture. Functional connectivity analysis further revealed a functional network connected to Broca's area that is common to speech, gesture, and co-speech gesture production. This network consists of brain areas that play essential roles in motor control, suggesting that the coordination of speech and gesture is mediated by a shared motor control network. Our findings thus lend support to the idea that speech can influence co-speech gesture production on a motoric level. Copyright © 2014 IBRO. Published by Elsevier Ltd. All rights reserved.

  20. Female voice communications in high level aircraft cockpit noises--part II: vocoder and automatic speech recognition systems.

    PubMed

    Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M

    1998-11-01

    The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.

  1. Brain networks engaged in audiovisual integration during speech perception revealed by persistent homology-based network filtration.

    PubMed

    Kim, Heejung; Hahm, Jarang; Lee, Hyekyoung; Kang, Eunjoo; Kang, Hyejin; Lee, Dong Soo

    2015-05-01

    The human brain naturally integrates audiovisual information to improve speech perception. However, in noisy environments, understanding speech is difficult and may require much effort. Although the brain network is supposed to be engaged in speech perception, it is unclear how speech-related brain regions are connected during natural bimodal audiovisual or unimodal speech perception with counterpart irrelevant noise. To investigate the topological changes of speech-related brain networks at all possible thresholds, we used a persistent homological framework through hierarchical clustering, such as single linkage distance, to analyze the connected component of the functional network during speech perception using functional magnetic resonance imaging. For speech perception, bimodal (audio-visual speech cue) or unimodal speech cues with counterpart irrelevant noise (auditory white-noise or visual gum-chewing) were delivered to 15 subjects. In terms of positive relationship, similar connected components were observed in bimodal and unimodal speech conditions during filtration. However, during speech perception by congruent audiovisual stimuli, the tighter couplings of left anterior temporal gyrus-anterior insula component and right premotor-visual components were observed than auditory or visual speech cue conditions, respectively. Interestingly, visual speech is perceived under white noise by tight negative coupling in the left inferior frontal region-right anterior cingulate, left anterior insula, and bilateral visual regions, including right middle temporal gyrus, right fusiform components. In conclusion, the speech brain network is tightly positively or negatively connected, and can reflect efficient or effortful processes during natural audiovisual integration or lip-reading, respectively, in speech perception.

  2. Sensorimotor Oscillations Prior to Speech Onset Reflect Altered Motor Networks in Adults Who Stutter

    PubMed Central

    Mersov, Anna-Maria; Jobst, Cecilia; Cheyne, Douglas O.; De Nil, Luc

    2016-01-01

    Adults who stutter (AWS) have demonstrated atypical coordination of motor and sensory regions during speech production. Yet little is known of the speech-motor network in AWS in the brief time window preceding audible speech onset. The purpose of the current study was to characterize neural oscillations in the speech-motor network during preparation for and execution of overt speech production in AWS using magnetoencephalography (MEG). Twelve AWS and 12 age-matched controls were presented with 220 words, each word embedded in a carrier phrase. Controls were presented with the same word list as their matched AWS participant. Neural oscillatory activity was localized using minimum-variance beamforming during two time periods of interest: speech preparation (prior to speech onset) and speech execution (following speech onset). Compared to controls, AWS showed stronger beta (15–25 Hz) suppression in the speech preparation stage, followed by stronger beta synchronization in the bilateral mouth motor cortex. AWS also recruited the right mouth motor cortex significantly earlier in the speech preparation stage compared to controls. Exaggerated motor preparation is discussed in the context of reduced coordination in the speech-motor network of AWS. It is further proposed that exaggerated beta synchronization may reflect a more strongly inhibited motor system that requires a stronger beta suppression to disengage prior to speech initiation. These novel findings highlight critical differences in the speech-motor network of AWS that occur prior to speech onset and emphasize the need to investigate further the speech-motor assembly in the stuttering population. PMID:27642279

  3. The Speech multi features fusion perceptual hash algorithm based on tensor decomposition

    NASA Astrophysics Data System (ADS)

    Huang, Y. B.; Fan, M. H.; Zhang, Q. Y.

    2018-03-01

    With constant progress in modern speech communication technologies, the speech data is prone to be attacked by the noise or maliciously tampered. In order to make the speech perception hash algorithm has strong robustness and high efficiency, this paper put forward a speech perception hash algorithm based on the tensor decomposition and multi features is proposed. This algorithm analyses the speech perception feature acquires each speech component wavelet packet decomposition. LPCC, LSP and ISP feature of each speech component are extracted to constitute the speech feature tensor. Speech authentication is done by generating the hash values through feature matrix quantification which use mid-value. Experimental results showing that the proposed algorithm is robust for content to maintain operations compared with similar algorithms. It is able to resist the attack of the common background noise. Also, the algorithm is highly efficiency in terms of arithmetic, and is able to meet the real-time requirements of speech communication and complete the speech authentication quickly.

  4. Associations between speech features and phenotypic severity in Treacher Collins syndrome

    PubMed Central

    2014-01-01

    Background Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Methods Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5–74 years, median 34 years) divided into three groups comprising children 5–10 years (n = 4), adolescents 11–18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0–6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Results Children and adolescents presented with significantly higher speech composite scores (median 4, range 1–6) than adults (median 1, range 0–5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31–99) than in adults (98%, range 93–100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Conclusions Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services. PMID:24775909

  5. Associations between speech features and phenotypic severity in Treacher Collins syndrome.

    PubMed

    Asten, Pamela; Akre, Harriet; Persson, Christina

    2014-04-28

    Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services.

  6. Conflict monitoring in speech processing: An fMRI study of error detection in speech production and perception.

    PubMed

    Gauvin, Hanna S; De Baene, Wouter; Brass, Marcel; Hartsuiker, Robert J

    2016-02-01

    To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated whether internal verbal monitoring takes place through the speech perception system, as proposed by perception-based theories of speech monitoring, or whether mechanisms independent of perception are applied, as proposed by production-based theories of speech monitoring. With the use of fMRI during a tongue twister task we observed that error detection in internal speech during noise-masked overt speech production and error detection in speech perception both recruit the same neural network, which includes pre-supplementary motor area (pre-SMA), dorsal anterior cingulate cortex (dACC), anterior insula (AI), and inferior frontal gyrus (IFG). Although production and perception recruit similar areas, as proposed by perception-based accounts, we did not find activation in superior temporal areas (which are typically associated with speech perception) during internal speech monitoring in speech production as hypothesized by these accounts. On the contrary, results are highly compatible with a domain general approach to speech monitoring, by which internal speech monitoring takes place through detection of conflict between response options, which is subsequently resolved by a domain general executive center (e.g., the ACC). Copyright © 2015 Elsevier Inc. All rights reserved.

  7. A novel radar sensor for the non-contact detection of speech signals.

    PubMed

    Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi

    2010-01-01

    Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.

  8. A Novel Radar Sensor for the Non-Contact Detection of Speech Signals

    PubMed Central

    Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi

    2010-01-01

    Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects. PMID:22399895

  9. Acoustics in human communication: evolving ideas about the nature of speech.

    PubMed

    Cooper, F S

    1980-07-01

    This paper discusses changes in attitude toward the nature of speech during the past half century. After reviewing early views on the subject, it considers the role of speech spectrograms, speech articulation, speech perception, messages and computers, and the nature of fluent speech.

  10. Music and Speech Perception in Children Using Sung Speech

    PubMed Central

    Nie, Yingjiu; Galvin, John J.; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

    2018-01-01

    This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners. PMID:29609496

  11. Music and Speech Perception in Children Using Sung Speech.

    PubMed

    Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

    2018-01-01

    This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.

  12. THE COMPREHENSION OF RAPID SPEECH BY THE BLIND, PART III.

    ERIC Educational Resources Information Center

    FOULKE, EMERSON

    A REVIEW OF THE RESEARCH ON THE COMPREHENSION OF RAPID SPEECH BY THE BLIND IDENTIFIES FIVE METHODS OF SPEECH COMPRESSION--SPEECH CHANGING, ELECTROMECHANICAL SAMPLING, COMPUTER SAMPLING, SPEECH SYNTHESIS, AND FREQUENCY DIVIDING WITH THE HARMONIC COMPRESSOR. THE SPEECH CHANGING AND ELECTROMECHANICAL SAMPLING METHODS AND THE NECESSARY APPARATUS HAVE…

  13. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  14. 75 FR 5794 - Notice of Public Information Collection Being Reviewed by the Federal Communications Commission...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-04

    ... Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities; IP Captioned Telephone..., the Commission released Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities, CC Docket No. 98-67, Declaratory Ruling, published at 68 FR...

  15. Extensions to the Speech Disorders Classification System (SDCS)

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  16. Linguistic contributions to speech-on-speech masking for native and non-native listeners: language familiarity and semantic content.

    PubMed

    Brouwer, Susanne; Van Engen, Kristin J; Calandruccio, Lauren; Bradlow, Ann R

    2012-02-01

    This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. © 2012 Acoustical Society of America

  17. Linguistic contributions to speech-on-speech masking for native and non-native listeners: Language familiarity and semantic content

    PubMed Central

    Brouwer, Susanne; Van Engen, Kristin J.; Calandruccio, Lauren; Bradlow, Ann R.

    2012-01-01

    This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener’s knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. PMID:22352516

  18. Auditory-Perceptual Learning Improves Speech Motor Adaptation in Children

    PubMed Central

    Shiller, Douglas M.; Rochon, Marie-Lyne

    2015-01-01

    Auditory feedback plays an important role in children’s speech development by providing the child with information about speech outcomes that is used to learn and fine-tune speech motor plans. The use of auditory feedback in speech motor learning has been extensively studied in adults by examining oral motor responses to manipulations of auditory feedback during speech production. Children are also capable of adapting speech motor patterns to perceived changes in auditory feedback, however it is not known whether their capacity for motor learning is limited by immature auditory-perceptual abilities. Here, the link between speech perceptual ability and the capacity for motor learning was explored in two groups of 5–7-year-old children who underwent a period of auditory perceptual training followed by tests of speech motor adaptation to altered auditory feedback. One group received perceptual training on a speech acoustic property relevant to the motor task while a control group received perceptual training on an irrelevant speech contrast. Learned perceptual improvements led to an enhancement in speech motor adaptation (proportional to the perceptual change) only for the experimental group. The results indicate that children’s ability to perceive relevant speech acoustic properties has a direct influence on their capacity for sensory-based speech motor adaptation. PMID:24842067

  19. Cortical activation patterns correlate with speech understanding after cochlear implantation

    PubMed Central

    Olds, Cristen; Pollonini, Luca; Abaya, Homer; Larky, Jannine; Loy, Megan; Bortfeld, Heather; Beauchamp, Michael S.; Oghalai, John S.

    2015-01-01

    Objectives Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, we used functional near-infrared spectroscopy (fNIRS) to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design We studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. We used fNIRS to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). We also used environmental sounds as a control stimulus. Behavioral measures consisted of the Speech Reception Threshold, CNC words, and AzBio Sentence tests measured in quiet. Results Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the CNC Words and AzBio Sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced cortical activations in all implanted participants. Conclusions Together, these data indicate that the responses we measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation. PMID:26709749

  20. Measures to Evaluate the Effects of DBS on Speech Production

    PubMed Central

    Weismer, Gary; Yunusova, Yana; Bunton, Kate

    2011-01-01

    The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066

  1. Spectral and temporal changes to speech produced in the presence of energetic and informational maskers.

    PubMed

    Cooke, Martin; Lu, Youyi

    2010-10-01

    Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.

  2. Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis

    PubMed Central

    Vielsmeier, Veronika; Kreuzer, Peter M.; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R. O.; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin

    2016-01-01

    Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (“How would you rate your ability to understand speech?”; “How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?”). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role. PMID:28018209

  3. Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis.

    PubMed

    Vielsmeier, Veronika; Kreuzer, Peter M; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R O; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin

    2016-01-01

    Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments ("How would you rate your ability to understand speech?"; "How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?"). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role.

  4. The effect of filtered speech feedback on the frequency of stuttering

    NASA Astrophysics Data System (ADS)

    Rami, Manish Krishnakant

    2000-10-01

    This study investigated the effects of filtered components of speech and whispered speech on the frequency of stuttering. It is known that choral speech, shadowing, and altered auditory feedback are the only conditions which induce fluency without any additional effort than normally required to speak on the part of people who stutter. All these conditions use speech as a second signal. This experiment examined the role of components of speech signal as delineated by the source- filter theory of speech production. Three filtered speech signals, a whispered speech signal, and a choral speech signal formed the stimuli. It was postulated that if the speech signal in whole was necessary for producing fluency in people who stutter, then all other conditions except choral speech should fail to produce fluency enhancement. If the glottal source alone was adequate in restoring fluency, then only the conditions of NAF and whispered speech should fail in promoting fluency. In the event that full filter characteristics are necessary for the fluency creating effects, then all conditions except the choral speech and whispered speech should fail to produce fluency. If any part of the filter characteristics is sufficient in yielding fluency, then only the NAF and the approximate glottal source should fail to demonstrate an increase in the amount of fluency. Twelve adults who stuttered read passages under the six conditions while receiving auditory feedback consisting of one of the six experimental conditions: (a)NAF; (b)approximate glottal source; (c)glottal source and first formant; (d)glottal source and first two formants; and (e)whispered speech. Frequencies of stuttering were obtained for each condition and submitted to descriptive and inferential statistical analysis. Statistically significant differences in means were found within the choral feedback conditions. Specifically, the choral speech, the source and first formant, source and the first two formants, and the whispered speech conditions all decreased the frequency of stuttering while the approximate glottal source did not. It is suggested that articulatory events, chiefly the encoded speech output of the vocal tract origin, afford effective cues and induces fluent speech in people who stutter.

  5. Effects of Within-Talker Variability on Speech Intelligibility in Mandarin-Speaking Adult and Pediatric Cochlear Implant Patients

    PubMed Central

    Su, Qiaotong; Galvin, John J.; Zhang, Guoping; Li, Yongxin

    2016-01-01

    Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. The coarse spectral resolution afforded by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, sentence recognition from the Mandarin speech perception database was measured in adult and pediatric Mandarin-speaking CI listeners for a variety of speaking styles: voiced speech produced at slow, normal, and fast speaking rates; whispered speech; voiced emotional speech; and voiced shouted speech. Recognition of Mandarin Hearing in Noise Test sentences was also measured. Results showed that performance was significantly poorer with whispered speech relative to the other speaking styles and that performance was significantly better with slow speech than with fast or emotional speech. Results also showed that adult and pediatric performance was significantly poorer with Mandarin Hearing in Noise Test than with Mandarin speech perception sentences at the normal rate. The results suggest that adult and pediatric Mandarin-speaking CI patients are highly susceptible to whispered speech, due to the lack of lexically important voice pitch cues and perhaps other qualities associated with whispered speech. The results also suggest that test materials may contribute to differences in performance observed between adult and pediatric CI users. PMID:27363714

  6. EEG oscillations entrain their phase to high-level features of speech sound.

    PubMed

    Zoefel, Benedikt; VanRullen, Rufin

    2016-01-01

    Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation-and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed "high-level" speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

    ERIC Educational Resources Information Center

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…

  8. Motor Speech Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Progressive Apraxia of Speech

    ERIC Educational Resources Information Center

    Poole, Matthew L.; Brodtmann, Amy; Darby, David; Vogel, Adam P.

    2017-01-01

    Purpose: Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Method: Speech and…

  9. Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

    ERIC Educational Resources Information Center

    Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

    2018-01-01

    Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…

  10. The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

    ERIC Educational Resources Information Center

    Wayne, Rachel V.; Johnsrude, Ingrid S.

    2012-01-01

    Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

  11. Inner Speech's Relationship with Overt Speech in Poststroke Aphasia

    ERIC Educational Resources Information Center

    Stark, Brielle C.; Geva, Sharon; Warburton, Elizabeth A.

    2017-01-01

    Purpose: Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech…

  12. TEACHER'S GUIDE TO HIGH SCHOOL SPEECH.

    ERIC Educational Resources Information Center

    JENKINSON, EDWARD B., ED.

    THIS GUIDE TO HIGH SCHOOL SPEECH FOCUSES ON SPEECH AS ORAL COMPOSITION, STRESSING THE IMPORTANCE OF CLEAR THINKING AND COMMUNICATION. THE PROPOSED 1-SEMESTER BASIC COURSE IN SPEECH ATTEMPTS TO IMPROVE THE STUDENT'S ABILITY TO COMPOSE AND DELIVER SPEECHES, TO THINK AND LISTEN CRITICALLY, AND TO UNDERSTAND THE SOCIAL FUNCTION OF SPEECH. IN ADDITION…

  13. Chinese speech intelligibility and its relationship with the speech transmission index for children in elementary school classrooms.

    PubMed

    Peng, Jianxin; Yan, Nanjie; Wang, Dan

    2015-01-01

    The present study investigated Chinese speech intelligibility in 28 classrooms from nine different elementary schools in Guangzhou, China. The subjective Chinese speech intelligibility in the classrooms was evaluated with children in grades 2, 4, and 6 (7 to 12 years old). Acoustical measurements were also performed in these classrooms. Subjective Chinese speech intelligibility scores and objective speech intelligibility parameters, such as speech transmission index (STI), were obtained at each listening position for all tests. The relationship between subjective Chinese speech intelligibility scores and STI was revealed and analyzed. The effects of age on Chinese speech intelligibility scores were compared. Results indicate high correlations between subjective Chinese speech intelligibility scores and STI for grades 2, 4, and 6 children. Chinese speech intelligibility scores increase with increase of age under the same STI condition. The differences in scores among different age groups decrease as STI increases. To achieve 95% Chinese speech intelligibility scores, the STIs required for grades 2, 4, and 6 children are 0.75, 0.69, and 0.63, respectively.

  14. A causal test of the motor theory of speech perception: A case of impaired speech production and spared speech perception

    PubMed Central

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.

    2015-01-01

    In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749

  15. Affective Properties of Mothers' Speech to Infants With Hearing Impairment and Cochlear Implants

    PubMed Central

    Bergeson, Tonya R.; Xu, Huiping; Kitamura, Christine

    2015-01-01

    Purpose The affective properties of infant-directed speech influence the attention of infants with normal hearing to speech sounds. This study explored the affective quality of maternal speech to infants with hearing impairment (HI) during the 1st year after cochlear implantation as compared to speech to infants with normal hearing. Method Mothers of infants with HI and mothers of infants with normal hearing matched by age (NH-AM) or hearing experience (NH-EM) were recorded playing with their infants during 3 sessions over a 12-month period. Speech samples of 25 s were low-pass filtered, leaving intonation but not speech information intact. Sixty adults rated the stimuli along 5 scales: positive/negative affect and intention to express affection, to encourage attention, to comfort/soothe, and to direct behavior. Results Low-pass filtered speech to HI and NH-EM groups was rated as more positive, affective, and comforting compared with the such speech to the NH-AM group. Speech to infants with HI and with NH-AM was rated as more directive than speech to the NH-EM group. Mothers decreased affective qualities in speech to all infants but increased directive qualities in speech to infants with NH-EM over time. Conclusions Mothers fine-tune communicative intent in speech to their infant's developmental stage. They adjust affective qualities to infants' hearing experience rather than to chronological age but adjust directive qualities of speech to the chronological age of their infants. PMID:25679195

  16. Freedom of racist speech: Ego and expressive threats.

    PubMed

    White, Mark H; Crandall, Christian S

    2017-09-01

    Do claims of "free speech" provide cover for prejudice? We investigate whether this defense of racist or hate speech serves as a justification for prejudice. In a series of 8 studies (N = 1,624), we found that explicit racial prejudice is a reliable predictor of the "free speech defense" of racist expression. Participants endorsed free speech values for singing racists songs or posting racist comments on social media; people high in prejudice endorsed free speech more than people low in prejudice (meta-analytic r = .43). This endorsement was not principled-high levels of prejudice did not predict endorsement of free speech values when identical speech was directed at coworkers or the police. Participants low in explicit racial prejudice actively avoided endorsing free speech values in racialized conditions compared to nonracial conditions, but participants high in racial prejudice increased their endorsement of free speech values in racialized conditions. Three experiments failed to find evidence that defense of racist speech by the highly prejudiced was based in self-relevant or self-protective motives. Two experiments found evidence that the free speech argument protected participants' own freedom to express their attitudes; the defense of other's racist speech seems motivated more by threats to autonomy than threats to self-regard. These studies serve as an elaboration of the Justification-Suppression Model (Crandall & Eshleman, 2003) of prejudice expression. The justification of racist speech by endorsing fundamental political values can serve to buffer racial and hate speech from normative disapproval. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  17. Speech Entrainment Compensates for Broca's Area Damage

    PubMed Central

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-01-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443

  18. The role of hearing ability and speech distortion in the facilitation of articulatory motor cortex.

    PubMed

    Nuttall, Helen E; Kennedy-Higgins, Daniel; Devlin, Joseph T; Adank, Patti

    2017-01-08

    Excitability of articulatory motor cortex is facilitated when listening to speech in challenging conditions. Beyond this, however, we have little knowledge of what listener-specific and speech-specific factors engage articulatory facilitation during speech perception. For example, it is unknown whether speech motor activity is independent or dependent on the form of distortion in the speech signal. It is also unknown if speech motor facilitation is moderated by hearing ability. We investigated these questions in two experiments. We applied transcranial magnetic stimulation (TMS) to the lip area of primary motor cortex (M1) in young, normally hearing participants to test if lip M1 is sensitive to the quality (Experiment 1) or quantity (Experiment 2) of distortion in the speech signal, and if lip M1 facilitation relates to the hearing ability of the listener. Experiment 1 found that lip motor evoked potentials (MEPs) were larger during perception of motor-distorted speech that had been produced using a tongue depressor, and during perception of speech presented in background noise, relative to natural speech in quiet. Experiment 2 did not find evidence of motor system facilitation when speech was presented in noise at signal-to-noise ratios where speech intelligibility was at 50% or 75%, which were significantly less severe noise levels than used in Experiment 1. However, there was a significant interaction between noise condition and hearing ability, which indicated that when speech stimuli were correctly classified at 50%, speech motor facilitation was observed in individuals with better hearing, whereas individuals with relatively worse but still normal hearing showed more activation during perception of clear speech. These findings indicate that the motor system may be sensitive to the quantity, but not quality, of degradation in the speech signal. Data support the notion that motor cortex complements auditory cortex during speech perception, and point to a role for the motor cortex in compensating for differences in hearing ability. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. How visual timing and form information affect speech and non-speech processing.

    PubMed

    Kim, Jeesun; Davis, Chris

    2014-10-01

    Auditory speech processing is facilitated when the talker's face/head movements are seen. This effect is typically explained in terms of visual speech providing form and/or timing information. We determined the effect of both types of information on a speech/non-speech task (non-speech stimuli were spectrally rotated speech). All stimuli were presented paired with the talker's static or moving face. Two types of moving face stimuli were used: full-face versions (both spoken form and timing information available) and modified face versions (only timing information provided by peri-oral motion available). The results showed that the peri-oral timing information facilitated response time for speech and non-speech stimuli compared to a static face. An additional facilitatory effect was found for full-face versions compared to the timing condition; this effect only occurred for speech stimuli. We propose the timing effect was due to cross-modal phase resetting; the form effect to cross-modal priming. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C [Livermore, CA; Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA

    2006-08-08

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  1. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2004-03-23

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  2. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-02-14

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  3. Children perceive speech onsets by ear and eye*

    PubMed Central

    JERGER, SUSAN; DAMIAN, MARKUS F.; TYE-MURRAY, NANCY; ABDI, HERVÉ

    2016-01-01

    Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: −b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children – like adults – perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception. PMID:26752548

  4. Impairments of speech fluency in Lewy body spectrum disorder.

    PubMed

    Ash, Sharon; McMillan, Corey; Gross, Rachel G; Cook, Philip; Gunawardena, Delani; Morgan, Brianna; Boller, Ashley; Siderowf, Andrew; Grossman, Murray

    2012-03-01

    Few studies have examined connected speech in demented and non-demented patients with Parkinson's disease (PD). We assessed the speech production of 35 patients with Lewy body spectrum disorder (LBSD), including non-demented PD patients, patients with PD dementia (PDD), and patients with dementia with Lewy bodies (DLB), in a semi-structured narrative speech sample in order to characterize impairments of speech fluency and to determine the factors contributing to reduced speech fluency in these patients. Both demented and non-demented PD patients exhibited reduced speech fluency, characterized by reduced overall speech rate and long pauses between sentences. Reduced speech rate in LBSD correlated with measures of between-utterance pauses, executive functioning, and grammatical comprehension. Regression analyses related non-fluent speech, grammatical difficulty, and executive difficulty to atrophy in frontal brain regions. These findings indicate that multiple factors contribute to slowed speech in LBSD, and this is mediated in part by disease in frontal brain regions. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. Speech processing using conditional observable maximum likelihood continuity mapping

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogden, John; Nix, David

    A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence ofmore » speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.« less

  6. Walking the talk--speech activates the leg motor cortex.

    PubMed

    Liuzzi, Gianpiero; Ellger, Tanja; Flöel, Agnes; Breitenstein, Caterina; Jansen, Andreas; Knecht, Stefan

    2008-09-01

    Speech may have evolved from earlier modes of communication based on gestures. Consistent with such a motor theory of speech, cortical orofacial and hand motor areas are activated by both speech production and speech perception. However, the extent of speech-related activation of the motor cortex remains unclear. Therefore, we examined if reading and listening to continuous prose also activates non-brachiofacial motor representations like the leg motor cortex. We found corticospinal excitability of bilateral leg muscle representations to be enhanced by speech production and silent reading. Control experiments showed that speech production yielded stronger facilitation of the leg motor system than non-verbal tongue-mouth mobilization and silent reading more than a visuo-attentional task thus indicating speech-specificity of the effect. In the frame of the motor theory of speech this finding suggests that the system of gestural communication, from which speech may have evolved, is not confined to the hand but includes gestural movements of other body parts as well.

  7. Religious Speech in the Military: Freedoms and Limitations

    DTIC Science & Technology

    2011-01-01

    abridging the freedom of speech .” Speech is construed broadly and includes both oral and written speech, as well as expressive conduct and displays when...intended to convey a message that is likely to be understood.7 Religious speech is certainly included. As a bedrock constitutional right, freedom of speech has...to good order and discipline or of a nature to bring discredit upon the armed forces)—the First Amendment’s freedom of speech will not provide them

  8. Dog-directed speech: why do we use it and do dogs pay attention to it?

    PubMed Central

    Ben-Aderet, Tobey; Gallego-Abenza, Mario

    2017-01-01

    Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. PMID:28077769

  9. The prevalence of stuttering, voice, and speech-sound disorders in primary school students in Australia.

    PubMed

    McKinnon, David H; McLeod, Sharynne; Reilly, Sheena

    2007-01-01

    The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to schoolchildren with speech disorders. Students with speech disorders were identified from 10,425 students in Australia using a 4-stage process: training in the data collection process, teacher identification, confirmation by a speech-language pathologist, and consultation with district special needs advisors. The prevalence of students with speech disorders was estimated; specifically, 0.33% of students were identified as stuttering, 0.12% as having a voice disorder, and 1.06% as having a speech-sound disorder. There was a higher prevalence of speech disorders in males than in females. As grade level increased, the prevalence of speech disorders decreased. There was no significant difference in the pattern of prevalence across the three speech disorders and four socioeconomic groups; however, students who were identified with a speech disorder were more likely to be in the higher socioeconomic groups. Finally, there was a difference between the perceived and actual level of support that was provided to these students. These prevalence figures are lower than those using initial identification by speech-language pathologists and similar to those using parent report.

  10. Role of contextual cues on the perception of spectrally reduced interrupted speech.

    PubMed

    Patro, Chhayakanta; Mendel, Lisa Lucks

    2016-08-01

    Understanding speech within an auditory scene is constantly challenged by interfering noise in suboptimal listening environments when noise hinders the continuity of the speech stream. In such instances, a typical auditory-cognitive system perceptually integrates available speech information and "fills in" missing information in the light of semantic context. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing counterparts. This inefficiency in perceptual integration of speech could be attributed to further degradations in the spectral-temporal domain imposed by CIs making it difficult to utilize the contextual evidence effectively. To address these issues, 20 normal hearing adults listened to speech that was spectrally reduced and spectrally reduced interrupted in a manner similar to CI processing. The Revised Speech Perception in Noise test, which includes contextually rich and contextually poor sentences, was used to evaluate the influence of semantic context on speech perception. Results indicated that listeners benefited more from semantic context when they listened to spectrally reduced speech alone. For the spectrally reduced interrupted speech, contextual information was not as helpful under significant spectral reductions, but became beneficial as the spectral resolution improved. These results suggest top-down processing facilitates speech perception up to a point, and it fails to facilitate speech understanding when the speech signals are significantly degraded.

  11. Dog-directed speech: why do we use it and do dogs pay attention to it?

    PubMed

    Ben-Aderet, Tobey; Gallego-Abenza, Mario; Reby, David; Mathevon, Nicolas

    2017-01-11

    Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. © 2017 The Author(s).

  12. The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech.

    PubMed

    Crosse, Michael J; Lalor, Edmund C

    2014-04-01

    Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.

  13. 76 FR 69737 - Information Collections Being Reviewed by the Federal Communications Commission for Extension...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-09

    ...: Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities... Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities; Americans with Disabilities Act of 1990, CC Docket No. 98-67, CG Docket No. 10-123, Second Report and Order, Order on...

  14. Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

    ERIC Educational Resources Information Center

    Drijvers, Linda; Ozyurek, Asli

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…

  15. Compressed Speech: Potential Application for Air Force Technical Training. Final Report, August 73-November 73.

    ERIC Educational Resources Information Center

    Dailey, K. Anne

    Time-compressed speech (also called compressed speech, speeded speech, or accelerated speech) is an extension of the normal recording procedure for reproducing the spoken word. Compressed speech can be used to achieve dramatic reductions in listening time without significant loss in comprehension. The implications of such temporal reductions in…

  16. Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder

    ERIC Educational Resources Information Center

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2006-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…

  17. Poor Speech Perception Is Not a Core Deficit of Childhood Apraxia of Speech: Preliminary Findings

    ERIC Educational Resources Information Center

    Zuk, Jennifer; Iuzzini-Seigel, Jenya; Cabbage, Kathryn; Green, Jordan R.; Hogan, Tiffany P.

    2018-01-01

    Purpose: Childhood apraxia of speech (CAS) is hypothesized to arise from deficits in speech motor planning and programming, but the influence of abnormal speech perception in CAS on these processes is debated. This study examined speech perception abilities among children with CAS with and without language impairment compared to those with…

  18. Neural tracking of attended versus ignored speech is differentially affected by hearing loss.

    PubMed

    Petersen, Eline Borch; Wöstmann, Malte; Obleser, Jonas; Lunner, Thomas

    2017-01-01

    Hearing loss manifests as a reduced ability to understand speech, particularly in multitalker situations. In these situations, younger normal-hearing listeners' brains are known to track attended speech through phase-locking of neural activity to the slow-varying envelope of the speech. This study investigates how hearing loss, compensated by hearing aids, affects the neural tracking of the speech-onset envelope in elderly participants with varying degree of hearing loss (n = 27, 62-86 yr; hearing thresholds 11-73 dB hearing level). In an active listening task, a to-be-attended audiobook (signal) was presented either in quiet or against a competing to-be-ignored audiobook (noise) presented at three individualized signal-to-noise ratios (SNRs). The neural tracking of the to-be-attended and to-be-ignored speech was quantified through the cross-correlation of the electroencephalogram (EEG) and the temporal envelope of speech. We primarily investigated the effects of hearing loss and SNR on the neural envelope tracking. First, we found that elderly hearing-impaired listeners' neural responses reliably track the envelope of to-be-attended speech more than to-be-ignored speech. Second, hearing loss relates to the neural tracking of to-be-ignored speech, resulting in a weaker differential neural tracking of to-be-attended vs. to-be-ignored speech in listeners with worse hearing. Third, neural tracking of to-be-attended speech increased with decreasing background noise. Critically, the beneficial effect of reduced noise on neural speech tracking decreased with stronger hearing loss. In sum, our results show that a common sensorineural processing deficit, i.e., hearing loss, interacts with central attention mechanisms and reduces the differential tracking of attended and ignored speech. The present study investigates the effect of hearing loss in older listeners on the neural tracking of competing speech. Interestingly, we observed that whereas internal degradation (hearing loss) relates to the neural tracking of ignored speech, external sound degradation (ratio between attended and ignored speech; signal-to-noise ratio) relates to tracking of attended speech. This provides the first evidence for hearing loss affecting the ability to neurally track speech. Copyright © 2017 the American Physiological Society.

  19. Asymmetric Dynamic Attunement of Speech and Gestures in the Construction of Children's Understanding.

    PubMed

    De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A

    2016-01-01

    As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and speech. The picture that emerges from our analyses suggests that the relation between gestures, speech and cognition is more complex than previously thought. We suggest that temporal differences and asymmetry in influence between gestures and speech arise from simultaneous coordination of synergies.

  20. Short-term effect of short, intensive speech therapy on articulation and resonance in Ugandan patients with cleft (lip and) palate.

    PubMed

    Luyten, A; Bettens, K; D'haeseleer, E; Hodges, A; Galiwango, G; Vermeersch, H; Van Lierde, K

    2016-01-01

    The purpose of the current study was to assess the short-term effectiveness of short and intensive speech therapy provided to patients with cleft (lip and) palate (C(L)P) in terms of articulation and resonance. Five Ugandan patients (age: 7.3-19.6 years) with non-syndromic C(L)P received six hours of individualized speech therapy in three to four days. Speech therapy focused on correct phonetic placement and contrasts between oral and nasal airflow and resonance. Speech evaluations performed before and immediately after speech therapy, including perceptual and instrumental assessment techniques, were compared. Post-therapy, improvement of speech was noted for most of the patients, although to varying degrees. Clinically relevant progress of objective nasalance values and/or articulation was obtained in four patients. Overall, two patients showed normal speech intelligibility, while three patients required additional speech therapy. These preliminary short-term results demonstrate that short and intensive speech therapy can be effective for patients with C(L)P in countries with limited access to speech-language therapy. However, further research is needed on the long-term effectiveness and the advantages of applying this treatment protocol in countries with good access to speech therapy. The reader will be able to (1) list the challenges in resource poor-countries to achieve access to speech-language therapy services, (2) describe when the application of speech therapy is appropriate in patients with C(L)P, (3) describe the speech therapy that can be applied to reduce compensatory articulation and resonance disorders in patients with C(L)P, and (4) list the (possible) advantages of short, intensive speech therapy for both resource-poor and developed countries. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Systematic studies of modified vocalization: the effect of speech rate on speech production measures during metronome-paced speech in persons who stutter.

    PubMed

    Davidow, Jason H

    2014-01-01

    Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control speech rate between conditions limits our ability to determine if the changes were necessary for fluency. This study examined the effect of speech rate on several speech production variables during one-syllable-per-beat metronomic speech in order to determine changes that may be important for fluency during this fluency-inducing condition. Thirteen persons who stutter (PWS), aged 18-62 years, completed a series of speaking tasks. Several speech production variables were compared between conditions produced at different metronome beat rates, and between a control condition and a metronome-paced speech condition produced at a rate equal to the control condition. Vowel duration, voice onset time, pressure rise time and phonated intervals were significantly impacted by metronome beat rate. Voice onset time and the percentage of short (30-100 ms) phonated intervals significantly decreased from the control condition to the equivalent rate metronome-paced speech condition. A reduction in the percentage of short phonated intervals may be important for fluency during syllable-based metronome-paced speech for PWS. Future studies should continue examining the necessity of this reduction. In addition, speech rate must be controlled in future fluency-inducing condition studies, including neuroimaging investigations, in order for this research to make a substantial contribution to finding the fluency-inducing mechanism of fluency-inducing conditions. © 2013 Royal College of Speech and Language Therapists.

  2. Intensive treatment of speech disorders in robin sequence: a case report.

    PubMed

    Pinto, Maria Daniela Borro; Pegoraro-Krook, Maria Inês; Andrade, Laura Katarine Félix de; Correa, Ana Paula Carvalho; Rosa-Lugo, Linda Iris; Dutka, Jeniffer de Cássia Rillo

    2017-10-23

    To describe the speech of a patient with Pierre Robin Sequence (PRS) and severe speech disorders before and after participating in an Intensive Speech Therapy Program (ISTP). The ISTP consisted of two daily sessions of therapy over a 36-week period, resulting in a total of 360 therapy sessions. The sessions included the phases of establishment, generalization, and maintenance. A combination of strategies, such as modified contrast therapy and speech sound perception training, were used to elicit adequate place of articulation. The ISTP addressed correction of place of production of oral consonants and maximization of movement of the pharyngeal walls with a speech bulb reduction program. Therapy targets were addressed at the phonetic level with a gradual increase in the complexity of the productions hierarchically (e.g., syllables, words, phrases, conversation) while simultaneously addressing the velopharyngeal hypodynamism with speech bulb reductions. Re-evaluation after the ISTP revealed normal speech resonance and articulation with the speech bulb. Nasoendoscopic assessment indicated consistent velopharyngeal closure for all oral sounds with the speech bulb in place. Intensive speech therapy, combined with the use of the speech bulb, yielded positive outcomes in the rehabilitation of a clinical case with severe speech disorders associated with velopharyngeal dysfunction in Pierre Robin Sequence.

  3. Speech Inconsistency in Children With Childhood Apraxia of Speech, Language Impairment, and Speech Delay: Depends on the Stimuli.

    PubMed

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Green, Jordan R

    2017-05-24

    The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and speech delay. Participants included 48 children ranging between 4;7 to 17;8 (years;months) with CAS (n = 10), CAS + language impairment (n = 10), speech delay (n = 10), language impairment (n = 9), or typical development (n = 9). Speech inconsistency was assessed at phonemic and token-to-token levels using a variety of stimuli. Children with CAS and CAS + language impairment performed equivalently on all inconsistency assessments. Children with language impairment evidenced high levels of speech inconsistency on the phrase "buy Bobby a puppy." Token-to-token inconsistency of monosyllabic words and the phrase "buy Bobby a puppy" was sensitive and specific in differentiating children with CAS and speech delay, whereas inconsistency calculated on other stimuli (e.g., multisyllabic words) was less efficacious in differentiating between these disorders. Speech inconsistency is a core feature of CAS and is efficacious in differentiating between children with CAS and speech delay; however, sensitivity and specificity are stimuli dependent.

  4. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech From Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech.

    PubMed

    Shriberg, Lawrence D; Strand, Edythe A; Fourakis, Marios; Jakielski, Kathy J; Hall, Sheryl D; Karlsson, Heather B; Mabie, Heather L; McSweeny, Jane L; Tilkens, Christie M; Wilson, David L

    2017-04-14

    Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. PM and other scores were obtained for 264 participants in 6 groups: CAS in idiopathic, neurogenetic, and complex neurodevelopmental disorders; adult-onset apraxia of speech (AAS) consequent to stroke and primary progressive apraxia of speech; and idiopathic speech delay. Participants with CAS and AAS had significantly lower scores than typically speaking reference participants and speech delay controls on measures posited to assess representational and transcoding processes. Representational deficits differed between CAS and AAS groups, with support for both underspecified linguistic representations and memory/access deficits in CAS, but for only the latter in AAS. CAS-AAS similarities in the age-sex standardized percentages of occurrence of the most frequent type of inappropriate pauses (abrupt) and significant differences in the standardized occurrence of appropriate pauses were consistent with speech processing findings. Results support the hypotheses of core representational and transcoding speech processing deficits in CAS and theoretical coherence of the PM's pause-speech elements with these deficits.

  5. Cortical Integration of Audio-Visual Information

    PubMed Central

    Vander Wyk, Brent C.; Ramsay, Gordon J.; Hudac, Caitlin M.; Jones, Warren; Lin, David; Klin, Ami; Lee, Su Mei; Pelphrey, Kevin A.

    2013-01-01

    We investigated the neural basis of audio-visual processing in speech and non-speech stimuli. Physically identical auditory stimuli (speech and sinusoidal tones) and visual stimuli (animated circles and ellipses) were used in this fMRI experiment. Relative to unimodal stimuli, each of the multimodal conjunctions showed increased activation in largely non-overlapping areas. The conjunction of Ellipse and Speech, which most resembles naturalistic audiovisual speech, showed higher activation in the right inferior frontal gyrus, fusiform gyri, left posterior superior temporal sulcus, and lateral occipital cortex. The conjunction of Circle and Tone, an arbitrary audio-visual pairing with no speech association, activated middle temporal gyri and lateral occipital cortex. The conjunction of Circle and Speech showed activation in lateral occipital cortex, and the conjunction of Ellipse and Tone did not show increased activation relative to unimodal stimuli. Further analysis revealed that middle temporal regions, although identified as multimodal only in the Circle-Tone condition, were more strongly active to Ellipse-Speech or Circle-Speech, but regions that were identified as multimodal for Ellipse-Speech were always strongest for Ellipse-Speech. Our results suggest that combinations of auditory and visual stimuli may together be processed by different cortical networks, depending on the extent to which speech or non-speech percepts are evoked. PMID:20709442

  6. Speech privacy and annoyance considerations in the acoustic environment of passenger cars of high-speed trains.

    PubMed

    Jeon, Jin Yong; Hong, Joo Young; Jang, Hyung Suk; Kim, Jae Hyeon

    2015-12-01

    It is necessary to consider not only annoyance of interior noises but also speech privacy to achieve acoustic comfort in a passenger car of a high-speed train because speech from other passengers can be annoying. This study aimed to explore an optimal acoustic environment to satisfy speech privacy and reduce annoyance in a passenger car. Two experiments were conducted using speech sources and compartment noise of a high speed train with varying speech-to-noise ratios (SNRA) and background noise levels (BNL). Speech intelligibility was tested in experiment I, and in experiment II, perceived speech privacy, annoyance, and acoustic comfort of combined sounds with speech and background noise were assessed. The results show that speech privacy and annoyance were significantly influenced by the SNRA. In particular, the acoustic comfort was evaluated as acceptable when the SNRA was less than -6 dB for both speech privacy and noise annoyance. In addition, annoyance increased significantly as the BNL exceeded 63 dBA, whereas the effect of the background-noise level on the speech privacy was not significant. These findings suggest that an optimal level of interior noise in a passenger car might exist between 59 and 63 dBA, taking normal speech levels into account.

  7. Using on-line altered auditory feedback treating Parkinsonian speech

    NASA Astrophysics Data System (ADS)

    Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

    2005-09-01

    Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].

  8. An acoustic comparison of two women's infant- and adult-directed speech

    NASA Astrophysics Data System (ADS)

    Andruski, Jean; Katz-Gershon, Shiri

    2003-04-01

    In addition to having prosodic characteristics that are attractive to infant listeners, infant-directed (ID) speech shares certain characteristics of adult-directed (AD) clear speech, such as increased acoustic distance between vowels, that might be expected to make ID speech easier for adults to perceive in noise than AD conversational speech. However, perceptual tests of two women's ID productions by Andruski and Bessega [J. Acoust. Soc. Am. 112, 2355] showed that is not always the case. In a word identification task that compared ID speech with AD clear and conversational speech, one speaker's ID productions were less well-identified than AD clear speech, but better identified than AD conversational speech. For the second woman, ID speech was the least accurately identified of the three speech registers. For both speakers, hard words (infrequent words with many lexical neighbors) were also at an increased disadvantage relative to easy words (frequent words with few lexical neighbors) in speech registers that were less accurately perceived. This study will compare several acoustic properties of these women's productions, including pitch and formant-frequency characteristics. Results of the acoustic analyses will be examined with the original perceptual results to suggest reasons for differences in listener's accuracy in identifying these two women's ID speech in noise.

  9. Objective speech quality evaluation of real-time speech coders

    NASA Astrophysics Data System (ADS)

    Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

    1984-02-01

    This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.

  10. Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

    PubMed

    Shin, Yu-Jeong; Ko, Seung-O

    2015-12-01

    Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.

  11. Sources of Variability in Children’s Language Growth

    PubMed Central

    Huttenlocher, Janellen; Waterfall, Heidi; Vasilyeva, Marina; Vevea, Jack; Hedges, Larry V.

    2010-01-01

    The present longitudinal study examines the role of caregiver speech in language development, especially syntactic development, using 47 parent-child pairs of diverse SES background from 14 to 46 months. We assess the diversity (variety) of words and syntactic structures produced by caregivers and children. We use lagged correlations to examine language growth and its relation to caregiver speech. Results show substantial individual differences among children, and indicate that diversity of earlier caregiver speech significantly predicts corresponding diversity in later child speech. For vocabulary, earlier child speech also predicts later caregiver speech, suggesting mutual influence. However, for syntax, earlier child speech does not significantly predict later caregiver speech, suggesting a causal flow from caregiver to child. Finally, demographic factors, notably SES, are related to language growth, and are, at least partially, mediated by differences in caregiver speech, showing the pervasive influence of caregiver speech on language growth. PMID:20832781

  12. Rate and rhythm control strategies for apraxia of speech in nonfluent primary progressive aphasia.

    PubMed

    Beber, Bárbara Costa; Berbert, Monalise Costa Batista; Grawer, Ruth Siqueira; Cardoso, Maria Cristina de Almeida Freitas

    2018-01-01

    The nonfluent/agrammatic variant of primary progressive aphasia is characterized by apraxia of speech and agrammatism. Apraxia of speech limits patients' communication due to slow speaking rate, sound substitutions, articulatory groping, false starts and restarts, segmentation of syllables, and increased difficulty with increasing utterance length. Speech and language therapy is known to benefit individuals with apraxia of speech due to stroke, but little is known about its effects in primary progressive aphasia. This is a case report of a 72-year-old, illiterate housewife, who was diagnosed with nonfluent primary progressive aphasia and received speech and language therapy for apraxia of speech. Rate and rhythm control strategies for apraxia of speech were trained to improve initiation of speech. We discuss the importance of these strategies to alleviate apraxia of speech in this condition and the future perspectives in the area.

  13. Multistage audiovisual integration of speech: dissociating identification and detection.

    PubMed

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias S

    2011-02-01

    Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech signal. Here, we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers, the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multistage account of audiovisual integration of speech in which the many attributes of the audiovisual speech signal are integrated by separate integration processes.

  14. Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds

    PubMed Central

    Agnew, Z.K.; McGettigan, C.; Scott, S.K.

    2012-01-01

    Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557

  15. How our own speech rate influences our perception of others.

    PubMed

    Bosker, Hans Rutger

    2017-08-01

    In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects induced by our own speech through 6 experiments, specifically targeting rate normalization (i.e., perceiving phonetic segments relative to surrounding speech rate). Experiment 1 revealed that hearing prerecorded fast or slow context sentences altered the perception of ambiguous vowels, replicating earlier work. Experiment 2 demonstrated that talking at a fast or slow rate prior to target presentation also altered target perception, though the effect of preceding speech rate was reduced. Experiment 3 showed that silent talking (i.e., inner speech) at fast or slow rates did not modulate the perception of others, suggesting that the effect of self-produced speech rate in Experiment 2 arose through monitoring of the external speech signal. Experiment 4 demonstrated that, when participants were played back their own (fast/slow) speech, no reduction of the effect of preceding speech rate was observed, suggesting that the additional task of speech production may be responsible for the reduced effect in Experiment 2. Finally, Experiments 5 and 6 replicate Experiments 2 and 3 with new participant samples. Taken together, these results suggest that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in dialogue settings. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  16. Systematic Studies of Modified Vocalization: The Effect of Speech Rate on Speech Production Measures during Metronome-Paced Speech in Persons Who Stutter

    ERIC Educational Resources Information Center

    Davidow, Jason H.

    2014-01-01

    Background: Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control…

  17. Functional speech disorders: clinical manifestations, diagnosis, and management.

    PubMed

    Duffy, J R

    2016-01-01

    Acquired psychogenic or functional speech disorders are a subtype of functional neurologic disorders. They can mimic organic speech disorders and, although any aspect of speech production can be affected, they manifest most often as dysphonia, stuttering, or prosodic abnormalities. This chapter reviews the prevalence of functional speech disorders, the spectrum of their primary clinical characteristics, and the clues that help distinguish them from organic neurologic diseases affecting the sensorimotor networks involved in speech production. Diagnosis of a speech disorder as functional can be supported by sometimes rapidly achieved positive outcomes of symptomatic speech therapy. The general principles of such therapy are reviewed. © 2016 Elsevier B.V. All rights reserved.

  18. A model of serial order problems in fluent, stuttered and agrammatic speech.

    PubMed

    Howell, Peter

    2007-10-01

    Many models of speech production have attempted to explain dysfluent speech. Most models assume that the disruptions that occur when speech is dysfluent arise because the speakers make errors while planning an utterance. In this contribution, a model of the serial order of speech is described that does not make this assumption. It involves the coordination or 'interlocking' of linguistic planning and execution stages at the language-speech interface. The model is examined to determine whether it can distinguish two forms of dysfluent speech (stuttered and agrammatic speech) that are characterized by iteration and omission of whole words and parts of words.

  19. Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope.

    PubMed

    Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan; Simon, Jonathan Z; Francart, Tom

    2018-04-01

    Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.

  20. Population responses in primary auditory cortex simultaneously represent the temporal envelope and periodicity features in natural speech.

    PubMed

    Abrams, Daniel A; Nicol, Trent; White-Schwoch, Travis; Zecker, Steven; Kraus, Nina

    2017-05-01

    Speech perception relies on a listener's ability to simultaneously resolve multiple temporal features in the speech signal. Little is known regarding neural mechanisms that enable the simultaneous coding of concurrent temporal features in speech. Here we show that two categories of temporal features in speech, the low-frequency speech envelope and periodicity cues, are processed by distinct neural mechanisms within the same population of cortical neurons. We measured population activity in primary auditory cortex of anesthetized guinea pig in response to three variants of a naturally produced sentence. Results show that the envelope of population responses closely tracks the speech envelope, and this cortical activity more closely reflects wider bandwidths of the speech envelope compared to narrow bands. Additionally, neuronal populations represent the fundamental frequency of speech robustly with phase-locked responses. Importantly, these two temporal features of speech are simultaneously observed within neuronal ensembles in auditory cortex in response to clear, conversation, and compressed speech exemplars. Results show that auditory cortical neurons are adept at simultaneously resolving multiple temporal features in extended speech sentences using discrete coding mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Cognitive Load in Voice Therapy Carry-Over Exercises.

    PubMed

    Iwarsson, Jenny; Morris, David Jackson; Balling, Laura Winther

    2017-01-01

    The cognitive load generated by online speech production may vary with the nature of the speech task. This article examines 3 speech tasks used in voice therapy carry-over exercises, in which a patient is required to adopt and automatize new voice behaviors, ultimately in daily spontaneous communication. Twelve subjects produced speech in 3 conditions: rote speech (weekdays), sentences in a set form, and semispontaneous speech. Subjects simultaneously performed a secondary visual discrimination task for which response times were measured. On completion of each speech task, subjects rated their experience on a questionnaire. Response times from the secondary, visual task were found to be shortest for the rote speech, longer for the semispontaneous speech, and longest for the sentences within the set framework. Principal components derived from the subjective ratings were found to be linked to response times on the secondary visual task. Acoustic measures reflecting fundamental frequency distribution and vocal fold compression varied across the speech tasks. The results indicate that consideration should be given to the selection of speech tasks during the process leading to automation of revised speech behavior and that self-reports may be a reliable index of cognitive load.

  2. Speech identification in noise: Contribution of temporal, spectral, and visual speech cues.

    PubMed

    Kim, Jeesun; Davis, Chris; Groot, Christopher

    2009-12-01

    This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual speech cues. Due to a ceiling effect, the degree to which these effects combined could not be determined. In noise, these facilitation effects were more marked and were additive. Experiment 2 examined consonant and vowel identification in the context of CVC or VCV syllables presented in noise. For consonants, both spectral and visual speech cues facilitated identification and these effects were additive. For vowels, the effect of combined cues was underadditive, with the effect of spectral cues reduced when presented with visual speech cues. Analysis indicated that without visual speech, spectral cues facilitated the transmission of place information and vowel height, whereas with visual speech, they facilitated lip rounding, with little impact on the transmission of place information.

  3. Motor speech skills in children with Down syndrome: A descriptive study.

    PubMed

    Rupela, Vani; Velleman, Shelley L; Andrianopoulos, Mary V

    2016-10-01

    Motor speech characteristics of children with Down syndrome (DS) have historically been viewed as either Childhood Dysarthria (CD) or, more infrequently, as Childhood Apraxia of Speech (CAS). The objective of this study was to investigate motor speech deficits in a systematic manner, considering characteristics from both CAS and CD. Motor speech assessments were carried out on seven 3;4-8;11-year old children with DS in comparison with younger, typically-developing children using a Language-Neutral Assessment of Motor Speech for young children (LAMS). Additionally, the motor speech and non-speech oral motor skills of all participants were analysed qualitatively using an investigator checklist of characteristics of CAS, CD and Motor Speech Disorder-Not Otherwise Specified (MSD-NOS). Results indicated that the children with DS exhibited symptoms of CAS, CD and MSD-NOS, with variability within the group and overlapping symptoms of the disorder types. This finding is different from previous assumptions that children with DS have either CD or CAS. The motor speech disorder accompanying DS is complex. The data provide some preliminary descriptions of motor speech disorders in this population and some tools that clinicians would find useful when assessing motor speech skills of young children with DS.

  4. The effect of speech rate on stuttering frequency, phonated intervals, speech effort, and speech naturalness during chorus reading.

    PubMed

    Davidow, Jason H; Ingham, Roger J

    2013-01-01

    This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also examined. An added purpose was to determine if chorus reading could be further refined so as to provide a perceptual guide for gauging the level of physical effort exerted during speech production. A repeated-measures design was used to compare data obtained during control reading conditions and during several chorus reading conditions produced at different speech rates. Participants included 8 persons who stutter (PWS) between the ages of 16 and 32 years. There were significant reductions in the frequency of short PIs from the habitual reading condition during slower chorus conditions, no change when speech rates were matched between habitual reading and chorus conditions, and an increase in the frequency of short PIs during chorus reading produced at a faster rate than the habitual condition. Speech rate did not have an effect on stuttering frequency during chorus reading. In general, speech effort ratings improved and naturalness ratings worsened as speech rate decreased. These results provide evidence that (a) a reduction in the frequency of short PIs is not necessary for fluency improvement during chorus reading, and (b) speech rate may be altered to provide PWS with a more appropriate reference for how physically effortful normally fluent speech production should be. Future investigations should examine the necessity of changes in the activation of neural regions during chorus reading, the possibility of defining individualized units on a 9-point effort scale, and if there are upper and lower speech rate boundaries for receiving ratings of "highly natural sounding" speech during chorus reading. The reader will be able to: (1) describe the effect of changes in speech rate on the frequency of short phonated intervals during chorus reading, (2) describe changes to speaker-judged speech effort as speech rate changes during chorus reading, (3) and describe the effect of changes in speech rate on listener-judged naturalness ratings during chorus reading. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex

    PubMed Central

    Rogers, Jack C.; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E.

    2014-01-01

    Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener’s ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips (“ba” vs. “da”); (2) two speech sounds not articulated by the lips (“ga” vs. “da”); and (3) non-speech sounds produced by the hands (“claps” vs. “clicks”). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba–da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way. PMID:25076928

  6. Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex.

    PubMed

    Rogers, Jack C; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E

    2014-01-01

    Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener's ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips ("ba" vs. "da"); (2) two speech sounds not articulated by the lips ("ga" vs. "da"); and (3) non-speech sounds produced by the hands ("claps" vs. "clicks"). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba-da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way.

  7. A Generative Model of Speech Production in Broca’s and Wernicke’s Areas

    PubMed Central

    Price, Cathy J.; Crinion, Jenny T.; MacSweeney, Mairéad

    2011-01-01

    Speech production involves the generation of an auditory signal from the articulators and vocal tract. When the intended auditory signal does not match the produced sounds, subsequent articulatory commands can be adjusted to reduce the difference between the intended and produced sounds. This requires an internal model of the intended speech output that can be compared to the produced speech. The aim of this functional imaging study was to identify brain activation related to the internal model of speech production after activation related to vocalization, auditory feedback, and movement in the articulators had been controlled. There were four conditions: silent articulation of speech, non-speech mouth movements, finger tapping, and visual fixation. In the speech conditions, participants produced the mouth movements associated with the words “one” and “three.” We eliminated auditory feedback from the spoken output by instructing participants to articulate these words without producing any sound. The non-speech mouth movement conditions involved lip pursing and tongue protrusions to control for movement in the articulators. The main difference between our speech and non-speech mouth movement conditions is that prior experience producing speech sounds leads to the automatic and covert generation of auditory and phonological associations that may play a role in predicting auditory feedback. We found that, relative to non-speech mouth movements, silent speech activated Broca’s area in the left dorsal pars opercularis and Wernicke’s area in the left posterior superior temporal sulcus. We discuss these results in the context of a generative model of speech production and propose that Broca’s and Wernicke’s areas may be involved in predicting the speech output that follows articulation. These predictions could provide a mechanism by which rapid movement of the articulators is precisely matched to the intended speech outputs during future articulations. PMID:21954392

  8. Method and apparatus for obtaining complete speech signals for speech recognition applications

    NASA Technical Reports Server (NTRS)

    Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)

    2009-01-01

    The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

  9. Automatic Speech Recognition from Neural Signals: A Focused Review.

    PubMed

    Herff, Christian; Schultz, Tanja

    2016-01-01

    Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.

  10. Multisensory integration of speech sounds with letters vs. visual speech: only visual speech induces the mismatch negativity.

    PubMed

    Stekelenburg, Jeroen J; Keetels, Mirjam; Vroomen, Jean

    2018-05-01

    Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect. Here, we examined whether written text, like visual speech, can induce an illusory change in the perception of speech sounds on both the behavioural and neural levels. In a sound categorization task, we found that both text and visual speech changed the identity of speech sounds from an /aba/-/ada/ continuum, but the size of this audiovisual effect was considerably smaller for text than visual speech. To examine at which level in the information processing hierarchy these multisensory interactions occur, we recorded electroencephalography in an audiovisual mismatch negativity (MMN, a component of the event-related potential reflecting preattentive auditory change detection) paradigm in which deviant text or visual speech was used to induce an illusory change in a sequence of ambiguous sounds halfway between /aba/ and /ada/. We found that only deviant visual speech induced an MMN, but not deviant text, which induced a late P3-like positive potential. These results demonstrate that text has much weaker effects on sound processing than visual speech does, possibly because text has different biological roots than visual speech. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  11. Longitudinal follow-up to evaluate speech disorders in early-treated patients with infantile-onset Pompe disease.

    PubMed

    Zeng, Yin-Ting; Hwu, Wuh-Liang; Torng, Pao-Chuan; Lee, Ni-Chung; Shieh, Jeng-Yi; Lu, Lu; Chien, Yin-Hsiu

    2017-05-01

    Patients with infantile-onset Pompe disease (IOPD) can be treated by recombinant human acid alpha glucosidase (rhGAA) replacement beginning at birth with excellent survival rates, but they still commonly present with speech disorders. This study investigated the progress of speech disorders in these early-treated patients and ascertained the relationship with treatments. Speech disorders, including hypernasal resonance, articulation disorders, and speech intelligibility, were scored by speech-language pathologists using auditory perception in seven early-treated patients over a period of 6 years. Statistical analysis of the first and last evaluations of the patients was performed with the Wilcoxon signed-rank test. A total of 29 speech samples were analyzed. All the patients suffered from hypernasality, articulation disorder, and impairment in speech intelligibility at the age of 3 years. The conditions were stable, and 2 patients developed normal or near normal speech during follow-up. Speech therapy and a high dose of rhGAA appeared to improve articulation in 6 of the 7 patients (86%, p = 0.028) by decreasing the omission of consonants, which consequently increased speech intelligibility (p = 0.041). Severity of hypernasality greatly reduced only in 2 patients (29%, p = 0.131). Speech disorders were common even in early and successfully treated patients with IOPD; however, aggressive speech therapy and high-dose rhGAA could improve their speech disorders. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  12. Careers in Speech Communication.

    ERIC Educational Resources Information Center

    Speech Communication Association, New York, NY.

    Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…

  13. The development of co-speech gesture in the communication of children with autism spectrum disorders.

    PubMed

    Sowden, Hannah; Clegg, Judy; Perkins, Michael

    2013-12-01

    Co-speech gestures have a close semantic relationship to speech in adult conversation. In typically developing children co-speech gestures which give additional information to speech facilitate the emergence of multi-word speech. A difficulty with integrating audio-visual information is known to exist for individuals with Autism Spectrum Disorder (ASD), which may affect development of the speech-gesture system. A longitudinal observational study was conducted with four children with ASD, aged 2;4 to 3;5 years. Participants were video-recorded for 20 min every 2 weeks during their attendance on an intervention programme. Recording continued for up to 8 months, thus affording a rich analysis of gestural practices from pre-verbal to multi-word speech across the group. All participants combined gesture with either speech or vocalisations. Co-speech gestures providing additional information to speech were observed to be either absent or rare. Findings suggest that children with ASD do not make use of the facilitating communicative effects of gesture in the same way as typically developing children.

  14. SAM: speech-aware applications in medicine to support structured data entry.

    PubMed Central

    Wormek, A. K.; Ingenerf, J.; Orthner, H. F.

    1997-01-01

    In the last two years, improvement in speech recognition technology has directed the medical community's interest to porting and using such innovations in clinical systems. The acceptance of speech recognition systems in clinical domains increases with recognition speed, large medical vocabulary, high accuracy, continuous speech recognition, and speaker independence. Although some commercial speech engines approach these requirements, the greatest benefit can be achieved in adapting a speech recognizer to a specific medical application. The goals of our work are first, to develop a speech-aware core component which is able to establish connections to speech recognition engines of different vendors. This is realized in SAM. Second, with applications based on SAM we want to support the physician in his/her routine clinical care activities. Within the STAMP project (STAndardized Multimedia report generator in Pathology), we extend SAM by combining a structured data entry approach with speech recognition technology. Another speech-aware application in the field of Diabetes care is connected to a terminology server. The server delivers a controlled vocabulary which can be used for speech recognition. PMID:9357730

  15. Increase in Speech Recognition due to Linguistic Mismatch Between Target and Masker Speech: Monolingual and Simultaneous Bilingual Performance

    PubMed Central

    Calandruccio, Lauren; Zhou, Haibo

    2014-01-01

    Purpose To examine whether improved speech recognition during linguistically mismatched target–masker experiments is due to linguistic unfamiliarity of the masker speech or linguistic dissimilarity between the target and masker speech. Method Monolingual English speakers (n = 20) and English–Greek simultaneous bilinguals (n = 20) listened to English sentences in the presence of competing English and Greek speech. Data were analyzed using mixed-effects regression models to determine differences in English recogition performance between the 2 groups and 2 masker conditions. Results Results indicated that English sentence recognition for monolinguals and simultaneous English–Greek bilinguals improved when the masker speech changed from competing English to competing Greek speech. Conclusion The improvement in speech recognition that has been observed for linguistically mismatched target–masker experiments cannot be simply explained by the masker language being linguistically unknown or unfamiliar to the listeners. Listeners can improve their speech recognition in linguistically mismatched target–masker experiments even when the listener is able to obtain meaningful linguistic information from the masker speech. PMID:24167230

  16. Audibility-based predictions of speech recognition for children and adults with normal hearing.

    PubMed

    McCreery, Ryan W; Stelmachowicz, Patricia G

    2011-12-01

    This study investigated the relationship between audibility and predictions of speech recognition for children and adults with normal hearing. The Speech Intelligibility Index (SII) is used to quantify the audibility of speech signals and can be applied to transfer functions to predict speech recognition scores. Although the SII is used clinically with children, relatively few studies have evaluated SII predictions of children's speech recognition directly. Children have required more audibility than adults to reach maximum levels of speech understanding in previous studies. Furthermore, children may require greater bandwidth than adults for optimal speech understanding, which could influence frequency-importance functions used to calculate the SII. Speech recognition was measured for 116 children and 19 adults with normal hearing. Stimulus bandwidth and background noise level were varied systematically in order to evaluate speech recognition as predicted by the SII and derive frequency-importance functions for children and adults. Results suggested that children required greater audibility to reach the same level of speech understanding as adults. However, differences in performance between adults and children did not vary across frequency bands. © 2011 Acoustical Society of America

  17. Speech fluency profile on different tasks for individuals with Parkinson's disease.

    PubMed

    Juste, Fabiola Staróbole; Andrade, Claudia Regina Furquim de

    2017-07-20

    To characterize the speech fluency profile of patients with Parkinson's disease. Study participants were 40 individuals of both genders aged 40 to 80 years divided into 2 groups: Research Group - RG (20 individuals with diagnosis of Parkinson's disease) and Control Group - CG (20 individuals with no communication or neurological disorders). For all of the participants, three speech samples involving different tasks were collected: monologue, individual reading, and automatic speech. The RG presented a significant larger number of speech disruptions, both stuttering-like and typical dysfluencies, and higher percentage of speech discontinuity in the monologue and individual reading tasks compared with the CG. Both groups presented reduced number of speech disruptions (stuttering-like and typical dysfluencies) in the automatic speech task; the groups presented similar performance in this task. Regarding speech rate, individuals in the RG presented lower number of words and syllables per minute compared with those in the CG in all speech tasks. Participants of the RG presented altered parameters of speech fluency compared with those of the CG; however, this change in fluency cannot be considered a stuttering disorder.

  18. Is Presurgery and Early Postsurgery Performance Related to Speech and Language Outcomes at 3 Years of Age for Children with Cleft Palate?

    ERIC Educational Resources Information Center

    Chapman, Kathy L.

    2004-01-01

    This study examined the relationship between presurgery speech measures and speech and language performance at 39 months as well as the relationship between early postsurgery speech measures and speech and language performance at 39 months of age. Fifteen children with cleft lip and palate participated in the study. Spontaneous speech samples were…

  19. The Effect of Speech Rate on Stuttering Frequency, Phonated Intervals, Speech Effort, and Speech Naturalness during Chorus Reading

    ERIC Educational Resources Information Center

    Davidow, Jason H.; Ingham, Roger J.

    2013-01-01

    Purpose: This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also…

  20. Speech Inconsistency in Children with Childhood Apraxia of Speech, Language Impairment, and Speech Delay: Depends on the Stimuli

    ERIC Educational Resources Information Center

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P.; Green, Jordan R.

    2017-01-01

    Purpose: The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and…

  1. The sensorimotor and social sides of the architecture of speech.

    PubMed

    Pezzulo, Giovanni; Barca, Laura; D'Ausilio, Alessando

    2014-12-01

    Speech is a complex skill to master. In addition to sophisticated phono-articulatory abilities, speech acquisition requires neuronal systems configured for vocal learning, with adaptable sensorimotor maps that couple heard speech sounds with motor programs for speech production; imitation and self-imitation mechanisms that can train the sensorimotor maps to reproduce heard speech sounds; and a "pedagogical" learning environment that supports tutor learning.

  2. Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

    ERIC Educational Resources Information Center

    Cowan, Gloria; Khatchadourian, Desiree

    2003-01-01

    Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…

  3. The speech naturalness of people who stutter speaking under delayed auditory feedback as perceived by different groups of listeners.

    PubMed

    Van Borsel, John; Eeckhout, Hannelore

    2008-09-01

    This study investigated listeners' perception of the speech naturalness of people who stutter (PWS) speaking under delayed auditory feedback (DAF) with particular attention for possible listener differences. Three panels of judges consisting of 14 stuttering individuals, 14 speech language pathologists, and 14 naive listeners rated the naturalness of speech samples of stuttering and non-stuttering individuals using a 9-point interval scale. Results clearly indicate that these three groups evaluate naturalness differently. Naive listeners appear to be more severe in their judgements than speech language pathologists and stuttering listeners, and speech language pathologists are apparently more severe than PWS. The three listener groups showed similar trends with respect to the relationship between speech naturalness and speech rate. Results of all three indicated that for PWS, the slower a speaker's rate was, the less natural speech was judged to sound. The three listener groups also showed similar trends with regard to naturalness of the stuttering versus the non-stuttering individuals. All three panels considered the speech of the non-stuttering participants more natural. The reader will be able to: (1) discuss the speech naturalness of people who stutter speaking under delayed auditory feedback, (2) discuss listener differences about the naturalness of people who stutter speaking under delayed auditory feedback, and (3) discuss the importance of speech rate for the naturalness of speech.

  4. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech From Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

    PubMed Central

    Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method PM and other scores were obtained for 264 participants in 6 groups: CAS in idiopathic, neurogenetic, and complex neurodevelopmental disorders; adult-onset apraxia of speech (AAS) consequent to stroke and primary progressive apraxia of speech; and idiopathic speech delay. Results Participants with CAS and AAS had significantly lower scores than typically speaking reference participants and speech delay controls on measures posited to assess representational and transcoding processes. Representational deficits differed between CAS and AAS groups, with support for both underspecified linguistic representations and memory/access deficits in CAS, but for only the latter in AAS. CAS–AAS similarities in the age–sex standardized percentages of occurrence of the most frequent type of inappropriate pauses (abrupt) and significant differences in the standardized occurrence of appropriate pauses were consistent with speech processing findings. Conclusions Results support the hypotheses of core representational and transcoding speech processing deficits in CAS and theoretical coherence of the PM's pause-speech elements with these deficits. PMID:28384751

  5. Scaling and universality in the human voice.

    PubMed

    Luque, Jordi; Luque, Bartolo; Lacasa, Lucas

    2015-04-06

    Speech is a distinctive complex feature of human capabilities. In order to understand the physics underlying speech production, in this work, we empirically analyse the statistics of large human speech datasets ranging several languages. We first show that during speech, the energy is unevenly released and power-law distributed, reporting a universal robust Gutenberg-Richter-like law in speech. We further show that such 'earthquakes in speech' show temporal correlations, as the interevent statistics are again power-law distributed. As this feature takes place in the intraphoneme range, we conjecture that the process responsible for this complex phenomenon is not cognitive, but it resides in the physiological (mechanical) mechanisms of speech production. Moreover, we show that these waiting time distributions are scale invariant under a renormalization group transformation, suggesting that the process of speech generation is indeed operating close to a critical point. These results are put in contrast with current paradigms in speech processing, which point towards low dimensional deterministic chaos as the origin of nonlinear traits in speech fluctuations. As these latter fluctuations are indeed the aspects that humanize synthetic speech, these findings may have an impact in future speech synthesis technologies. Results are robust and independent of the communication language or the number of speakers, pointing towards a universal pattern and yet another hint of complexity in human speech. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  6. Speech training alters consonant and vowel responses in multiple auditory cortex fields

    PubMed Central

    Engineer, Crystal T.; Rahebi, Kimiya C.; Buell, Elizabeth P.; Fink, Melyssa K.; Kilgard, Michael P.

    2015-01-01

    Speech sounds evoke unique neural activity patterns in primary auditory cortex (A1). Extensive speech sound discrimination training alters A1 responses. While the neighboring auditory cortical fields each contain information about speech sound identity, each field processes speech sounds differently. We hypothesized that while all fields would exhibit training-induced plasticity following speech training, there would be unique differences in how each field changes. In this study, rats were trained to discriminate speech sounds by consonant or vowel in quiet and in varying levels of background speech-shaped noise. Local field potential and multiunit responses were recorded from four auditory cortex fields in rats that had received 10 weeks of speech discrimination training. Our results reveal that training alters speech evoked responses in each of the auditory fields tested. The neural response to consonants was significantly stronger in anterior auditory field (AAF) and A1 following speech training. The neural response to vowels following speech training was significantly weaker in ventral auditory field (VAF) and posterior auditory field (PAF). This differential plasticity of consonant and vowel sound responses may result from the greater paired pulse depression, expanded low frequency tuning, reduced frequency selectivity, and lower tone thresholds, which occurred across the four auditory fields. These findings suggest that alterations in the distributed processing of behaviorally relevant sounds may contribute to robust speech discrimination. PMID:25827927

  7. Infants’ brain responses to speech suggest Analysis by Synthesis

    PubMed Central

    Kuhl, Patricia K.; Ramírez, Rey R.; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

    2014-01-01

    Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners’ knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca’s area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of “motherese” on early language learning, and (iii) the “social-gating” hypothesis and humans’ development of social understanding. PMID:25024207

  8. Infants' brain responses to speech suggest analysis by synthesis.

    PubMed

    Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

    2014-08-05

    Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.

  9. Systematic Studies of Modified Vocalization: The Effect of Speech Rate on Speech Production Measures During Metronome-Paced Speech in Persons who Stutter

    PubMed Central

    Davidow, Jason H.

    2013-01-01

    Background Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control speech rate between conditions limits our ability to determine if the changes were necessary for fluency. Aims This study examined the effect of speech rate on several speech production variables during one-syllable-per-beat metronomic speech, in order to determine changes that may be important for fluency during this fluency-inducing condition. Methods and Procedures Thirteen persons who stutter (PWS), aged 18–62 years, completed a series of speaking tasks. Several speech production variables were compared between conditions produced at different metronome beat rates, and between a control condition and a metronome-paced speech condition produced at a rate equal to the control condition. Outcomes & Results Vowel duration, voice onset time, pressure rise time, and phonated intervals were significantly impacted by metronome beat rate. Voice onset time and the percentage of short (30–100 ms) phonated intervals significantly decreased from the control condition to the equivalent rate metronome-paced speech condition. Conclusions & Implications A reduction in the percentage of short phonated intervals may be important for fluency during syllable-based metronome-paced speech for PWS. Future studies should continue examining the necessity of this reduction. In addition, speech rate must be controlled in future fluency-inducing condition studies, including neuroimaging investigations, in order for this research to make a substantial contribution to finding the fluency-inducing mechanism of fluency-inducing conditions. PMID:24372888

  10. Reference-free automatic quality assessment of tracheoesophageal speech.

    PubMed

    Huang, Andy; Falk, Tiago H; Chan, Wai-Yip; Parsa, Vijay; Doyle, Philip

    2009-01-01

    Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.

  11. An analysis of the masking of speech by competing speech using self-report data.

    PubMed

    Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot

    2009-01-01

    Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.

  12. Multichannel Speech Enhancement Based on Generalized Gamma Prior Distribution with Its Online Adaptive Estimation

    NASA Astrophysics Data System (ADS)

    Dat, Tran Huy; Takeda, Kazuya; Itakura, Fumitada

    We present a multichannel speech enhancement method based on MAP speech spectral magnitude estimation using a generalized gamma model of speech prior distribution, where the model parameters are adapted from actual noisy speech in a frame-by-frame manner. The utilization of a more general prior distribution with its online adaptive estimation is shown to be effective for speech spectral estimation in noisy environments. Furthermore, the multi-channel information in terms of cross-channel statistics are shown to be useful to better adapt the prior distribution parameters to the actual observation, resulting in better performance of speech enhancement algorithm. We tested the proposed algorithm in an in-car speech database and obtained significant improvements of the speech recognition performance, particularly under non-stationary noise conditions such as music, air-conditioner and open window.

  13. Some Effects of Training on the Perception of Synthetic Speech

    PubMed Central

    Schwab, Eileen C.; Nusbaum, Howard C.; Pisoni, David B.

    2012-01-01

    The present study was conducted to determine the effects of training on the perception of synthetic speech. Three groups of subjects were tested with synthetic speech using the same tasks before and after training. One group was trained with synthetic speech. A second group went through the identical training procedures using natural speech. The third group received no training. Although performance of the three groups was the same prior to training, significant differences on the post-test measures of word recognition were observed: the group trained with synthetic speech performed much better than the other two groups. A six-month follow-up indicated that the group trained with synthetic speech displayed long-term retention of the knowledge and experience gained with prior exposure to synthetic speech generated by a text-to-speech system. PMID:2936671

  14. Lexical Effects on Second Language Acquisition

    ERIC Educational Resources Information Center

    Kemp, Renee Lorraine

    2017-01-01

    Speech production and perception are inextricably linked systems. Speakers modify their speech in response to listener characteristics, such as age, hearing ability, and language background. Listener-oriented modifications in speech production, commonly referred to as clear speech, have also been found to affect speech perception by enhancing…

  15. Sperry Univac speech communications technology

    NASA Technical Reports Server (NTRS)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  16. Auditory-Motor Processing of Speech Sounds

    PubMed Central

    Möttönen, Riikka; Dutton, Rebekah; Watkins, Kate E.

    2013-01-01

    The motor regions that control movements of the articulators activate during listening to speech and contribute to performance in demanding speech recognition and discrimination tasks. Whether the articulatory motor cortex modulates auditory processing of speech sounds is unknown. Here, we aimed to determine whether the articulatory motor cortex affects the auditory mechanisms underlying discrimination of speech sounds in the absence of demanding speech tasks. Using electroencephalography, we recorded responses to changes in sound sequences, while participants watched a silent video. We also disrupted the lip or the hand representation in left motor cortex using transcranial magnetic stimulation. Disruption of the lip representation suppressed responses to changes in speech sounds, but not piano tones. In contrast, disruption of the hand representation had no effect on responses to changes in speech sounds. These findings show that disruptions within, but not outside, the articulatory motor cortex impair automatic auditory discrimination of speech sounds. The findings provide evidence for the importance of auditory-motor processes in efficient neural analysis of speech sounds. PMID:22581846

  17. System And Method For Characterizing Voiced Excitations Of Speech And Acoustic Signals, Removing Acoustic Noise From Speech, And Synthesizi

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-04-25

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  18. Contextual modulation of reading rate for direct versus indirect speech quotations.

    PubMed

    Yao, Bo; Scheepers, Christoph

    2011-12-01

    In human communication, direct speech (e.g., Mary said: "I'm hungry") is perceived to be more vivid than indirect speech (e.g., Mary said [that] she was hungry). However, the processing consequences of this distinction are largely unclear. In two experiments, participants were asked to either orally (Experiment 1) or silently (Experiment 2, eye-tracking) read written stories that contained either a direct speech or an indirect speech quotation. The context preceding those quotations described a situation that implied either a fast-speaking or a slow-speaking quoted protagonist. It was found that this context manipulation affected reading rates (in both oral and silent reading) for direct speech quotations, but not for indirect speech quotations. This suggests that readers are more likely to engage in perceptual simulations of the reported speech act when reading direct speech as opposed to meaning-equivalent indirect speech quotations, as part of a more vivid representation of the former. Copyright © 2011 Elsevier B.V. All rights reserved.

  19. Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders.

    PubMed

    Liss, Julie M; Utianski, Rene; Lansford, Kaitlin

    2013-01-01

    Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. The goals of this paper are to (i) provide a review of the cognitive-linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. Copyright © 2013 S. Karger AG, Basel.

  20. Crosslinguistic Application of English-Centric Rhythm Descriptors in Motor Speech Disorders

    PubMed Central

    Liss, Julie M.; Utianski, Rene; Lansford, Kaitlin

    2014-01-01

    Background Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. Objective The goals of this paper are to (i) provide a review of the cognitive- linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. Summary This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. PMID:24157596

  1. An investigation of articulatory setting using real-time magnetic resonance imaging

    PubMed Central

    Ramanarayanan, Vikram; Goldstein, Louis; Byrd, Dani; Narayanan, Shrikanth S.

    2013-01-01

    This paper presents an automatic procedure to analyze articulatory setting in speech production using real-time magnetic resonance imaging of the moving human vocal tract. The procedure extracts frames corresponding to inter-speech pauses, speech-ready intervals and absolute rest intervals from magnetic resonance imaging sequences of read and spontaneous speech elicited from five healthy speakers of American English and uses automatically extracted image features to quantify vocal tract posture during these intervals. Statistical analyses show significant differences between vocal tract postures adopted during inter-speech pauses and those at absolute rest before speech; the latter also exhibits a greater variability in the adopted postures. In addition, the articulatory settings adopted during inter-speech pauses in read and spontaneous speech are distinct. The results suggest that adopted vocal tract postures differ on average during rest positions, ready positions and inter-speech pauses, and might, in that order, involve an increasing degree of active control by the cognitive speech planning mechanism. PMID:23862826

  2. Alternative Speech Communication System for Persons with Severe Speech Disorders

    NASA Astrophysics Data System (ADS)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  3. Neurophysiological Influence of Musical Training on Speech Perception

    PubMed Central

    Shahin, Antoine J.

    2011-01-01

    Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL. PMID:21716639

  4. Neurophysiological influence of musical training on speech perception.

    PubMed

    Shahin, Antoine J

    2011-01-01

    Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL.

  5. Speech production gains following constraint-induced movement therapy in children with hemiparesis.

    PubMed

    Allison, Kristen M; Reidy, Teressa Garcia; Boyle, Mary; Naber, Erin; Carney, Joan; Pidcock, Frank S

    2017-01-01

    The purpose of this study was to investigate changes in speech skills of children who have hemiparesis and speech impairment after participation in a constraint-induced movement therapy (CIMT) program. While case studies have reported collateral speech gains following CIMT, the effect of CIMT on speech production has not previously been directly investigated to the knowledge of these investigators. Eighteen children with hemiparesis and co-occurring speech impairment participated in a 21-day clinical CIMT program. The Goldman-Fristoe Test of Articulation-2 (GFTA-2) was used to assess children's articulation of speech sounds before and after the intervention. Changes in percent of consonants correct (PCC) on the GFTA-2 were used as a measure of change in speech production. Children made significant gains in PCC following CIMT. Gains were similar in children with left and right-sided hemiparesis, and across age groups. This study reports significant collateral gains in speech production following CIMT and suggests benefits of CIMT may also spread to speech motor domains.

  6. Long short-term memory for speaker generalization in supervised speech separation

    PubMed Central

    Chen, Jitong; Wang, DeLiang

    2017-01-01

    Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. To improve speaker generalization, a separation model based on long short-term memory (LSTM) is proposed, which naturally accounts for temporal dynamics of speech. Systematic evaluation shows that the proposed model substantially outperforms a DNN-based model on unseen speakers and unseen noises in terms of objective speech intelligibility. Analyzing LSTM internal representations reveals that LSTM captures long-term speech contexts. It is also found that the LSTM model is more advantageous for low-latency speech separation and it, without future frames, performs better than the DNN model with future frames. The proposed model represents an effective approach for speaker- and noise-independent speech separation. PMID:28679261

  7. The right hemisphere is highlighted in connected natural speech production and perception.

    PubMed

    Alexandrou, Anna Maria; Saarinen, Timo; Mäkelä, Sasu; Kujala, Jan; Salmelin, Riitta

    2017-05-15

    Current understanding of the cortical mechanisms of speech perception and production stems mostly from studies that focus on single words or sentences. However, it has been suggested that processing of real-life connected speech may rely on additional cortical mechanisms. In the present study, we examined the neural substrates of natural speech production and perception with magnetoencephalography by modulating three central features related to speech: amount of linguistic content, speaking rate and social relevance. The amount of linguistic content was modulated by contrasting natural speech production and perception to speech-like non-linguistic tasks. Meaningful speech was produced and perceived at three speaking rates: normal, slow and fast. Social relevance was probed by having participants attend to speech produced by themselves and an unknown person. These speech-related features were each associated with distinct spatiospectral modulation patterns that involved cortical regions in both hemispheres. Natural speech processing markedly engaged the right hemisphere in addition to the left. In particular, the right temporo-parietal junction, previously linked to attentional processes and social cognition, was highlighted in the task modulations. The present findings suggest that its functional role extends to active generation and perception of meaningful, socially relevant speech. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Predicting speech intelligibility with a multiple speech subsystems approach in children with cerebral palsy.

    PubMed

    Lee, Jimin; Hustad, Katherine C; Weismer, Gary

    2014-10-01

    Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (speech motor impairment [SMI] group) and 9 judged to be free of dysarthria (no SMI [NSMI] group). Data from children with CP were compared to data from age-matched typically developing children. Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and typically developing groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (adjusted R2 = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R2 analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Children in the SMI group had articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.

  9. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech From Speech Delay: I. Development and Description of the Pause Marker

    PubMed Central

    Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose The goal of this article (PM I) is to describe the rationale for and development of the Pause Marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech from speech delay. Method The authors describe and prioritize 7 criteria with which to evaluate the research and clinical utility of a diagnostic marker for childhood apraxia of speech, including evaluation of the present proposal. An overview is given of the Speech Disorders Classification System, including extensions completed in the same approximately 3-year period in which the PM was developed. Results The finalized Speech Disorders Classification System includes a nosology and cross-classification procedures for childhood and persistent speech disorders and motor speech disorders (Shriberg, Strand, & Mabie, 2017). A PM is developed that provides procedural and scoring information, and citations to papers and technical reports that include audio exemplars of the PM and reference data used to standardize PM scores are provided. Conclusions The PM described here is an acoustic-aided perceptual sign that quantifies one aspect of speech precision in the linguistic domain of phrasing. This diagnostic marker can be used to discriminate early or persistent childhood apraxia of speech from speech delay. PMID:28384779

  10. Do not throw out the baby with the bath water: choosing an effective baseline for a functional localizer of speech processing.

    PubMed

    Stoppelman, Nadav; Harpaz, Tamar; Ben-Shachar, Michal

    2013-05-01

    Speech processing engages multiple cortical regions in the temporal, parietal, and frontal lobes. Isolating speech-sensitive cortex in individual participants is of major clinical and scientific importance. This task is complicated by the fact that responses to sensory and linguistic aspects of speech are tightly packed within the posterior superior temporal cortex. In functional magnetic resonance imaging (fMRI), various baseline conditions are typically used in order to isolate speech-specific from basic auditory responses. Using a short, continuous sampling paradigm, we show that reversed ("backward") speech, a commonly used auditory baseline for speech processing, removes much of the speech responses in frontal and temporal language regions of adult individuals. On the other hand, signal correlated noise (SCN) serves as an effective baseline for removing primary auditory responses while maintaining strong signals in the same language regions. We show that the response to reversed speech in left inferior frontal gyrus decays significantly faster than the response to speech, thus suggesting that this response reflects bottom-up activation of speech analysis followed up by top-down attenuation once the signal is classified as nonspeech. The results overall favor SCN as an auditory baseline for speech processing.

  11. The Use of Electroencephalography in Language Production Research: A Review

    PubMed Central

    Ganushchak, Lesya Y.; Christoffels, Ingrid K.; Schiller, Niels O.

    2011-01-01

    Speech production long avoided electrophysiological experiments due to the suspicion that potential artifacts caused by muscle activity of overt speech may lead to a bad signal-to-noise ratio in the measurements. Therefore, researchers have sought to assess speech production by using indirect speech production tasks, such as tacit or implicit naming, delayed naming, or meta-linguistic tasks, such as phoneme-monitoring. Covert speech may, however, involve different processes than overt speech production. Recently, overt speech has been investigated using electroencephalography (EEG). As the number of papers published is rising steadily, this clearly indicates the increasing interest and demand for overt speech research within the field of cognitive neuroscience of language. Our main goal here is to review all currently available results of overt speech production involving EEG measurements, such as picture naming, Stroop naming, and reading aloud. We conclude that overt speech production can be successfully studied using electrophysiological measures, for instance, event-related brain potentials (ERPs). We will discuss possible relevant components in the ERP waveform of speech production and aim to address the issue of how to interpret the results of ERP research using overt speech, and whether the ERP components in language production are comparable to results from other fields. PMID:21909333

  12. Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging.

    PubMed

    Hagedorn, Christina; Proctor, Michael; Goldstein, Louis; Wilson, Stephen M; Miller, Bruce; Gorno-Tempini, Maria Luisa; Narayanan, Shrikanth S

    2017-04-14

    Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech discussed. One adult male speaker with apraxia of speech was imaged using real-time MRI while producing spontaneous speech, repeated naming tasks, and self-paced repetition of word pairs designed to elicit speech errors. Articulatory data were analyzed, and speech errors were detected using time series reflecting articulatory activity in regions of interest. Real-time MRI captured two types of apraxic gestural intrusion errors in a word pair repetition task. Gestural intrusion errors in nonrepetitive speech, multiple silent initiation gestures at the onset of speech, and covert (unphonated) articulation of entire monosyllabic words were also captured. Real-time MRI and accompanying analytical methods capture and quantify many features of apraxic speech that have been previously observed using other modalities while offering high spatial resolution. This patient's apraxia of speech affected the ability to select only the appropriate vocal tract gestures for a target utterance, suppressing others, and to coordinate them in time.

  13. Describing Speech Usage in Daily Activities in Typical Adults.

    PubMed

    Anderson, Laine; Baylor, Carolyn R; Eadie, Tanya L; Yorkston, Kathryn M

    2016-01-01

    "Speech usage" refers to what people want or need to do with their speech to meet communication demands in life roles. The purpose of this study was to contribute to validation of the Levels of Speech Usage scale by providing descriptive data from a sample of adults without communication disorders, comparing this scale to a published Occupational Voice Demands scale and examining predictors of speech usage levels. This is a survey design. Adults aged ≥25 years without reported communication disorders were recruited nationally to complete an online questionnaire. The questionnaire included the Levels of Speech Usage scale, questions about relevant occupational and nonoccupational activities (eg, socializing, hobbies, childcare, and so forth), and demographic information. Participants were also categorized according to Koufman and Isaacson occupational voice demands scale. A total of 276 participants completed the questionnaires. People who worked for pay tended to report higher levels of speech usage than those who do not work for pay. Regression analyses showed employment to be the major contributor to speech usage; however, considerable variance left unaccounted for suggests that determinants of speech usage and the relationship between speech usage, employment, and other life activities are not yet fully defined. The Levels of Speech Usage may be a viable instrument to systematically rate speech usage because it captures both occupational and nonoccupational speech demands. These data from a sample of typical adults may provide a reference to help in interpreting the impact of communication disorders on speech usage patterns. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  14. Patterns of Post-Stroke Brain Damage that Predict Speech Production Errors in Apraxia of Speech and Aphasia Dissociate

    PubMed Central

    Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

    2015-01-01

    Background and Purpose Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions regarding if AOS emerges from a unique pattern of brain damage or as a sub-element of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Methods Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The Apraxia of Speech Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with AOS and/or aphasia. Localized brain damage was identified using structural MRI, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS and/or aphasia, and brain damage. Results The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS and/or aphasia were associated with damage to the temporal lobe and the inferior pre-central frontal regions. Conclusion AOS likely occurs in conjunction with aphasia due to the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. PMID:25908457

  15. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 42 Public Health 5 2010-10-01 2010-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...

  16. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 5 2013-10-01 2013-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...

  17. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 5 2011-10-01 2011-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...

  18. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 5 2014-10-01 2014-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...

  19. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 5 2012-10-01 2012-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...

  20. Freedom of Speech: A Clear and Present Need to Teach. ERIC Report.

    ERIC Educational Resources Information Center

    Boileau, Don M.

    1983-01-01

    Presents annotations of 21 documents in the ERIC system on the following subjects: (1) theory of freedom of speech; (2) theorists; (3) research on freedom of speech; (4) broadcasting and freedom of speech; and (5) international questions of freedom of speech. (PD)

  1. Retrieval from Memory: Vulnerable or Inviolable?

    ERIC Educational Resources Information Center

    Jones, Dylan M.; Marsh, John E.; Hughes, Robert W.

    2012-01-01

    We show that retrieval from semantic memory is vulnerable even to the mere presence of speech. Irrelevant speech impairs semantic fluency--namely, lexical retrieval cued by a semantic category name--but only if it is meaningful (forward speech compared to reversed speech or words compared to nonwords). Moreover, speech related semantically to the…

  2. Freedom of Speech Newsletter, September, 1975.

    ERIC Educational Resources Information Center

    Allen, Winfred G., Jr., Ed.

    The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…

  3. Voice and Speech after Laryngectomy

    ERIC Educational Resources Information Center

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  4. Speech Patterns and Racial Wage Inequality

    ERIC Educational Resources Information Center

    Grogger, Jeffrey

    2011-01-01

    Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…

  5. The Relationship between Speech Production and Speech Perception Deficits in Parkinson's Disease

    ERIC Educational Resources Information Center

    De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

    2016-01-01

    Purpose: This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Method: Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified…

  6. Speech Characteristics Associated with Three Genotypes of Ataxia

    ERIC Educational Resources Information Center

    Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana

    2011-01-01

    Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…

  7. Speech Acts across Cultures: Challenges to Communication in a Second Language. Studies on Language Acquisition, 11.

    ERIC Educational Resources Information Center

    Gass, Susan M., Ed.; Neu, Joyce, Ed.

    Articles on speech acts and intercultural communication include: "Investigating the Production of Speech Act Sets" (Andrew Cohen); "Non-Native Refusals: A Methodological Perspective" (Noel Houck, Susan M. Gass); "Natural Speech Act Data versus Written Questionnaire Data: How Data Collection Method Affects Speech Act…

  8. The "Checkers" Speech and Televised Political Communication.

    ERIC Educational Resources Information Center

    Flaningam, Carl

    Richard Nixon's 1952 "Checkers" speech was an innovative use of television for political communication. Like television news itself, the campaign fund crisis behind the speech can be thought of in the same terms as other television melodrama, with the speech serving as its climactic episode. The speech adapted well to television because…

  9. Phonemic Characteristics of Apraxia of Speech Resulting from Subcortical Hemorrhage

    ERIC Educational Resources Information Center

    Peach, Richard K.; Tonkovich, John D.

    2004-01-01

    Reports describing subcortical apraxia of speech (AOS) have received little consideration in the development of recent speech processing models because the speech characteristics of patients with this diagnosis have not been described precisely. We describe a case of AOS with aphasia secondary to basal ganglia hemorrhage. Speech-language symptoms…

  10. The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

    ERIC Educational Resources Information Center

    Ye, Ruijuan

    2010-01-01

    This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…

  11. The Effectiveness of Clear Speech as a Masker

    ERIC Educational Resources Information Center

    Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

    2010-01-01

    Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

  12. 76 FR 44326 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-25

    ... Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities; Structure and Practices of the Video Relay Service Program AGENCY: Federal Communications Commission. ACTION...-minute video relay service (``VRS'') compensation rates, and adopts per-minute compensation rates for the...

  13. Advanced Persuasive Speaking, English, Speech: 5114.112.

    ERIC Educational Resources Information Center

    Dade County Public Schools, Miami, FL.

    Developed as a high school quinmester unit on persuasive speaking, this guide provides the teacher with teaching strategies for a course which analyzes speeches from "Vital Speeches of the Day," political speeches, TV commercials, and other types of speeches. Practical use of persuasive methods for school, community, county, state, and…

  14. ON THE NATURE OF SPEECH SCIENCE.

    ERIC Educational Resources Information Center

    PETERSON, GORDON E.

    IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…

  15. Automated Speech Rate Measurement in Dysarthria

    ERIC Educational Resources Information Center

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  16. Neural and Behavioral Mechanisms of Clear Speech

    ERIC Educational Resources Information Center

    Luque, Jenna Silver

    2017-01-01

    Clear speech is a speaking style that has been shown to improve intelligibility in adverse listening conditions, for various listener and talker populations. Clear-speech phonetic enhancements include a slowed speech rate, expanded vowel space, and expanded pitch range. Although clear-speech phonetic enhancements have been demonstrated across a…

  17. Left Lateralized Enhancement of Orofacial Somatosensory Processing Due to Speech Sounds

    ERIC Educational Resources Information Center

    Ito, Takayuki; Johns, Alexis R.; Ostry, David J.

    2013-01-01

    Purpose: Somatosensory information associated with speech articulatory movements affects the perception of speech sounds and vice versa, suggesting an intimate linkage between speech production and perception systems. However, it is unclear which cortical processes are involved in the interaction between speech sounds and orofacial somatosensory…

  18. Audiovisual Cues and Perceptual Learning of Spectrally Distorted Speech

    ERIC Educational Resources Information Center

    Pilling, Michael; Thomas, Sharon

    2011-01-01

    Two experiments investigate the effectiveness of audiovisual (AV) speech cues (cues derived from both seeing and hearing a talker speak) in facilitating perceptual learning of spectrally distorted speech. Speech was distorted through an eight channel noise-vocoder which shifted the spectral envelope of the speech signal to simulate the properties…

  19. 78 FR 73859 - Information Collections Being Reviewed by the Federal Communications Commission Under Delegated...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-09

    ...: Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities... enable the Commission to collect waiver reports from Telecommunications Relay Service (TRS) providers... Report and Order and Order on Reconsideration in Telecommunications Relay Services and Speech-to- Speech...

  20. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…

  1. The influence of speech rate and accent on access and use of semantic information.

    PubMed

    Sajin, Stanislav M; Connine, Cynthia M

    2017-04-01

    Circumstances in which the speech input is presented in sub-optimal conditions generally lead to processing costs affecting spoken word recognition. The current study indicates that some processing demands imposed by listening to difficult speech can be mitigated by feedback from semantic knowledge. A set of lexical decision experiments examined how foreign accented speech and word duration impact access to semantic knowledge in spoken word recognition. Results indicate that when listeners process accented speech, the reliance on semantic information increases. Speech rate was not observed to influence semantic access, except in the setting in which unusually slow accented speech was presented. These findings support interactive activation models of spoken word recognition in which attention is modulated based on speech demands.

  2. Comparison of formant detection methods used in speech processing applications

    NASA Astrophysics Data System (ADS)

    Belean, Bogdan

    2013-11-01

    The paper describes time frequency representations of speech signal together with the formant significance in speech processing applications. Speech formants can be used in emotion recognition, sex discrimination or diagnosing different neurological diseases. Taking into account the various applications of formant detection in speech signal, two methods for detecting formants are presented. First, the poles resulted after a complex analysis of LPC coefficients are used for formants detection. The second approach uses the Kalman filter for formant prediction along the speech signal. Results are presented for both approaches on real life speech spectrograms. A comparison regarding the features of the proposed methods is also performed, in order to establish which method is more suitable in case of different speech processing applications.

  3. Long term rehabilitation of a total glossectomy patient.

    PubMed

    Bachher, Gurmit Kaur; Dholam, Kanchan P

    2010-09-01

    Malignant tumours of the oral cavity that require resection of the tongue result in severe deficiencies in speech and deglutition. Speech misarticulation leads to loss of speech intelligibility, which can prevent or limit communication. Prosthodontic rehabilitation involves fabrication of a Palatal Augmentation Prosthesis (PAP) following partial glossectomy and a mandibular tongue prosthesis after total glossectomy [1]. Speech analysis of a total glossectmy patient rehabilitated with a tongue prosthesis was done with the help of Dr. Speech Software Version 4 (Tiger DRS, Inc., Seattle) twelve years after treatment. Speech therapy sessions along with a prosthesis helped him to correct the dental sounds by using the lower lip and upper dentures (labio-dentals). It was noticed that speech intelligibility, intonation pattern, speech articulation and overall loudness was noticeably improved.

  4. Speech perception and production in severe environments

    NASA Astrophysics Data System (ADS)

    Pisoni, David B.

    1990-09-01

    The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.

  5. Reaction times of normal listeners to laryngeal, alaryngeal, and synthetic speech.

    PubMed

    Evitts, Paul M; Searl, Jeff

    2006-12-01

    The purpose of this study was to compare listener processing demands when decoding alaryngeal compared to laryngeal speech. Fifty-six listeners were presented with single words produced by 1 proficient speaker from 5 different modes of speech: normal, tracheosophageal (TE), esophageal (ES), electrolaryngeal (EL), and synthetic speech (SS). Cognitive processing load was indexed by listener reaction time (RT). To account for significant durational differences among the modes of speech, an RT ratio was calculated (stimulus duration divided by RT). Results indicated that the cognitive processing load was greater for ES and EL relative to normal speech. TE and normal speech did not differ in terms of RT ratio, suggesting fairly comparable cognitive demands placed on the listener. SS required greater cognitive processing load than normal and alaryngeal speech. The results are discussed relative to alaryngeal speech intelligibility and the role of the listener. Potential clinical applications and directions for future research are also presented.

  6. Should visual speech cues (speechreading) be considered when fitting hearing aids?

    NASA Astrophysics Data System (ADS)

    Grant, Ken

    2002-05-01

    When talker and listener are face-to-face, visual speech cues become an important part of the communication environment, and yet, these cues are seldom considered when designing hearing aids. Models of auditory-visual speech recognition highlight the importance of complementary versus redundant speech information for predicting auditory-visual recognition performance. Thus, for hearing aids to work optimally when visual speech cues are present, it is important to know whether the cues provided by amplification and the cues provided by speechreading complement each other. In this talk, data will be reviewed that show nonmonotonicity between auditory-alone speech recognition and auditory-visual speech recognition, suggesting that efforts designed solely to improve auditory-alone recognition may not always result in improved auditory-visual recognition. Data will also be presented showing that one of the most important speech cues for enhancing auditory-visual speech recognition performance, voicing, is often the cue that benefits least from amplification.

  7. Brain 'talks over' boring quotes: top-down activation of voice-selective areas while listening to monotonous direct speech quotations.

    PubMed

    Yao, Bo; Belin, Pascal; Scheepers, Christoph

    2012-04-15

    In human communication, direct speech (e.g., Mary said, "I'm hungry") is perceived as more vivid than indirect speech (e.g., Mary said that she was hungry). This vividness distinction has previously been found to underlie silent reading of quotations: Using functional magnetic resonance imaging (fMRI), we found that direct speech elicited higher brain activity in the temporal voice areas (TVA) of the auditory cortex than indirect speech, consistent with an "inner voice" experience in reading direct speech. Here we show that listening to monotonously spoken direct versus indirect speech quotations also engenders differential TVA activity. This suggests that individuals engage in top-down simulations or imagery of enriched supra-segmental acoustic representations while listening to monotonous direct speech. The findings shed new light on the acoustic nature of the "inner voice" in understanding direct speech. Copyright © 2012 Elsevier Inc. All rights reserved.

  8. Clear speech and lexical competition in younger and older adult listeners.

    PubMed

    Van Engen, Kristin J

    2017-08-01

    This study investigated whether clear speech reduces the cognitive demands of lexical competition by crossing speaking style with lexical difficulty. Younger and older adults identified more words in clear versus conversational speech and more easy words than hard words. An initial analysis suggested that the effect of lexical difficulty was reduced in clear speech, but more detailed analyses within each age group showed this interaction was significant only for older adults. The results also showed that both groups improved over the course of the task and that clear speech was particularly helpful for individuals with poorer hearing: for younger adults, clear speech eliminated hearing-related differences that affected performance on conversational speech. For older adults, clear speech was generally more helpful to listeners with poorer hearing. These results suggest that clear speech affords perceptual benefits to all listeners and, for older adults, mitigates the cognitive challenge associated with identifying words with many phonological neighbors.

  9. Recognizing intentions in infant-directed speech: evidence for universals.

    PubMed

    Bryant, Gregory A; Barrett, H Clark

    2007-08-01

    In all languages studied to date, distinct prosodic contours characterize different intention categories of infant-directed (ID) speech. This vocal behavior likely exists universally as a species-typical trait, but little research has examined whether listeners can accurately recognize intentions in ID speech using only vocal cues, without access to semantic information. We recorded native-English-speaking mothers producing four intention categories of utterances (prohibition, approval, comfort, and attention) as both ID and adult-directed (AD) speech, and we then presented the utterances to Shuar adults (South American hunter-horticulturalists). Shuar subjects were able to reliably distinguish ID from AD speech and were able to reliably recognize the intention categories in both types of speech, although performance was significantly better with ID speech. This is the first demonstration that adult listeners in an indigenous, nonindustrialized, and nonliterate culture can accurately infer intentions from both ID speech and AD speech in a language they do not speak.

  10. An integrated approach to improving noisy speech perception

    NASA Astrophysics Data System (ADS)

    Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail

    2002-05-01

    For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.

  11. Measurement of trained speech patterns in stuttering: interjudge and intrajudge agreement of experts by means of modified time-interval analysis.

    PubMed

    Alpermann, Anke; Huber, Walter; Natke, Ulrich; Willmes, Klaus

    2010-09-01

    Improved fluency after stuttering therapy is usually measured by the percentage of stuttered syllables. However, outcome studies rarely evaluate the use of trained speech patterns that speakers use to manage stuttering. This study investigated whether the modified time interval analysis can distinguish between trained speech patterns, fluent speech, and stuttered speech. Seventeen German experts on stuttering judged a speech sample on two occasions. Speakers of the sample were stuttering adults, who were not undergoing therapy, as well as participants in a fluency shaping and a stuttering modification therapy. Results showed satisfactory inter-judge and intra-judge agreement above 80%. Intervals with trained speech patterns were identified as consistently as stuttered and fluent intervals. We discuss limitations of the study, as well as implications of our findings for the development of training for identification of trained speech patterns and future outcome studies. The reader will be able to (a) explain different methods to measure the use of trained speech patterns, (b) evaluate whether German experts are able to discriminate intervals with trained speech patterns reliably from fluent and stuttered intervals and (c) describe how the measurement of trained speech patterns can contribute to outcome studies.

  12. A systematic review of treatment intensity in speech disorders.

    PubMed

    Kaipa, Ramesh; Peterson, Abigail Marie

    2016-12-01

    Treatment intensity (sometimes referred to as "practice amount") has been well-investigated in learning non-speech tasks, but its role in treating speech disorders has not been largely analysed. This study reviewed the literature regarding treatment intensity in speech disorders. A systematic search was conducted in four databases using appropriate search terms. Seven articles from a total of 580 met the inclusion criteria. The speech disorders investigated included speech sound disorders, dysarthria, acquired apraxia of speech and childhood apraxia of speech. All seven studies were evaluated for their methodological quality, research phase and evidence level. Evidence level of reviewed studies ranged from moderate to strong. With regard to the research phase, only one study was considered to be phase III research, which corresponds to the controlled trial phase. The remaining studies were considered to be phase II research, which corresponds to the phase where magnitude of therapeutic effect is assessed. Results suggested that higher treatment intensity was favourable over lower treatment intensity of specific treatment technique(s) for treating childhood apraxia of speech and speech sound (phonological) disorders. Future research should incorporate randomised-controlled designs to establish optimal treatment intensity that is specific to each of the speech disorders.

  13. Speech and nonspeech: What are we talking about?

    PubMed

    Maas, Edwin

    2017-08-01

    Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.

  14. Speech planning happens before speech execution: online reaction time methods in the study of apraxia of speech.

    PubMed

    Maas, Edwin; Mailend, Marja-Liisa

    2012-10-01

    The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Following a brief description of limitations of offline perceptual methods, we provide a narrative review of various types of RT paradigms from the (speech) motor programming and psycholinguistic literatures and their (thus far limited) application with AOS. On the basis of the review of the literature, we conclude that with careful consideration of potential challenges and caveats, RT approaches hold great promise to advance our understanding of AOS, in particular with respect to the speech planning processes that generate the speech signal before initiation. A deeper understanding of the nature and time course of speech planning and its disruptions in AOS may enhance diagnosis and treatment for AOS. Only a handful of published studies on apraxia of speech have used reaction time methods. However, these studies have provided deeper insight into speech planning impairments in AOS based on a variety of experimental paradigms.

  15. Neural integration of iconic and unrelated coverbal gestures: a functional MRI study.

    PubMed

    Green, Antonia; Straube, Benjamin; Weis, Susanne; Jansen, Andreas; Willmes, Klaus; Konrad, Kerstin; Kircher, Tilo

    2009-10-01

    Gestures are an important part of interpersonal communication, for example by illustrating physical properties of speech contents (e.g., "the ball is round"). The meaning of these so-called iconic gestures is strongly intertwined with speech. We investigated the neural correlates of the semantic integration for verbal and gestural information. Participants watched short videos of five speech and gesture conditions performed by an actor, including variation of language (familiar German vs. unfamiliar Russian), variation of gesture (iconic vs. unrelated), as well as isolated familiar language, while brain activation was measured using functional magnetic resonance imaging. For familiar speech with either of both gesture types contrasted to Russian speech-gesture pairs, activation increases were observed at the left temporo-occipital junction. Apart from this shared location, speech with iconic gestures exclusively engaged left occipital areas, whereas speech with unrelated gestures activated bilateral parietal and posterior temporal regions. Our results demonstrate that the processing of speech with speech-related versus speech-unrelated gestures occurs in two distinct but partly overlapping networks. The distinct processing streams (visual versus linguistic/spatial) are interpreted in terms of "auxiliary systems" allowing the integration of speech and gesture in the left temporo-occipital region.

  16. Engaged listeners: shared neural processing of powerful political speeches

    PubMed Central

    Häcker, Frank E. K.; Honey, Christopher J.; Hasson, Uri

    2015-01-01

    Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners’ brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. PMID:25653012

  17. The interlanguage speech intelligibility benefit for native speakers of Mandarin: Production and perception of English word-final voicing contrasts

    PubMed Central

    Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.

    2009-01-01

    This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271

  18. Some articulatory details of emotional speech

    NASA Astrophysics Data System (ADS)

    Lee, Sungbok; Yildirim, Serdar; Bulut, Murtaza; Kazemzadeh, Abe; Narayanan, Shrikanth

    2005-09-01

    Differences in speech articulation among four emotion types, neutral, anger, sadness, and happiness are investigated by analyzing tongue tip, jaw, and lip movement data collected from one male and one female speaker of American English. The data were collected using an electromagnetic articulography (EMA) system while subjects produce simulated emotional speech. Pitch, root-mean-square (rms) energy and the first three formants were estimated for vowel segments. For both speakers, angry speech exhibited the largest rms energy and largest articulatory activity in terms of displacement range and movement speed. Happy speech is characterized by largest pitch variability. It has higher rms energy than neutral speech but articulatory activity is rather comparable to, or less than, neutral speech. That is, happy speech is more prominent in voicing activity than in articulation. Sad speech exhibits longest sentence duration and lower rms energy. However, its articulatory activity is no less than neutral speech. Interestingly, for the male speaker, articulation for vowels in sad speech is consistently more peripheral (i.e., more forwarded displacements) when compared to other emotions. However, this does not hold for female subject. These and other results will be discussed in detail with associated acoustics and perceived emotional qualities. [Work supported by NIH.

  19. Processing of speech signals for physical and sensory disabilities.

    PubMed Central

    Levitt, H

    1995-01-01

    Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities. Images Fig. 4 PMID:7479816

  20. Loss tolerant speech decoder for telecommunications

    NASA Technical Reports Server (NTRS)

    Prieto, Jr., Jaime L. (Inventor)

    1999-01-01

    A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.

  1. Effects of linear and nonlinear speech rate changes on speech intelligibility in stationary and fluctuating maskers

    PubMed Central

    Cooke, Martin; Aubanel, Vincent

    2017-01-01

    Algorithmic modifications to the durational structure of speech designed to avoid intervals of intense masking lead to increases in intelligibility, but the basis for such gains is not clear. The current study addressed the possibility that the reduced information load produced by speech rate slowing might explain some or all of the benefits of durational modifications. The study also investigated the influence of masker stationarity on the effectiveness of durational changes. Listeners identified keywords in sentences that had undergone linear and nonlinear speech rate changes resulting in overall temporal lengthening in the presence of stationary and fluctuating maskers. Relative to unmodified speech, a slower speech rate produced no intelligibility gains for the stationary masker, suggesting that a reduction in information rate does not underlie intelligibility benefits of durationally modified speech. However, both linear and nonlinear modifications led to substantial intelligibility increases in fluctuating noise. One possibility is that overall increases in speech duration provide no new phonetic information in stationary masking conditions, but that temporal fluctuations in the background increase the likelihood of glimpsing additional salient speech cues. Alternatively, listeners may have benefitted from an increase in the difference in speech rates between the target and background. PMID:28618803

  2. Development of a Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Evaluation of Sound Quality Using a Semantic Differential Method

    NASA Astrophysics Data System (ADS)

    Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

    2013-07-01

    Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed. However, there is room for improvement particularly in terms of sound quality. BCU speech is accompanied by a strong high-pitched tone and contain some distortion. In this study, the sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulations] and air-conducted (AC) speech was quantitatively evaluated using semantic differential and factor analysis. The results showed that all the types of BCU speech had higher metallic and lower esthetic factor scores than AC speech. On the other hand, transposed speech was closer than the other types of BCU speech to AC speech generally; the transposed speech showed a higher powerfulness factor score than the other types of BCU speech and a higher esthetic factor score than DSB-SC speech. These results provide useful information for further development of the BCUHA.

  3. Lexical and phonological variability in preschool children with speech sound disorder.

    PubMed

    Macrae, Toby; Tyler, Ann A; Lewis, Kerry E

    2014-02-01

    The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.

  4. Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

    PubMed

    Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

    2017-02-01

    The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.

    PubMed

    Williams, Jason A

    2012-06-01

    The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.

  6. Developing a Weighted Measure of Speech Sound Accuracy

    PubMed Central

    Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

    2010-01-01

    Purpose The purpose is to develop a system for numerically quantifying a speaker’s phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, we describe a system for differentially weighting speech sound errors based on various levels of phonetic accuracy with a Weighted Speech Sound Accuracy (WSSA) score. We then evaluate the reliability and validity of this measure. Method Phonetic transcriptions are analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy is compared to existing measures, is used to discriminate typical and disordered speech production, and is evaluated to determine whether it is sensitive to changes in phonetic accuracy over time. Results Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners’ judgments of severity of a child’s speech disorder. The measure separates children with and without speech sound disorders. WSSA scores also capture growth in phonetic accuracy in toddler’s speech over time. Conclusion Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children’s speech. PMID:20699344

  7. Duration, Pitch, and Loudness in Kunqu Opera Stage Speech.

    PubMed

    Han, Qichao; Sundberg, Johan

    2017-03-01

    Kunqu is a special type of opera within the Chinese tradition with 600 years of history. In it, stage speech is used for the spoken dialogue. It is performed in Ming Dynasty's mandarin language and is a much more dominant part of the play than singing. Stage speech deviates considerably from normal conversational speech with respect to duration, loudness and pitch. This paper compares these properties in stage speech conversational speech. A famous, highly experienced female singer's performed stage speech and reading of the same lyrics in a conversational speech mode. Clear differences are found. As compared with conversational speech, stage speech had longer word and sentence duration and word duration was less variable. Average sound level was 16 dB higher. Also mean fundamental frequency was considerably higher and more varied. Within sentences, both loudness and fundamental frequency tended to vary according to a low-high-low pattern. Some of the findings fail to support current opinions regarding the characteristics of stage speech, and in this sense the study demonstrates the relevance of objective measurements in descriptions of vocal styles. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  8. Infant-Directed Visual Prosody: Mothers’ Head Movements and Speech Acoustics

    PubMed Central

    Smith, Nicholas A.; Strader, Heather L.

    2014-01-01

    Acoustical changes in the prosody of mothers’ speech to infants are distinct and near universal. However, less is known about the visible properties mothers’ infant-directed (ID) speech, and their relation to speech acoustics. Mothers’ head movements were tracked as they interacted with their infants using ID speech, and compared to movements accompanying their adult-directed (AD) speech. Movement measures along three dimensions of head translation, and three axes of head rotation were calculated. Overall, more head movement was found for ID than AD speech, suggesting that mothers exaggerate their visual prosody in a manner analogous to the acoustical exaggerations in their speech. Regression analyses examined the relation between changing head position and changing acoustical pitch (F0) over time. Head movements and voice pitch were more strongly related in ID speech than in AD speech. When these relations were examined across time windows of different durations, stronger relations were observed for shorter time windows (< 5 sec). However, the particular form of these more local relations did not extend or generalize to longer time windows. This suggests that the multimodal correspondences in speech prosody are variable in form, and occur within limited time spans. PMID:25242907

  9. Processing of Speech Signals for Physical and Sensory Disabilities

    NASA Astrophysics Data System (ADS)

    Levitt, Harry

    1995-10-01

    Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

  10. VOT in speech-disordered individuals: History, theory, data, reminiscence

    NASA Astrophysics Data System (ADS)

    Weismer, Gary

    2004-05-01

    Forty years ago Lisker and Abramson published their landmark paper on VOT; the speech-research world has never been the same. The concept of VOT as a measure relevant to phonology, speech physiology, and speech perception made it a prime choice for scientists who saw an opportunity to exploit the techniques and analytic frameworks of ``speech science'' in the study of speech disorders. Modifications of VOT in speech disorders have been used to draw specific inferences concerning phonological representations, glottal-supraglottal timing, and speech intelligibility. This presentation will provide a review of work on VOT in speech disorders, including (among others) stuttering, hearing impairment, and neurogenic disorders. An attempt will be made to collect published data in summary graphic form, and to discuss their implications. Emphasis will be placed on how VOT has been used to inform theories of disordered speech production. I will close with some personal comments about the influence (unbeknowest to them) these two outstanding scientists had on me in the 1970s, when under the spell of their work I first became aware that the world of speech research did not start and end with moving parts.

  11. Cortical activity patterns predict robust speech discrimination ability in noise

    PubMed Central

    Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

    2012-01-01

    The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331

  12. Speech fluency profile in Williams-Beuren syndrome: a preliminary study.

    PubMed

    Rossi, Natalia Freitas; Souza, Deise Helena de; Moretti-Ferreira, Danilo; Giacheti, Célia Maria

    2009-01-01

    the speech fluency pattern attributed to individuals with Williams-Beuren syndrome (WBS) is supported by the effectiveness of the phonological loop. Some studies have reported the occurrence of speech disruptions caused by lexical and semantic deficits. However, the type and frequency of such speech disruptions has not been well elucidated. to determine the speech fluency profile of individuals with WBS and to compare the speech performance of these individuals to a control group matched by gender and mental age. Twelve subjects with Williams-Beuren syndrome, chronologically aged between 6.6 and 23.6 years and mental age ranging from 4.8 to 14.3 years, were evaluated. They were compared with another group consisting of 12 subjects with similar mental age and with no speech or learning difficulties. Speech fluency parameters were assessed according to the ABFW Language Test: type and frequency of speech disruptions and speech rate. The obtained results were compared between the groups. In comparison with individuals of similar mental age and typical speech and language development, the group with Williams-Beuren syndrome showed a greater percentage of speech discontinuity, and an increased frequency of common hesitations and word repetition. The speech fluency profile presented by individuals with WBS in this study suggests that the presence of disfluencies can be caused by deficits in the lexical, semantic, and syntactic processing of verbal information. The authors stress that further systematic investigations on the subject are warranted.

  13. Dissecting choral speech: properties of the accompanist critical to stuttering reduction.

    PubMed

    Kiefte, Michael; Armson, Joy

    2008-01-01

    The effects of choral speech and altered auditory feedback (AAF) on stuttering frequency were compared to identify those properties of choral speech that make it a more effective condition for stuttering reduction. Seventeen adults who stutter (AWS) participated in an experiment consisting of special choral speech conditions that were manipulated to selectively eliminate specific differences between choral speech and AAF. Consistent with previous findings, results showed that both choral speech and AAF reduced stuttering compared to solo reading. Although reductions under AAF were substantial, they were less dramatic than those for choral speech. Stuttering reduction for choral speech was highly robust even when the accompanist's voice temporally lagged that of the AWS, when there was no opportunity for dynamic interplay between the AWS and accompanist, and when the accompanist was replaced by the AWS's own voice, all of which approximate specific features of AAF. Choral speech was also highly effective in reducing stuttering across changes in speech rate and for both familiar and unfamiliar passages. We concluded that differences in properties between choral speech and AAF other than those that were manipulated in this experiment must account for differences in stuttering reduction. The reader will be able to (1) describe differences in stuttering reduction associated with altered auditory feedback compared to choral speech conditions and (2) describe differences between delivery of a second voice signal as an altered rendition of the speakers own voice (altered auditory feedback) and alterations in the voice of an accompanist (choral speech).

  14. Speech sound disorder at 4 years: prevalence, comorbidities, and predictors in a community cohort of children.

    PubMed

    Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena

    2015-06-01

    The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.

  15. Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

    PubMed

    Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

    2012-12-01

    Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.

  16. Perception drives production across sensory modalities: A network for sensorimotor integration of visual speech.

    PubMed

    Venezia, Jonathan H; Fillmore, Paul; Matchin, William; Isenberg, A Lisette; Hickok, Gregory; Fridriksson, Julius

    2016-02-01

    Sensory information is critical for movement control, both for defining the targets of actions and providing feedback during planning or ongoing movements. This holds for speech motor control as well, where both auditory and somatosensory information have been shown to play a key role. Recent clinical research demonstrates that individuals with severe speech production deficits can show a dramatic improvement in fluency during online mimicking of an audiovisual speech signal suggesting the existence of a visuomotor pathway for speech motor control. Here we used fMRI in healthy individuals to identify this new visuomotor circuit for speech production. Participants were asked to perceive and covertly rehearse nonsense syllable sequences presented auditorily, visually, or audiovisually. The motor act of rehearsal, which is prima facie the same whether or not it is cued with a visible talker, produced different patterns of sensorimotor activation when cued by visual or audiovisual speech (relative to auditory speech). In particular, a network of brain regions including the left posterior middle temporal gyrus and several frontoparietal sensorimotor areas activated more strongly during rehearsal cued by a visible talker versus rehearsal cued by auditory speech alone. Some of these brain regions responded exclusively to rehearsal cued by visual or audiovisual speech. This result has significant implications for models of speech motor control, for the treatment of speech output disorders, and for models of the role of speech gesture imitation in development. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Methodology for speech assessment in the Scandcleft project--an international randomized clinical trial on palatal surgery: experiences from a pilot study.

    PubMed

    Lohmander, A; Willadsen, E; Persson, C; Henningsson, G; Bowden, M; Hutters, B

    2009-07-01

    To present the methodology for speech assessment in the Scandcleft project and discuss issues from a pilot study. Description of methodology and blinded test for speech assessment. Speech samples and instructions for data collection and analysis for comparisons of speech outcomes across five included languages were developed and tested. PARTICIPANTS AND MATERIALS: Randomly selected video recordings of 10 5-year-old children from each language (n = 50) were included in the project. Speech material consisted of test consonants in single words, connected speech, and syllable chains with nasal consonants. Five experienced speech and language pathologists participated as observers. Narrow phonetic transcription of test consonants translated into cleft speech characteristics, ordinal scale rating of resonance, and perceived velopharyngeal closure (VPC). A velopharyngeal composite score (VPC-sum) was extrapolated from raw data. Intra-agreement comparisons were performed. Range for intra-agreement for consonant analysis was 53% to 89%, for hypernasality on high vowels in single words the range was 20% to 80%, and the agreement between the VPC-sum and the overall rating of VPC was 78%. Pooling data of speakers of different languages in the same trial and comparing speech outcome across trials seems possible if the assessment of speech concerns consonants and is confined to speech units that are phonetically similar across languages. Agreed conventions and rules are important. A composite variable for perceptual assessment of velopharyngeal function during speech seems usable; whereas, the method for hypernasality evaluation requires further testing.

  18. Perception drives production across sensory modalities: A network for sensorimotor integration of visual speech

    PubMed Central

    Venezia, Jonathan H.; Fillmore, Paul; Matchin, William; Isenberg, A. Lisette; Hickok, Gregory; Fridriksson, Julius

    2015-01-01

    Sensory information is critical for movement control, both for defining the targets of actions and providing feedback during planning or ongoing movements. This holds for speech motor control as well, where both auditory and somatosensory information have been shown to play a key role. Recent clinical research demonstrates that individuals with severe speech production deficits can show a dramatic improvement in fluency during online mimicking of an audiovisual speech signal suggesting the existence of a visuomotor pathway for speech motor control. Here we used fMRI in healthy individuals to identify this new visuomotor circuit for speech production. Participants were asked to perceive and covertly rehearse nonsense syllable sequences presented auditorily, visually, or audiovisually. The motor act of rehearsal, which is prima facie the same whether or not it is cued with a visible talker, produced different patterns of sensorimotor activation when cued by visual or audiovisual speech (relative to auditory speech). In particular, a network of brain regions including the left posterior middle temporal gyrus and several frontoparietal sensorimotor areas activated more strongly during rehearsal cued by a visible talker versus rehearsal cued by auditory speech alone. Some of these brain regions responded exclusively to rehearsal cued by visual or audiovisual speech. This result has significant implications for models of speech motor control, for the treatment of speech output disorders, and for models of the role of speech gesture imitation in development. PMID:26608242

  19. Speech endpoint detection with non-language speech sounds for generic speech processing applications

    NASA Astrophysics Data System (ADS)

    McClain, Matthew; Romanowski, Brian

    2009-05-01

    Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.

  20. Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and nonnative english speakers.

    PubMed

    Calandruccio, Lauren; Bradlow, Ann R; Dhar, Sumitrajit

    2014-04-01

    Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared with native-accented English speech was reported in Calandruccio et al (2010a). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech Affiliationect masking release. A mixed-model design with within-subject (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech and high-intelligibility, moderate-intelligibility, and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Three listener groups were tested, including monolingual English speakers with normal hearing, nonnative English speakers with normal hearing, and monolingual English speakers with hearing loss. The nonnative English speakers were from various native language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetric mild sloping to moderate sensorineural hearing loss. Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the key words within the sentences (100 key words per masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and listener groups. Monolingual English speakers with normal hearing benefited when the competing speech signal was foreign accented compared with native accented, allowing for improved speech recognition. Various levels of intelligibility across the foreign-accented speech maskers did not influence results. Neither the nonnative English-speaking listeners with normal hearing nor the monolingual English speakers with hearing loss benefited from masking release when the masker was changed from native-accented to foreign-accented English. Slight modifications between the target and the masker speech allowed monolingual English speakers with normal hearing to improve their recognition of native-accented English, even when the competing speech was highly intelligible. Further research is needed to determine which modifications within the competing speech signal caused the Mandarin-accented English to be less effective with respect to masking. Determining the influences within the competing speech that make it less effective as a masker or determining why monolingual normal-hearing listeners can take advantage of these differences could help improve speech recognition for those with hearing loss in the future. American Academy of Audiology.

  1. [Improving speech comprehension using a new cochlear implant speech processor].

    PubMed

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg sentences in the clinical setting S(0)N(CI), with speech signal at 0 degrees and noise lateral to the CI at 90 degrees . With the convincing findings from our evaluations of this multicenter study cohort, a trial with the Freedom 24 sound processor for all suitable CI users is recommended. For evaluating the benefits of a new processor, the comparative assessment paradigm used in our study design would be considered ideal for use with individual patients.

  2. Cortical oscillations and entrainment in speech processing during working memory load.

    PubMed

    Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten

    2018-02-02

    Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (< 13 Hz). We found that increases in both types of WM load (background noise and n-back level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  3. Significance of parametric spectral ratio methods in detection and recognition of whispered speech

    NASA Astrophysics Data System (ADS)

    Mathur, Arpit; Reddy, Shankar M.; Hegde, Rajesh M.

    2012-12-01

    In this article the significance of a new parametric spectral ratio method that can be used to detect whispered speech segments within normally phonated speech is described. Adaptation methods based on the maximum likelihood linear regression (MLLR) are then used to realize a mismatched train-test style speech recognition system. This proposed parametric spectral ratio method computes a ratio spectrum of the linear prediction (LP) and the minimum variance distortion-less response (MVDR) methods. The smoothed ratio spectrum is then used to detect whispered segments of speech within neutral speech segments effectively. The proposed LP-MVDR ratio method exhibits robustness at different SNRs as indicated by the whisper diarization experiments conducted on the CHAINS and the cell phone whispered speech corpus. The proposed method also performs reasonably better than the conventional methods for whisper detection. In order to integrate the proposed whisper detection method into a conventional speech recognition engine with minimal changes, adaptation methods based on the MLLR are used herein. The hidden Markov models corresponding to neutral mode speech are adapted to the whispered mode speech data in the whispered regions as detected by the proposed ratio method. The performance of this method is first evaluated on whispered speech data from the CHAINS corpus. The second set of experiments are conducted on the cell phone corpus of whispered speech. This corpus is collected using a set up that is used commercially for handling public transactions. The proposed whisper speech recognition system exhibits reasonably better performance when compared to several conventional methods. The results shown indicate the possibility of a whispered speech recognition system for cell phone based transactions.

  4. Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under “Cocktail-Party” Listening Conditions

    PubMed Central

    Gao, Yayue; Wang, Qian; Ding, Yu; Wang, Changming; Li, Haifeng; Wu, Xihong; Qu, Tianshu; Li, Liang

    2017-01-01

    Human listeners are able to selectively attend to target speech in a noisy environment with multiple-people talking. Using recordings of scalp electroencephalogram (EEG), this study investigated how selective attention facilitates the cortical representation of target speech under a simulated “cocktail-party” listening condition with speech-on-speech masking. The result shows that the cortical representation of target-speech signals under the multiple-people talking condition was specifically improved by selective attention relative to the non-selective-attention listening condition, and the beta-band activity was most strongly modulated by selective attention. Moreover, measured with the Granger Causality value, selective attention to the single target speech in the mixed-speech complex enhanced the following four causal connectivities for the beta-band oscillation: the ones (1) from site FT7 to the right motor area, (2) from the left frontal area to the right motor area, (3) from the central frontal area to the right motor area, and (4) from the central frontal area to the right frontal area. However, the selective-attention-induced change in beta-band causal connectivity from the central frontal area to the right motor area, but not other beta-band causal connectivities, was significantly correlated with the selective-attention-induced change in the cortical beta-band representation of target speech. These findings suggest that under the “cocktail-party” listening condition, the beta-band oscillation in EEGs to target speech is specifically facilitated by selective attention to the target speech that is embedded in the mixed-speech complex. The selective attention-induced unmasking of target speech may be associated with the improved beta-band functional connectivity from the central frontal area to the right motor area, suggesting a top-down attentional modulation of the speech-motor process. PMID:28239344

  5. A Networking of Community-Based Speech Therapy: Borabue District, Maha Sarakham.

    PubMed

    Pumnum, Tawitree; Kum-ud, Weawta; Prathanee, Benjamas

    2015-08-01

    Most children with cleft lip and palate have articulation problems because of compensatory articulation disorders from velopharyngeal insufficiency. Theoretically, children should receive speech therapy from a speech and language pathologist (SLP) 1-2 sessions per week. For developing countries, particularly Thailand, most of them cannot reach standard speech services because of limitation of speech services and SLP Networking of a Community-Based Speech Model might be an appropriate way to solve this problem. To study the effectiveness of a networking of Khon Kaen University (KKU) Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham, in decreasing the number of articulation errors for children with CLP. Six children with cleft lip and palate (CLP) who lived in Borabue and the surrounding district, Maha Sarakham, and had medical records in Srinagarind Hospital. They were assessed for pre- and post-articulation errors and provided speech therapy by SLP via teaching on service for speech assistant (SA). Then, children with CLP received speech correction (SC) by SA based on assignment and caregivers practiced home program for a year. Networking of Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham significantly reduce the number of post-articulation errors for 3 children with CLP. There were factors affecting the results in treatment of other children as follows: delayed speech and language development, hypernaslaity, and consistency of SC at local hospital and home. A networking of KKU Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, and Maha Sarakham was a good way to enhance speech therapy in Thailand or other developing countries, where have limitation of speech services or lack of professionals.

  6. Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners.

    PubMed

    Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim

    2015-06-15

    Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. Real-time continuous visual biofeedback in the treatment of speech breathing disorders following childhood traumatic brain injury: report of one case.

    PubMed

    Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C

    1999-01-01

    The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.

  8. Neural evidence for predictive coding in auditory cortex during speech production.

    PubMed

    Okada, Kayoko; Matchin, William; Hickok, Gregory

    2018-02-01

    Recent models of speech production suggest that motor commands generate forward predictions of the auditory consequences of those commands, that these forward predications can be used to monitor and correct speech output, and that this system is hierarchically organized (Hickok, Houde, & Rong, Neuron, 69(3), 407--422, 2011; Pickering & Garrod, Behavior and Brain Sciences, 36(4), 329--347, 2013). Recent psycholinguistic research has shown that internally generated speech (i.e., imagined speech) produces different types of errors than does overt speech (Oppenheim & Dell, Cognition, 106(1), 528--537, 2008; Oppenheim & Dell, Memory & Cognition, 38(8), 1147-1160, 2010). These studies suggest that articulated speech might involve predictive coding at additional levels than imagined speech. The current fMRI experiment investigates neural evidence of predictive coding in speech production. Twenty-four participants from UC Irvine were recruited for the study. Participants were scanned while they were visually presented with a sequence of words that they reproduced in sync with a visual metronome. On each trial, they were cued to either silently articulate the sequence or to imagine the sequence without overt articulation. As expected, silent articulation and imagined speech both engaged a left hemisphere network previously implicated in speech production. A contrast of silent articulation with imagined speech revealed greater activation for articulated speech in inferior frontal cortex, premotor cortex and the insula in the left hemisphere, consistent with greater articulatory load. Although both conditions were silent, this contrast also produced significantly greater activation in auditory cortex in dorsal superior temporal gyrus in both hemispheres. We suggest that these activations reflect forward predictions arising from additional levels of the perceptual/motor hierarchy that are involved in monitoring the intended speech output.

  9. Frontal Top-Down Signals Increase Coupling of Auditory Low-Frequency Oscillations to Continuous Speech in Human Listeners

    PubMed Central

    Park, Hyojin; Ince, Robin A.A.; Schyns, Philippe G.; Thut, Gregor; Gross, Joachim

    2015-01-01

    Summary Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433

  10. Selective Attention Enhances Beta-Band Cortical Oscillation to Speech under "Cocktail-Party" Listening Conditions.

    PubMed

    Gao, Yayue; Wang, Qian; Ding, Yu; Wang, Changming; Li, Haifeng; Wu, Xihong; Qu, Tianshu; Li, Liang

    2017-01-01

    Human listeners are able to selectively attend to target speech in a noisy environment with multiple-people talking. Using recordings of scalp electroencephalogram (EEG), this study investigated how selective attention facilitates the cortical representation of target speech under a simulated "cocktail-party" listening condition with speech-on-speech masking. The result shows that the cortical representation of target-speech signals under the multiple-people talking condition was specifically improved by selective attention relative to the non-selective-attention listening condition, and the beta-band activity was most strongly modulated by selective attention. Moreover, measured with the Granger Causality value, selective attention to the single target speech in the mixed-speech complex enhanced the following four causal connectivities for the beta-band oscillation: the ones (1) from site FT7 to the right motor area, (2) from the left frontal area to the right motor area, (3) from the central frontal area to the right motor area, and (4) from the central frontal area to the right frontal area. However, the selective-attention-induced change in beta-band causal connectivity from the central frontal area to the right motor area, but not other beta-band causal connectivities, was significantly correlated with the selective-attention-induced change in the cortical beta-band representation of target speech. These findings suggest that under the "cocktail-party" listening condition, the beta-band oscillation in EEGs to target speech is specifically facilitated by selective attention to the target speech that is embedded in the mixed-speech complex. The selective attention-induced unmasking of target speech may be associated with the improved beta-band functional connectivity from the central frontal area to the right motor area, suggesting a top-down attentional modulation of the speech-motor process.

  11. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogden, J.

    The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.« less

  12. Speech training alters tone frequency tuning in rat primary auditory cortex

    PubMed Central

    Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Kilgard, Michael P.

    2013-01-01

    Previous studies in both humans and animals have documented improved performance following discrimination training. This enhanced performance is often associated with cortical response changes. In this study, we tested the hypothesis that long-term speech training on multiple tasks can improve primary auditory cortex (A1) responses compared to rats trained on a single speech discrimination task or experimentally naïve rats. Specifically, we compared the percent of A1 responding to trained sounds, the responses to both trained and untrained sounds, receptive field properties of A1 neurons, and the neural discrimination of pairs of speech sounds in speech trained and naïve rats. Speech training led to accurate discrimination of consonant and vowel sounds, but did not enhance A1 response strength or the neural discrimination of these sounds. Speech training altered tone responses in rats trained on six speech discrimination tasks but not in rats trained on a single speech discrimination task. Extensive speech training resulted in broader frequency tuning, shorter onset latencies, a decreased driven response to tones, and caused a shift in the frequency map to favor tones in the range where speech sounds are the loudest. Both the number of trained tasks and the number of days of training strongly predict the percent of A1 responding to a low frequency tone. Rats trained on a single speech discrimination task performed less accurately than rats trained on multiple tasks and did not exhibit A1 response changes. Our results indicate that extensive speech training can reorganize the A1 frequency map, which may have downstream consequences on speech sound processing. PMID:24344364

  13. Pulse Vector-Excitation Speech Encoder

    NASA Technical Reports Server (NTRS)

    Davidson, Grant; Gersho, Allen

    1989-01-01

    Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.

  14. Versatile simulation testbed for rotorcraft speech I/O system design

    NASA Technical Reports Server (NTRS)

    Simpson, Carol A.

    1986-01-01

    A versatile simulation testbed for the design of a rotorcraft speech I/O system is described in detail. The testbed will be used to evaluate alternative implementations of synthesized speech displays and speech recognition controls for the next generation of Army helicopters including the LHX. The message delivery logic is discussed as well as the message structure, the speech recognizer command structure and features, feedback from the recognizer, and random access to controls via speech command.

  15. Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome

    PubMed Central

    Engineer, Crystal T.; Rahebi, Kimiya C.; Borland, Michael S.; Buell, Elizabeth P.; Centanni, Tracy M.; Fink, Melyssa K.; Im, Kwok W.; Wilson, Linda G.; Kilgard, Michael P.

    2015-01-01

    Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial. PMID:26321676

  16. Noise reduction improves memory for target language speech in competing native but not foreign language speech.

    PubMed

    Ng, Elaine Hoi Ning; Rudner, Mary; Lunner, Thomas; Rönnberg, Jerker

    2015-01-01

    A hearing aid noise reduction (NR) algorithm reduces the adverse effect of competing speech on memory for target speech for individuals with hearing impairment with high working memory capacity. In the present study, we investigated whether the positive effect of NR could be extended to individuals with low working memory capacity, as well as how NR influences recall performance for target native speech when the masker language is non-native. A sentence-final word identification and recall (SWIR) test was administered to 26 experienced hearing aid users. In this test, target spoken native language (Swedish) sentence lists were presented in competing native (Swedish) or foreign (Cantonese) speech with or without binary masking NR algorithm. After each sentence list, free recall of sentence final words was prompted. Working memory capacity was measured using a reading span (RS) test. Recall performance was associated with RS. However, the benefit obtained from NR was not associated with RS. Recall performance was more disrupted by native than foreign speech babble and NR improved recall performance in native but not foreign competing speech. Noise reduction improved memory for speech heard in competing speech for hearing aid users. Memory for native speech was more disrupted by native babble than foreign babble, but the disruptive effect of native speech babble was reduced to that of foreign babble when there was NR.

  17. An evaluation of the effectiveness of PROMPT therapy in improving speech production accuracy in six children with cerebral palsy.

    PubMed

    Ward, Roslyn; Leitão, Suze; Strauss, Geoff

    2014-08-01

    This study evaluates perceptual changes in speech production accuracy in six children (3-11 years) with moderate-to-severe speech impairment associated with cerebral palsy before, during, and after participation in a motor-speech intervention program (Prompts for Restructuring Oral Muscular Phonetic Targets). An A1BCA2 single subject research design was implemented. Subsequent to the baseline phase (phase A1), phase B targeted each participant's first intervention priority on the PROMPT motor-speech hierarchy. Phase C then targeted one level higher. Weekly speech probes were administered, containing trained and untrained words at the two levels of intervention, plus an additional level that served as a control goal. The speech probes were analysed for motor-speech-movement-parameters and perceptual accuracy. Analysis of the speech probe data showed all participants recorded a statistically significant change. Between phases A1-B and B-C 6/6 and 4/6 participants, respectively, recorded a statistically significant increase in performance level on the motor speech movement patterns targeted during the training of that intervention. The preliminary data presented in this study make a contribution to providing evidence that supports the use of a treatment approach aligned with dynamic systems theory to improve the motor-speech movement patterns and speech production accuracy in children with cerebral palsy.

  18. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogden, J.

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values aremore » constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.« less

  19. Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

    PubMed Central

    Yu, Chengzhu; Wójcicki, Kamil K.; Loizou, Philipos C.; Hansen, John H. L.; Johnson, Michael T.

    2014-01-01

    Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280

  20. Audio-Visual Speech Perception Is Special

    ERIC Educational Resources Information Center

    Tuomainen, J.; Andersen, T.S.; Tiippana, K.; Sams, M.

    2005-01-01

    In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and…

  1. Phonetic Recalibration Only Occurs in Speech Mode

    ERIC Educational Resources Information Center

    Vroomen, Jean; Baart, Martijn

    2009-01-01

    Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…

  2. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  3. Intensive Speech and Language Therapy for Older Children with Cerebral Palsy: A Systems Approach

    ERIC Educational Resources Information Center

    Pennington, Lindsay; Miller, Nick; Robson, Sheila; Steen, Nick

    2010-01-01

    Aim: To investigate whether speech therapy using a speech systems approach to controlling breath support, phonation, and speech rate can increase the speech intelligibility of children with dysarthria and cerebral palsy (CP). Method: Sixteen children with dysarthria and CP participated in a modified time series design. Group characteristics were…

  4. Speech Sound Disorders in a Community Study of Preschool Children

    ERIC Educational Resources Information Center

    McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane

    2013-01-01

    Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…

  5. Status Report on Speech Research, No. 27, July-September 1971.

    ERIC Educational Resources Information Center

    Haskins Labs., New Haven, CT.

    This report contains fourteen papers on a wide range of current topics and experiments in speech research, ranging from the relationship between speech and reading to questions of memory and perception of speech sounds. The following papers are included: "How Is Language Conveyed by Speech?;""Reading, the Linguistic Process, and Linguistic…

  6. Monkey Lipsmacking Develops Like the Human Speech Rhythm

    ERIC Educational Resources Information Center

    Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.

    2012-01-01

    Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…

  7. Increasing Parental Involvement in Speech-Sound Remediation

    ERIC Educational Resources Information Center

    Roberts, Micah Renee Ferguson

    2014-01-01

    Speech therapy homework is a key component of a successful speech therapy program, increasing carryover of learned speech sounds. Poor return rate of homework assigned, with a lack of parental involvement, is a problem. The purpose of this project study was to examine what may increase parental participation in speech therapy homework. Guided by…

  8. Analysis of False Starts in Spontaneous Speech.

    ERIC Educational Resources Information Center

    O'Shaughnessy, Douglas

    A primary difference between spontaneous speech and read speech concerns the use of false starts, where a speaker interrupts the flow of speech to restart his or her utterance. A study examined the acoustic aspects of such restarts in a widely-used speech database, examining approximately 1000 utterances, about 10% of which contained a restart.…

  9. Foundational Tuning: How Infants' Attention to Speech Predicts Language Development

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Curtin, Suzanne

    2014-01-01

    Orienting biases for speech may provide a foundation for language development. Although human infants show a bias for listening to speech from birth, the relation of a speech bias to later language development has not been established. Here, we examine whether infants' attention to speech directly predicts expressive vocabulary. Infants…

  10. Facial Speech Gestures: The Relation between Visual Speech Processing, Phonological Awareness, and Developmental Dyslexia in 10-Year-Olds

    ERIC Educational Resources Information Center

    Schaadt, Gesa; Männel, Claudia; van der Meer, Elke; Pannekamp, Ann; Friederici, Angela D.

    2016-01-01

    Successful communication in everyday life crucially involves the processing of auditory and visual components of speech. Viewing our interlocutor and processing visual components of speech facilitates speech processing by triggering auditory processing. Auditory phoneme processing, analyzed by event-related brain potentials (ERP), has been shown…

  11. Speech in the Junior High School. Michigan Speech Association Curriculum Guide Series, No. 4.

    ERIC Educational Resources Information Center

    Herman, Deldee; Ratliffe, Sharon

    Designed to provide the student with experience in oral communication, this curriculum guide presents a one-semester speech course for junior high school students with "normal" rather than defective speech. The eight units cover speech in social interaction; group discussion and business meetings; demonstrations and reports; creative dramatics;…

  12. Philosophical Perspectives on Values and Ethics in Speech Communication.

    ERIC Educational Resources Information Center

    Becker, Carl B.

    There are three very different concerns of communication ethics: (1) applied speech ethics, (2) ethical rules or standards, and (3) metaethical issues. In the area of applied speech ethics, communications theorists attempt to determine whether a speech act is moral or immoral by focusing on the content and effects of specific speech acts. Specific…

  13. 78 FR 63152 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-23

    ...] Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities... for telecommunications relay services (TRS) by eliminating standards for Internet-based relay services... comments, identified by CG Docket No. 03-123, by any of the following methods: Electronic Filers: Comments...

  14. Tracking Change in Children with Severe and Persisting Speech Difficulties

    ERIC Educational Resources Information Center

    Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

    2013-01-01

    Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…

  15. A Note On Deletion Rules in Fast Speech.

    ERIC Educational Resources Information Center

    Hewlett, Nigel

    In fast speech, certain segments pronounced in careful speech may be deleted. Rules of a generative phonology have been used to account for fast speech forms. An alternative approach is suggested which views fast speech deletions as merely limiting cases of segment reduction, under conditions of increased tempo and/or casualness. To complement…

  16. The Genres of "Shouted Speech" in Cheke Holo.

    ERIC Educational Resources Information Center

    Boswell, Freddy

    Speech genres in Cheke Holo (CH) have not been studied extensively. Speech genres related to shouted speech in CH deserves more study because it is commonly used. Culturally speaking, shouted speech is a natural expression of the importance and centrality of CH community and surrounding authority structures, and has a very strong hortatory…

  17. Tongue-Palate Contact Pressure, Oral Air Pressure, and Acoustics of Clear Speech

    ERIC Educational Resources Information Center

    Searl, Jeff; Evitts, Paul M.

    2013-01-01

    Purpose: The authors compared articulatory contact pressure (ACP), oral air pressure (Po), and speech acoustics for conversational versus clear speech. They also assessed the relationship of these measures to listener perception. Method: Twelve adults with normal speech produced monosyllables in a phrase using conversational and clear speech.…

  18. Freedom of Speech and Press: Exceptions to the First Amendment

    DTIC Science & Technology

    2007-04-26

    some types of speech. For example, the Court has decided that the First Amendment provides no protection to obscenity, child pornography, or speech...1 Obscenity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Child ...some types of speech.2 For example, the Court has decided that the First Amendment provides no protection to obscenity, child pornography, or speech

  19. Compressed Speech Technology: Implications for Learning and Instruction.

    ERIC Educational Resources Information Center

    Sullivan, LeRoy L.

    This paper first traces the historical development of speech compression technology, which has made it possible to alter the spoken rate of a pre-recorded message without excessive distortion. Terms used to describe techniques employed as the technology evolved are discussed, including rapid speech, rate altered speech, cut-and-spliced speech, and…

  20. Contemporary Reflections on Speech-Based Language Learning

    ERIC Educational Resources Information Center

    Gustafson, Marianne

    2009-01-01

    In "The Relation of Language to Mental Development and of Speech to Language Teaching," S.G. Davidson displayed several timeless insights into the role of speech in developing language and reasons for using speech as the basis for instruction for children who are deaf and hard of hearing. His understanding that speech includes more than merely…

  1. The Effects of Simulated Stuttering and Prolonged Speech on the Neural Activation Patterns of Stuttering and Nonstuttering Adults

    ERIC Educational Resources Information Center

    De Nil, Luc F.; Beal, Deryk S.; Lafaille, Sophie J.; Kroll, Robert M.; Crawley, Adrian P.; Gracco, Vincent L.

    2008-01-01

    Functional magnetic resonance imaging was used to investigate the neural correlates of passive listening, habitual speech and two modified speech patterns (simulated stuttering and prolonged speech) in stuttering and nonstuttering adults. Within-group comparisons revealed increased right hemisphere biased activation of speech-related regions…

  2. Visual and Auditory Input in Second-Language Speech Processing

    ERIC Educational Resources Information Center

    Hardison, Debra M.

    2010-01-01

    The majority of studies in second-language (L2) speech processing have involved unimodal (i.e., auditory) input; however, in many instances, speech communication involves both visual and auditory sources of information. Some researchers have argued that multimodal speech is the primary mode of speech perception (e.g., Rosenblum 2005). Research on…

  3. Speech and Language Skills of Parents of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry

    2007-01-01

    Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…

  4. Key considerations in designing a speech brain-computer interface.

    PubMed

    Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Chabardès, Stéphan; Yvert, Blaise

    2016-11-01

    Restoring communication in case of aphasia is a key challenge for neurotechnologies. To this end, brain-computer strategies can be envisioned to allow artificial speech synthesis from the continuous decoding of neural signals underlying speech imagination. Such speech brain-computer interfaces do not exist yet and their design should consider three key choices that need to be made: the choice of appropriate brain regions to record neural activity from, the choice of an appropriate recording technique, and the choice of a neural decoding scheme in association with an appropriate speech synthesis method. These key considerations are discussed here in light of (1) the current understanding of the functional neuroanatomy of cortical areas underlying overt and covert speech production, (2) the available literature making use of a variety of brain recording techniques to better characterize and address the challenge of decoding cortical speech signals, and (3) the different speech synthesis approaches that can be considered depending on the level of speech representation (phonetic, acoustic or articulatory) envisioned to be decoded at the core of a speech BCI paradigm. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  5. Systematic studies of modified vocalization: effects of speech rate and instatement style during metronome stimulation.

    PubMed

    Davidow, Jason H; Bothe, Anne K; Richardson, Jessica D; Andreatta, Richard D

    2010-12-01

    This study introduces a series of systematic investigations intended to clarify the parameters of the fluency-inducing conditions (FICs) in stuttering. Participants included 11 adults, aged 20-63 years, with typical speech-production skills. A repeated measures design was used to examine the relationships between several speech production variables (vowel duration, voice onset time, fundamental frequency, intraoral pressure, pressure rise time, transglottal airflow, and phonated intervals) and speech rate and instatement style during metronome-entrained rhythmic speech. Measures of duration (vowel duration, voice onset time, and pressure rise time) differed across different metronome conditions. When speech rates were matched between the control condition and metronome condition, voice onset time was the only variable that changed. Results confirm that speech rate and instatement style can influence speech production variables during the production of fluency-inducing conditions. Future studies of normally fluent speech and of stuttered speech must control both features and should further explore the importance of voice onset time, which may be influenced by rate during metronome stimulation in a way that the other variables are not.

  6. Automated Speech Rate Measurement in Dysarthria.

    PubMed

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-06-01

    In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. The new algorithm was trained and tested using Dutch speech samples of 36 speakers with no history of speech impairment and 40 speakers with mild to moderate dysarthria. We tested the algorithm under various conditions: according to speech task type (sentence reading, passage reading, and storytelling) and algorithm optimization method (speaker group optimization and individual speaker optimization). Correlations between automated and human SR determination were calculated for each condition. High correlations between automated and human SR determination were found in the various testing conditions. The new algorithm measures SR in a sufficiently reliable manner. It is currently being integrated in a clinical software tool for assessing and managing prosody in dysarthric speech. Further research is needed to fine-tune the algorithm to severely dysarthric speech, to make the algorithm less sensitive to background noise, and to evaluate how the algorithm deals with syllabic consonants.

  7. Optimal speech level for speech transmission in a noisy environment for young adults and aged persons

    NASA Astrophysics Data System (ADS)

    Sato, Hayato; Ota, Ryo; Morimoto, Masayuki; Sato, Hiroshi

    2005-04-01

    Assessing sound environment of classrooms for the aged is a very important issue, because classrooms can be used by the aged for their lifelong learning, especially in the aged society. Hence hearing loss due to aging is a considerable factor for classrooms. In this study, the optimal speech level in noisy fields for both young adults and aged persons was investigated. Listening difficulty ratings and word intelligibility scores for familiar words were used to evaluate speech transmission performance. The results of the tests demonstrated that the optimal speech level for moderate background noise (i.e., less than around 60 dBA) was fairly constant. Meanwhile, the optimal speech level depended on the speech-to-noise ratio when the background noise level exceeded around 60 dBA. The minimum required speech level to minimize difficulty ratings for the aged was higher than that for the young. However, the minimum difficulty ratings for both the young and the aged were given in the range of speech level of 70 to 80 dBA of speech level.

  8. Datalink in air traffic management: Human factors issues in communications.

    PubMed

    Stedmon, Alex W; Sharples, Sarah; Littlewood, Robert; Cox, Gemma; Patel, Harshada; Wilson, John R

    2007-07-01

    This paper examines issues underpinning the potential move in aviation away from real speech radiotelephony (R/T) communications towards datalink communications involving text and synthetic speech communications. Using a novel air traffic control (ATC) task, two experiments are reported. Experiment 1 compared the use of speech and text while Experiment 2 compared the use of real and synthetic speech communications. Results indicated that generally there were no significant differences between speech and text communications and that either type could be used without any main effects on performance. However, a number of specific differences were observed across the different phases of the scenarios indicating that workload levels may be more varied when speech communications are used. Experiment 2 illustrated that participants placed a greater level of trust in real speech than synthetic speech, and trusted true communications more than false communications (regardless of whether they were real or synthetic voices). The findings are considered in terms of datalink initiatives for future air traffic management, the importance placed on real speech R/T communications, and the need to develop more natural synthetic speech in this application area.

  9. Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

    PubMed Central

    Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.

    2014-01-01

    Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594

  10. Fluency variation in adolescents.

    PubMed

    Furquim de Andrade, Claudia Regina; de Oliveira Martins, Vanessa

    2007-10-01

    The Speech Fluency Profile of fluent adolescent speakers of Brazilian Portuguese, were examined with respect to gender and neurolinguistic variations. Speech samples of 130 male and female adolescents, aged between 12;0 and 17;11 years were gathered. They were analysed according to type of speech disruption; speech rate; and frequency of speech disruptions. Statistical analysis did not find significant differences between genders for the variables studied. However, regarding the phases of adolescence (early: 12;0-14;11 years; late: 15;0-17;11 years), statistical differences were observed for all of the variables. As for neurolinguistic maturation, a decrease in the number of speech disruptions and an increase in speech rate occurred during the final phase of adolescence, indicating that the maturation of the motor and linguistic processes exerted an influence over the fluency profile of speech.

  11. Untrained listeners' ratings of speech disorders in a group with cleft palate: a comparison with speech and language pathologists' ratings.

    PubMed

    Brunnegård, Karin; Lohmander, Anette; van Doorn, Jan

    2009-01-01

    Hypernasal resonance, audible nasal air emission and/or nasal turbulence, and articulation errors are typical speech disorders associated with the speech of children with cleft lip and palate. Several studies indicate that hypernasal resonance tends to be perceived negatively by listeners. Most perceptual studies of speech disorders related to cleft palate are carried out with speech and language pathologists as listeners, whereas only a few studies have been conducted to explore how judgements by untrained listeners compare with expert assessments. These types of studies can be used to determine whether children for whom speech and language pathologists recommend intervention have a significant speech deviance that is also detected by untrained listeners. To compare ratings by untrained listeners with ratings by speech and language pathologists for cleft palate speech. An assessment form for untrained listeners was developed using statements and a five-point scale. The assessment form was tailored to facilitate comparison with expert judgements. Twenty-eight untrained listeners assessed the speech of 26 speakers with cleft palate and ten speakers without cleft in a comparison group. This assessment was compared with the joint assessment of two expert speech and language pathologists. Listener groups generally agreed on which speakers were nasal. The untrained listeners detected hyper- and hyponasality when it was present in speech and considered moderate to severe hypernasality to be serious enough to call for intervention. The expert listeners assessed audible nasal air emission and/or nasal turbulence to be present in twice as many speakers as the untrained listeners who were much less sensitive to audible nasal air emission and/or nasal turbulence. The results of untrained listeners' ratings in this study in the main confirm the ratings of speech and language pathologists and show that cleft palate speech disorders may have an impact in the everyday life of the speaker.

  12. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers

    PubMed Central

    Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang

    2017-01-01

    The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with “quasi-unvoiced” or with “quasi-voiced” initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%. PMID:28926572

  13. Auditory and cognitive factors underlying individual differences in aided speech-understanding among older adults

    PubMed Central

    Humes, Larry E.; Kidd, Gary R.; Lentz, Jennifer J.

    2013-01-01

    This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male) ranging in age from 60 to 86 (mean = 69.2). Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures), psychophysical (17 measures), and speech-understanding (9 measures), as well as the Speech, Spatial, and Qualities of Hearing (SSQ) self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference). All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding) measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection) as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI), and performance on the text-recognition-threshold (TRT) task (a visual analog of interrupted speech recognition). These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance. PMID:24098273

  14. The Auditory-Brainstem Response to Continuous, Non-repetitive Speech Is Modulated by the Speech Envelope and Reflects Speech Processing

    PubMed Central

    Reichenbach, Chagit S.; Braiman, Chananel; Schiff, Nicholas D.; Hudspeth, A. J.; Reichenbach, Tobias

    2016-01-01

    The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function. PMID:27303286

  15. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers.

    PubMed

    He, Ling; Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang

    2017-01-01

    The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.

  16. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus)

    PubMed Central

    Flaherty, Mary; Dent, Micheal L.; Sawusch, James R.

    2017-01-01

    The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT) and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated), Passive speech exposure (regular exposure to human speech), and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with “d” or “t” and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal. PMID:28562597

  17. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus).

    PubMed

    Flaherty, Mary; Dent, Micheal L; Sawusch, James R

    2017-01-01

    The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT) and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated), Passive speech exposure (regular exposure to human speech), and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.

  18. Functional assessment and treatment of perseverative speech about restricted topics in an adolescent with Asperger syndrome.

    PubMed

    Fisher, Wayne W; Rodriguez, Nicole M; Owen, Todd M

    2013-01-01

    A functional analysis showed that a 14-year-old boy with Asperger syndrome displayed perseverative speech (or "restricted interests") reinforced by attention. To promote appropriate speech in a turn-taking format, we implemented differential reinforcement (DR) of nonperseverative speech and DR of on-topic speech within a multiple schedule with stimuli that signaled the contingencies in effect and who was to select the topic. Both treatments reduced perseverative speech, but only DR of on-topic speech increased appropriate turn taking during conversation. Treatment effects were maintained when implemented by family members and novel therapists. © Society for the Experimental Analysis of Behavior.

  19. Cortical Responses to Chinese Phonemes in Preschoolers Predict Their Literacy Skills at School Age.

    PubMed

    Hong, Tian; Shuai, Lan; Frost, Stephen J; Landi, Nicole; Pugh, Kenneth R; Shu, Hua

    2018-01-01

    We investigated whether preschoolers with poor phonological awareness (PA) skills had impaired cortical basis for detecting speech feature, and whether speech perception influences future literacy outcomes in preschoolers. We recorded ERP responses to speech in 52 Chinese preschoolers. The results showed that the poor PA group processed speech changes differentially compared to control group in mismatch negativity (MMN) and late discriminative negativity (LDN). Furthermore, speech perception in kindergarten could predict literacy outcomes after literacy acquisition. These suggest that impairment in detecting speech features occurs before formal reading instruction, and that speech perception plays an important role in reading development.

  20. Relationship between listeners' nonnative speech recognition and categorization abilities

    PubMed Central

    Atagi, Eriko; Bent, Tessa

    2015-01-01

    Enhancement of the perceptual encoding of talker characteristics (indexical information) in speech can facilitate listeners' recognition of linguistic content. The present study explored this indexical-linguistic relationship in nonnative speech processing by examining listeners' performance on two tasks: nonnative accent categorization and nonnative speech-in-noise recognition. Results indicated substantial variability across listeners in their performance on both the accent categorization and nonnative speech recognition tasks. Moreover, listeners' accent categorization performance correlated with their nonnative speech-in-noise recognition performance. These results suggest that having more robust indexical representations for nonnative accents may allow listeners to more accurately recognize the linguistic content of nonnative speech. PMID:25618098

  1. Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

    NASA Technical Reports Server (NTRS)

    Olorenshaw, Lex; Trawick, David

    1991-01-01

    The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.

  2. Speech and Speech-Related Quality of Life After Late Palate Repair: A Patient's Perspective.

    PubMed

    Schönmeyr, Björn; Wendby, Lisa; Sharma, Mitali; Jacobson, Lia; Restrepo, Carolina; Campbell, Alex

    2015-07-01

    Many patients with cleft palate deformities worldwide receive treatment at a later age than is recommended for normal speech to develop. The outcomes after late palate repairs in terms of speech and quality of life (QOL) still remain largely unstudied. In the current study, questionnaires were used to assess the patients' perception of speech and QOL before and after primary palate repair. All of the patients were operated at a cleft center in northeast India and had a cleft palate with a normal lip or with a cleft lip that had been previously repaired. A total of 134 patients (7-35 years) were interviewed preoperatively and 46 patients (7-32 years) were assessed in the postoperative survey. The survey showed that scores based on the speech handicap index, concerning speech and speech-related QOL, did not improve postoperatively. In fact, the questionnaires indicated that the speech became more unpredictable (P < 0.01) and that nasal regurgitation became worse (P < 0.01) for some patients after surgery. A total of 78% of the patients were still satisfied with the surgery and all of the patients reported that their self-confidence had improved after the operation. Thus, the majority of interviewed patients who underwent late primary palate repair were satisfied with the surgery. At the same time, speech and speech-related QOL did not improve according to the speech handicap index-based survey. Speech predictability may even become worse and nasal regurgitation may increase after late palate repair, according to these results.

  3. Speech deficits in serious mental illness: a cognitive resource issue?

    PubMed

    Cohen, Alex S; McGovern, Jessica E; Dinzeo, Thomas J; Covington, Michael A

    2014-12-01

    Speech deficits, notably those involved in psychomotor retardation, blunted affect, alogia and poverty of content of speech, are pronounced in a wide range of serious mental illnesses (e.g., schizophrenia, unipolar depression, bipolar disorders). The present project evaluated the degree to which these deficits manifest as a function of cognitive resource limitations. We examined natural speech from 52 patients meeting criteria for serious mental illnesses (i.e., severe functional deficits with a concomitant diagnosis of schizophrenia, unipolar and/or bipolar affective disorders) and 30 non-psychiatric controls using a range of objective, computer-based measures tapping speech production ("alogia"), variability ("blunted vocal affect") and content ("poverty of content of speech"). Subjects produced natural speech during a baseline condition and while engaging in an experimentally-manipulated cognitively-effortful task. For correlational analysis, cognitive ability was measured using a standardized battery. Generally speaking, speech deficits did not differ as a function of SMI diagnosis. However, every speech production and content measure was significantly abnormal in SMI versus control groups. Speech variability measures generally did not differ between groups. For both patients and controls as a group, speech during the cognitively-effortful task was sparser and less rich in content. Relative to controls, patients were abnormal under cognitive load with respect only to average pause length. Correlations between the speech variables and cognitive ability were only significant for this same variable: average pause length. Results suggest that certain speech deficits, notably involving pause length, may manifest as a function of cognitive resource limitations. Implications for treatment, research and assessment are discussed. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Gesture helps learners learn, but not merely by guiding their visual attention.

    PubMed

    Wakefield, Elizabeth; Novack, Miriam A; Congdon, Eliza L; Franconeri, Steven; Goldin-Meadow, Susan

    2018-04-16

    Teaching a new concept through gestures-hand movements that accompany speech-facilitates learning above-and-beyond instruction through speech alone (e.g., Singer & Goldin-Meadow, ). However, the mechanisms underlying this phenomenon are still under investigation. Here, we use eye tracking to explore one often proposed mechanism-gesture's ability to direct visual attention. Behaviorally, we replicate previous findings: Children perform significantly better on a posttest after learning through Speech+Gesture instruction than through Speech Alone instruction. Using eye tracking measures, we show that children who watch a math lesson with gesture do allocate their visual attention differently from children who watch a math lesson without gesture-they look more to the problem being explained, less to the instructor, and are more likely to synchronize their visual attention with information presented in the instructor's speech (i.e., follow along with speech) than children who watch the no-gesture lesson. The striking finding is that, even though these looking patterns positively predict learning outcomes, the patterns do not mediate the effects of training condition (Speech Alone vs. Speech+Gesture) on posttest success. We find instead a complex relation between gesture and visual attention in which gesture moderates the impact of visual looking patterns on learning-following along with speech predicts learning for children in the Speech+Gesture condition, but not for children in the Speech Alone condition. Gesture's beneficial effects on learning thus come not merely from its ability to guide visual attention, but also from its ability to synchronize with speech and affect what learners glean from that speech. © 2018 John Wiley & Sons Ltd.

  5. Temporal Sensitivity Measured Shortly After Cochlear Implantation Predicts 6-Month Speech Recognition Outcome.

    PubMed

    Erb, Julia; Ludwig, Alexandra Annemarie; Kunke, Dunja; Fuchs, Michael; Obleser, Jonas

    2018-04-24

    Psychoacoustic tests assessed shortly after cochlear implantation are useful predictors of the rehabilitative speech outcome. While largely independent, both spectral and temporal resolution tests are important to provide an accurate prediction of speech recognition. However, rapid tests of temporal sensitivity are currently lacking. Here, we propose a simple amplitude modulation rate discrimination (AMRD) paradigm that is validated by predicting future speech recognition in adult cochlear implant (CI) patients. In 34 newly implanted patients, we used an adaptive AMRD paradigm, where broadband noise was modulated at the speech-relevant rate of ~4 Hz. In a longitudinal study, speech recognition in quiet was assessed using the closed-set Freiburger number test shortly after cochlear implantation (t0) as well as the open-set Freiburger monosyllabic word test 6 months later (t6). Both AMRD thresholds at t0 (r = -0.51) and speech recognition scores at t0 (r = 0.56) predicted speech recognition scores at t6. However, AMRD and speech recognition at t0 were uncorrelated, suggesting that those measures capture partially distinct perceptual abilities. A multiple regression model predicting 6-month speech recognition outcome with deafness duration and speech recognition at t0 improved from adjusted R = 0.30 to adjusted R = 0.44 when AMRD threshold was added as a predictor. These findings identify AMRD thresholds as a reliable, nonredundant predictor above and beyond established speech tests for CI outcome. This AMRD test could potentially be developed into a rapid clinical temporal-resolution test to be integrated into the postoperative test battery to improve the reliability of speech outcome prognosis.

  6. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces

    PubMed Central

    Bocquelet, Florent; Hueber, Thomas; Girin, Laurent; Savariaux, Christophe; Yvert, Blaise

    2016-01-01

    Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI) controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips) into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN) trained on electromagnetic articulography (EMA) data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer. PMID:27880768

  7. Speech motor planning and execution deficits in early childhood stuttering.

    PubMed

    Walsh, Bridget; Mettel, Kathleen Marie; Smith, Anne

    2015-01-01

    Five to eight percent of preschool children develop stuttering, a speech disorder with clearly observable, hallmark symptoms: sound repetitions, prolongations, and blocks. While the speech motor processes underlying stuttering have been widely documented in adults, few studies to date have assessed the speech motor dynamics of stuttering near its onset. We assessed fundamental characteristics of speech movements in preschool children who stutter and their fluent peers to determine if atypical speech motor characteristics described for adults are early features of the disorder or arise later in the development of chronic stuttering. Orofacial movement data were recorded from 58 children who stutter and 43 children who do not stutter aged 4;0 to 5;11 (years; months) in a sentence production task. For single speech movements and multiple speech movement sequences, we computed displacement amplitude, velocity, and duration. For the phrase level movement sequence, we computed an index of articulation coordination consistency for repeated productions of the sentence. Boys who stutter, but not girls, produced speech with reduced amplitudes and velocities of articulatory movement. All children produced speech with similar durations. Boys, particularly the boys who stuttered, had more variable patterns of articulatory coordination compared to girls. This study is the first to demonstrate sex-specific differences in speech motor control processes between preschool boys and girls who are stuttering. The sex-specific lag in speech motor development in many boys who stutter likely has significant implications for the dramatically different recovery rates between male and female preschoolers who stutter. Further, our findings document that atypical speech motor development is an early feature of stuttering.

  8. Sound frequency affects speech emotion perception: results from congenital amusia

    PubMed Central

    Lolli, Sydney L.; Lewenstein, Ari D.; Basurto, Julian; Winnik, Sean; Loui, Psyche

    2015-01-01

    Congenital amusics, or “tone-deaf” individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech. PMID:26441718

  9. The effect of simultaneous text on the recall of noise-degraded speech.

    PubMed

    Grossman, Irina; Rajan, Ramesh

    2017-05-01

    Written and spoken language utilize the same processing system, enabling text to modulate speech processing. We investigated how simultaneously presented text affected speech recall in babble noise using a retrospective recall task. Participants were presented with text-speech sentence pairs in multitalker babble noise and then prompted to recall what they heard or what they read. In Experiment 1, sentence pairs were either congruent or incongruent and they were presented in silence or at 1 of 4 noise levels. Audio and Visual control groups were also tested with sentences presented in only 1 modality. Congruent text facilitated accurate recall of degraded speech; incongruent text had no effect. Text and speech were seldom confused for each other. A consideration of the effects of the language background found that monolingual English speakers outperformed early multilinguals at recalling degraded speech; however the effects of text on speech processing were analogous. Experiment 2 considered if the benefit provided by matching text was maintained when the congruency of the text and speech becomes more ambiguous because of the addition of partially mismatching text-speech sentence pairs that differed only on their final keyword and because of the use of low signal-to-noise ratios. The experiment focused on monolingual English speakers; the results showed that even though participants commonly confused text-for-speech during incongruent text-speech pairings, these confusions could not fully account for the benefit provided by matching text. Thus, we uniquely demonstrate that congruent text benefits the recall of noise-degraded speech. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  10. Alpha and Beta Oscillations Index Semantic Congruency between Speech and Gestures in Clear and Degraded Speech.

    PubMed

    Drijvers, Linda; Özyürek, Asli; Jensen, Ole

    2018-06-19

    Previous work revealed that visual semantic information conveyed by gestures can enhance degraded speech comprehension, but the mechanisms underlying these integration processes under adverse listening conditions remain poorly understood. We used MEG to investigate how oscillatory dynamics support speech-gesture integration when integration load is manipulated by auditory (e.g., speech degradation) and visual semantic (e.g., gesture congruency) factors. Participants were presented with videos of an actress uttering an action verb in clear or degraded speech, accompanied by a matching (mixing gesture + "mixing") or mismatching (drinking gesture + "walking") gesture. In clear speech, alpha/beta power was more suppressed in the left inferior frontal gyrus and motor and visual cortices when integration load increased in response to mismatching versus matching gestures. In degraded speech, beta power was less suppressed over posterior STS and medial temporal lobe for mismatching compared with matching gestures, showing that integration load was lowest when speech was degraded and mismatching gestures could not be integrated and disambiguate the degraded signal. Our results thus provide novel insights on how low-frequency oscillatory modulations in different parts of the cortex support the semantic audiovisual integration of gestures in clear and degraded speech: When speech is clear, the left inferior frontal gyrus and motor and visual cortices engage because higher-level semantic information increases semantic integration load. When speech is degraded, posterior STS/middle temporal gyrus and medial temporal lobe are less engaged because integration load is lowest when visual semantic information does not aid lexical retrieval and speech and gestures cannot be integrated.

  11. Using others' words: conversational use of reported speech by individuals with aphasia and their communication partners.

    PubMed

    Hengst, Julie A; Frame, Simone R; Neuman-Stritzel, Tiffany; Gannaway, Rachel

    2005-02-01

    Reported speech, wherein one quotes or paraphrases the speech of another, has been studied extensively as a set of linguistic and discourse practices. Researchers agree that reported speech is pervasive, found across languages, and used in diverse contexts. However, to date, there have been no studies of the use of reported speech among individuals with aphasia. Grounded in an interactional sociolinguistic perspective, the study presented here documents and analyzes the use of reported speech by 7 adults with mild to moderately severe aphasia and their routine communication partners. Each of the 7 pairs was videotaped in 4 everyday activities at home or around the community, yielding over 27 hr of conversational interaction for analysis. A coding scheme was developed that identified 5 types of explicitly marked reported speech: direct, indirect, projected, indexed, and undecided. Analysis of the data documented reported speech as a common discourse practice used successfully by the individuals with aphasia and their communication partners. All participants produced reported speech at least once, and across all observations the target pairs produced 400 reported speech episodes (RSEs), 149 by individuals with aphasia and 251 by their communication partners. For all participants, direct and indirect forms were the most prevalent (70% of RSEs). Situated discourse analysis of specific episodes of reported speech used by 3 of the pairs provides detailed portraits of the diverse interactional, referential, social, and discourse functions of reported speech and explores ways that the pairs used reported speech to successfully frame talk despite their ongoing management of aphasia.

  12. Patterns of poststroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate.

    PubMed

    Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

    2015-06-01

    Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions on whether AOS emerges from a unique pattern of brain damage or as a subelement of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The AOS Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with both AOS and aphasia. Localized brain damage was identified using structural magnetic resonance imaging, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS or aphasia, and brain damage. The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS or aphasia were associated with damage to the temporal lobe and the inferior precentral frontal regions. AOS likely occurs in conjunction with aphasia because of the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. © 2015 American Heart Association, Inc.

  13. Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

    PubMed

    Davidow, Jason H; Grossman, Heather L; Edge, Robin L

    2018-05-01

    Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.

  14. [Nature of speech disorders in Parkinson disease].

    PubMed

    Pawlukowska, W; Honczarenko, K; Gołąb-Janowska, M

    2013-01-01

    The aim of the study was to discuss physiology and pathology of speech and review of the literature on speech disorders in Parkinson disease. Additionally, the most effective methods to diagnose the speech disorders in Parkinson disease were also stressed. Afterward, articulatory, respiratory, acoustic and pragmatic factors contributing to the exacerbation of the speech disorders were discussed. Furthermore, the study dealt with the most important types of speech treatment techniques available (pharmacological and behavioral) and a significance of Lee Silverman Voice Treatment was highlighted.

  15. Surgical speech disorders.

    PubMed

    Shen, Tianjie; Sie, Kathleen C Y

    2014-11-01

    Most speech disorders of childhood are treated with speech therapy. However, two conditions, ankyloglossia and velopharyngeal dysfunction, may be amenable to surgical intervention. It is important for surgeons to work with experienced speech language pathologists to diagnose the speech disorder. Children with articulation disorders related to ankyloglossia may benefit from frenuloplasty. Children with velopharyngeal dysfunction should have standardized clinical evaluation and instrumental asseessment of velopharyngeal function. Surgeons should develop a treatment protocol to optimize speech outcomes while minimizing morbidity. Copyright © 2014 Elsevier Inc. All rights reserved.

  16. Audiovisual speech perception development at varying levels of perceptual processing

    PubMed Central

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-01-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children. PMID:27106318

  17. [Restoration of speech function in oncological patients with maxillary defects].

    PubMed

    Matiakin, E G; Chuchkov, V M; Akhundov, A A; Azizian, R I; Romanov, I S; Chuchkov, M V; Agapov, V V

    2009-01-01

    Speech quality was evaluated in 188 patients with acquired maxillary defects. Prosthetic treatment of 29 patients was preceded by pharmacopsychotherapy. Sixty three patients had lessons with a logopedist and 66 practiced self-tuition based on the specially developed test. Thirty patients were examined for the quality of speech without preliminary preparation. Speech quality was assessed by auditory and spectral analysis. The main forms of impaired speech quality in the patients with maxillary defects were marked rhinophonia and impaired articulation. The proposed analytical tests were based on a combination of "difficult" vowels and consonants. The use of a removable prostheses with an obturator failed to correct the affected speech function but created prerequisites for the formation of the correct speech stereotype. Results of the study suggest the relationship between the quality of speech in subjects with maxillary defects and their intellectual faculties as well as the desire to overcome this drawback. The proposed tests are designed to activate the neuromuscular apparatus responsible for the generation of the speech. Lessons with a speech therapist give a powerful emotional incentive to the patients and promote their efforts toward restoration of speaking ability. Pharmacopsychotherapy and self-control are another efficacious tools for the improvement of speech quality in patients with maxillary defects.

  18. Audiovisual speech perception development at varying levels of perceptual processing.

    PubMed

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-04-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children.

  19. Speech Perception and Short Term Memory Deficits in Persistent Developmental Speech Disorder

    PubMed Central

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2008-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech perception and short-term memory. Nine adults with a persistent familial developmental speech disorder without language impairment were compared with 20 controls on tasks requiring the discrimination of fine acoustic cues for word identification and on measures of verbal and nonverbal short-term memory. Significant group differences were found in the slopes of the discrimination curves for first formant transitions for word identification with stop gaps of 40 and 20 ms with effect sizes of 1.60 and 1.56. Significant group differences also occurred on tests of nonverbal rhythm and tonal memory, and verbal short-term memory with effect sizes of 2.38, 1.56 and 1.73. No group differences occurred in the use of stop gap durations for word identification. Because frequency-based speech perception and short-term verbal and nonverbal memory deficits both persisted into adulthood in the speech-impaired adults, these deficits may be involved in the persistence of speech disorders without language impairment. PMID:15896836

  20. Relations between affective music and speech: evidence from dynamics of affective piano performance and speech production.

    PubMed

    Liu, Xiaoluan; Xu, Yi

    2015-01-01

    This study compares affective piano performance with speech production from the perspective of dynamics: unlike previous research, this study uses finger force and articulatory effort as indexes reflecting the dynamics of affective piano performance and speech production respectively. Moreover, for the first time physical constraints such as piano fingerings and speech articulatory constraints are included due to their potential contribution to different patterns of dynamics. A piano performance experiment and speech production experiment were conducted in four emotions: anger, fear, happiness and sadness. The results show that in both piano performance and speech production, anger and happiness generally have high dynamics while sadness has the lowest dynamics. Fingerings interact with fear in the piano experiment and articulatory constraints interact with anger in the speech experiment, i.e., large physical constraints produce significantly higher dynamics than small physical constraints in piano performance under the condition of fear and in speech production under the condition of anger. Using production experiments, this study firstly supports previous perception studies on relations between affective music and speech. Moreover, this is the first study to show quantitative evidence for the importance of considering motor aspects such as dynamics in comparing music performance and speech production in which motor mechanisms play a crucial role.

  1. Predicting Speech Intelligibility with A Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    PubMed Central

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD). Results Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Conclusions Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems. PMID:24824584

  2. Engaged listeners: shared neural processing of powerful political speeches.

    PubMed

    Schmälzle, Ralf; Häcker, Frank E K; Honey, Christopher J; Hasson, Uri

    2015-08-01

    Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners' brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. © The Author (2015). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  3. Comparing speech and nonspeech context effects across timescales in coarticulatory contexts.

    PubMed

    Viswanathan, Navin; Kelty-Stephen, Damian G

    2018-02-01

    Context effects are ubiquitous in speech perception and reflect the ability of human listeners to successfully perceive highly variable speech signals. In the study of how listeners compensate for coarticulatory variability, past studies have used similar effects speech and tone analogues of speech as strong support for speech-neutral, general auditory mechanisms for compensation for coarticulation. In this manuscript, we revisit compensation for coarticulation by replacing standard button-press responses with mouse-tracking responses and examining both standard geometric measures of uncertainty as well as newer information-theoretic measures that separate fast from slow mouse movements. We found that when our analyses were restricted to end-state responses, tones and speech contexts appeared to produce similar effects. However, a more detailed time-course analysis revealed systematic differences between speech and tone contexts such that listeners' responses to speech contexts, but not to tone contexts, changed across the experimental session. Analyses of the time course of effects within trials using mouse tracking indicated that speech contexts elicited fewer x-position flips but more area under the curve (AUC) and maximum deviation (MD), and they did so in the slower portions of mouse-tracking movements. Our results indicate critical differences between the time course of speech and nonspeech context effects and that general auditory explanations, motivated by their apparent similarity, be reexamined.

  4. Childhood apraxia of speech: A survey of praxis and typical speech characteristics.

    PubMed

    Malmenholt, Ann; Lohmander, Anette; McAllister, Anita

    2017-07-01

    The purpose of this study was to investigate current knowledge of the diagnosis childhood apraxia of speech (CAS) in Sweden and compare speech characteristics and symptoms to those of earlier survey findings in mainly English-speakers. In a web-based questionnaire 178 Swedish speech-language pathologists (SLPs) anonymously answered questions about their perception of typical speech characteristics for CAS. They graded own assessment skills and estimated clinical occurrence. The seven top speech characteristics reported as typical for children with CAS were: inconsistent speech production (85%), sequencing difficulties (71%), oro-motor deficits (63%), vowel errors (62%), voicing errors (61%), consonant cluster deletions (54%), and prosodic disturbance (53%). Motor-programming deficits described as lack of automatization of speech movements were perceived by 82%. All listed characteristics were consistent with the American Speech-Language-Hearing Association (ASHA) consensus-based features, Strand's 10-point checklist, and the diagnostic model proposed by Ozanne. The mode for clinical occurrence was 5%. Number of suspected cases of CAS in the clinical caseload was approximately one new patient/year and SLP. The results support and add to findings from studies of CAS in English-speaking children with similar speech characteristics regarded as typical. Possibly, these findings could contribute to cross-linguistic consensus on CAS characteristics.

  5. Speech-recognition interfaces for music information retrieval

    NASA Astrophysics Data System (ADS)

    Goto, Masataka

    2005-09-01

    This paper describes two hands-free music information retrieval (MIR) systems that enable a user to retrieve and play back a musical piece by saying its title or the artist's name. Although various interfaces for MIR have been proposed, speech-recognition interfaces suitable for retrieving musical pieces have not been studied. Our MIR-based jukebox systems employ two different speech-recognition interfaces for MIR, speech completion and speech spotter, which exploit intentionally controlled nonverbal speech information in original ways. The first is a music retrieval system with the speech-completion interface that is suitable for music stores and car-driving situations. When a user only remembers part of the name of a musical piece or an artist and utters only a remembered fragment, the system helps the user recall and enter the name by completing the fragment. The second is a background-music playback system with the speech-spotter interface that can enrich human-human conversation. When a user is talking to another person, the system allows the user to enter voice commands for music playback control by spotting a special voice-command utterance in face-to-face or telephone conversations. Experimental results from use of these systems have demonstrated the effectiveness of the speech-completion and speech-spotter interfaces. (Video clips: http://staff.aist.go.jp/m.goto/MIR/speech-if.html)

  6. Relations between affective music and speech: evidence from dynamics of affective piano performance and speech production

    PubMed Central

    Liu, Xiaoluan; Xu, Yi

    2015-01-01

    This study compares affective piano performance with speech production from the perspective of dynamics: unlike previous research, this study uses finger force and articulatory effort as indexes reflecting the dynamics of affective piano performance and speech production respectively. Moreover, for the first time physical constraints such as piano fingerings and speech articulatory constraints are included due to their potential contribution to different patterns of dynamics. A piano performance experiment and speech production experiment were conducted in four emotions: anger, fear, happiness and sadness. The results show that in both piano performance and speech production, anger and happiness generally have high dynamics while sadness has the lowest dynamics. Fingerings interact with fear in the piano experiment and articulatory constraints interact with anger in the speech experiment, i.e., large physical constraints produce significantly higher dynamics than small physical constraints in piano performance under the condition of fear and in speech production under the condition of anger. Using production experiments, this study firstly supports previous perception studies on relations between affective music and speech. Moreover, this is the first study to show quantitative evidence for the importance of considering motor aspects such as dynamics in comparing music performance and speech production in which motor mechanisms play a crucial role. PMID:26217252

  7. Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.

    PubMed

    Jørgensen, Søren; Dau, Torsten

    2011-09-01

    A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR(env), at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America

  8. Segmental intelligibility of synthetic speech produced by rule.

    PubMed

    Logan, J S; Greene, B G; Pisoni, D B

    1989-08-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.

  9. Segmental intelligibility of synthetic speech produced by rule

    PubMed Central

    Logan, John S.; Greene, Beth G.; Pisoni, David B.

    2012-01-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884

  10. Do Older Listeners With Hearing Loss Benefit From Dynamic Pitch for Speech Recognition in Noise?

    PubMed

    Shen, Jing; Souza, Pamela E

    2017-10-12

    Dynamic pitch, the variation in the fundamental frequency of speech, aids older listeners' speech perception in noise. It is unclear, however, whether some older listeners with hearing loss benefit from strengthened dynamic pitch cues for recognizing speech in certain noise scenarios and how this relative benefit may be associated with individual factors. We first examined older individuals' relative benefit between natural and strong dynamic pitches for better speech recognition in noise. Further, we reported the individual factors of the 2 groups of listeners who benefit differently from natural and strong dynamic pitches. Speech reception thresholds of 13 older listeners with mild-moderate hearing loss were measured using target speech with 3 levels of dynamic pitch strength. Individuals' ability to benefit from dynamic pitch was defined as the speech reception threshold difference between speeches with and without dynamic pitch cues. The relative benefit of natural versus strong dynamic pitch varied across individuals. However, this relative benefit remained consistent for the same individuals across those background noises with temporal modulation. Those listeners who benefited more from strong dynamic pitch reported better subjective speech perception abilities. Strong dynamic pitch may be more beneficial than natural dynamic pitch for some older listeners to recognize speech better in noise, particularly when the noise has temporal modulation.

  11. Speech and language support: How physicians can identify and treat speech and language delays in the office setting.

    PubMed

    Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

    2014-01-01

    Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society's Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children's speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation.

  12. Word pair classification during imagined speech using direct brain recordings

    NASA Astrophysics Data System (ADS)

    Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José Del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.

    2016-05-01

    People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70-150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58% p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications.

  13. Developing a weighted measure of speech sound accuracy.

    PubMed

    Preston, Jonathan L; Ramsdell, Heather L; Oller, D Kimbrough; Edwards, Mary Louise; Tobin, Stephen J

    2011-02-01

    To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound Accuracy (WSSA) score. The authors then evaluate the reliability and validity of this measure. Phonetic transcriptions were analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy was validated against existing measures, was used to discriminate typical and disordered speech production, and was evaluated to examine sensitivity to changes in phonetic accuracy over time. Reliability between transcribers and consistency of scores among different word sets and testing points are compared. Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners' judgments of the severity of a child's speech disorder. The measure separates children with and without speech sound disorders and captures growth in phonetic accuracy in toddlers' speech over time. The measure correlates highly across transcribers, word lists, and testing points. Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children's speech.

  14. Word pair classification during imagined speech using direct brain recordings

    PubMed Central

    Martin, Stephanie; Brunner, Peter; Iturrate, Iñaki; Millán, José del R.; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.

    2016-01-01

    People that cannot communicate due to neurological disorders would benefit from an internal speech decoder. Here, we showed the ability to classify individual words during imagined speech from electrocorticographic signals. In a word imagery task, we used high gamma (70–150 Hz) time features with a support vector machine model to classify individual words from a pair of words. To account for temporal irregularities during speech production, we introduced a non-linear time alignment into the SVM kernel. Classification accuracy reached 88% in a two-class classification framework (50% chance level), and average classification accuracy across fifteen word-pairs was significant across five subjects (mean = 58%; p < 0.05). We also compared classification accuracy between imagined speech, overt speech and listening. As predicted, higher classification accuracy was obtained in the listening and overt speech conditions (mean = 89% and 86%, respectively; p < 0.0001), where speech stimuli were directly presented. The results provide evidence for a neural representation for imagined words in the temporal lobe, frontal lobe and sensorimotor cortex, consistent with previous findings in speech perception and production. These data represent a proof of concept study for basic decoding of speech imagery, and delineate a number of key challenges to usage of speech imagery neural representations for clinical applications. PMID:27165452

  15. Speech-Language Dissociations, Distractibility, and Childhood Stuttering

    PubMed Central

    Conture, Edward G.; Walden, Tedra A.; Lambert, Warren E.

    2015-01-01

    Purpose This study investigated the relation among speech-language dissociations, attentional distractibility, and childhood stuttering. Method Participants were 82 preschool-age children who stutter (CWS) and 120 who do not stutter (CWNS). Correlation-based statistics (Bates, Appelbaum, Salcedo, Saygin, & Pizzamiglio, 2003) identified dissociations across 5 norm-based speech-language subtests. The Behavioral Style Questionnaire Distractibility subscale measured attentional distractibility. Analyses addressed (a) between-groups differences in the number of children exhibiting speech-language dissociations; (b) between-groups distractibility differences; (c) the relation between distractibility and speech-language dissociations; and (d) whether interactions between distractibility and dissociations predicted the frequency of total, stuttered, and nonstuttered disfluencies. Results More preschool-age CWS exhibited speech-language dissociations compared with CWNS, and more boys exhibited dissociations compared with girls. In addition, male CWS were less distractible than female CWS and female CWNS. For CWS, but not CWNS, less distractibility (i.e., greater attention) was associated with more speech-language dissociations. Last, interactions between distractibility and dissociations did not predict speech disfluencies in CWS or CWNS. Conclusions The present findings suggest that for preschool-age CWS, attentional processes are associated with speech-language dissociations. Future investigations are warranted to better understand the directionality of effect of this association (e.g., inefficient attentional processes → speech-language dissociations vs. inefficient attentional processes ← speech-language dissociations). PMID:26126203

  16. Vowel Space Characteristics of Speech Directed to Children With and Without Hearing Loss

    PubMed Central

    Wieland, Elizabeth A.; Burnham, Evamarie B.; Kondaurova, Maria; Bergeson, Tonya R.

    2015-01-01

    Purpose This study examined vowel characteristics in adult-directed (AD) and infant-directed (ID) speech to children with hearing impairment who received cochlear implants or hearing aids compared with speech to children with normal hearing. Method Mothers' AD and ID speech to children with cochlear implants (Study 1, n = 20) or hearing aids (Study 2, n = 11) was compared with mothers' speech to controls matched on age and hearing experience. The first and second formants of vowels /i/, /ɑ/, and /u/ were measured, and vowel space area and dispersion were calculated. Results In both studies, vowel space was modified in ID compared with AD speech to children with and without hearing loss. Study 1 showed larger vowel space area and dispersion in ID compared with AD speech regardless of infant hearing status. The pattern of effects of ID and AD speech on vowel space characteristics in Study 2 was similar to that in Study 1, but depended partly on children's hearing status. Conclusion Given previously demonstrated associations between expanded vowel space in ID compared with AD speech and enhanced speech perception skills, this research supports a focus on vowel pronunciation in developing intervention strategies for improving speech-language skills in children with hearing impairment. PMID:25658071

  17. Speech and oromotor outcome in adolescents born preterm: relationship to motor tract integrity.

    PubMed

    Northam, Gemma B; Liégeois, Frédérique; Chong, Wui K; Baker, Kate; Tournier, Jacques-Donald; Wyatt, John S; Baldeweg, Torsten; Morgan, Angela

    2012-03-01

    To assess speech abilities in adolescents born preterm and investigate whether there is an association between specific speech deficits and brain abnormalities. Fifty adolescents born prematurely (<33 weeks' gestation) with a spectrum of brain injuries were recruited (mean age, 16 years). Speech examination included tests of speech-sound processing and production and speech and oromotor control. Conventional magnetic resonance imaging and diffusion-weighted imaging was acquired in all adolescents born preterm and 30 term-born control subjects. Radiological ratings of brain injury were recorded and the integrity of the primary motor projections was measured (corticospinal tract and speech-motor corticobulbar tract [CST/CBT]). There were no clinical diagnoses of developmental dysarthria, dyspraxia, or a speech-sound disorder, but difficulties in speech and oromotor control were common. A regression analysis revealed that presence of a neurologic impairment, and diffusion-weighted imaging abnormalities in the left CST/CBT were significant independent predictors of poor speech and oromotor outcome. These left-lateralized abnormalities were most evident at the level of the posterior limb of the internal capsule. Difficulties in speech and oromotor control are common in adolescents born preterm, and adolescents with injury to the CST/CBT pathways in the left-hemisphere may be most at risk. Copyright © 2012 Mosby, Inc. All rights reserved.

  18. Applications of orofacial myofunctional techniques to speech therapy.

    PubMed

    Landis, C F

    1994-11-01

    A speech-language pathologist describes how she uses oral myofunctional therapy techniques in the treatment of speech articulation disorders, voice disorders, stuttering and apraxia of speech. Specific exercises are detailed.

  19. Submucous Clefts

    MedlinePlus

    ... of the palate is because of abnormal speech. The speech has a nasal sound because air is lost through the nose. In such cases the child’s speech should be evaluated by a speech pathologist who, ...

  20. Speech-on-speech masking with variable access to the linguistic content of the masker speech for native and non-native speakers of English

    PubMed Central

    Calandruccio, Lauren; Bradlow, Ann R.; Dhar, Sumitrajit

    2013-01-01

    Background Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared to native-accented English speech was reported in Calandruccio, Dhar and Bradlow (2010). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. Purpose The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech affect masking release. Research Design A mixed model design with within- (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech, and high-intelligibility, moderate-intelligibility and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Study Sample Three listener groups were tested including monolingual English speakers with normal hearing, non-native speakers of English with normal hearing, and monolingual speakers of English with hearing loss. The non-native speakers of English were from various native-language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetrical, mild sloping to moderate sensorineural hearing loss. Data Collection and Analysis Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the keywords within the sentences (100 keywords/masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and the listener groups. Results Monolingual speakers of English with normal hearing benefited when the competing speech signal was foreign-accented compared to native-accented allowing for improved speech recognition. Various levels of intelligibility across the foreign-accented speech maskers did not influence results. Neither the non-native English listeners with normal hearing, nor the monolingual English speakers with hearing loss benefited from masking release when the masker was changed from native-accented to foreign-accented English. Conclusions Slight modifications between the target and the masker speech allowed monolingual speakers of English with normal hearing to improve their recognition of native-accented English even when the competing speech was highly intelligible. Further research is needed to determine which modifications within the competing speech signal caused the Mandarin-accented English to be less effective with respect to masking. Determining the influences within the competing speech that make it less effective as a masker, or determining why monolingual normal-hearing listeners can take advantage of these differences could help improve speech recognition for those with hearing loss in the future. PMID:25126683

  1. Speaking of Race, Speaking of Sex: Hate Speech, Civil Rights, and Civil Liberties.

    ERIC Educational Resources Information Center

    Gates, Henry Louis, Jr.; And Others

    The essays of this collection explore the restriction of speech and the hate speech codes that attempt to restrict bigoted or offensive speech and punish those who engage in it. These essays generally argue that speech restrictions are dangerous and counterproductive, but they acknowledge that it is very difficult to distinguish between…

  2. Parents and Speech Therapist Perception of Parental Involvement in Kailila Therapy Center, Jakarta, Indonesia

    ERIC Educational Resources Information Center

    Jane, Griselda; Tunjungsari, Harini

    2015-01-01

    Parental involvement in a speech therapy has not been prioritized in most therapy centers in Indonesia. One of the therapy centers that has recognized the importance of parental involvement is Kailila Speech Therapy Center. In Kailila speech therapy center, parental involvement in children's speech therapy is an obligation that has been…

  3. Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

    ERIC Educational Resources Information Center

    Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

    2009-01-01

    Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…

  4. Verbal Short-Term Memory Span in Speech-Disordered Children: Implications for Articulatory Coding in Short-Term Memory.

    ERIC Educational Resources Information Center

    Raine, Adrian; And Others

    1991-01-01

    Children with speech disorders had lower short-term memory capacity and smaller word length effect than control children. Children with speech disorders also had reduced speech-motor activity during rehearsal. Results suggest that speech rate may be a causal determinant of verbal short-term memory capacity. (BC)

  5. Speech, "Inner Speech," and the Development of Short-Term Memory: Effects of Picture-Labeling on Recall.

    ERIC Educational Resources Information Center

    Hitch, Graham J.; And Others

    1991-01-01

    Reports on experiments to determine effects of overt speech on children's use of inner speech in short-term memory. Word length and phonemic similarity had greater effects on older children and when pictures were labeled at presentation. Suggests that speaking or listening to speech activates an internal articulatory loop. (Author/GH)

  6. Affective Properties of Mothers' Speech to Infants with Hearing Impairment and Cochlear Implants

    ERIC Educational Resources Information Center

    Kondaurova, Maria V.; Bergeson, Tonya R.; Xu, Huiping; Kitamura, Christine

    2015-01-01

    Purpose: The affective properties of infant-directed speech influence the attention of infants with normal hearing to speech sounds. This study explored the affective quality of maternal speech to infants with hearing impairment (HI) during the 1st year after cochlear implantation as compared to speech to infants with normal hearing. Method:…

  7. The Speech Discipline in Crisis - - A Cause for Hope.

    ERIC Educational Resources Information Center

    Lanigan, Richard L.

    Speech communication is a distinct discipline, but one in a healthy state of conflict between theory and practice. The crisis in the speech discipline (and in academic generally) exists because speech does not present itself as a consumable value; quality program decisions are not made; speech is often conceived as only one subject matter; general…

  8. Speech-Language Pathologists' Assessment Practices for Children with Suspected Speech Sound Disorders: Results of a National Survey

    ERIC Educational Resources Information Center

    Skahan, Sarah M.; Watson, Maggie; Lof, Gregory L.

    2007-01-01

    Purpose: This study examined assessment procedures used by speech-language pathologists (SLPs) when assessing children suspected of having speech sound disorders (SSD). This national survey also determined the information participants obtained from clients' speech samples, evaluation of non-native English speakers, and time spent on assessment.…

  9. Evidence-Based Systematic Review: Effects of Nonspeech Oral Motor Exercises on Speech

    ERIC Educational Resources Information Center

    McCauley, Rebecca J.; Strand, Edythe; Lof, Gregory L.; Schooling, Tracy; Frymark, Tobi

    2009-01-01

    Purpose: The purpose of this systematic review was to examine the current evidence for the use of oral motor exercises (OMEs) on speech (i.e., speech physiology, speech production, and functional speech outcomes) as a means of supporting further research and clinicians' use of evidence-based practice. Method: The peer-reviewed literature from 1960…

  10. Audiovisual Perception of Noise Vocoded Speech in Dyslexic and Non-Dyslexic Adults: The Role of Low-Frequency Visual Modulations

    ERIC Educational Resources Information Center

    Megnin-Viggars, Odette; Goswami, Usha

    2013-01-01

    Visual speech inputs can enhance auditory speech information, particularly in noisy or degraded conditions. The natural statistics of audiovisual speech highlight the temporal correspondence between visual and auditory prosody, with lip, jaw, cheek and head movements conveying information about the speech envelope. Low-frequency spatial and…

  11. Hate Speech or Free Speech: Can Broad Campus Speech Regulations Survive Current Judicial Reasoning?

    ERIC Educational Resources Information Center

    Heiser, Gregory M.; Rossow, Lawrence F.

    1993-01-01

    Federal courts have found speech regulations overbroad in suits against the University of Michigan and the University of Wisconsin System. Attempts to assess the theoretical justification and probable fate of broad speech regulations that have not been explicitly rejected by the courts. Concludes that strong arguments for broader regulation will…

  12. Difficulties in Automatic Speech Recognition of Dysarthric Speakers and Implications for Speech-Based Applications Used by the Elderly: A Literature Review

    ERIC Educational Resources Information Center

    Young, Victoria; Mihailidis, Alex

    2010-01-01

    Despite their growing presence in home computer applications and various telephony services, commercial automatic speech recognition technologies are still not easily employed by everyone; especially individuals with speech disorders. In addition, relatively little research has been conducted on automatic speech recognition performance with older…

  13. Speech Characteristics of 8-Year-Old Children: Findings from a Prospective Population Study

    ERIC Educational Resources Information Center

    Wren, Yvonne; McLeod, Sharynne; White, Paul; Miller, Laura L.; Roulstone, Sue

    2013-01-01

    Speech disorder that continues into middle childhood is rarely studied compared with speech disorder in the early years. Speech production in single words, connected speech and nonword repetition was assessed for 7390 eight-year-old children within the Avon Longitudinal Study of Parents and Children (ALSPAC). The majority (n=6399) had typical…

  14. Vocoders and Speech Perception: Uses of Computer-Based Speech Analysis-Synthesis in Stimulus Generation.

    ERIC Educational Resources Information Center

    Tierney, Joseph; Mack, Molly

    1987-01-01

    Stimuli used in research on the perception of the speech signal have often been obtained from simple filtering and distortion of the speech waveform, sometimes accompanied by noise. However, for more complex stimulus generation, the parameters of speech can be manipulated, after analysis and before synthesis, using various types of algorithms to…

  15. Attitudes toward Speech Disorders: Sampling the Views of Cantonese-Speaking Americans.

    ERIC Educational Resources Information Center

    Bebout, Linda; Arthur, Bradford

    1997-01-01

    A study of 60 Chinese Americans and 46 controls found the Chinese Americans were more likely to believe persons with speech disorders could improve speech by "trying hard," to view people using deaf speech and people with cleft palates as perhaps being emotionally disturbed, and to regard deaf speech as a limitation. (Author/CR)

  16. SPEECH HABILITATION IN THE SCHOOLS FOR THE CLEFT PALATE CHILD, THE NEW YORK STATE EDUCATION DEPARTMENT PROCEEDINGS (MARCH 17-20, 1965).

    ERIC Educational Resources Information Center

    VAN HATTUM, ROLLAND J.; AND OTHERS

    DESIGNED TO STRENGTHEN THE SKILLS, COMPETENCIES, AND KNOWLEDGE OF SPEECH CORRECTION TEACHERS, THIS SUMMARY OF A SPECIAL STUDY INSTITUTE CONTAINS A SERIES OF PRESENTATIONS. SPEAKERS DISCUSS ASPECTS OF CLEFT PALATE INCLUDING SPEECH, SPEECH ANATOMY, SURGICAL AND DENTAL MANAGEMENT, DIAGNOSIS, AND SPEECH THERAPY. SPEAKERS REPRESENT MEDICAL AND…

  17. Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2012

    2012-01-01

    The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be…

  18. A Meta-Analytic Review of Gender Variations in Children's Language Use: Talkativeness, Affiliative Speech, and Assertive Speech

    ERIC Educational Resources Information Center

    Leaper, Campbell; Smith, Tara E.

    2004-01-01

    Three sets of meta-analyses examined gender effects on children's language use. Each set of analyses considered an aspect of speech that is considered to be gender typed: talkativeness, affiliative speech, and assertive speech. Statistically significant average effect sizes were obtained with all three language constructs. On average, girls were…

  19. Staff Report to the Senior Department Official on Recognition Compliance Issues. Recommendation Page: American Speech-Language-Hearing Association

    ERIC Educational Resources Information Center

    US Department of Education, 2010

    2010-01-01

    The American Speech-Language-Hearing Association, Council on Academic Accreditation in Audiology and Speech-Language Pathology (CAA) is a national accrediting agency of graduate education programs in audiology or speech-language pathology. The CAA currently accredits or or preaccredits 319 programs (247 in speech-language pathology and 72 in…

  20. The role of the speech-language pathologist in home care.

    PubMed

    Giles, Melanie; Barker, Mary; Hayes, Amanda

    2014-06-01

    Speech language pathologists play an important role in the care of patients with speech, language, or swallowing difficulties that can result from a variety of medical conditions. This article describes how speech language pathologists assess and treat these conditions and the red flags that suggest a referral to a speech language pathologist is indicated.

  1. Increase in Speech Recognition Due to Linguistic Mismatch between Target and Masker Speech: Monolingual and Simultaneous Bilingual Performance

    ERIC Educational Resources Information Center

    Calandruccio, Lauren; Zhou, Haibo

    2014-01-01

    Purpose: To examine whether improved speech recognition during linguistically mismatched target-masker experiments is due to linguistic unfamiliarity of the masker speech or linguistic dissimilarity between the target and masker speech. Method: Monolingual English speakers (n = 20) and English-Greek simultaneous bilinguals (n = 20) listened to…

  2. Speech systems research at Texas Instruments

    NASA Technical Reports Server (NTRS)

    Doddington, George R.

    1977-01-01

    An assessment of automatic speech processing technology is presented. Fundamental problems in the development and the deployment of automatic speech processing systems are defined and a technology forecast for speech systems is presented.

  3. The Atlanta Motor Speech Disorders Corpus: Motivation, Development, and Utility.

    PubMed

    Laures-Gore, Jacqueline; Russell, Scott; Patel, Rupal; Frankel, Michael

    2016-01-01

    This paper describes the design and collection of a comprehensive spoken language dataset from speakers with motor speech disorders in Atlanta, Ga., USA. This collaborative project aimed to gather a spoken database consisting of nonmainstream American English speakers residing in the Southeastern US in order to provide a more diverse perspective of motor speech disorders. Ninety-nine adults with an acquired neurogenic disorder resulting in a motor speech disorder were recruited. Stimuli include isolated vowels, single words, sentences with contrastive focus, sentences with emotional content and prosody, sentences with acoustic and perceptual sensitivity to motor speech disorders, as well as 'The Caterpillar' and 'The Grandfather' passages. Utility of this data in understanding the potential interplay of dialect and dysarthria was demonstrated with a subset of the speech samples existing in the database. The Atlanta Motor Speech Disorders Corpus will enrich our understanding of motor speech disorders through the examination of speech from a diverse group of speakers. © 2016 S. Karger AG, Basel.

  4. Shot through with voices: Dissociation mediates the relationship between varieties of inner speech and auditory hallucination proneness

    PubMed Central

    Alderson-Day, Ben; McCarthy-Jones, Simon; Bedford, Sarah; Collins, Hannah; Dunne, Holly; Rooke, Chloe; Fernyhough, Charles

    2014-01-01

    Inner speech is a commonly experienced but poorly understood phenomenon. The Varieties of Inner Speech Questionnaire (VISQ; McCarthy-Jones & Fernyhough, 2011) assesses four characteristics of inner speech: dialogicality, evaluative/motivational content, condensation, and the presence of other people. Prior findings have linked anxiety and proneness to auditory hallucinations (AH) to these types of inner speech. This study extends that work by examining how inner speech relates to self-esteem and dissociation, and their combined impact upon AH-proneness. 156 students completed the VISQ and measures of self-esteem, dissociation and AH-proneness. Correlational analyses indicated that evaluative inner speech and other people in inner speech were associated with lower self-esteem and greater frequency of dissociative experiences. Dissociation and VISQ scores, but not self-esteem, predicted AH-proneness. Structural equation modelling supported a mediating role for dissociation between specific components of inner speech (evaluative and other people) and AH-proneness. Implications for the development of “hearing voices” are discussed. PMID:24980910

  5. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    PubMed

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.

  6. Gender Disparities in Speech-evoked Auditory Brainstem Response in Healthy Adults.

    PubMed

    Jalaei, Bahram; Zakaria, Mohd Normani; Mohd Azmi, Mohd Hafiz Afifi; Nik Othman, Nik Adilah; Sidek, Dinsuhaimi

    2017-04-01

    Gender disparities in speech-evoked auditory brainstem response (speech-ABR) outcomes have been reported, but the literature is limited. The present study was performed to further verify this issue and determine the influence of head size on speech-ABR results between genders. Twenty-nine healthy Malaysian subjects (14 males and 15 females) aged 19 to 30 years participated in this study. After measuring the head circumference, speech-ABR was recorded by using synthesized syllable /da/ from the right ear of each participant. Speech-ABR peaks amplitudes, peaks latencies, and composite onset measures were computed and analyzed. Significant gender disparities were noted in the transient component but not in the sustained component of speech-ABR. Statistically higher V/A amplitudes and less steeper V/A slopes were found in females. These gender differences were partially affected after controlling for the head size. Head size is not the main contributing factor for gender disparities in speech-ABR outcomes. Gender-specific normative data can be useful when recording speech-ABR for clinical purposes.

  7. Measuring Speech Comprehensibility in Students with Down Syndrome

    PubMed Central

    Woynaroski, Tiffany; Camarata, Stephen

    2016-01-01

    Purpose There is an ongoing need to develop assessments of spontaneous speech that focus on whether the child's utterances are comprehensible to listeners. This study sought to identify the attributes of a stable ratings-based measure of speech comprehensibility, which enabled examining the criterion-related validity of an orthography-based measure of the comprehensibility of conversational speech in students with Down syndrome. Method Participants were 10 elementary school students with Down syndrome and 4 unfamiliar adult raters. Averaged across-observer Likert ratings of speech comprehensibility were called a ratings-based measure of speech comprehensibility. The proportion of utterance attempts fully glossed constituted an orthography-based measure of speech comprehensibility. Results Averaging across 4 raters on four 5-min segments produced a reliable (G = .83) ratings-based measure of speech comprehensibility. The ratings-based measure was strongly (r > .80) correlated with the orthography-based measure for both the same and different conversational samples. Conclusion Reliable and valid measures of speech comprehensibility are achievable with the resources available to many researchers and some clinicians. PMID:27299989

  8. Reliance on auditory feedback in children with childhood apraxia of speech.

    PubMed

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Guarino, Anthony J; Green, Jordan R

    2015-01-01

    Children with childhood apraxia of speech (CAS) have been hypothesized to continuously monitor their speech through auditory feedback to minimize speech errors. We used an auditory masking paradigm to determine the effect of attenuating auditory feedback on speech in 30 children: 9 with CAS, 10 with speech delay, and 11 with typical development. The masking only affected the speech of children with CAS as measured by voice onset time and vowel space area. These findings provide preliminary support for greater reliance on auditory feedback among children with CAS. Readers of this article should be able to (i) describe the motivation for investigating the role of auditory feedback in children with CAS; (ii) report the effects of feedback attenuation on speech production in children with CAS, speech delay, and typical development, and (iii) understand how the current findings may support a feedforward program deficit in children with CAS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Speech perception in individuals with auditory dys-synchrony.

    PubMed

    Kumar, U A; Jayaram, M

    2011-03-01

    This study aimed to evaluate the effect of lengthening the transition duration of selected speech segments upon the perception of those segments in individuals with auditory dys-synchrony. Thirty individuals with auditory dys-synchrony participated in the study, along with 30 age-matched normal hearing listeners. Eight consonant-vowel syllables were used as auditory stimuli. Two experiments were conducted. Experiment one measured the 'just noticeable difference' time: the smallest prolongation of the speech sound transition duration which was noticeable by the subject. In experiment two, speech sounds were modified by lengthening the transition duration by multiples of the just noticeable difference time, and subjects' speech identification scores for the modified speech sounds were assessed. Subjects with auditory dys-synchrony demonstrated poor processing of temporal auditory information. Lengthening of speech sound transition duration improved these subjects' perception of both the placement and voicing features of the speech syllables used. These results suggest that innovative speech processing strategies which enhance temporal cues may benefit individuals with auditory dys-synchrony.

  10. How does cognitive load influence speech perception? An encoding hypothesis.

    PubMed

    Mitterer, Holger; Mattys, Sven L

    2017-01-01

    Two experiments investigated the conditions under which cognitive load exerts an effect on the acuity of speech perception. These experiments extend earlier research by using a different speech perception task (four-interval oddity task) and by implementing cognitive load through a task often thought to be modular, namely, face processing. In the cognitive-load conditions, participants were required to remember two faces presented before the speech stimuli. In Experiment 1, performance in the speech-perception task under cognitive load was not impaired in comparison to a no-load baseline condition. In Experiment 2, we modified the load condition minimally such that it required encoding of the two faces simultaneously with the speech stimuli. As a reference condition, we also used a visual search task that in earlier experiments had led to poorer speech perception. Both concurrent tasks led to decrements in the speech task. The results suggest that speech perception is affected even by loads thought to be processed modularly, and that, critically, encoding in working memory might be the locus of interference.

  11. Speech and pause characteristics in multiple sclerosis: A preliminary study of speakers with high and low neuropsychological test performance

    PubMed Central

    FEENAUGHTY, LYNDA; TJADEN, KRIS; BENEDICT, RALPH H.B.; WEINSTOCK-GUTTMAN, BIANCA

    2017-01-01

    This preliminary study investigated how cognitive-linguistic status in multiple sclerosis (MS) is reflected in two speech tasks (i.e. oral reading, narrative) that differ in cognitive-linguistic demand. Twenty individuals with MS were selected to comprise High and Low performance groups based on clinical tests of executive function and information processing speed and efficiency. Ten healthy controls were included for comparison. Speech samples were audio-recorded and measures of global speech timing were obtained. Results indicated predicted differences in global speech timing (i.e. speech rate and pause characteristics) for speech tasks differing in cognitive-linguistic demand, but the magnitude of these task-related differences was similar for all speaker groups. Findings suggest that assumptions concerning the cognitive-linguistic demands of reading aloud as compared to spontaneous speech may need to be re-considered for individuals with cognitive impairment. Qualitative trends suggest that additional studies investigating the association between cognitive-linguistic and speech motor variables in MS are warranted. PMID:23294227

  12. Effects of utterance length and vocal loudness on speech breathing in older adults.

    PubMed

    Huber, Jessica E

    2008-12-31

    Age-related reductions in pulmonary elastic recoil and respiratory muscle strength can affect how older adults generate subglottal pressure required for speech production. The present study examined age-related changes in speech breathing by manipulating utterance length and loudness during a connected speech task (monologue). Twenty-three older adults and twenty-eight young adults produced a monologue at comfortable loudness and pitch and with multi-talker babble noise playing in the room to elicit louder speech. Dependent variables included sound pressure level, speech rate, and lung volume initiation, termination, and excursion. Older adults produced shorter utterances than young adults overall. Age-related effects were larger for longer utterances. Older adults demonstrated very different lung volume adjustments for loud speech than young adults. These results suggest that older adults have a more difficult time when the speech system is being taxed by both utterance length and loudness. The data were consistent with the hypothesis that both young and older adults use utterance length in premotor speech planning processes.

  13. Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.

    PubMed

    Borrie, Stephanie A

    2015-03-01

    This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.

  14. Common variation in the autism risk gene CNTNAP2, brain structural connectivity and multisensory speech integration.

    PubMed

    Ross, Lars A; Del Bene, Victor A; Molholm, Sophie; Jae Woo, Young; Andrade, Gizely N; Abrahams, Brett S; Foxe, John J

    2017-11-01

    Three lines of evidence motivated this study. 1) CNTNAP2 variation is associated with autism risk and speech-language development. 2) CNTNAP2 variations are associated with differences in white matter (WM) tracts comprising the speech-language circuitry. 3) Children with autism show impairment in multisensory speech perception. Here, we asked whether an autism risk-associated CNTNAP2 single nucleotide polymorphism in neurotypical adults was associated with multisensory speech perception performance, and whether such a genotype-phenotype association was mediated through white matter tract integrity in speech-language circuitry. Risk genotype at rs7794745 was associated with decreased benefit from visual speech and lower fractional anisotropy (FA) in several WM tracts (right precentral gyrus, left anterior corona radiata, right retrolenticular internal capsule). These structural connectivity differences were found to mediate the effect of genotype on audiovisual speech perception, shedding light on possible pathogenic pathways in autism and biological sources of inter-individual variation in audiovisual speech processing in neurotypicals. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. A multimodal spectral approach to characterize rhythm in natural speech.

    PubMed

    Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

    2016-01-01

    Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.

  16. Visual Feedback of Tongue Movement for Novel Speech Sound Learning

    PubMed Central

    Katz, William F.; Mehta, Sonya

    2015-01-01

    Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV) information. Second language (L2) learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals). However, little is known about the role of viewing one's own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker's learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA) was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ/; a voiced, coronal, palatal stop) before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers' productions were evaluated using kinematic (tongue-tip spatial positioning) and acoustic (burst spectra) measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing. PMID:26635571

  17. Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

    PubMed Central

    Van Ackeren, Markus Johannes; Barbero, Francesca M; Mattioni, Stefania; Bottini, Roberto

    2018-01-01

    The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives. PMID:29338838

  18. Contralateral functional reorganization of the speech supplementary motor area following neurosurgical tumor resection.

    PubMed

    Chivukula, Srinivas; Pikul, Brian K; Black, Keith L; Pouratian, Nader; Bookheimer, Susan Y

    2018-05-18

    We evaluated plasticity in speech supplemental motor area (SMA) tissue in two patients using functional magnetic resonance imaging (fMRI), following resection of tumors in or associated with the dominant hemisphere speech SMA. Patient A underwent resection of a anaplastic astrocytoma NOS associated with the left speech SMA, experienced SMA syndrome related mutism postoperatively, but experienced full recovery 14 months later. FMRI performed 32 months after surgery demonstrated a migration of speech SMA to homologous contralateral hemispheric regional tissue. Patient B underwent resection of a oligodendroglioma NOS in the left speech SMA, and postoperatively experienced speech hesitancy, latency and poor fluency, which gradually resolved over 18 months. FMRI performed at 64 months after surgery showed a reorganization of speech SMA to the contralateral hemisphere. These data support the hypothesis of dynamic, time based plasticity in speech SMA tissue, and may represent a noninvasive neural marker for SMA syndrome recovery. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. [Effects of fundamental frequency and speech rate on impression formation].

    PubMed

    Uchida, Teruhisa; Nakaune, Naoko

    2004-12-01

    This study investigated the systematic relationship between nonverbal features of speech and personality trait ratings of the speaker. In Study 1, fundamental frequency (F0) in original speech was converted into five levels from 64% to 156.25%. Then 132 undergraduates rated each of the converted speeches in terms of personality traits. In Study 2 134 undergraduates similarly rated the speech stimuli, which had five speech rate levels as well as two F0 levels. Results showed that listener ratings along Big Five dimensions were mostly independent. Each dimension had a slightly different change profile over the five levels of F0 and speech rate. A quadratic regression equation provided a good approximation for each rating as a function of F0 or speech rate. The quadratic regression equations put together would provide us with a rough estimate of personality trait impression as a function of prosodic features. The functional relationship among F0, speech rate, and trait ratings was shown as a curved surface in the three-dimensional space.

  20. A hybrid technique for speech segregation and classification using a sophisticated deep neural network

    PubMed Central

    Nawaz, Tabassam; Mehmood, Zahid; Rashid, Muhammad; Habib, Hafiz Adnan

    2018-01-01

    Recent research on speech segregation and music fingerprinting has led to improvements in speech segregation and music identification algorithms. Speech and music segregation generally involves the identification of music followed by speech segregation. However, music segregation becomes a challenging task in the presence of noise. This paper proposes a novel method of speech segregation for unlabelled stationary noisy audio signals using the deep belief network (DBN) model. The proposed method successfully segregates a music signal from noisy audio streams. A recurrent neural network (RNN)-based hidden layer segregation model is applied to remove stationary noise. Dictionary-based fisher algorithms are employed for speech classification. The proposed method is tested on three datasets (TIMIT, MIR-1K, and MusicBrainz), and the results indicate the robustness of proposed method for speech segregation. The qualitative and quantitative analysis carried out on three datasets demonstrate the efficiency of the proposed method compared to the state-of-the-art speech segregation and classification-based methods. PMID:29558485

  1. International aspirations for speech-language pathologists' practice with multilingual children with speech sound disorders: development of a position paper.

    PubMed

    McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline

    2013-01-01

    A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional practice, and (d) SLPs' collaboration with other professionals. Readers will 1. recognize that multilingual children with speech sound disorders have both similar and different needs to monolingual children when working with speech-language pathologists. 2. Describe the challenges for speech-language pathologists who work with multilingual children. 3. Recall the importance of cultural competence for speech-language pathologists. 4. Identify methods for international collaboration and consultation. 5. Recognize the importance of engaging with families and people within their local communities for supporting multilingual children in context. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. Utility and accuracy of perceptual voice and speech distinctions in the diagnosis of Parkinson's disease, PSP and MSA-P.

    PubMed

    Miller, Nick; Nath, Uma; Noble, Emma; Burn, David

    2017-06-01

    To determine if perceptual speech measures distinguish people with Parkinson's disease (PD), multiple system atrophy with predominant parkinsonism (MSA-P) and progressive supranuclear palsy (PSP). Speech-language therapists blind to patient characteristics employed clinical rating scales to evaluate speech/voice in 24 people with clinically diagnosed PD, 17 with PSP and 9 with MSA-P, matched for disease duration (mean 4.9 years, standard deviation 2.2). No consistent intergroup differences appeared on specific speech/voice variables. People with PD were significantly less impaired on overall speech/voice severity. Analyses by severity suggested further investigation around laryngeal, resonance and fluency changes may characterize individual groups. MSA-P and PSP compared with PD were distinguished by severity of speech/voice deterioration, but individual speech/voice parameters failed to consistently differentiate groups.

  3. The impact of extrinsic demographic factors on Cantonese speech acquisition.

    PubMed

    To, Carol K S; Cheung, Pamela S P; McLeod, Sharynne

    2013-05-01

    This study modeled the associations between extrinsic demographic factors and children's speech acquisition in Hong Kong Cantonese. The speech of 937 Cantonese-speaking children aged 2;4 to 6;7 in Hong Kong was assessed using a standardized speech test. Demographic information regarding household income, paternal education, maternal education, presence of siblings and having a domestic helper as the main caregiver was collected via parent questionnaires. After controlling for age and sex, higher maternal education and higher household income were significantly associated with better speech skills; however, these variables explained a negligible amount of variance. Paternal education, number of siblings and having a foreign domestic helper did not associate with a child's speech acquisition. Extrinsic factors only exerted minimal influence on children's speech acquisition. A large amount of unexplained variance in speech ability still warrants further research.

  4. Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task.

    PubMed

    Brumberg, Jonathan S; Krusienski, Dean J; Chakrabarti, Shreya; Gunduz, Aysegul; Brunner, Peter; Ritaccio, Anthony L; Schalk, Gerwin

    2016-01-01

    How the human brain plans, executes, and monitors continuous and fluent speech has remained largely elusive. For example, previous research has defined the cortical locations most important for different aspects of speech function, but has not yet yielded a definition of the temporal progression of involvement of those locations as speech progresses either overtly or covertly. In this paper, we uncovered the spatio-temporal evolution of neuronal population-level activity related to continuous overt speech, and identified those locations that shared activity characteristics across overt and covert speech. Specifically, we asked subjects to repeat continuous sentences aloud or silently while we recorded electrical signals directly from the surface of the brain (electrocorticography (ECoG)). We then determined the relationship between cortical activity and speech output across different areas of cortex and at sub-second timescales. The results highlight a spatio-temporal progression of cortical involvement in the continuous speech process that initiates utterances in frontal-motor areas and ends with the monitoring of auditory feedback in superior temporal gyrus. Direct comparison of cortical activity related to overt versus covert conditions revealed a common network of brain regions involved in speech that may implement orthographic and phonological processing. Our results provide one of the first characterizations of the spatiotemporal electrophysiological representations of the continuous speech process, and also highlight the common neural substrate of overt and covert speech. These results thereby contribute to a refined understanding of speech functions in the human brain.

  5. Spatio-Temporal Progression of Cortical Activity Related to Continuous Overt and Covert Speech Production in a Reading Task

    PubMed Central

    Brumberg, Jonathan S.; Krusienski, Dean J.; Chakrabarti, Shreya; Gunduz, Aysegul; Brunner, Peter; Ritaccio, Anthony L.; Schalk, Gerwin

    2016-01-01

    How the human brain plans, executes, and monitors continuous and fluent speech has remained largely elusive. For example, previous research has defined the cortical locations most important for different aspects of speech function, but has not yet yielded a definition of the temporal progression of involvement of those locations as speech progresses either overtly or covertly. In this paper, we uncovered the spatio-temporal evolution of neuronal population-level activity related to continuous overt speech, and identified those locations that shared activity characteristics across overt and covert speech. Specifically, we asked subjects to repeat continuous sentences aloud or silently while we recorded electrical signals directly from the surface of the brain (electrocorticography (ECoG)). We then determined the relationship between cortical activity and speech output across different areas of cortex and at sub-second timescales. The results highlight a spatio-temporal progression of cortical involvement in the continuous speech process that initiates utterances in frontal-motor areas and ends with the monitoring of auditory feedback in superior temporal gyrus. Direct comparison of cortical activity related to overt versus covert conditions revealed a common network of brain regions involved in speech that may implement orthographic and phonological processing. Our results provide one of the first characterizations of the spatiotemporal electrophysiological representations of the continuous speech process, and also highlight the common neural substrate of overt and covert speech. These results thereby contribute to a refined understanding of speech functions in the human brain. PMID:27875590

  6. Evidence of degraded representation of speech in noise, in the aging midbrain and cortex

    PubMed Central

    Simon, Jonathan Z.; Anderson, Samira

    2016-01-01

    Humans have a remarkable ability to track and understand speech in unfavorable conditions, such as in background noise, but speech understanding in noise does deteriorate with age. Results from several studies have shown that in younger adults, low-frequency auditory cortical activity reliably synchronizes to the speech envelope, even when the background noise is considerably louder than the speech signal. However, cortical speech processing may be limited by age-related decreases in the precision of neural synchronization in the midbrain. To understand better the neural mechanisms contributing to impaired speech perception in older adults, we investigated how aging affects midbrain and cortical encoding of speech when presented in quiet and in the presence of a single-competing talker. Our results suggest that central auditory temporal processing deficits in older adults manifest in both the midbrain and in the cortex. Specifically, midbrain frequency following responses to a speech syllable are more degraded in noise in older adults than in younger adults. This suggests a failure of the midbrain auditory mechanisms needed to compensate for the presence of a competing talker. Similarly, in cortical responses, older adults show larger reductions than younger adults in their ability to encode the speech envelope when a competing talker is added. Interestingly, older adults showed an exaggerated cortical representation of speech in both quiet and noise conditions, suggesting a possible imbalance between inhibitory and excitatory processes, or diminished network connectivity that may impair their ability to encode speech efficiently. PMID:27535374

  7. How may the basal ganglia contribute to auditory categorization and speech perception?

    PubMed Central

    Lim, Sung-Joo; Fiez, Julie A.; Holt, Lori L.

    2014-01-01

    Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood. PMID:25136291

  8. Clear Speech Modifications in Children Aged 6-10

    NASA Astrophysics Data System (ADS)

    Taylor, Griffin Lijding

    Modifications to speech production made by adult talkers in response to instructions to speak clearly have been well documented in the literature. Targeting adult populations has been motivated by efforts to improve speech production for the benefit of the communication partners, however, many adults also have communication partners who are children. Surprisingly, there is limited literature on whether children can change their speech production when cued to speak clearly. Pettinato, Tuomainen, Granlund, and Hazan (2016) showed that by age 12, children exhibited enlarged vowel space areas and reduced articulation rate when prompted to speak clearly, but did not produce any other adult-like clear speech modifications in connected speech. Moreover, Syrett and Kawahara (2013) suggested that preschoolers produced longer and more intense vowels when prompted to speak clearly at the word level. These findings contrasted with adult talkers who show significant temporal and spectral differences between speech produced in control and clear speech conditions. Therefore, it was the purpose of this study to analyze changes in temporal and spectral characteristics of speech production that children aged 6-10 made in these experimental conditions. It is important to elucidate the clear speech profile of this population to better understand which adult-like clear speech modifications they make spontaneously and which modifications are still developing. Understanding these baselines will advance future studies that measure the impact of more explicit instructions and children's abilities to better accommodate their interlocutors, which is a critical component of children's pragmatic and speech-motor development.

  9. Speech acoustic markers of early stage and prodromal Huntington's disease: a marker of disease onset?

    PubMed

    Vogel, Adam P; Shirbin, Christopher; Churchyard, Andrew J; Stout, Julie C

    2012-12-01

    Speech disturbances (e.g., altered prosody) have been described in symptomatic Huntington's Disease (HD) individuals, however, the extent to which speech changes in gene positive pre-manifest (PreHD) individuals is largely unknown. The speech of individuals carrying the mutant HTT gene is a behavioural/motor/cognitive marker demonstrating some potential as an objective indicator of early HD onset and disease progression. Speech samples were acquired from 30 individuals carrying the mutant HTT gene (13 PreHD, 17 early stage HD) and 15 matched controls. Participants read a passage, produced a monologue and said the days of the week. Data were analysed acoustically for measures of timing, frequency and intensity. There was a clear effect of group across most acoustic measures, so that speech performance differed in-line with disease progression. Comparisons across groups revealed significant differences between the control and the early stage HD group on measures of timing (e.g., speech rate). Participants carrying the mutant HTT gene presented with slower rates of speech, took longer to say words and produced greater silences between and within words compared to healthy controls. Importantly, speech rate showed a significant correlation to burden of disease scores. The speech of early stage HD differed significantly from controls. The speech of PreHD, although not reaching significance, tended to lie between the performance of controls and early stage HD. This suggests that changes in speech production appear to be developing prior to diagnosis. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. An ALE meta-analysis on the audiovisual integration of speech signals.

    PubMed

    Erickson, Laura C; Heeg, Elizabeth; Rauschecker, Josef P; Turkeltaub, Peter E

    2014-11-01

    The brain improves speech processing through the integration of audiovisual (AV) signals. Situations involving AV speech integration may be crudely dichotomized into those where auditory and visual inputs contain (1) equivalent, complementary signals (validating AV speech) or (2) inconsistent, different signals (conflicting AV speech). This simple framework may allow the systematic examination of broad commonalities and differences between AV neural processes engaged by various experimental paradigms frequently used to study AV speech integration. We conducted an activation likelihood estimation metaanalysis of 22 functional imaging studies comprising 33 experiments, 311 subjects, and 347 foci examining "conflicting" versus "validating" AV speech. Experimental paradigms included content congruency, timing synchrony, and perceptual measures, such as the McGurk effect or synchrony judgments, across AV speech stimulus types (sublexical to sentence). Colocalization of conflicting AV speech experiments revealed consistency across at least two contrast types (e.g., synchrony and congruency) in a network of dorsal stream regions in the frontal, parietal, and temporal lobes. There was consistency across all contrast types (synchrony, congruency, and percept) in the bilateral posterior superior/middle temporal cortex. Although fewer studies were available, validating AV speech experiments were localized to other regions, such as ventral stream visual areas in the occipital and inferior temporal cortex. These results suggest that while equivalent, complementary AV speech signals may evoke activity in regions related to the corroboration of sensory input, conflicting AV speech signals recruit widespread dorsal stream areas likely involved in the resolution of conflicting sensory signals. Copyright © 2014 Wiley Periodicals, Inc.

  11. Cortical Tracking of Global and Local Variations of Speech Rhythm during Connected Natural Speech Perception.

    PubMed

    Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

    2018-06-19

    During natural speech perception, listeners must track the global speaking rate, that is, the overall rate of incoming linguistic information, as well as transient, local speaking rate variations occurring within the global speaking rate. Here, we address the hypothesis that this tracking mechanism is achieved through coupling of cortical signals to the amplitude envelope of the perceived acoustic speech signals. Cortical signals were recorded with magnetoencephalography (MEG) while participants perceived spontaneously produced speech stimuli at three global speaking rates (slow, normal/habitual, and fast). Inherently to spontaneously produced speech, these stimuli also featured local variations in speaking rate. The coupling between cortical and acoustic speech signals was evaluated using audio-MEG coherence. Modulations in audio-MEG coherence spatially differentiated between tracking of global speaking rate, highlighting the temporal cortex bilaterally and the right parietal cortex, and sensitivity to local speaking rate variations, emphasizing the left parietal cortex. Cortical tuning to the temporal structure of natural connected speech thus seems to require the joint contribution of both auditory and parietal regions. These findings suggest that cortical tuning to speech rhythm operates on two functionally distinct levels: one encoding the global rhythmic structure of speech and the other associated with online, rapidly evolving temporal predictions. Thus, it may be proposed that speech perception is shaped by evolutionary tuning, a preference for certain speaking rates, and predictive tuning, associated with cortical tracking of the constantly changing rate of linguistic information in a speech stream.

  12. Segmenting words from natural speech: subsegmental variation in segmental cues.

    PubMed

    Rytting, C Anton; Brew, Chris; Fosler-Lussier, Eric

    2010-06-01

    Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We use this new representation to re-evaluate a key computational model of word segmentation. One finding is that high levels of phonetic variability degrade the model's performance. While robustness to phonetic variability may be intrinsically valuable, this finding needs to be complemented by parallel studies of the actual abilities of children to segment phonetically variable speech.

  13. A Comparison of LBG and ADPCM Speech Compression Techniques

    NASA Astrophysics Data System (ADS)

    Bachu, Rajesh G.; Patel, Jignasa; Barkana, Buket D.

    Speech compression is the technology of converting human speech into an efficiently encoded representation that can later be decoded to produce a close approximation of the original signal. In all speech there is a degree of predictability and speech coding techniques exploit this to reduce bit rates yet still maintain a suitable level of quality. This paper is a study and implementation of Linde-Buzo-Gray Algorithm (LBG) and Adaptive Differential Pulse Code Modulation (ADPCM) algorithms to compress speech signals. In here we implemented the methods using MATLAB 7.0. The methods we used in this study gave good results and performance in compressing the speech and listening tests showed that efficient and high quality coding is achieved.

  14. Effects of Production Training and Perception Training on Lexical Tone Perception--Are the Effects Domain General or Domain Specific?

    ERIC Educational Resources Information Center

    Lu, Shuang

    2013-01-01

    The relationship between speech perception and production has been debated for a long time. The Motor Theory of speech perception (Liberman et al., 1989) claims that perceiving speech is identifying the intended articulatory gestures rather than perceiving the sound patterns. It seems to suggest that speech production precedes speech perception,…

  15. Using Others' Words: Conversational Use of Reported Speech by Individuals with Aphasia and Their Communication Partners.

    ERIC Educational Resources Information Center

    Hengst, Julie A.; Frame, Simone R.; Neuman-Stritzel, Tiffany; Gannaway, Rachel

    2005-01-01

    Reported speech, wherein one quotes or paraphrases the speech of another, has been studied extensively as a set of linguistic and discourse practices. Researchers agree that reported speech is pervasive, found across languages, and used in diverse contexts. However, to date, there have been no studies of the use of reported speech among…

  16. Elements of a Plan-Based Theory of Speech Acts. Technical Report No. 141.

    ERIC Educational Resources Information Center

    Cohen, Philip R.; Perrault, C. Raymond

    This report proposes that people often plan their speech acts to affect their listeners' beliefs, goals, and emotional states and that such language use can be modeled by viewing speech acts as operators in a planning system, allowing both physical and speech acts to be integrated into plans. Methodological issues of how speech acts should be…

  17. Hello World, It's Me: Bringing the Basic Speech Communication Course into the Digital Age

    ERIC Educational Resources Information Center

    Kirkwood, Jessica; Gutgold, Nichola D.; Manley, Destiny

    2011-01-01

    During the past decade, instructors of speech communication have been adapting the introductory speech course to keep up with the television age. Learning units in speech textbooks now teach how to speak well on television, as well as how to interpret speeches in the media. This article argues that the computer age invites adaptation of the…

  18. Speed-Accuracy Tradeoffs in Speech Production

    DTIC Science & Technology

    2017-06-01

    imaging data of speech production. A theoretical framework for considering Fitts’ law in the domain of speech production is elucidated. Methodological ...articulatory kinematics conform to Fitts’ law. A second, associated goal is to address the methodological challenges inherent in performing Fitts-style...analysis on rtMRI data of speech production. Methodological challenges include segmenting continuous speech into specific motor tasks, defining key

  19. The Effects of Direct and Indirect Speech Acts on Native English and ESL Speakers' Perception of Teacher Written Feedback

    ERIC Educational Resources Information Center

    Baker, Wendy; Hansen Bricker, Rachel

    2010-01-01

    This study explores how second language (L2) learners perceive indirect (hedging or indirect speech acts) and direct written teacher feedback. Though research suggests that indirect speech acts may be more difficult to interpret than direct speech acts ([Champagne, 2001] and [Holtgraves, 1999]), using indirect speech acts is often encouraged in…

  20. Particularities of Speech Readiness for Schooling in Pre-School Children Having General Speech Underdevelopment: A Social and Pedagogical Aspect

    ERIC Educational Resources Information Center

    Emelyanova, Irina A.; Borisova, Elena A.; Shapovalova, Olga E.; Karynbaeva, Olga V.; Vorotilkina, Irina M.

    2018-01-01

    The relevance of the research is due to the necessity of creating the pedagogical conditions for correction and development of speech in children having the general speech underdevelopment. For them, difficulties generating a coherent utterance are characteristic, which prevents a sufficient speech readiness for schooling forming in them as well…

  1. Is Birdsong More Like Speech or Music?

    PubMed

    Shannon, Robert V

    2016-04-01

    Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Frontal Brain Electrical Activity (EEG) and Heart Rate in Response to Affective Infant-Directed (ID) Speech in 9-Month-Old Infants

    ERIC Educational Resources Information Center

    Santesso, Diane L.; Schmidt, Louis A.; Trainor, Laurel J.

    2007-01-01

    Many studies have shown that infants prefer infant-directed (ID) speech to adult-directed (AD) speech. ID speech functions to aid language learning, obtain and/or maintain an infant's attention, and create emotional communication between the infant and caregiver. We examined psychophysiological responses to ID speech that varied in affective…

  3. Effects of Multisensory Speech Training and Visual Phonics on Speech Production of a Hearing-Impaired Child.

    ERIC Educational Resources Information Center

    Zaccagnini, Cindy M.; Antia, Shirin D.

    1993-01-01

    This study of the effects of intensive multisensory speech training on the speech production of a profoundly hearing-impaired child (age nine) found that the addition of Visual Phonics hand cues did not result in speech production gains. All six target phonemes were generalized to new words and maintained after the intervention was discontinued.…

  4. Effects and modeling of phonetic and acoustic confusions in accented speech.

    PubMed

    Fung, Pascale; Liu, Yi

    2005-11-01

    Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. Since low phonetic confusion units in accented speech do not give rise to automatic speech recognition errors, we focus on analyzing and reducing phonetic and acoustic confusability under high phonetic confusion conditions. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic units with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition.

  5. Influence of compact disk recording protocols on reliability and comparability of speech audiometry outcomes: acoustic analysis.

    PubMed

    Di Berardino, F; Tognola, G; Paglialonga, A; Alpini, D; Grandori, F; Cesarani, A

    2010-08-01

    To assess whether different compact disk recording protocols, used to prepare speech test material, affect the reliability and comparability of speech audiometry testing. We conducted acoustic analysis of compact disks used in clinical practice, to determine whether speech material had been recorded using similar procedures. To assess the impact of different recording procedures on speech test outcomes, normal hearing subjects were tested using differently prepared compact disks, and their psychometric curves compared. Acoustic analysis revealed that speech material had been recorded using different protocols. The major difference was the gain between the levels at which the speech material and the calibration signal had been recorded. Although correct calibration of the audiometer was performed for each compact disk before testing, speech recognition thresholds and maximum intelligibility thresholds differed significantly between compact disks (p < 0.05), and were influenced by the gain between the recording level of the speech material and the calibration signal. To ensure the reliability and comparability of speech test outcomes obtained using different compact disks, it is recommended to check for possible differences in the recording gains used to prepare the compact disks, and then to compensate for any differences before testing.

  6. Preschoolers Benefit From Visually Salient Speech Cues

    PubMed Central

    Holt, Rachael Frush

    2015-01-01

    Purpose This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method Twelve adults and 27 typically developing 3- and 4-year-old children completed 3 audiovisual (AV) speech integration tasks: matching, discrimination, and recognition. The authors compared AV benefit for visually salient and less visually salient speech discrimination contrasts and assessed the visual saliency of consonant confusions in auditory-only and AV word recognition. Results Four-year-olds and adults demonstrated visual influence on all measures. Three-year-olds demonstrated visual influence on speech discrimination and recognition measures. All groups demonstrated greater AV benefit for the visually salient discrimination contrasts. AV recognition benefit in 4-year-olds and adults depended on the visual saliency of speech sounds. Conclusions Preschoolers can demonstrate AV speech integration. Their AV benefit results from efficient use of visually salient speech cues. Four-year-olds, but not 3-year-olds, used visual phonological knowledge to take advantage of visually salient speech cues, suggesting possible developmental differences in the mechanisms of AV benefit. PMID:25322336

  7. Audiovisual integration of speech in a patient with Broca's Aphasia

    PubMed Central

    Andersen, Tobias S.; Starrfelt, Randi

    2015-01-01

    Lesions to Broca's area cause aphasia characterized by a severe impairment of the ability to speak, with comparatively intact speech perception. However, some studies have found effects on speech perception under adverse listening conditions, indicating that Broca's area is also involved in speech perception. While these studies have focused on auditory speech perception other studies have shown that Broca's area is activated by visual speech perception. Furthermore, one preliminary report found that a patient with Broca's aphasia did not experience the McGurk illusion suggesting that an intact Broca's area is necessary for audiovisual integration of speech. Here we describe a patient with Broca's aphasia who experienced the McGurk illusion. This indicates that an intact Broca's area is not necessary for audiovisual integration of speech. The McGurk illusions this patient experienced were atypical, which could be due to Broca's area having a more subtle role in audiovisual integration of speech. The McGurk illusions of a control subject with Wernicke's aphasia were, however, also atypical. This indicates that the atypical McGurk illusions were due to deficits in speech processing that are not specific to Broca's aphasia. PMID:25972819

  8. Fifty years of progress in acoustic phonetics

    NASA Astrophysics Data System (ADS)

    Stevens, Kenneth N.

    2004-10-01

    Three events that occurred 50 or 60 years ago shaped the study of acoustic phonetics, and in the following few decades these events influenced research and applications in speech disorders, speech development, speech synthesis, speech recognition, and other subareas in speech communication. These events were: (1) the source-filter theory of speech production (Chiba and Kajiyama; Fant); (2) the development of the sound spectrograph and its interpretation (Potter, Kopp, and Green; Joos); and (3) the birth of research that related distinctive features to acoustic patterns (Jakobson, Fant, and Halle). Following these events there has been systematic exploration of the articulatory, acoustic, and perceptual bases of phonological categories, and some quantification of the sources of variability in the transformation of this phonological representation of speech into its acoustic manifestations. This effort has been enhanced by studies of how children acquire language in spite of this variability and by research on speech disorders. Gaps in our knowledge of this inherent variability in speech have limited the directions of applications such as synthesis and recognition of speech, and have led to the implementation of data-driven techniques rather than theoretical principles. Some examples of advances in our knowledge, and limitations of this knowledge, are reviewed.

  9. Adaptation to spectrally-rotated speech.

    PubMed

    Green, Tim; Rosen, Stuart; Faulkner, Andrew; Paterson, Ruth

    2013-08-01

    Much recent interest surrounds listeners' abilities to adapt to various transformations that distort speech. An extreme example is spectral rotation, in which the spectrum of low-pass filtered speech is inverted around a center frequency (2 kHz here). Spectral shape and its dynamics are completely altered, rendering speech virtually unintelligible initially. However, intonation, rhythm, and contrasts in periodicity and aperiodicity are largely unaffected. Four normal hearing adults underwent 6 h of training with spectrally-rotated speech using Continuous Discourse Tracking. They and an untrained control group completed pre- and post-training speech perception tests, for which talkers differed from the training talker. Significantly improved recognition of spectrally-rotated sentences was observed for trained, but not untrained, participants. However, there were no significant improvements in the identification of medial vowels in /bVd/ syllables or intervocalic consonants. Additional tests were performed with speech materials manipulated so as to isolate the contribution of various speech features. These showed that preserving intonational contrasts did not contribute to the comprehension of spectrally-rotated speech after training, and suggested that improvements involved adaptation to altered spectral shape and dynamics, rather than just learning to focus on speech features relatively unaffected by the transformation.

  10. Exposure to hate speech increases prejudice through desensitization.

    PubMed

    Soral, Wiktor; Bilewicz, Michał; Winiewski, Mikołaj

    2018-03-01

    In three studies (two representative nationwide surveys, N = 1,007, N = 682; and one experimental, N = 76) we explored the effects of exposure to hate speech on outgroup prejudice. Following the General Aggression Model, we suggest that frequent and repetitive exposure to hate speech leads to desensitization to this form of verbal violence and subsequently to lower evaluations of the victims and greater distancing, thus increasing outgroup prejudice. In the first survey study, we found that lower sensitivity to hate speech was a positive mediator of the relationship between frequent exposure to hate speech and outgroup prejudice. In the second study, we obtained a crucial confirmation of these effects. After desensitization training individuals were less sensitive to hate speech and more prejudiced toward hate speech victims than their counterparts in the control condition. In the final study, we replicated several previous effects and additionally found that the effects of exposure to hate speech on prejudice were mediated by a lower sensitivity to hate speech, and not by lower sensitivity to social norms. Altogether, our studies are the first to elucidate the effects of exposure to hate speech on outgroup prejudice. © 2017 Wiley Periodicals, Inc.

  11. High-frequency neural activity predicts word parsing in ambiguous speech streams.

    PubMed

    Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

    2016-12-01

    During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.

  12. Multimodal Speech Capture System for Speech Rehabilitation and Learning.

    PubMed

    Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

    2017-11-01

    Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.

  13. Automated recognition of helium speech. Phase I: Investigation of microprocessor based analysis/synthesis system

    NASA Astrophysics Data System (ADS)

    Jelinek, H. J.

    1986-01-01

    This is the Final Report of Electronic Design Associates on its Phase I SBIR project. The purpose of this project is to develop a method for correcting helium speech, as experienced in diver-surface communication. The goal of the Phase I study was to design, prototype, and evaluate a real time helium speech corrector system based upon digital signal processing techniques. The general approach was to develop hardware (an IBM PC board) to digitize helium speech and software (a LAMBDA computer based simulation) to translate the speech. As planned in the study proposal, this initial prototype may now be used to assess expected performance from a self contained real time system which uses an identical algorithm. The Final Report details the work carried out to produce the prototype system. Four major project tasks were: a signal processing scheme for converting helium speech to normal sounding speech was generated. The signal processing scheme was simulated on a general purpose (LAMDA) computer. Actual helium speech was supplied to the simulation and the converted speech was generated. An IBM-PC based 14 bit data Input/Output board was designed and built. A bibliography of references on speech processing was generated.

  14. Heterosexuals' attitudes toward hate crimes and hate speech against gays and lesbians: old-fashioned and modern heterosexism.

    PubMed

    Cowan, Gloria; Heiple, Becky; Marquez, Carolyn; Khatchadourian, Désirée; McNevin, Michelle

    2005-01-01

    Modern racism and sexism have been studied to examine the different ways that prejudice can be expressed; yet, little attention has been given to modern heterosexism. This study examined the extent to which modern heterosexism and old-fashioned heterosexism predict acceptance of hate crimes against gays and lesbians and perceptions of hate speech. Male (n = 74) and female (n = 95) heterosexual college students completed a survey consisting of scales that assessed modern and old-fashioned heterosexism, acceptance of violence against gays and lesbians, attitudes toward the harm of hate speech and its offensiveness, and the importance of freedom of speech. Results indicated strong negative relations between both modern and old-fashioned heterosexism and the perceived harm of hate speech. When old-fashioned heterosexism,modern heterosexism, and the importance of freedom of speech were combined to predict hate crime and hate speech attitudes, only old-fashioned heterosexism predicted acceptance of hate crimes. All three predictors contributed to the perception of the harm of hate speech. Gender differences in the role of the importance of freedom of speech in predicting attitudes toward hate crimes and hate speech are noted.

  15. High-frequency neural activity predicts word parsing in ambiguous speech streams

    PubMed Central

    Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie

    2016-01-01

    During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528

  16. Ongoing slow oscillatory phase modulates speech intelligibility in cooperation with motor cortical activity.

    PubMed

    Onojima, Takayuki; Kitajo, Keiichi; Mizuhara, Hiroaki

    2017-01-01

    Neural oscillation is attracting attention as an underlying mechanism for speech recognition. Speech intelligibility is enhanced by the synchronization of speech rhythms and slow neural oscillation, which is typically observed as human scalp electroencephalography (EEG). In addition to the effect of neural oscillation, it has been proposed that speech recognition is enhanced by the identification of a speaker's motor signals, which are used for speech production. To verify the relationship between the effect of neural oscillation and motor cortical activity, we measured scalp EEG, and simultaneous EEG and functional magnetic resonance imaging (fMRI) during a speech recognition task in which participants were required to recognize spoken words embedded in noise sound. We proposed an index to quantitatively evaluate the EEG phase effect on behavioral performance. The results showed that the delta and theta EEG phase before speech inputs modulated the participant's response time when conducting speech recognition tasks. The simultaneous EEG-fMRI experiment showed that slow EEG activity was correlated with motor cortical activity. These results suggested that the effect of the slow oscillatory phase was associated with the activity of the motor cortex during speech recognition.

  17. Did you or I say pretty, rude or brief? An ERP study of the effects of speaker's identity on emotional word processing.

    PubMed

    Pinheiro, Ana P; Rezaii, Neguine; Nestor, Paul G; Rauber, Andréia; Spencer, Kevin M; Niznikiewicz, Margaret

    2016-02-01

    During speech comprehension, multiple cues need to be integrated at a millisecond speed, including semantic information, as well as voice identity and affect cues. A processing advantage has been demonstrated for self-related stimuli when compared with non-self stimuli, and for emotional relative to neutral stimuli. However, very few studies investigated self-other speech discrimination and, in particular, how emotional valence and voice identity interactively modulate speech processing. In the present study we probed how the processing of words' semantic valence is modulated by speaker's identity (self vs. non-self voice). Sixteen healthy subjects listened to 420 prerecorded adjectives differing in voice identity (self vs. non-self) and semantic valence (neutral, positive and negative), while electroencephalographic data were recorded. Participants were instructed to decide whether the speech they heard was their own (self-speech condition), someone else's (non-self speech), or if they were unsure. The ERP results demonstrated interactive effects of speaker's identity and emotional valence on both early (N1, P2) and late (Late Positive Potential - LPP) processing stages: compared with non-self speech, self-speech with neutral valence elicited more negative N1 amplitude, self-speech with positive valence elicited more positive P2 amplitude, and self-speech with both positive and negative valence elicited more positive LPP. ERP differences between self and non-self speech occurred in spite of similar accuracy in the recognition of both types of stimuli. Together, these findings suggest that emotion and speaker's identity interact during speech processing, in line with observations of partially dependent processing of speech and speaker information. Copyright © 2016. Published by Elsevier Inc.

  18. Evaluation of speech outcomes using English version of the Speech Handicap Index in a cohort of head and neck cancer patients.

    PubMed

    Dwivedi, Raghav C; St Rose, Suzanne; Chisholm, Edward J; Bisase, Brian; Amen, Furrat; Nutting, Christopher M; Clarke, Peter M; Kerawala, Cyrus J; Rhys-Evans, Peter H; Harrington, Kevin J; Kazi, Rehan

    2012-06-01

    The aim of this study was to explore post-treatment speech impairments using English version of Speech Handicap Index (SHI) (first speech-specific questionnaire) in a cohort of oral cavity (OC) and oropharyngeal (OP) cancer patients. Sixty-three consecutive OC and OP cancer patients in follow-up participated in this study. Descriptive analyses have been presented as percentages, while Mann-Whitney U-test and Kruskall-Wallis test have been used for the quantitative variables. Statistical Package for Social Science-15 statistical software (SPSS Inc., Chicago, IL) was used for the statistical analyses. Over a third (36.1%) of patients reported their speech as either average or bad. Speech intelligibility and articulation were the main speech concerns for 58.8% and 52.9% OC and 31.6% and 34.2% OP cancer patients, respectively. While feeling of incompetent and being less outgoing were the speech-related psychosocial concerns for 64.7% and 23.5% OC and 15.8% and 18.4% OP cancer patients, respectively. Worse speech outcomes were noted for oral tongue and base of tongue cancers vs. tonsillar cancers, mean (SD) values were 56.7 (31.3) and 52.0 (38.4) vs. 10.9 (14.8) (P<0.001) and late vs. early T stage cancers 65.0 (29.9) vs. 29.3 (32.7) (P<0.005). The English version of the SHI is a reliable, valid and useful tool for the evaluation of speech in HNC patients. Over one-third of OC and OP cancer patients reported speech problems in their day-do-day life. Advanced T-stage tumors affecting the oral tongue or base of tongue are particularly associated with poor speech outcomes. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. The relationship between the neural computations for speech and music perception is context-dependent: an activation likelihood estimate study.

    PubMed

    LaCroix, Arianna N; Diaz, Alvaro F; Rogalsky, Corianne

    2015-01-01

    The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music.

  20. The eye as a window to the listening brain: neural correlates of pupil size as a measure of cognitive listening load.

    PubMed

    Zekveld, Adriana A; Heslenfeld, Dirk J; Johnsrude, Ingrid S; Versfeld, Niek J; Kramer, Sophia E

    2014-11-01

    An important aspect of hearing is the degree to which listeners have to deploy effort to understand speech. One promising measure of listening effort is task-evoked pupil dilation. Here, we use functional magnetic resonance imaging (fMRI) to identify the neural correlates of pupil dilation during comprehension of degraded spoken sentences in 17 normal-hearing listeners. Subjects listened to sentences degraded in three different ways: the target female speech was masked by fluctuating noise, by speech from a single male speaker, or the target speech was noise-vocoded. The degree of degradation was individually adapted such that 50% or 84% of the sentences were intelligible. Control conditions included clear speech in quiet, and silent trials. The peak pupil dilation was larger for the 50% compared to the 84% intelligibility condition, and largest for speech masked by the single-talker masker, followed by speech masked by fluctuating noise, and smallest for noise-vocoded speech. Activation in the bilateral superior temporal gyrus (STG) showed the same pattern, with most extensive activation for speech masked by the single-talker masker. Larger peak pupil dilation was associated with more activation in the bilateral STG, bilateral ventral and dorsal anterior cingulate cortex and several frontal brain areas. A subset of the temporal region sensitive to pupil dilation was also sensitive to speech intelligibility and degradation type. These results show that pupil dilation during speech perception in challenging conditions reflects both auditory and cognitive processes that are recruited to cope with degraded speech and the need to segregate target speech from interfering sounds. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Auditory-motor interactions in pediatric motor speech disorders: neurocomputational modeling of disordered development.

    PubMed

    Terband, H; Maassen, B; Guenther, F H; Brumberg, J

    2014-01-01

    Differentiating the symptom complex due to phonological-level disorders, speech delay and pediatric motor speech disorders is a controversial issue in the field of pediatric speech and language pathology. The present study investigated the developmental interaction between neurological deficits in auditory and motor processes using computational modeling with the DIVA model. In a series of computer simulations, we investigated the effect of a motor processing deficit alone (MPD), and the effect of a motor processing deficit in combination with an auditory processing deficit (MPD+APD) on the trajectory and endpoint of speech motor development in the DIVA model. Simulation results showed that a motor programming deficit predominantly leads to deterioration on the phonological level (phonemic mappings) when auditory self-monitoring is intact, and on the systemic level (systemic mapping) if auditory self-monitoring is impaired. These findings suggest a close relation between quality of auditory self-monitoring and the involvement of phonological vs. motor processes in children with pediatric motor speech disorders. It is suggested that MPD+APD might be involved in typically apraxic speech output disorders and MPD in pediatric motor speech disorders that also have a phonological component. Possibilities to verify these hypotheses using empirical data collected from human subjects are discussed. The reader will be able to: (1) identify the difficulties in studying disordered speech motor development; (2) describe the differences in speech motor characteristics between SSD and subtype CAS; (3) describe the different types of learning that occur in the sensory-motor system during babbling and early speech acquisition; (4) identify the neural control subsystems involved in speech production; (5) describe the potential role of auditory self-monitoring in developmental speech disorders. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. The development and validation of the speech quality instrument.

    PubMed

    Chen, Stephanie Y; Griffin, Brianna M; Mancuso, Dean; Shiau, Stephanie; DiMattia, Michelle; Cellum, Ilana; Harvey Boyd, Kelly; Prevoteau, Charlotte; Kohlberg, Gavriel D; Spitzer, Jaclyn B; Lalwani, Anil K

    2017-12-08

    Although speech perception tests are available to evaluate hearing, there is no standardized validated tool to quantify speech quality. The objective of this study is to develop a validated tool to measure quality of speech heard. Prospective instrument validation study of 35 normal hearing adults recruited at a tertiary referral center. Participants listened to 44 speech clips of male/female voices reciting the Rainbow Passage. Speech clips included original and manipulated excerpts capturing goal qualities such as mechanical and garbled. Listeners rated clips on a 10-point visual analog scale (VAS) of 18 characteristics (e.g. cartoonish, garbled). Skewed distribution analysis identified mean ratings in the upper and lower 2-point limits of the VAS (ratings of 8-10, 0-2, respectively); items with inconsistent responses were eliminated. The test was pruned to a final instrument of nine speech clips that clearly define qualities of interest: speech-like, male/female, cartoonish, echo-y, garbled, tinny, mechanical, rough, breathy, soothing, hoarse, like, pleasant, natural. Mean ratings were highest for original female clips (8.8) and lowest for not-speech manipulation (2.1). Factor analysis identified two subsets of characteristics: internal consistency demonstrated Cronbach's alpha of 0.95 and 0.82 per subset. Test-retest reliability of total scores was high, with an intraclass correlation coefficient of 0.76. The Speech Quality Instrument (SQI) is a concise, valid tool for assessing speech quality as an indicator for hearing performance. SQI may be a valuable outcome measure for cochlear implant recipients who, despite achieving excellent speech perception, often experience poor speech quality. 2b. Laryngoscope, 2017. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.

  3. The relationship between the neural computations for speech and music perception is context-dependent: an activation likelihood estimate study

    PubMed Central

    LaCroix, Arianna N.; Diaz, Alvaro F.; Rogalsky, Corianne

    2015-01-01

    The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music. PMID:26321976

  4. Neural Oscillations Carry Speech Rhythm through to Comprehension

    PubMed Central

    Peelle, Jonathan E.; Davis, Matthew H.

    2012-01-01

    A key feature of speech is the quasi-regular rhythmic information contained in its slow amplitude modulations. In this article we review the information conveyed by speech rhythm, and the role of ongoing brain oscillations in listeners’ processing of this content. Our starting point is the fact that speech is inherently temporal, and that rhythmic information conveyed by the amplitude envelope contains important markers for place and manner of articulation, segmental information, and speech rate. Behavioral studies demonstrate that amplitude envelope information is relied upon by listeners and plays a key role in speech intelligibility. Extending behavioral findings, data from neuroimaging – particularly electroencephalography (EEG) and magnetoencephalography (MEG) – point to phase locking by ongoing cortical oscillations to low-frequency information (~4–8 Hz) in the speech envelope. This phase modulation effectively encodes a prediction of when important events (such as stressed syllables) are likely to occur, and acts to increase sensitivity to these relevant acoustic cues. We suggest a framework through which such neural entrainment to speech rhythm can explain effects of speech rate on word and segment perception (i.e., that the perception of phonemes and words in connected speech is influenced by preceding speech rate). Neuroanatomically, acoustic amplitude modulations are processed largely bilaterally in auditory cortex, with intelligible speech resulting in differential recruitment of left-hemisphere regions. Notable among these is lateral anterior temporal cortex, which we propose functions in a domain-general fashion to support ongoing memory and integration of meaningful input. Together, the reviewed evidence suggests that low-frequency oscillations in the acoustic speech signal form the foundation of a rhythmic hierarchy supporting spoken language, mirrored by phase-locked oscillations in the human brain. PMID:22973251

  5. Auditory Neuropathy Spectrum Disorder (ANSD) (For Parents)

    MedlinePlus

    ... speech-language-pathologist, who will monitor speech and language development to make sure the child is on track. ... Speech-Language Therapy Cochlear Implants Delayed Speech or Language Development Your Child's Checkup: Newborn Hearing Evaluation in Children ...

  6. Effects of interior aircraft noise on speech intelligibility and annoyance

    NASA Technical Reports Server (NTRS)

    Pearsons, K. S.; Bennett, R. L.

    1977-01-01

    Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that above an 85 db background level the speech intelligibility results had a minimal effect on annoyance ratings. Below this level, people rated the background as less annoying if there was adequate speech intelligibility.

  7. Age-Related Differences in Lexical Access Relate to Speech Recognition in Noise

    PubMed Central

    Carroll, Rebecca; Warzybok, Anna; Kollmeier, Birger; Ruigendijk, Esther

    2016-01-01

    Vocabulary size has been suggested as a useful measure of “verbal abilities” that correlates with speech recognition scores. Knowing more words is linked to better speech recognition. How vocabulary knowledge translates to general speech recognition mechanisms, how these mechanisms relate to offline speech recognition scores, and how they may be modulated by acoustical distortion or age, is less clear. Age-related differences in linguistic measures may predict age-related differences in speech recognition in noise performance. We hypothesized that speech recognition performance can be predicted by the efficiency of lexical access, which refers to the speed with which a given word can be searched and accessed relative to the size of the mental lexicon. We tested speech recognition in a clinical German sentence-in-noise test at two signal-to-noise ratios (SNRs), in 22 younger (18–35 years) and 22 older (60–78 years) listeners with normal hearing. We also assessed receptive vocabulary, lexical access time, verbal working memory, and hearing thresholds as measures of individual differences. Age group, SNR level, vocabulary size, and lexical access time were significant predictors of individual speech recognition scores, but working memory and hearing threshold were not. Interestingly, longer accessing times were correlated with better speech recognition scores. Hierarchical regression models for each subset of age group and SNR showed very similar patterns: the combination of vocabulary size and lexical access time contributed most to speech recognition performance; only for the younger group at the better SNR (yielding about 85% correct speech recognition) did vocabulary size alone predict performance. Our data suggest that successful speech recognition in noise is mainly modulated by the efficiency of lexical access. This suggests that older adults’ poorer performance in the speech recognition task may have arisen from reduced efficiency in lexical access; with an average vocabulary size similar to that of younger adults, they were still slower in lexical access. PMID:27458400

  8. Speech summer camp for treating articulation disorders in cleft palate patients.

    PubMed

    Pamplona, Carmen; Ysunza, Antonio; Patiño, Carmeluza; Ramírez, Elena; Drucker, Mónica; Mazón, Juán J

    2005-03-01

    Compensatory articulation disorder (CAD) severely affects speech intelligibility of cleft palate children. CAD must be treated with speech therapy. Children can manage articulation better when they use language in event contexts such as every day routines. The purpose of this paper is to study and compare two modalities of speech intervention in cleft palate children with associated CAD. The first modality is a conventional approach providing speech therapy in 1-h sessions, twice a week. The second modality is a speech summer camp in which children received therapy 4h per day, 5 days a week for a period of 3 weeks. We were aimed to determine if a speech summer camp could significantly enhance articulation in CP children with CAD. Forty-five children with repaired cleft palates who exhibited CAD were studied. A matched control group of 45 children with repaired cleft palate who also exhibited CAD were identified. The patients included in the first group attended a speech summer camp for 3 weeks. The matched control subjects included in the second group received speech therapy aimed to correct CAD twice per-week in 1-h sessions. At the onset of either the summer camp or the speech therapy period, the severity of CAD was evenly distributed with non-significant differences across both groups of patients (p > 0.05). After the summer camp (3 weeks) or 12 months of speech therapy sessions at a frequency of twice per-week, both groups of patients showed a significant decrease in the severity of their CAD (p < 0.05). However, when the distribution of the severity of CAD was compared at the end of the summer camp or the speech therapy period, non-significant differences were found between both groups of patients (p > 0.05). A speech summer camp is a valid and efficient method for providing speech therapy in cleft palate children with compensatory articulation disorder.

  9. Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

    PubMed

    Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

    2018-04-27

    In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Speech-Like Rhythm in a Voiced and Voiceless Orangutan Call

    PubMed Central

    Lameira, Adriano R.; Hardus, Madeleine E.; Bartlett, Adrian M.; Shumaker, Robert W.; Wich, Serge A.; Menken, Steph B. J.

    2015-01-01

    The evolutionary origins of speech remain obscure. Recently, it was proposed that speech derived from monkey facial signals which exhibit a speech-like rhythm of ∼5 open-close lip cycles per second. In monkeys, these signals may also be vocalized, offering a plausible evolutionary stepping stone towards speech. Three essential predictions remain, however, to be tested to assess this hypothesis' validity; (i) Great apes, our closest relatives, should likewise produce 5Hz-rhythm signals, (ii) speech-like rhythm should involve calls articulatorily similar to consonants and vowels given that speech rhythm is the direct product of stringing together these two basic elements, and (iii) speech-like rhythm should be experience-based. Via cinematic analyses we demonstrate that an ex-entertainment orangutan produces two calls at a speech-like rhythm, coined “clicks” and “faux-speech.” Like voiceless consonants, clicks required no vocal fold action, but did involve independent manoeuvring over lips and tongue. In parallel to vowels, faux-speech showed harmonic and formant modulations, implying vocal fold and supralaryngeal action. This rhythm was several times faster than orangutan chewing rates, as observed in monkeys and humans. Critically, this rhythm was seven-fold faster, and contextually distinct, than any other known rhythmic calls described to date in the largest database of the orangutan repertoire ever assembled. The first two predictions advanced by this study are validated and, based on parsimony and exclusion of potential alternative explanations, initial support is given to the third prediction. Irrespectively of the putative origins of these calls and underlying mechanisms, our findings demonstrate irrevocably that great apes are not respiratorily, articulatorilly, or neurologically constrained for the production of consonant- and vowel-like calls at speech rhythm. Orangutan clicks and faux-speech confirm the importance of rhythmic speech antecedents within the primate lineage, and highlight potential articulatory homologies between great ape calls and human consonants and vowels. PMID:25569211

  11. Age-Related Differences in Lexical Access Relate to Speech Recognition in Noise.

    PubMed

    Carroll, Rebecca; Warzybok, Anna; Kollmeier, Birger; Ruigendijk, Esther

    2016-01-01

    Vocabulary size has been suggested as a useful measure of "verbal abilities" that correlates with speech recognition scores. Knowing more words is linked to better speech recognition. How vocabulary knowledge translates to general speech recognition mechanisms, how these mechanisms relate to offline speech recognition scores, and how they may be modulated by acoustical distortion or age, is less clear. Age-related differences in linguistic measures may predict age-related differences in speech recognition in noise performance. We hypothesized that speech recognition performance can be predicted by the efficiency of lexical access, which refers to the speed with which a given word can be searched and accessed relative to the size of the mental lexicon. We tested speech recognition in a clinical German sentence-in-noise test at two signal-to-noise ratios (SNRs), in 22 younger (18-35 years) and 22 older (60-78 years) listeners with normal hearing. We also assessed receptive vocabulary, lexical access time, verbal working memory, and hearing thresholds as measures of individual differences. Age group, SNR level, vocabulary size, and lexical access time were significant predictors of individual speech recognition scores, but working memory and hearing threshold were not. Interestingly, longer accessing times were correlated with better speech recognition scores. Hierarchical regression models for each subset of age group and SNR showed very similar patterns: the combination of vocabulary size and lexical access time contributed most to speech recognition performance; only for the younger group at the better SNR (yielding about 85% correct speech recognition) did vocabulary size alone predict performance. Our data suggest that successful speech recognition in noise is mainly modulated by the efficiency of lexical access. This suggests that older adults' poorer performance in the speech recognition task may have arisen from reduced efficiency in lexical access; with an average vocabulary size similar to that of younger adults, they were still slower in lexical access.

  12. Start/End Delays of Voiced and Unvoiced Speech Signals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Herrnstein, A

    Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measuredmore » acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.« less

  13. Impaired Feedforward Control and Enhanced Feedback Control of Speech in Patients with Cerebellar Degeneration

    PubMed Central

    Agnew, Zarinah; Nagarajan, Srikantan; Houde, John; Ivry, Richard B.

    2017-01-01

    The cerebellum has been hypothesized to form a crucial part of the speech motor control network. Evidence for this comes from patients with cerebellar damage, who exhibit a variety of speech deficits, as well as imaging studies showing cerebellar activation during speech production in healthy individuals. To date, the precise role of the cerebellum in speech motor control remains unclear, as it has been implicated in both anticipatory (feedforward) and reactive (feedback) control. Here, we assess both anticipatory and reactive aspects of speech motor control, comparing the performance of patients with cerebellar degeneration and matched controls. Experiment 1 tested feedforward control by examining speech adaptation across trials in response to a consistent perturbation of auditory feedback. Experiment 2 tested feedback control, examining online corrections in response to inconsistent perturbations of auditory feedback. Both male and female patients and controls were tested. The patients were impaired in adapting their feedforward control system relative to controls, exhibiting an attenuated anticipatory response to the perturbation. In contrast, the patients produced even larger compensatory responses than controls, suggesting an increased reliance on sensory feedback to guide speech articulation in this population. Together, these results suggest that the cerebellum is crucial for maintaining accurate feedforward control of speech, but relatively uninvolved in feedback control. SIGNIFICANCE STATEMENT Speech motor control is a complex activity that is thought to rely on both predictive, feedforward control as well as reactive, feedback control. While the cerebellum has been shown to be part of the speech motor control network, its functional contribution to feedback and feedforward control remains controversial. Here, we use real-time auditory perturbations of speech to show that patients with cerebellar degeneration are impaired in adapting feedforward control of speech but retain the ability to make online feedback corrections; indeed, the patients show an increased sensitivity to feedback. These results indicate that the cerebellum forms a crucial part of the feedforward control system for speech but is not essential for online, feedback control. PMID:28842410

  14. The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

    PubMed

    Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

    2009-06-01

    The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p < 0.001) than with the 30 IIDR. Group mean CNC scores at 60 dB SPL, loudness ratings, and the signal to noise ratios-50 for Bamford-Kowal-Bench Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.

  15. Systematic Studies of Modified Vocalization: Speech Production Changes During a Variation of Metronomic Speech in Persons Who Do and Do Not Stutter

    PubMed Central

    Davidow, Jason H.; Bothe, Anne K.; Ye, Jun

    2011-01-01

    The most common way to induce fluency using rhythm requires persons who stutter to speak one syllable or one word to each beat of a metronome, but stuttering can also be eliminated when the stimulus is of a particular duration (e.g., 1 s). The present study examined stuttering frequency, speech production changes, and speech naturalness during rhythmic speech that alternated 1 s of reading with 1 s of silence. A repeated-measures design was used to compare data obtained during a control reading condition and during rhythmic reading in 10 persons who stutter (PWS) and 10 normally fluent controls. Ratings for speech naturalness were also gathered from naïve listeners. Results showed that mean vowel duration increased significantly, and the percentage of short phonated intervals decreased significantly, for both groups from the control to the experimental condition. Mean phonated interval length increased significantly for the fluent controls. Mean speech naturalness ratings during the experimental condition were approximately 7 on a 1–9 scale (1 = highly natural; 9 = highly unnatural), and these ratings were significantly correlated with vowel duration and phonated intervals for PWS. The findings indicate that PWS may be altering vocal fold vibration duration to obtain fluency during this rhythmic speech style, and that vocal fold vibration duration may have an impact on speech naturalness during rhythmic speech. Future investigations should examine speech production changes and speech naturalness during variations of this rhythmic condition. Educational Objectives The reader will be able to: (1) describe changes (from a control reading condition) in speech production variables when alternating between 1 s of reading and 1 s of silence, (2) describe which rhythmic conditions have been found to sound and feel the most natural, (3) describe methodological issues for studies about alterations in speech production variables during fluency-inducing conditions, and (4) describe which fluency-inducing conditions have been shown to involve a reduction in short phonated intervals. PMID:21664528

  16. Brainstem Encoding of Aided Speech in Hearing Aid Users with Cochlear Dead Region(s).

    PubMed

    Hassaan, Mohammad Ramadan; Ibraheem, Ola Abdallah; Galhom, Dalia Helal

    2016-07-01

    Neural encoding of speech begins with the analysis of the signal as a whole broken down into its sinusoidal components in the cochlea, which has to be conserved up to the higher auditory centers. Some of these components target the dead regions of the cochlea causing little or no excitation. Measuring aided speech-evoked auditory brainstem response elicited by speech stimuli with different spectral maxima can give insight into the brainstem encoding of aided speech with spectral maxima at these dead regions. This research aims to study the impact of dead regions of the cochlea on speech processing at the brainstem level after a long period of hearing aid use. This study comprised 30 ears without dead regions and 46 ears with dead regions at low, mid, or high frequencies. For all ears, we measured the aided speech-evoked auditory brainstem response using speech stimuli of low, mid, and high spectral maxima. Aided speech-evoked auditory brainstem response was producible in all subjects. Responses evoked by stimuli with spectral maxima at dead regions had longer latencies and smaller amplitudes when compared with the control group or the responses of other stimuli. The presence of cochlear dead regions affects brainstem encoding of speech with spectral maxima perpendicular to these regions. Brainstem neuroplasticity and the extrinsic redundancy of speech can minimize the impact of dead regions in chronic hearing aid users.

  17. Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

    PubMed Central

    Preston, Jonathan L.; Seki, Ayumi

    2012-01-01

    Purpose The purposes are to (1) describe the assessment of residual speech sound disorders (SSD) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations, and (2) describe how assessment of domains such as speech motor control and phonological awareness can provide a more complete understanding of SSDs in bilinguals. Method A review of Japanese phonology is provided to offer a context for understanding the transfer of Japanese to English productions. A case study of an 11-year-old is presented, demonstrating parallel speech assessments in English and Japanese. Speech motor and phonological awareness tasks were conducted in both languages. Results Several patterns were observed in the participant’s English that could be plausibly explained by the influence of Japanese phonology. However, errors indicating a residual SSD were observed in both Japanese and English. A speech motor assessment suggested possible speech motor control problems, and phonological awareness was judged to be within the typical range of performance in both languages. Conclusion Understanding the phonological characteristics of L1 can help clinicians recognize speech patterns in L2 associated with transfer. Once these differences are understood, patterns associated with a residual SSD can be identified. Supplementing a relational speech analysis with measures of speech motor control and phonological awareness can provide a more comprehensive understanding of a client’s strengths and needs. PMID:21386046

  18. A positron emission tomography study of the neural basis of informational and energetic masking effects in speech perception

    NASA Astrophysics Data System (ADS)

    Scott, Sophie K.; Rosen, Stuart; Wickham, Lindsay; Wise, Richard J. S.

    2004-02-01

    Positron emission tomography (PET) was used to investigate the neural basis of the comprehension of speech in unmodulated noise (``energetic'' masking, dominated by effects at the auditory periphery), and when presented with another speaker (``informational'' masking, dominated by more central effects). Each type of signal was presented at four different signal-to-noise ratios (SNRs) (+3, 0, -3, -6 dB for the speech-in-speech, +6, +3, 0, -3 dB for the speech-in-noise), with listeners instructed to listen for meaning to the target speaker. Consistent with behavioral studies, there was SNR-dependent activation associated with the comprehension of speech in noise, with no SNR-dependent activity for the comprehension of speech-in-speech (at low or negative SNRs). There was, in addition, activation in bilateral superior temporal gyri which was associated with the informational masking condition. The extent to which this activation of classical ``speech'' areas of the temporal lobes might delineate the neural basis of the informational masking is considered, as is the relationship of these findings to the interfering effects of unattended speech and sound on more explicit working memory tasks. This study is a novel demonstration of candidate neural systems involved in the perception of speech in noisy environments, and of the processing of multiple speakers in the dorso-lateral temporal lobes.

  19. Speech recognition in one- and two-talker maskers in school-age children and adults: Development of perceptual masking and glimpsing

    PubMed Central

    Buss, Emily; Leibold, Lori J.; Porter, Heather L.; Grose, John H.

    2017-01-01

    Children perform more poorly than adults on a wide range of masked speech perception paradigms, but this effect is particularly pronounced when the masker itself is also composed of speech. The present study evaluated two factors that might contribute to this effect: the ability to perceptually isolate the target from masker speech, and the ability to recognize target speech based on sparse cues (glimpsing). Speech reception thresholds (SRTs) were estimated for closed-set, disyllabic word recognition in children (5–16 years) and adults in a one- or two-talker masker. Speech maskers were 60 dB sound pressure level (SPL), and they were either presented alone or in combination with a 50-dB-SPL speech-shaped noise masker. There was an age effect overall, but performance was adult-like at a younger age for the one-talker than the two-talker masker. Noise tended to elevate SRTs, particularly for older children and adults, and when summed with the one-talker masker. Removing time-frequency epochs associated with a poor target-to-masker ratio markedly improved SRTs, with larger effects for younger listeners; the age effect was not eliminated, however. Results were interpreted as indicating that development of speech-in-speech recognition is likely impacted by development of both perceptual masking and the ability recognize speech based on sparse cues. PMID:28464682

  20. Differentiating primary progressive aphasias in a brief sample of connected speech

    PubMed Central

    Evans, Emily; O'Shea, Jessica; Powers, John; Boller, Ashley; Weinberg, Danielle; Haley, Jenna; McMillan, Corey; Irwin, David J.; Rascovsky, Katya; Grossman, Murray

    2013-01-01

    Objective: A brief speech expression protocol that can be administered and scored without special training would aid in the differential diagnosis of the 3 principal forms of primary progressive aphasia (PPA): nonfluent/agrammatic PPA, logopenic variant PPA, and semantic variant PPA. Methods: We used a picture-description task to elicit a short speech sample, and we evaluated impairments in speech-sound production, speech rate, lexical retrieval, and grammaticality. We compared the results with those obtained by a longer, previously validated protocol and further validated performance with multimodal imaging to assess the neuroanatomical basis of the deficits. Results: We found different patterns of impaired grammar in each PPA variant, and additional language production features were impaired in each: nonfluent/agrammatic PPA was characterized by speech-sound errors; logopenic variant PPA by dysfluencies (false starts and hesitations); and semantic variant PPA by poor retrieval of nouns. Strong correlations were found between this brief speech sample and a lengthier narrative speech sample. A composite measure of grammaticality and other measures of speech production were correlated with distinct regions of gray matter atrophy and reduced white matter fractional anisotropy in each PPA variant. Conclusions: These findings provide evidence that large-scale networks are required for fluent, grammatical expression; that these networks can be selectively disrupted in PPA syndromes; and that quantitative analysis of a brief speech sample can reveal the corresponding distinct speech characteristics. PMID:23794681

  1. Speech and language support: How physicians can identify and treat speech and language delays in the office setting

    PubMed Central

    Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

    2014-01-01

    Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. The tool aimed to help physicians achieve three main goals: early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society’s Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children’s speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation. PMID:24627648

  2. Role of Visual Speech in Phonological Processing by Children With Hearing Loss

    PubMed Central

    Jerger, Susan; Tye-Murray, Nancy; Abdi, Hervé

    2011-01-01

    Purpose This research assessed the influence of visual speech on phonological processing by children with hearing loss (HL). Method Children with HL and children with normal hearing (NH) named pictures while attempting to ignore auditory or audiovisual speech distractors whose onsets relative to the pictures were either congruent, conflicting in place of articulation, or conflicting in voicing—for example, the picture “pizza” coupled with the distractors “peach,” “teacher,” or “beast,” respectively. Speed of picture naming was measured. Results The conflicting conditions slowed naming, and phonological processing by children with HL displayed the age-related shift in sensitivity to visual speech seen in children with NH, although with developmental delay. Younger children with HL exhibited a disproportionately large influence of visual speech and a negligible influence of auditory speech, whereas older children with HL showed a robust influence of auditory speech with no benefit to performance from adding visual speech. The congruent conditions did not speed naming in children with HL, nor did the addition of visual speech influence performance. Unexpectedly, the /∧/-vowel congruent distractors slowed naming in children with HL and decreased articulatory proficiency. Conclusions Results for the conflicting conditions are consistent with the hypothesis that speech representations in children with HL (a) are initially disproportionally structured in terms of visual speech and (b) become better specified with age in terms of auditorily encoded information. PMID:19339701

  3. The speech perception skills of children with and without speech sound disorder.

    PubMed

    Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

    To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Understanding the abstract role of speech in communication at 12 months.

    PubMed

    Martin, Alia; Onishi, Kristine H; Vouloumanos, Athena

    2012-04-01

    Adult humans recognize that even unfamiliar speech can communicate information between third parties, demonstrating an ability to separate communicative function from linguistic content. We examined whether 12-month-old infants understand that speech can communicate before they understand the meanings of specific words. Specifically, we test the understanding that speech permits the transfer of information about a Communicator's target object to a Recipient. Initially, the Communicator selectively grasped one of two objects. In test, the Communicator could no longer reach the objects. She then turned to the Recipient and produced speech (a nonsense word) or non-speech (coughing). Infants looked longer when the Recipient selected the non-target than the target object when the Communicator had produced speech but not coughing (Experiment 1). Looking time patterns differed from the speech condition when the Recipient rather than the Communicator produced the speech (Experiment 2), and when the Communicator produced a positive emotional vocalization (Experiment 3), but did not differ when the Recipient had previously received information about the target by watching the Communicator's selective grasping (Experiment 4). Thus infants understand the information-transferring properties of speech and recognize some of the conditions under which others' information states can be updated. These results suggest that infants possess an abstract understanding of the communicative function of speech, providing an important potential mechanism for language and knowledge acquisition. Copyright © 2011 Elsevier B.V. All rights reserved.

  5. Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar

    PubMed Central

    Shin, Young Hoon; Seo, Jiwon

    2016-01-01

    People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker’s vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing. PMID:27801867

  6. Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar.

    PubMed

    Shin, Young Hoon; Seo, Jiwon

    2016-10-29

    People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.

  7. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: Introduction

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: The goal of this article is to introduce the pause marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech (CAS) from speech delay.

  8. Children with a cochlear implant: characteristics and determinants of speech recognition, speech-recognition growth rate, and speech production.

    PubMed

    Wie, Ona Bø; Falkenberg, Eva-Signe; Tvete, Ole; Tomblin, Bruce

    2007-05-01

    The objectives of the study were to describe the characteristics of the first 79 prelingually deaf cochlear implant users in Norway and to investigate to what degree the variation in speech recognition, speech- recognition growth rate, and speech production could be explained by the characteristics of the child, the cochlear implant, the family, and the educational setting. Data gathered longitudinally were analysed using descriptive statistics, multiple regression, and growth-curve analysis. The results show that more than 50% of the variation could be explained by these characteristics. Daily user-time, non-verbal intelligence, mode of communication, length of CI experience, and educational placement had the highest effect on the outcome. The results also indicate that children educated in a bilingual approach to education have better speech perception and faster speech perception growth rate with increased focus on spoken language.

  9. An Analysis of The Parameters Used In Speech ABR Assessment Protocols.

    PubMed

    Sanfins, Milaine D; Hatzopoulos, Stavros; Donadon, Caroline; Diniz, Thais A; Borges, Leticia R; Skarzynski, Piotr H; Colella-Santos, Maria Francisca

    2018-04-01

    The aim of this study was to assess the parameters of choice, such as duration, intensity, rate, polarity, number of sweeps, window length, stimulated ear, fundamental frequency, first formant, and second formant, from previously published speech ABR studies. To identify candidate articles, five databases were assessed using the following keyword descriptors: speech ABR, ABR-speech, speech auditory brainstem response, auditory evoked potential to speech, speech-evoked brainstem response, and complex sounds. The search identified 1288 articles published between 2005 and 2015. After filtering the total number of papers according to the inclusion and exclusion criteria, 21 studies were selected. Analyzing the protocol details used in 21 studies suggested that there is no consensus to date on a speech-ABR protocol and that the parameters of analysis used are quite variable between studies. This inhibits the wider generalization and extrapolation of data across languages and studies.

  10. The persuasiveness of synthetic speech versus human speech.

    PubMed

    Stern, S E; Mullennix, J W; Dyson, C; Wilson, S J

    1999-12-01

    Is computer-synthesized speech as persuasive as the human voice when presenting an argument? After completing an attitude pretest, 193 participants were randomly assigned to listen to a persuasive appeal under three conditions: a high-quality synthesized speech system (DECtalk Express), a low-quality synthesized speech system (Monologue), and a tape recording of a human voice. Following the appeal, participants completed a posttest attitude survey and a series of questionnaires designed to assess perceptions of speech qualities, perceptions of the speaker, and perceptions of the message. The human voice was generally perceived more favorably than the computer-synthesized voice, and the speaker was perceived more favorably when the voice was a human voice than when it was computer synthesized. There was, however, no evidence that computerized speech, as compared with the human voice, affected persuasion or perceptions of the message. Actual or potential applications of this research include issues that should be considered when designing synthetic speech systems.

  11. On the importance of early reflections for speech in rooms.

    PubMed

    Bradley, J S; Sato, H; Picard, M

    2003-06-01

    This paper presents the results of new studies based on speech intelligibility tests in simulated sound fields and analyses of impulse response measurements in rooms used for speech communication. The speech intelligibility test results confirm the importance of early reflections for achieving good conditions for speech in rooms. The addition of early reflections increased the effective signal-to-noise ratio and related speech intelligibility scores for both impaired and nonimpaired listeners. The new results also show that for common conditions where the direct sound is reduced, it is only possible to understand speech because of the presence of early reflections. Analyses of measured impulse responses in rooms intended for speech show that early reflections can increase the effective signal-to-noise ratio by up to 9 dB. A room acoustics computer model is used to demonstrate that the relative importance of early reflections can be influenced by the room acoustics design.

  12. Lip-read me now, hear me better later: cross-modal transfer of talker-familiarity effects.

    PubMed

    Rosenblum, Lawrence D; Miller, Rachel M; Sanchez, Kauyumari

    2007-05-01

    There is evidence that for both auditory and visual speech perception, familiarity with the talker facilitates speech recognition. Explanations of these effects have concentrated on the retention of talker information specific to each of these modalities. It could be, however, that some amodal, talker-specific articulatory-style information facilitates speech perception in both modalities. If this is true, then experience with a talker in one modality should facilitate perception of speech from that talker in the other modality. In a test of this prediction, subjects were given about 1 hr of experience lipreading a talker and were then asked to recover speech in noise from either this same talker or a different talker. Results revealed that subjects who lip-read and heard speech from the same talker performed better on the speech-in-noise task than did subjects who lip-read from one talker and then heard speech from a different talker.

  13. Perceptual analysis of speech following traumatic brain injury in childhood.

    PubMed

    Cahill, Louise M; Murdoch, Bruce E; Theodoros, Deborah G

    2002-05-01

    To investigate perceptually the speech dimensions, oromotor function, and speech intelligibility of a group of individuals with traumatic brain injury (TBI) acquired in childhood. The speech of 24 children with TBI was analysed perceptually and compared with that of a group of non-neurologically impaired children matched for age and sex. The 16 dysarthric TBI subjects were significantly less intelligible than the control subjects, and demonstrated significant impairment in 12 of the 33 speech dimensions rated. In addition, the eight non-dysarthric TBI subjects were significantly impaired in many areas of oromotor function on the Frenchay Dysarthria Assessment, indicating some degree of pre-clinical speech impairment. The results of the perceptual analysis are discussed in terms of the possible underlying pathophysiological bases of the deviant speech features identified, and the need for a comprehensive instrumental assessment, to more accurately determine the level of breakdown in the speech production mechanism in children following TBI.

  14. Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology

    PubMed Central

    2015-01-01

    Inner speech—also known as covert speech or verbal thinking—has been implicated in theories of cognitive development, speech monitoring, executive function, and psychopathology. Despite a growing body of knowledge on its phenomenology, development, and function, approaches to the scientific study of inner speech have remained diffuse and largely unintegrated. This review examines prominent theoretical approaches to inner speech and methodological challenges in its study, before reviewing current evidence on inner speech in children and adults from both typical and atypical populations. We conclude by considering prospects for an integrated cognitive science of inner speech, and present a multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations. Despite its variability among individuals and across the life span, inner speech appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes. PMID:26011789

  15. Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

    NASA Astrophysics Data System (ADS)

    Athaudage, Chandranath R. N.; Bradley, Alan B.; Lech, Margaret

    2003-12-01

    A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably good results, there was no assurance on the optimality of the event locations. In the present work, we have optimized the event localizing task using a dynamic programming-based optimization strategy. Simulation results show that an improved TD model accuracy can be achieved. A methodology of incorporating the optimized TD algorithm within the standard MELP speech coder for the efficient compression of speech spectral information is also presented. The performance evaluation results revealed that the proposed speech coding scheme achieves 50%-60% compression of speech spectral information with negligible degradation in the decoded speech quality.

  16. Speech in spinocerebellar ataxia.

    PubMed

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Attitudes toward speech disorders: sampling the views of Cantonese-speaking Americans.

    PubMed

    Bebout, L; Arthur, B

    1997-01-01

    Speech-language pathologists who serve clients from cultural backgrounds that are not familiar to them may encounter culturally influenced attitudinal differences. A questionnaire with statements about 4 speech disorders (dysfluency, cleft pallet, speech of the deaf, and misarticulations) was given to a focus group of Chinese Americans and a comparison group of non-Chinese Americans. The focus group was much more likely to believe that persons with speech disorders could improve their own speech by "trying hard," was somewhat more likely to say that people who use deaf speech and people with cleft palates might be "emotionally disturbed," and generally more likely to view deaf speech as a limitation. The comparison group was more pessimistic about stuttering children's acceptance by their peers than was the focus group. The two subject groups agreed about other items, such as the likelihood that older children with articulation problems are "less intelligent" than their peers.

  18. Central Presbycusis: A Review and Evaluation of the Evidence

    PubMed Central

    Humes, Larry E.; Dubno, Judy R.; Gordon-Salant, Sandra; Lister, Jennifer J.; Cacace, Anthony T.; Cruickshanks, Karen J.; Gates, George A.; Wilson, Richard H.; Wingfield, Arthur

    2018-01-01

    Background The authors reviewed the evidence regarding the existence of age-related declines in central auditory processes and the consequences of any such declines for everyday communication. Purpose This report summarizes the review process and presents its findings. Data Collection and Analysis The authors reviewed 165 articles germane to central presbycusis. Of the 165 articles, 132 articles with a focus on human behavioral measures for either speech or nonspeech stimuli were selected for further analysis. Results For 76 smaller-scale studies of speech understanding in older adults reviewed, the following findings emerged: (1) the three most commonly studied behavioral measures were speech in competition, temporally distorted speech, and binaural speech perception (especially dichotic listening); (2) for speech in competition and temporally degraded speech, hearing loss proved to have a significant negative effect on performance in most of the laboratory studies; (3) significant negative effects of age, unconfounded by hearing loss, were observed in most of the studies of speech in competing speech, time-compressed speech, and binaural speech perception; and (4) the influence of cognitive processing on speech understanding has been examined much less frequently, but when included, significant positive associations with speech understanding were observed. For 36 smaller-scale studies of the perception of nonspeech stimuli by older adults reviewed, the following findings emerged: (1) the three most frequently studied behavioral measures were gap detection, temporal discrimination, and temporal-order discrimination or identification; (2) hearing loss was seldom a significant factor; and (3) negative effects of age were almost always observed. For 18 studies reviewed that made use of test batteries and medium-to-large sample sizes, the following findings emerged: (1) all studies included speech-based measures of auditory processing; (2) 4 of the 18 studies included nonspeech stimuli; (3) for the speech-based measures, monaural speech in a competing-speech background, dichotic speech, and monaural time-compressed speech were investigated most frequently; (4) the most frequently used tests were the Synthetic Sentence Identification (SSI) test with Ipsilateral Competing Message (ICM), the Dichotic Sentence Identification (DSI) test, and time-compressed speech; (5) many of these studies using speech-based measures reported significant effects of age, but most of these studies were confounded by declines in hearing, cognition, or both; (6) for nonspeech auditory-processing measures, the focus was on measures of temporal processing in all four studies; (7) effects of cognition on nonspeech measures of auditory processing have been studied less frequently, with mixed results, whereas the effects of hearing loss on performance were minimal due to judicious selection of stimuli; and (8) there is a paucity of observational studies using test batteries and longitudinal designs. Conclusions Based on this review of the scientific literature, there is insufficient evidence to confirm the existence of central presbycusis as an isolated entity. On the other hand, recent evidence has been accumulating in support of the existence of central presbycusis as a multifactorial condition that involves age- and/or disease-related changes in the auditory system and in the brain. Moreover, there is a clear need for additional research in this area. PMID:22967738

  19. Speech Research: A Report on the Status and Progress of Studies on the Nature of Speech , Instrumentation for Its Investigation, and Practical Applications, 1 October-31 December 1971.

    ERIC Educational Resources Information Center

    Turney, Michael T.; And Others

    This report on speech research contains papers describing experiments involving both information processing and speech production. The papers concerned with information processing cover such topics as peripheral and central processes in vision, separate speech and nonspeech processing in dichotic listening, and dichotic fusion along an acoustic…

  20. When Infants Talk, Infants Listen: Pre-Babbling Infants Prefer Listening to Speech with Infant Vocal Properties

    ERIC Educational Resources Information Center

    Masapollo, Matthew; Polka, Linda; Ménard, Lucie

    2016-01-01

    To learn to produce speech, infants must effectively monitor and assess their own speech output. Yet very little is known about how infants perceive speech produced by an infant, which has higher voice pitch and formant frequencies compared to adult or child speech. Here, we tested whether pre-babbling infants (at 4-6 months) prefer listening to…

  1. The design of a device for hearer and feeler differentiation, part A. [speech modulated hearing device

    NASA Technical Reports Server (NTRS)

    Creecy, R.

    1974-01-01

    A speech modulated white noise device is reported that gives the rhythmic characteristics of a speech signal for intelligible reception by deaf persons. The signal is composed of random amplitudes and frequencies as modulated by the speech envelope characteristics of rhythm and stress. Time intensity parameters of speech are conveyed through the vibro-tactile sensation stimuli.

  2. Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

    ERIC Educational Resources Information Center

    Maas, Edwin; Mailend, Marja-Liisa

    2012-01-01

    Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…

  3. Pragmatic Analyses of Martin Luther King (Jr)'s Speech: "I Have a Dream"--An Introspective Prognosis

    ERIC Educational Resources Information Center

    Josiah, Ubong E.; Oghenerho, Gift

    2015-01-01

    This paper investigates the speech of Martin Luther King (Jr.) titled: "I Have a Dream", presented in 1963 at the Lincoln Memorial. This speech is selected for use because it involves a speaker and an audience who belong to a particular speech community. The speech is about the failed promises by the Americans whose dream advocate…

  4. Why Should Speech Rate (Tempo) Be Integrated into Pronunciation Teaching Curriculum

    ERIC Educational Resources Information Center

    Yurtbasi, Meti

    2015-01-01

    The pace of speech i.e. tempo can be varied to our mood of the moment. Fast speech can convey urgency, whereas slower speech can be used for emphasis. In public speaking, orators produce powerful effects by varying the loudness and pace of their speech. The juxtaposition of very loud and very quiet utterances is a device often used by those trying…

  5. Age-Related Differences in Speech Rate Perception Do Not Necessarily Entail Age-Related Differences in Speech Rate Use

    ERIC Educational Resources Information Center

    Heffner, Christopher C.; Newman, Rochelle S.; Dilley, Laura C.; Idsardi, William J.

    2015-01-01

    Purpose: A new literature has suggested that speech rate can influence the parsing of words quite strongly in speech. The purpose of this study was to investigate differences between younger adults and older adults in the use of context speech rate in word segmentation, given that older adults perceive timing information differently from younger…

  6. The Cortical Organization of Speech Processing: Feedback Control and Predictive Coding the Context of a Dual-Stream Model

    ERIC Educational Resources Information Center

    Hickok, Gregory

    2012-01-01

    Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the…

  7. Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

    NASA Astrophysics Data System (ADS)

    Samardzic, Nikolina

    The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly, as an example of the significance of speech intelligibility evaluation in the context of an applicable listening environment, as indicated in this research, it was found that the jury test participants required on average an approximate 3 dB increase in sound pressure level of speech material while driving and listening compared to when just listening, for an equivalent speech intelligibility performance and the same listening task.

  8. The Timing and Effort of Lexical Access in Natural and Degraded Speech

    PubMed Central

    Wagner, Anita E.; Toffanin, Paolo; Başkent, Deniz

    2016-01-01

    Understanding speech is effortless in ideal situations, and although adverse conditions, such as caused by hearing impairment, often render it an effortful task, they do not necessarily suspend speech comprehension. A prime example of this is speech perception by cochlear implant users, whose hearing prostheses transmit speech as a significantly degraded signal. It is yet unknown how mechanisms of speech processing deal with such degraded signals, and whether they are affected by effortful processing of speech. This paper compares the automatic process of lexical competition between natural and degraded speech, and combines gaze fixations, which capture the course of lexical disambiguation, with pupillometry, which quantifies the mental effort involved in processing speech. Listeners’ ocular responses were recorded during disambiguation of lexical embeddings with matching and mismatching durational cues. Durational cues were selected due to their substantial role in listeners’ quick limitation of the number of lexical candidates for lexical access in natural speech. Results showed that lexical competition increased mental effort in processing natural stimuli in particular in presence of mismatching cues. Signal degradation reduced listeners’ ability to quickly integrate durational cues in lexical selection, and delayed and prolonged lexical competition. The effort of processing degraded speech was increased overall, and because it had its sources at the pre-lexical level this effect can be attributed to listening to degraded speech rather than to lexical disambiguation. In sum, the course of lexical competition was largely comparable for natural and degraded speech, but showed crucial shifts in timing, and different sources of increased mental effort. We argue that well-timed progress of information from sensory to pre-lexical and lexical stages of processing, which is the result of perceptual adaptation during speech development, is the reason why in ideal situations speech is perceived as an undemanding task. Degradation of the signal or the receiver channel can quickly bring this well-adjusted timing out of balance and lead to increase in mental effort. Incomplete and effortful processing at the early pre-lexical stages has its consequences on lexical processing as it adds uncertainty to the forming and revising of lexical hypotheses. PMID:27065901

  9. Extensions to the Speech Disorders Classification System (SDCS)

    PubMed Central

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for pediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three subtypes of motor speech disorders. Part II describes the Madison Speech Assessment Protocol (MSAP), an approximately two-hour battery of 25 measures that includes 15 speech tests and tasks. Part III describes the Competence, Precision, and Stability Analytics (CPSA) framework, a current set of approximately 90 perceptual- and acoustic-based indices of speech, prosody, and voice used to quantify and classify subtypes of Speech Sound Disorders (SSD). A companion paper, Shriberg, Fourakis, et al. (2010) provides reliability estimates for the perceptual and acoustic data reduction methods used in the SDCS. The agreement estimates in the companion paper support the reliability of SDCS methods and illustrate the complementary roles of perceptual and acoustic methods in diagnostic analyses of SSD of unknown origin. Examples of research using the extensions to the SDCS described in the present report include diagnostic findings for a sample of youth with motor speech disorders associated with galactosemia (Shriberg, Potter, & Strand, 2010) and a test of the hypothesis of apraxia of speech in a group of children with autism spectrum disorders (Shriberg, Paul, Black, & van Santen, 2010). All SDCS methods and reference databases running in the PEPPER (Programs to Examine Phonetic and Phonologic Evaluation Records; [Shriberg, Allen, McSweeny, & Wilson, 2001]) environment will be disseminated without cost when complete. PMID:20831378

  10. A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.

    PubMed

    Magnotti, John F; Beauchamp, Michael S

    2017-02-01

    Audiovisual speech integration combines information from auditory speech (talker's voice) and visual speech (talker's mouth movements) to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same source, a process known as causal inference. A well-known illusion, the McGurk Effect, consists of incongruent audiovisual syllables, such as auditory "ba" + visual "ga" (AbaVga), that are integrated to produce a fused percept ("da"). This illusion raises two fundamental questions: first, given the incongruence between the auditory and visual syllables in the McGurk stimulus, why are they integrated; and second, why does the McGurk effect not occur for other, very similar syllables (e.g., AgaVba). We describe a simplified model of causal inference in multisensory speech perception (CIMS) that predicts the perception of arbitrary combinations of auditory and visual speech. We applied this model to behavioral data collected from 60 subjects perceiving both McGurk and non-McGurk incongruent speech stimuli. The CIMS model successfully predicted both the audiovisual integration observed for McGurk stimuli and the lack of integration observed for non-McGurk stimuli. An identical model without causal inference failed to accurately predict perception for either form of incongruent speech. The CIMS model uses causal inference to provide a computational framework for studying how the brain performs one of its most important tasks, integrating auditory and visual speech cues to allow us to communicate with others.

  11. Self-Administered Computer Therapy for Apraxia of Speech: Two-Period Randomized Control Trial With Crossover.

    PubMed

    Varley, Rosemary; Cowell, Patricia E; Dyson, Lucy; Inglis, Lesley; Roper, Abigail; Whiteside, Sandra P

    2016-03-01

    There is currently little evidence on effective interventions for poststroke apraxia of speech. We report outcomes of a trial of self-administered computer therapy for apraxia of speech. Effects of speech intervention on naming and repetition of treated and untreated words were compared with those of a visuospatial sham program. The study used a parallel-group, 2-period, crossover design, with participants receiving 2 interventions. Fifty participants with chronic and stable apraxia of speech were randomly allocated to 1 of 2 order conditions: speech-first condition versus sham-first condition. Period 1 design was equivalent to a randomized controlled trial. We report results for this period and profile the effect of the period 2 crossover. Period 1 results revealed significant improvement in naming and repetition only in the speech-first group. The sham-first group displayed improvement in speech production after speech intervention in period 2. Significant improvement of treated words was found in both naming and repetition, with little generalization to structurally similar and dissimilar untreated words. Speech gains were largely maintained after withdrawal of intervention. There was a significant relationship between treatment dose and response. However, average self-administered dose was modest for both groups. Future software design would benefit from incorporation of social and gaming components to boost motivation. Single-word production can be improved in chronic apraxia of speech with behavioral intervention. Self-administered computerized therapy is a promising method for delivering high-intensity speech/language rehabilitation. URL: http://orcid.org/0000-0002-1278-0601. Unique identifier: ISRCTN88245643. © 2016 American Heart Association, Inc.

  12. Use of amplitude modulation cues recovered from frequency modulation for cochlear implant users when original speech cues are severely degraded.

    PubMed

    Won, Jong Ho; Shim, Hyun Joon; Lorenzi, Christian; Rubinstein, Jay T

    2014-06-01

    Won et al. (J Acoust Soc Am 132:1113-1119, 2012) reported that cochlear implant (CI) speech processors generate amplitude-modulation (AM) cues recovered from broadband speech frequency modulation (FM) and that CI users can use these cues for speech identification in quiet. The present study was designed to extend this finding for a wide range of listening conditions, where the original speech cues were severely degraded by manipulating either the acoustic signals or the speech processor. The manipulation of the acoustic signals included the presentation of background noise, simulation of reverberation, and amplitude compression. The manipulation of the speech processor included changing the input dynamic range and the number of channels. For each of these conditions, multiple levels of speech degradation were tested. Speech identification was measured for CI users and compared for stimuli having both AM and FM information (intact condition) or FM information only (FM condition). Each manipulation degraded speech identification performance for both intact and FM conditions. Performance for the intact and FM conditions became similar for stimuli having the most severe degradations. Identification performance generally overlapped for the intact and FM conditions. Moreover, identification performance for the FM condition was better than chance performance even at the maximum level of distortion. Finally, significant correlations were found between speech identification scores for the intact and FM conditions. Altogether, these results suggest that despite poor frequency selectivity, CI users can make efficient use of AM cues recovered from speech FM in difficult listening situations.

  13. Speech Deficits in Serious mental Illness: A Cognitive Resource Issue?

    PubMed Central

    Cohen, Alex S.; McGovern, Jessica E.; Dinzeo, Thomas J.; Covington, Michael A.

    2014-01-01

    Speech deficits, notably those involved in psychomotor retardation, blunted affect, alogia and poverty of content of speech, are pronounced in a wide range of serious mental illnesses (e.g., schizophrenia, unipolar depression, bipolar disorders). The present project evaluated the degree to which these deficits manifest as a function of cognitive resource limitations. We examined natural speech from 52 patients meeting criteria for serious mental illnesses (i.e., severe functional deficits with a concomitant diagnosis of schizophrenia, unipolar and/or bipolar affective disorders) and 30 non-psychiatric controls using a range of objective, computer-based measures tapping speech production (“alogia”), variability (“blunted vocal affect”) and content (“poverty of content of speech”). Subjects produced natural speech during a baseline condition and while engaging in an experimentally-manipulated cognitively-effortful task. For correlational analysis, cognitive ability was measured using a standardized battery. Generally speaking, speech deficits did not differ as a function of SMI diagnosis. However, every speech production and content measure was significantly abnormal in SMI versus control groups. Speech variability measures generally did not differ between groups. For both patients and controls as a group, speech during the cognitively-effortful task was sparser and less rich in content. Relative to controls, patients were abnormal under cognitive load with respect only to average pause length. Correlations between the speech variables and cognitive ability were only significant for this same variable: average pause length. Results suggest that certain speech deficits, notably involving pause length, may manifest as a function of cognitive resource limitations. Implications for treatment, research and assessment are discussed. PMID:25464920

  14. Spatial Release From Masking in Simulated Cochlear Implant Users With and Without Access to Low-Frequency Acoustic Hearing

    PubMed Central

    Dietz, Mathias; Hohmann, Volker; Jürgens, Tim

    2015-01-01

    For normal-hearing listeners, speech intelligibility improves if speech and noise are spatially separated. While this spatial release from masking has already been quantified in normal-hearing listeners in many studies, it is less clear how spatial release from masking changes in cochlear implant listeners with and without access to low-frequency acoustic hearing. Spatial release from masking depends on differences in access to speech cues due to hearing status and hearing device. To investigate the influence of these factors on speech intelligibility, the present study measured speech reception thresholds in spatially separated speech and noise for 10 different listener types. A vocoder was used to simulate cochlear implant processing and low-frequency filtering was used to simulate residual low-frequency hearing. These forms of processing were combined to simulate cochlear implant listening, listening based on low-frequency residual hearing, and combinations thereof. Simulated cochlear implant users with additional low-frequency acoustic hearing showed better speech intelligibility in noise than simulated cochlear implant users without acoustic hearing and had access to more spatial speech cues (e.g., higher binaural squelch). Cochlear implant listener types showed higher spatial release from masking with bilateral access to low-frequency acoustic hearing than without. A binaural speech intelligibility model with normal binaural processing showed overall good agreement with measured speech reception thresholds, spatial release from masking, and spatial speech cues. This indicates that differences in speech cues available to listener types are sufficient to explain the changes of spatial release from masking across these simulated listener types. PMID:26721918

  15. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    PubMed

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Content analysis of the professional journal of the College of Speech Therapists II: coming of age and growing maturity, 1946-65.

    PubMed

    Stansfield, Jois; Armstrong, Linda

    2016-07-01

    Following a content analysis of the first 10 years of the UK professional journal Speech, this study was conducted to survey the published work of the speech (and language) therapy profession in the 20 years following the unification of two separate professional bodies into the College of Speech Therapists. To understand better the development of the speech (and language) therapy profession in the UK in order to support the development of an online history of the speech and language therapy profession in the UK. The 40 issues of the professional journal of the College of Speech Therapists published between 1946 and 1965 (Speech and later Speech Pathology and Therapy) were examined using content analysis and the content compared with that of the same journal as it appeared from 1935 to the end of the Second World War (1945). Many aspects of the journal and its authored papers were retained from the earlier years, for example, the range of authors' professions, their location mainly in the UK, their number of contributions and the length of papers. Changes and developments included the balance of original to republished papers, the description and discussion of new professional issues, and an extended range of client groups/disorders. The journal and its articles reflect the growing maturity of the newly unified profession of speech therapy and give an indication both of the expanding depth of knowledge available to speech therapists and of the rapidly increasing breadth of their work over this period. © 2016 Royal College of Speech and Language Therapists.

  17. Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

    PubMed

    Schoenmaker, Esther; van de Par, Steven

    2016-01-01

    Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.

  18. The influence of cochlear spectral processing on the timing and amplitude of the speech-evoked auditory brain stem response

    PubMed Central

    Nuttall, Helen E.; Moore, David R.; Barry, Johanna G.; Krumbholz, Katrin

    2015-01-01

    The speech-evoked auditory brain stem response (speech ABR) is widely considered to provide an index of the quality of neural temporal encoding in the central auditory pathway. The aim of the present study was to evaluate the extent to which the speech ABR is shaped by spectral processing in the cochlea. High-pass noise masking was used to record speech ABRs from delimited octave-wide frequency bands between 0.5 and 8 kHz in normal-hearing young adults. The latency of the frequency-delimited responses decreased from the lowest to the highest frequency band by up to 3.6 ms. The observed frequency-latency function was compatible with model predictions based on wave V of the click ABR. The frequency-delimited speech ABR amplitude was largest in the 2- to 4-kHz frequency band and decreased toward both higher and lower frequency bands despite the predominance of low-frequency energy in the speech stimulus. We argue that the frequency dependence of speech ABR latency and amplitude results from the decrease in cochlear filter width with decreasing frequency. The results suggest that the amplitude and latency of the speech ABR may reflect interindividual differences in cochlear, as well as central, processing. The high-pass noise-masking technique provides a useful tool for differentiating between peripheral and central effects on the speech ABR. It can be used for further elucidating the neural basis of the perceptual speech deficits that have been associated with individual differences in speech ABR characteristics. PMID:25787954

  19. Nonverbal oral apraxia in primary progressive aphasia and apraxia of speech.

    PubMed

    Botha, Hugo; Duffy, Joseph R; Strand, Edythe A; Machulda, Mary M; Whitwell, Jennifer L; Josephs, Keith A

    2014-05-13

    The goal of this study was to explore the prevalence of nonverbal oral apraxia (NVOA), its association with other forms of apraxia, and associated imaging findings in patients with primary progressive aphasia (PPA) and progressive apraxia of speech (PAOS). Patients with a degenerative speech or language disorder were prospectively recruited and diagnosed with a subtype of PPA or with PAOS. All patients had comprehensive speech and language examinations. Voxel-based morphometry was performed to determine whether atrophy of a specific region correlated with the presence of NVOA. Eighty-nine patients were identified, of which 34 had PAOS, 9 had agrammatic PPA, 41 had logopenic aphasia, and 5 had semantic dementia. NVOA was very common among patients with PAOS but was found in patients with PPA as well. Several patients exhibited only one of NVOA or apraxia of speech. Among patients with apraxia of speech, the severity of the apraxia of speech was predictive of NVOA, whereas ideomotor apraxia severity was predictive of the presence of NVOA in those without apraxia of speech. Bilateral atrophy of the prefrontal cortex anterior to the premotor area and supplementary motor area was associated with NVOA. Apraxia of speech, NVOA, and ideomotor apraxia are at least partially separable disorders. The association of NVOA and apraxia of speech likely results from the proximity of the area reported here and the premotor area, which has been implicated in apraxia of speech. The association of ideomotor apraxia and NVOA among patients without apraxia of speech could represent disruption of modules shared by nonverbal oral movements and limb movements.

  20. Nonverbal oral apraxia in primary progressive aphasia and apraxia of speech

    PubMed Central

    Botha, Hugo; Duffy, Joseph R.; Strand, Edythe A.; Machulda, Mary M.; Whitwell, Jennifer L.

    2014-01-01

    Objective: The goal of this study was to explore the prevalence of nonverbal oral apraxia (NVOA), its association with other forms of apraxia, and associated imaging findings in patients with primary progressive aphasia (PPA) and progressive apraxia of speech (PAOS). Methods: Patients with a degenerative speech or language disorder were prospectively recruited and diagnosed with a subtype of PPA or with PAOS. All patients had comprehensive speech and language examinations. Voxel-based morphometry was performed to determine whether atrophy of a specific region correlated with the presence of NVOA. Results: Eighty-nine patients were identified, of which 34 had PAOS, 9 had agrammatic PPA, 41 had logopenic aphasia, and 5 had semantic dementia. NVOA was very common among patients with PAOS but was found in patients with PPA as well. Several patients exhibited only one of NVOA or apraxia of speech. Among patients with apraxia of speech, the severity of the apraxia of speech was predictive of NVOA, whereas ideomotor apraxia severity was predictive of the presence of NVOA in those without apraxia of speech. Bilateral atrophy of the prefrontal cortex anterior to the premotor area and supplementary motor area was associated with NVOA. Conclusions: Apraxia of speech, NVOA, and ideomotor apraxia are at least partially separable disorders. The association of NVOA and apraxia of speech likely results from the proximity of the area reported here and the premotor area, which has been implicated in apraxia of speech. The association of ideomotor apraxia and NVOA among patients without apraxia of speech could represent disruption of modules shared by nonverbal oral movements and limb movements. PMID:24727315

Top