Variations in Articulatory Movement with Changes in Speech Task.
ERIC Educational Resources Information Center
Tasko, Stephen M.; McClean, Michael D.
2004-01-01
Studies of normal and disordered articulatory movement often rely on the use of short, simple speech tasks. However, the severity of speech disorders can be observed to vary markedly with task. Understanding task-related variations in articulatory kinematic behavior may allow for an improved understanding of normal and disordered speech motor…
Comparing Motor Skills in Autism Spectrum Individuals With and Without Speech Delay
Barbeau, Elise B.; Meilleur, Andrée‐Anne S.; Zeffiro, Thomas A.
2015-01-01
Movement atypicalities in speed, coordination, posture, and gait have been observed across the autism spectrum (AS) and atypicalities in coordination are more commonly observed in AS individuals without delayed speech (DSM‐IV Asperger) than in those with atypical or delayed speech onset. However, few studies have provided quantitative data to support these mostly clinical observations. Here, we compared perceptual and motor performance between 30 typically developing and AS individuals (21 with speech delay and 18 without speech delay) to examine the associations between limb movement control and atypical speech development. Groups were matched for age, intelligence, and sex. The experimental design included: an inspection time task, which measures visual processing speed; the Purdue Pegboard, which measures finger dexterity, bimanual performance, and hand‐eye coordination; the Annett Peg Moving Task, which measures unimanual goal‐directed arm movement; and a simple reaction time task. We used analysis of covariance to investigate group differences in task performance and linear regression models to explore potential associations between intelligence, language skills, simple reaction time, and visually guided movement performance. AS participants without speech delay performed slower than typical participants in the Purdue Pegboard subtests. AS participants without speech delay showed poorer bimanual coordination than those with speech delay. Visual processing speed was slightly faster in both AS groups than in the typical group. Altogether, these results suggest that AS individuals with and without speech delay differ in visually guided and visually triggered behavior and show that early language skills are associated with slower movement in simple and complex motor tasks. Autism Res 2015, 8: 682–693. © 2015 The Authors Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research PMID:25820662
Autonomic Correlates of Speech Versus Nonspeech Tasks in Children and Adults
Arnold, Hayley S.; MacPherson, Megan K.; Smith, Anne
2015-01-01
Purpose To assess autonomic arousal associated with speech and nonspeech tasks in school-age children and young adults. Method Measures of autonomic arousal (electrodermal level, electrodermal response amplitude, blood pulse volume, and heart rate) were recorded prior to, during, and after the performance of speech and nonspeech tasks by twenty 7- to 9-year-old children and twenty 18- to 22-year-old adults. Results Across age groups, autonomic arousal was higher for speech tasks compared with nonspeech tasks, based on peak electrodermal response amplitude and blood pulse volume. Children demonstrated greater relative arousal, based on heart rate and blood pulse volume, for nonspeech oral motor tasks than adults but showed similar mean arousal levels for speech tasks as adults. Children demonstrated sex differences in autonomic arousal; specifically, autonomic arousal remained high for school-age boys but not girls in a more complex open-ended narrative task that followed a simple sentence production task. Conclusions Speech tasks elicit greater autonomic arousal than nonspeech tasks, and children demonstrate greater autonomic arousal for nonspeech oral motor tasks than adults. Sex differences in autonomic arousal associated with speech tasks in school-age children are discussed relative to speech-language differences between boys and girls. PMID:24686989
A multimodal spectral approach to characterize rhythm in natural speech.
Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta
2016-01-01
Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.
Neurophysiology of speech differences in childhood apraxia of speech.
Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole
2014-01-01
Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.
Neurophysiology of Speech Differences in Childhood Apraxia of Speech
Preston, Jonathan L.; Molfese, Peter J.; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia; Landi, Nicole
2014-01-01
Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016
Measuring listening effort: driving simulator vs. simple dual-task paradigm
Wu, Yu-Hsiang; Aksan, Nazan; Rizzo, Matthew; Stangl, Elizabeth; Zhang, Xuyang; Bentler, Ruth
2014-01-01
Objectives The dual-task paradigm has been widely used to measure listening effort. The primary objectives of the study were to (1) investigate the effect of hearing aid amplification and a hearing aid directional technology on listening effort measured by a complicated, more real world dual-task paradigm, and (2) compare the results obtained with this paradigm to a simpler laboratory-style dual-task paradigm. Design The listening effort of adults with hearing impairment was measured using two dual-task paradigms, wherein participants performed a speech recognition task simultaneously with either a driving task in a simulator or a visual reaction-time task in a sound-treated booth. The speech materials and road noises for the speech recognition task were recorded in a van traveling on the highway in three hearing aid conditions: unaided, aided with omni directional processing (OMNI), and aided with directional processing (DIR). The change in the driving task or the visual reaction-time task performance across the conditions quantified the change in listening effort. Results Compared to the driving-only condition, driving performance declined significantly with the addition of the speech recognition task. Although the speech recognition score was higher in the OMNI and DIR conditions than in the unaided condition, driving performance was similar across these three conditions, suggesting that listening effort was not affected by amplification and directional processing. Results from the simple dual-task paradigm showed a similar trend: hearing aid technologies improved speech recognition performance, but did not affect performance in the visual reaction-time task (i.e., reduce listening effort). The correlation between listening effort measured using the driving paradigm and the visual reaction-time task paradigm was significant. The finding showing that our older (56 to 85 years old) participants’ better speech recognition performance did not result in reduced listening effort was not consistent with literature that evaluated younger (approximately 20 years old), normal hearing adults. Because of this, a follow-up study was conducted. In the follow-up study, the visual reaction-time dual-task experiment using the same speech materials and road noises was repeated on younger adults with normal hearing. Contrary to findings with older participants, the results indicated that the directional technology significantly improved performance in both speech recognition and visual reaction-time tasks. Conclusions Adding a speech listening task to driving undermined driving performance. Hearing aid technologies significantly improved speech recognition while driving, but did not significantly reduce listening effort. Listening effort measured by dual-task experiments using a simulated real-world driving task and a conventional laboratory-style task was generally consistent. For a given listening environment, the benefit of hearing aid technologies on listening effort measured from younger adults with normal hearing may not be fully translated to older listeners with hearing impairment. PMID:25083599
Abnormal Brain Dynamics Underlie Speech Production in Children with Autism Spectrum Disorder.
Pang, Elizabeth W; Valica, Tatiana; MacDonald, Matt J; Taylor, Margot J; Brian, Jessica; Lerch, Jason P; Anagnostou, Evdokia
2016-02-01
A large proportion of children with autism spectrum disorder (ASD) have speech and/or language difficulties. While a number of structural and functional neuroimaging methods have been used to explore the brain differences in ASD with regards to speech and language comprehension and production, the neurobiology of basic speech function in ASD has not been examined. Magnetoencephalography (MEG) is a neuroimaging modality with high spatial and temporal resolution that can be applied to the examination of brain dynamics underlying speech as it can capture the fast responses fundamental to this function. We acquired MEG from 21 children with high-functioning autism (mean age: 11.43 years) and 21 age- and sex-matched controls as they performed a simple oromotor task, a phoneme production task and a phonemic sequencing task. Results showed significant differences in activation magnitude and peak latencies in primary motor cortex (Brodmann Area 4), motor planning areas (BA 6), temporal sequencing and sensorimotor integration areas (BA 22/13) and executive control areas (BA 9). Our findings of significant functional brain differences between these two groups on these simple oromotor and phonemic tasks suggest that these deficits may be foundational and could underlie the language deficits seen in ASD. © 2015 The Authors Autism Research published by Wiley Periodicals, Inc. on behalf of International Society for Autism Research.
Barista: A Framework for Concurrent Speech Processing by USC-SAIL
Can, Doğan; Gibson, James; Vaz, Colin; Georgiou, Panayiotis G.; Narayanan, Shrikanth S.
2016-01-01
We present Barista, an open-source framework for concurrent speech processing based on the Kaldi speech recognition toolkit and the libcppa actor library. With Barista, we aim to provide an easy-to-use, extensible framework for constructing highly customizable concurrent (and/or distributed) networks for a variety of speech processing tasks. Each Barista network specifies a flow of data between simple actors, concurrent entities communicating by message passing, modeled after Kaldi tools. Leveraging the fast and reliable concurrency and distribution mechanisms provided by libcppa, Barista lets demanding speech processing tasks, such as real-time speech recognizers and complex training workflows, to be scheduled and executed on parallel (and/or distributed) hardware. Barista is released under the Apache License v2.0. PMID:27610047
Barista: A Framework for Concurrent Speech Processing by USC-SAIL.
Can, Doğan; Gibson, James; Vaz, Colin; Georgiou, Panayiotis G; Narayanan, Shrikanth S
2014-05-01
We present Barista, an open-source framework for concurrent speech processing based on the Kaldi speech recognition toolkit and the libcppa actor library. With Barista, we aim to provide an easy-to-use, extensible framework for constructing highly customizable concurrent (and/or distributed) networks for a variety of speech processing tasks. Each Barista network specifies a flow of data between simple actors, concurrent entities communicating by message passing, modeled after Kaldi tools. Leveraging the fast and reliable concurrency and distribution mechanisms provided by libcppa, Barista lets demanding speech processing tasks, such as real-time speech recognizers and complex training workflows, to be scheduled and executed on parallel (and/or distributed) hardware. Barista is released under the Apache License v2.0.
The effect of compression and attention allocation on speech intelligibility
NASA Astrophysics Data System (ADS)
Choi, Sangsook; Carrell, Thomas
2003-10-01
Research investigating the effects of amplitude compression on speech intelligibility for individuals with sensorineural hearing loss has demonstrated contradictory results [Souza and Turner (1999)]. Because percent-correct measures may not be the best indicator of compression effectiveness, a speech intelligibility and motor coordination task was developed to provide data that may more thoroughly explain the perception of compressed speech signals. In the present study, a pursuit rotor task [Dlhopolsky (2000)] was employed along with word identification task to measure the amount of attention required to perceive compressed and non-compressed words in noise. Monosyllabic words were mixed with speech-shaped noise at a fixed signal-to-noise ratio and compressed using a wide dynamic range compression scheme. Participants with normal hearing identified each word with or without a simultaneous pursuit-rotor task. Also, participants completed the pursuit-rotor task without simultaneous word presentation. It was expected that the performance on the additional motor task would reflect effect of the compression better than simple word-accuracy measures. Results were complex. For example, in some conditions an irrelevant task actually improved performance on a simultaneous listening task. This suggests there might be an optimal level of attention required for recognition of monosyllabic words.
Oral-Motor and Motor-Speech Characteristics of Children with Autism.
ERIC Educational Resources Information Center
Adams, Lynn
1998-01-01
This study compared the oral-motor and motor-speech characteristics of four young children with autism and four nonautistic children. Three tasks requiring oral motor movements, simple syllable productions, and complex syllable productions were utilized. Significant differences were found in scores on the oral-motor movements and the…
A Flexible Question-and-Answer Task for Measuring Speech Understanding.
Best, Virginia; Streeter, Timothy; Roverud, Elin; Mason, Christine R; Kidd, Gerald
2016-11-24
This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer) that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes). All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways. © The Author(s) 2016.
A Flexible Question-and-Answer Task for Measuring Speech Understanding
Streeter, Timothy; Roverud, Elin; Mason, Christine R.; Kidd, Gerald
2016-01-01
This report introduces a new speech task based on simple questions and answers. The task differs from a traditional sentence recall task in that it involves an element of comprehension and can be implemented in an ongoing fashion. It also contains two target items (the question and the answer) that may be associated with different voices and locations to create dynamic listening scenarios. A set of 227 questions was created, covering six broad categories (days of the week, months of the year, numbers, colors, opposites, and sizes). All questions and their one-word answers were spoken by 11 female and 11 male talkers. In this study, listeners were presented with question-answer pairs and asked to indicate whether the answer was true or false. Responses were given as simple button or key presses, which are quick to make and easy to score. Two preliminary experiments are presented that illustrate different ways of implementing the basic task. In the first experiment, question-answer pairs were presented in speech-shaped noise, and performance was compared across subjects, question categories, and time, to examine the different sources of variability. In the second experiment, sequences of question-answer pairs were presented amidst competing conversations in an ongoing, spatially dynamic listening scenario. Overall, the question-and-answer task appears to be feasible and could be implemented flexibly in a number of different ways. PMID:27888257
Mooij, Anne H; Huiskamp, Geertjan J M; Gosselaar, Peter H; Ferrier, Cyrille H
2016-02-01
Electrocorticographic (ECoG) mapping of high gamma activity induced by language tasks has been proposed as a more patient friendly alternative for electrocortical stimulation mapping (ESM), the gold standard in pre-surgical language mapping of epilepsy patients. However, ECoG mapping often reveals more language areas than considered critical with ESM. We investigated if critical language areas can be identified with a listening task consisting of speech and music phrases. Nine patients with implanted subdural grid electrodes listened to an audio fragment in which music and speech alternated. We analysed ECoG power in the 65-95 Hz band and obtained task-related activity patterns in electrodes over language areas. We compared the spatial distribution of sites that discriminated between listening to speech and music to ESM results using sensitivity and specificity calculations. Our listening task of alternating speech and music phrases had a low sensitivity (0.32) but a high specificity (0.95). The high specificity indicates that this test does indeed point to areas that are critical to language processing. Our test cannot replace ESM, but this short and simple task can give a reliable indication where to find critical language areas, better than ECoG mapping using language tasks alone. Copyright © 2015 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Integrating Text-to-Speech Software into Pedagogically Sound Teaching and Learning Scenarios
ERIC Educational Resources Information Center
Rughooputh, S. D. D. V.; Santally, M. I.
2009-01-01
This paper presents a new technique of delivery of classes--an instructional technique which will no doubt revolutionize the teaching and learning, whether for on-campus, blended or online modules. This is based on the simple task of instructionally incorporating text-to-speech software embedded in the lecture slides that will simulate exactly the…
Challenges in discriminating profanity from hate speech
NASA Astrophysics Data System (ADS)
Malmasi, Shervin; Zampieri, Marcos
2018-03-01
In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.
Lee, J D; Caven, B; Haake, S; Brown, T L
2001-01-01
As computer applications for cars emerge, a speech-based interface offers an appealing alternative to the visually demanding direct manipulation interface. However, speech-based systems may pose cognitive demands that could undermine driving safety. This study used a car-following task to evaluate how a speech-based e-mail system affects drivers' response to the periodic braking of a lead vehicle. The study included 24 drivers between the ages of 18 and 24 years. A baseline condition with no e-mail system was compared with a simple and a complex e-mail system in both simple and complex driving environments. The results show a 30% (310 ms) increase in reaction time when the speech-based system is used. Subjective workload ratings and probe questions also indicate that speech-based interaction introduces a significant cognitive load, which was highest for the complex e-mail system. These data show that a speech-based interface is not a panacea that eliminates the potential distraction of in-vehicle computers. Actual or potential applications of this research include design of in-vehicle information systems and evaluation of their contributions to driver distraction.
Speech target modulates speaking induced suppression in auditory cortex
Ventura, Maria I; Nagarajan, Srikantan S; Houde, John F
2009-01-01
Background Previous magnetoencephalography (MEG) studies have demonstrated speaking-induced suppression (SIS) in the auditory cortex during vocalization tasks wherein the M100 response to a subject's own speaking is reduced compared to the response when they hear playback of their speech. Results The present MEG study investigated the effects of utterance rapidity and complexity on SIS: The greatest difference between speak and listen M100 amplitudes (i.e., most SIS) was found in the simple speech task. As the utterances became more rapid and complex, SIS was significantly reduced (p = 0.0003). Conclusion These findings are highly consistent with our model of how auditory feedback is processed during speaking, where incoming feedback is compared with an efference-copy derived prediction of expected feedback. Thus, the results provide further insights about how speech motor output is controlled, as well as the computational role of auditory cortex in transforming auditory feedback. PMID:19523234
Dunlop, William A.; Enticott, Peter G.; Rajan, Ramesh
2016-01-01
Autism Spectrum Disorder (ASD), characterized by impaired communication skills and repetitive behaviors, can also result in differences in sensory perception. Individuals with ASD often perform normally in simple auditory tasks but poorly compared to typically developed (TD) individuals on complex auditory tasks like discriminating speech from complex background noise. A common trait of individuals with ASD is hypersensitivity to auditory stimulation. No studies to our knowledge consider whether hypersensitivity to sounds is related to differences in speech-in-noise discrimination. We provide novel evidence that individuals with high-functioning ASD show poor performance compared to TD individuals in a speech-in-noise discrimination task with an attentionally demanding background noise, but not in a purely energetic noise. Further, we demonstrate in our small sample that speech-hypersensitivity does not appear to predict performance in the speech-in-noise task. The findings support the argument that an attentional deficit, rather than a perceptual deficit, affects the ability of individuals with ASD to discriminate speech from background noise. Finally, we piloted a novel questionnaire that measures difficulty hearing in noisy environments, and sensitivity to non-verbal and verbal sounds. Psychometric analysis using 128 TD participants provided novel evidence for a difference in sensitivity to non-verbal and verbal sounds, and these findings were reinforced by participants with ASD who also completed the questionnaire. The study was limited by a small and high-functioning sample of participants with ASD. Future work could test larger sample sizes and include lower-functioning ASD participants. PMID:27555814
Merrill, Anne M; Karcher, Nicole R; Cicero, David C; Becker, Theresa M; Docherty, Anna R; Kerns, John G
2017-03-01
People with schizophrenia exhibit wide-ranging cognitive deficits, including slower processing speed and decreased cognitive control. Disorganized speech symptoms, such as communication impairment, have been associated with poor cognitive control task performance (e.g., goal maintenance and working memory). Whether communication impairment is associated with poorer performance on a broader range of non-cognitive control measures is unclear. In the current study, people with schizophrenia (n =51) and non-psychiatric controls (n =26) completed speech interviews allowing for reliable quantitative assessment of communication impairment. Participants also completed multiple goal maintenance and working memory tasks. In addition, we also examined (a) simple measures of processing speed involving highly automatic prepotent responses and (b) a non-cognitive control measure of general task performance. Schizophrenia communication impairment was significantly associated with poor performance in all cognitive domains, with the largest association found with processing speed (r s =-0.52). Further, communication impairment was also associated with the non-cognitive control measure of poor general task performance (r s =-0.43). In contrast, alogia, a negative speech symptom, and positive symptoms were less if at all related to cognitive task performance. Overall, this study suggests that communication impairment in schizophrenia may be associated with relatively generalized poor cognitive task performance. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Why are background telephone conversations distracting?
Marsh, John E; Ljung, Robert; Jahncke, Helena; MacCutcheon, Douglas; Pausch, Florian; Ball, Linden J; Vachon, François
2018-06-01
Telephone conversation is ubiquitous within the office setting. Overhearing a telephone conversation-whereby only one of the two speakers is heard-is subjectively more annoying and objectively more distracting than overhearing a full conversation. The present study sought to determine whether this "halfalogue" effect is attributable to unexpected offsets and onsets within the background speech (acoustic unexpectedness) or to the tendency to predict the unheard part of the conversation (semantic [un]predictability), and whether these effects can be shielded against through top-down cognitive control. In Experiment 1, participants performed an office-related task in quiet or in the presence of halfalogue and dialogue background speech. Irrelevant speech was either meaningful or meaningless speech. The halfalogue effect was only present for the meaningful speech condition. Experiment 2 addressed whether higher task-engagement could shield against the halfalogue effect by manipulating the font of the to-be-read material. Although the halfalogue effect was found with an easy-to-read font (fluent text), the use of a difficult-to-read font (disfluent text) eliminated the effect. The halfalogue effect is thus attributable to the semantic (un)predictability, not the acoustic unexpectedness, of background telephone conversation and can be prevented by simple means such as increasing the level of engagement required by the focal task. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Markett, Sebastian; Bleek, Benjamin; Reuter, Martin; Prüss, Holger; Richardt, Kirsten; Müller, Thilo; Yaruss, J Scott; Montag, Christian
2016-10-01
Idiopathic stuttering is a fluency disorder characterized by impairments during speech production. Deficits in the motor control circuits of the basal ganglia have been implicated in idiopathic stuttering but it is unclear how these impairments relate to the disorder. Previous work has indicated a possible deficiency in motor inhibition in children who stutter. To extend these findings to adults, we designed two experiments to probe executive motor control in people who stutter using manual reaction time tasks that do not rely on speech production. We used two versions of the stop-signal reaction time task, a measure for inhibitory motor control that has been shown to rely on the basal ganglia circuits. We show increased stop-signal reaction times in two independent samples of adults who stutter compared to age- and sex-matched control groups. Additional measures involved simple reaction time measurements and a task-switching task where no group difference was detected. Results indicate a deficiency in inhibitory motor control in people who stutter in a task that does not rely on overt speech production and cannot be explained by general deficits in executive control or speeded motor execution. This finding establishes the stop-signal reaction time as a possible target for future experimental and neuroimaging studies on fluency disorders and is a further step towards unraveling the contribution of motor control deficits to idiopathic stuttering. Copyright © 2016 Elsevier Ltd. All rights reserved.
Brain activity during auditory and visual phonological, spatial and simple discrimination tasks.
Salo, Emma; Rinne, Teemu; Salonen, Oili; Alho, Kimmo
2013-02-16
We used functional magnetic resonance imaging to measure human brain activity during tasks demanding selective attention to auditory or visual stimuli delivered in concurrent streams. Auditory stimuli were syllables spoken by different voices and occurring in central or peripheral space. Visual stimuli were centrally or more peripherally presented letters in darker or lighter fonts. The participants performed a phonological, spatial or "simple" (speaker-gender or font-shade) discrimination task in either modality. Within each modality, we expected a clear distinction between brain activations related to nonspatial and spatial processing, as reported in previous studies. However, within each modality, different tasks activated largely overlapping areas in modality-specific (auditory and visual) cortices, as well as in the parietal and frontal brain regions. These overlaps may be due to effects of attention common for all three tasks within each modality or interaction of processing task-relevant features and varying task-irrelevant features in the attended-modality stimuli. Nevertheless, brain activations caused by auditory and visual phonological tasks overlapped in the left mid-lateral prefrontal cortex, while those caused by the auditory and visual spatial tasks overlapped in the inferior parietal cortex. These overlapping activations reveal areas of multimodal phonological and spatial processing. There was also some evidence for intermodal attention-related interaction. Most importantly, activity in the superior temporal sulcus elicited by unattended speech sounds was attenuated during the visual phonological task in comparison with the other visual tasks. This effect might be related to suppression of processing irrelevant speech presumably distracting the phonological task involving the letters. Copyright © 2012 Elsevier B.V. All rights reserved.
Konig, Alexandra; Satt, Aharon; Sorin, Alex; Hoory, Ran; Derreumaux, Alexandre; David, Renaud; Robert, Phillippe H
2018-01-01
Various types of dementia and Mild Cognitive Impairment (MCI) are manifested as irregularities in human speech and language, which have proven to be strong predictors for the disease presence and progress ion. Therefore, automatic speech analytics provided by a mobile application may be a useful tool in providing additional indicators for assessment and detection of early stage dementia and MCI. 165 participants (subjects with subjective cognitive impairment (SCI), MCI patients, Alzheimer's disease (AD) and mixed dementia (MD) patients) were recorded with a mobile application while performing several short vocal cognitive tasks during a regular consultation. These tasks included verbal fluency, picture description, counting down and a free speech task. The voice recordings were processed in two steps: in the first step, vocal markers were extracted using speech signal processing techniques; in the second, the vocal markers were tested to assess their 'power' to distinguish between SCI, MCI, AD and MD. The second step included training automatic classifiers for detecting MCI and AD, based on machine learning methods, and testing the detection accuracy. The fluency and free speech tasks obtain the highest accuracy rates of classifying AD vs. MD vs. MCI vs. SCI. Using the data, we demonstrated classification accuracy as follows: SCI vs. AD = 92% accuracy; SCI vs. MD = 92% accuracy; SCI vs. MCI = 86% accuracy and MCI vs. AD = 86%. Our results indicate the potential value of vocal analytics and the use of a mobile application for accurate automatic differentiation between SCI, MCI and AD. This tool can provide the clinician with meaningful information for assessment and monitoring of people with MCI and AD based on a non-invasive, simple and low-cost method. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Clustered functional MRI of overt speech production.
Sörös, Peter; Sokoloff, Lisa Guttman; Bose, Arpita; McIntosh, Anthony R; Graham, Simon J; Stuss, Donald T
2006-08-01
To investigate the neural network of overt speech production, event-related fMRI was performed in 9 young healthy adult volunteers. A clustered image acquisition technique was chosen to minimize speech-related movement artifacts. Functional images were acquired during the production of oral movements and of speech of increasing complexity (isolated vowel as well as monosyllabic and trisyllabic utterances). This imaging technique and behavioral task enabled depiction of the articulo-phonologic network of speech production from the supplementary motor area at the cranial end to the red nucleus at the caudal end. Speaking a single vowel and performing simple oral movements involved very similar activation of the cortical and subcortical motor systems. More complex, polysyllabic utterances were associated with additional activation in the bilateral cerebellum, reflecting increased demand on speech motor control, and additional activation in the bilateral temporal cortex, reflecting the stronger involvement of phonologic processing.
Telephone-quality pathological speech classification using empirical mode decomposition.
Kaleem, M F; Ghoraani, B; Guergachi, A; Krishnan, S
2011-01-01
This paper presents a computationally simple and effective methodology based on empirical mode decomposition (EMD) for classification of telephone quality normal and pathological speech signals. EMD is used to decompose continuous normal and pathological speech signals into intrinsic mode functions, which are analyzed to extract physically meaningful and unique temporal and spectral features. Using continuous speech samples from a database of 51 normal and 161 pathological speakers, which has been modified to simulate telephone quality speech under different levels of noise, a linear classifier is used with the feature vector thus obtained to obtain a high classification accuracy, thereby demonstrating the effectiveness of the methodology. The classification accuracy reported in this paper (89.7% for signal-to-noise ratio 30 dB) is a significant improvement over previously reported results for the same task, and demonstrates the utility of our methodology for cost-effective remote voice pathology assessment over telephone channels.
Cognitive Load in Voice Therapy Carry-Over Exercises.
Iwarsson, Jenny; Morris, David Jackson; Balling, Laura Winther
2017-01-01
The cognitive load generated by online speech production may vary with the nature of the speech task. This article examines 3 speech tasks used in voice therapy carry-over exercises, in which a patient is required to adopt and automatize new voice behaviors, ultimately in daily spontaneous communication. Twelve subjects produced speech in 3 conditions: rote speech (weekdays), sentences in a set form, and semispontaneous speech. Subjects simultaneously performed a secondary visual discrimination task for which response times were measured. On completion of each speech task, subjects rated their experience on a questionnaire. Response times from the secondary, visual task were found to be shortest for the rote speech, longer for the semispontaneous speech, and longest for the sentences within the set framework. Principal components derived from the subjective ratings were found to be linked to response times on the secondary visual task. Acoustic measures reflecting fundamental frequency distribution and vocal fold compression varied across the speech tasks. The results indicate that consideration should be given to the selection of speech tasks during the process leading to automation of revised speech behavior and that self-reports may be a reliable index of cognitive load.
Electromagnetic articulography treatment for an adult with Broca's aphasia and apraxia of speech.
Katz, W F; Bharadwaj, S V; Carstens, B
1999-12-01
Electromagnetic articulography (EMA) was explored as a means of remediating [s]/[symbol in text] articulation deficits in the speech of an adult with Broca's aphasia and apraxia of speech. Over a 1-month period, the subject was provided with 2 different treatments in a counterbalanced procedure: (1) visually guided biofeedback concerning tongue-tip position and (2) a foil treatment in which a computer program delivered voicing-contrast stimuli for simple repetition. Kinematic and perceptual data suggest improvement resulting from visually guided biofeedback, both for nonspeech oral and, to a lesser extent, speech motor tasks. In contrast, the phonetic contrast treated in the foil condition showed only marginal improvement during the therapy session, with performance dropping back to baseline 10 weeks post-treatment. Although preliminary, the findings suggest that visual biofeedback concerning tongue-tip position can be used to treat nonspeech oral and (to a lesser extent) speech motor behavior in adults with Broca's aphasia and apraxia of speech.
McArdle, J J; Mari, Z; Pursley, R H; Schulz, G M; Braun, A R
2009-02-01
We investigated whether the Bereitschaftspotential (BP), an event related potential believed to reflect motor planning, would be modulated by language-related parameters prior to speech. We anticipated that articulatory complexity would produce effects on the BP distribution similar to those demonstrated for complex limb movements. We also hypothesized that lexical semantic operations would independently impact the BP. Eighteen participants performed 3 speech tasks designed to differentiate lexical semantic and articulatory contributions to the BP. EEG epochs were time-locked to the earliest source of speech movement per trial. Lip movements were assessed using EMG recordings. Doppler imaging was used to determine the onset of tongue movement during speech, providing a means of identification and elimination of potential artifact. Compared to simple repetition, complex articulations produced an anterior shift in the maximum midline BP. Tasks requiring lexical search and selection augmented these effects and independently elicited a left lateralized asymmetry in the frontal distribution. The findings indicate that the BP is significantly modulated by linguistic processing, suggesting that the premotor system might play a role in lexical access. These novel findings support the notion that the motor systems may play a significant role in the formulation of language.
Abnormal motor cortex excitability during linguistic tasks in adductor-type spasmodic dysphonia.
Suppa, A; Marsili, L; Giovannelli, F; Di Stasio, F; Rocchi, L; Upadhyay, N; Ruoppolo, G; Cincotta, M; Berardelli, A
2015-08-01
In healthy subjects (HS), transcranial magnetic stimulation (TMS) applied during 'linguistic' tasks discloses excitability changes in the dominant hemisphere primary motor cortex (M1). We investigated 'linguistic' task-related cortical excitability modulation in patients with adductor-type spasmodic dysphonia (ASD), a speech-related focal dystonia. We studied 10 ASD patients and 10 HS. Speech examination included voice cepstral analysis. We investigated the dominant/non-dominant M1 excitability at baseline, during 'linguistic' (reading aloud/silent reading/producing simple phonation) and 'non-linguistic' tasks (looking at non-letter strings/producing oral movements). Motor evoked potentials (MEPs) were recorded from the contralateral hand muscles. We measured the cortical silent period (CSP) length and tested MEPs in HS and patients performing the 'linguistic' tasks with different voice intensities. We also examined MEPs in HS and ASD during hand-related 'action-verb' observation. Patients were studied under and not-under botulinum neurotoxin-type A (BoNT-A). In HS, TMS over the dominant M1 elicited larger MEPs during 'reading aloud' than during the other 'linguistic'/'non-linguistic' tasks. Conversely, in ASD, TMS over the dominant M1 elicited increased-amplitude MEPs during 'reading aloud' and 'syllabic phonation' tasks. CSP length was shorter in ASD than in HS and remained unchanged in both groups performing 'linguistic'/'non-linguistic' tasks. In HS and ASD, 'linguistic' task-related excitability changes were present regardless of the different voice intensities. During hand-related 'action-verb' observation, MEPs decreased in HS, whereas in ASD they increased. In ASD, BoNT-A improved speech, as demonstrated by cepstral analysis and restored the TMS abnormalities. ASD reflects dominant hemisphere excitability changes related to 'linguistic' tasks; BoNT-A returns these excitability changes to normal. © 2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Teaching Students to Visualize: Nine Key Questions for Success
ERIC Educational Resources Information Center
Rader, Laura A.
2009-01-01
The seemingly simple task associated with formal reading instruction may be problematic for many students with speech and language delays who often enter school with meager literacy experiences (B. K. Gunn, D. C. Simmons, & E. J. Kame'enui, 1999). However, the challenges that students face may be reduced when reading instruction includes…
Working memory training to improve speech perception in noise across languages
Ingvalson, Erin M.; Dhar, Sumitrajit; Wong, Patrick C. M.; Liu, Hanjun
2015-01-01
Working memory capacity has been linked to performance on many higher cognitive tasks, including the ability to perceive speech in noise. Current efforts to train working memory have demonstrated that working memory performance can be improved, suggesting that working memory training may lead to improved speech perception in noise. A further advantage of working memory training to improve speech perception in noise is that working memory training materials are often simple, such as letters or digits, making them easily translatable across languages. The current effort tested the hypothesis that working memory training would be associated with improved speech perception in noise and that materials would easily translate across languages. Native Mandarin Chinese and native English speakers completed ten days of reversed digit span training. Reading span and speech perception in noise both significantly improved following training, whereas untrained controls showed no gains. These data suggest that working memory training may be used to improve listeners' speech perception in noise and that the materials may be quickly adapted to a wide variety of listeners. PMID:26093435
Working memory training to improve speech perception in noise across languages.
Ingvalson, Erin M; Dhar, Sumitrajit; Wong, Patrick C M; Liu, Hanjun
2015-06-01
Working memory capacity has been linked to performance on many higher cognitive tasks, including the ability to perceive speech in noise. Current efforts to train working memory have demonstrated that working memory performance can be improved, suggesting that working memory training may lead to improved speech perception in noise. A further advantage of working memory training to improve speech perception in noise is that working memory training materials are often simple, such as letters or digits, making them easily translatable across languages. The current effort tested the hypothesis that working memory training would be associated with improved speech perception in noise and that materials would easily translate across languages. Native Mandarin Chinese and native English speakers completed ten days of reversed digit span training. Reading span and speech perception in noise both significantly improved following training, whereas untrained controls showed no gains. These data suggest that working memory training may be used to improve listeners' speech perception in noise and that the materials may be quickly adapted to a wide variety of listeners.
How does cognitive load influence speech perception? An encoding hypothesis.
Mitterer, Holger; Mattys, Sven L
2017-01-01
Two experiments investigated the conditions under which cognitive load exerts an effect on the acuity of speech perception. These experiments extend earlier research by using a different speech perception task (four-interval oddity task) and by implementing cognitive load through a task often thought to be modular, namely, face processing. In the cognitive-load conditions, participants were required to remember two faces presented before the speech stimuli. In Experiment 1, performance in the speech-perception task under cognitive load was not impaired in comparison to a no-load baseline condition. In Experiment 2, we modified the load condition minimally such that it required encoding of the two faces simultaneously with the speech stimuli. As a reference condition, we also used a visual search task that in earlier experiments had led to poorer speech perception. Both concurrent tasks led to decrements in the speech task. The results suggest that speech perception is affected even by loads thought to be processed modularly, and that, critically, encoding in working memory might be the locus of interference.
A CAI System for Visually Impaired Children to Improve Abilities of Orientation and Mobility
NASA Astrophysics Data System (ADS)
Yoneda, Takahiro; Kudo, Hiroaki; Minagawa, Hiroki; Ohnishi, Noboru; Matsubara, Shizuya
Some visually impaired children have difficulty in simple locomotion, and need orientation and mobility training. We developed a computer assisted instruction system which assists this training. A user realizes a task given by a tactile map and synthesized speech. The user walks around a room according to the task. The system gives the gap of walk path from its target path via both auditory and tactile feedback after the end of a task. Then the user can understand how well the user walked. We describe the detail of the proposed system and task, and the experimental result with three visually impaired children.
Speech fluency profile on different tasks for individuals with Parkinson's disease.
Juste, Fabiola Staróbole; Andrade, Claudia Regina Furquim de
2017-07-20
To characterize the speech fluency profile of patients with Parkinson's disease. Study participants were 40 individuals of both genders aged 40 to 80 years divided into 2 groups: Research Group - RG (20 individuals with diagnosis of Parkinson's disease) and Control Group - CG (20 individuals with no communication or neurological disorders). For all of the participants, three speech samples involving different tasks were collected: monologue, individual reading, and automatic speech. The RG presented a significant larger number of speech disruptions, both stuttering-like and typical dysfluencies, and higher percentage of speech discontinuity in the monologue and individual reading tasks compared with the CG. Both groups presented reduced number of speech disruptions (stuttering-like and typical dysfluencies) in the automatic speech task; the groups presented similar performance in this task. Regarding speech rate, individuals in the RG presented lower number of words and syllables per minute compared with those in the CG in all speech tasks. Participants of the RG presented altered parameters of speech fluency compared with those of the CG; however, this change in fluency cannot be considered a stuttering disorder.
ERIC Educational Resources Information Center
Treurniet, William
A study applied artificial neural networks, trained with the back-propagation learning algorithm, to modelling phonemes extracted from the DARPA TIMIT multi-speaker, continuous speech data base. A number of proposed network architectures were applied to the phoneme classification task, ranging from the simple feedforward multilayer network to more…
The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.
De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet
2016-10-01
This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.
Psychometric Functions of Dual-Task Paradigms for Measuring Listening Effort.
Wu, Yu-Hsiang; Stangl, Elizabeth; Zhang, Xuyang; Perkins, Joanna; Eilers, Emily
The purpose of the study was to characterize the psychometric functions that describe task performance in dual-task listening effort measures as a function of signal to noise ratio (SNR). Younger adults with normal hearing (YNH, n = 24; experiment 1) and older adults with hearing impairment (n = 24; experiment 2) were recruited. Dual-task paradigms wherein the participants performed a primary speech recognition task simultaneously with a secondary task were conducted at a wide range of SNRs. Two different secondary tasks were used: an easy task (i.e., a simple visual reaction-time task) and a hard task (i.e., the incongruent Stroop test). The reaction time (RT) quantified the performance of the secondary task. For both participant groups and for both easy and hard secondary tasks, the curves that described the RT as a function of SNR were peak shaped. The RT increased as SNR changed from favorable to intermediate SNRs, and then decreased as SNRs moved from intermediate to unfavorable SNRs. The RT reached its peak (longest time) at the SNRs at which the participants could understand 30 to 50% of the speech. In experiments 1 and 2, the dual-task trials that had the same SNR were conducted in one block. To determine if the peak shape of the RT curves was specific to the blocked SNR presentation order used in these experiments, YNH participants were recruited (n = 25; experiment 3) and dual-task measures, wherein the SNR was varied from trial to trial (i.e., nonblocked), were conducted. The results indicated that, similar to the first two experiments, the RT curves had a peak shape. Secondary task performance was poorer at the intermediate SNRs than at the favorable and unfavorable SNRs. This pattern was observed for both YNH and older adults with hearing impairment participants and was not affected by either task type (easy or hard secondary task) or SNR presentation order (blocked or nonblocked). The shorter RT at the unfavorable SNRs (speech intelligibility < 30%) possibly reflects that the participants experienced cognitive overload and/or disengaged themselves from the listening task. The implication of using the dual-task paradigm as a listening effort measure is discussed.
Synergetic Organization in Speech Rhythm
NASA Astrophysics Data System (ADS)
Cummins, Fred
The Speech Cycling Task is a novel experimental paradigm developed together with Robert Port and Keiichi Tajima at Indiana University. In a task of this sort, subjects repeat a phrase containing multiple prominent, or stressed, syllables in time with an auditory metronome, which can be simple or complex. A phase-based collective variable is defined in the acoustic speech signal. This paper reports on two experiments using speech cycling which together reveal many of the hallmarks of hierarchically coupled oscillatory processes. The first experiment requires subjects to place the final stressed syllable of a small phrase at specified phases within the overall Phrase Repetition Cycle (PRC). It is clearly demonstrated that only three patterns, characterized by phases around 1/3, 1/2 or 2/3 are reliably produced, and these points are attractors for other target phases. The system is thus multistable, and the attractors correspond to stable couplings between the metrical foot and the PRC. A second experiment examines the behavior of these attractors at increased rates. Faster rates lead to mode jumps between attractors. Previous experiments have also illustrated hysteresis as the system moves from one mode to the next. The dynamical organization is particularly interesting from a modeling point of view, as there is no single part of the speech production system which cycles at the level of either the metrical foot or the phrase repetition cycle. That is, there is no continuous kinematic observable in the system. Nonetheless, there is strong evidence that the oscopic behavior of the entire production system is correctly described as hierarchically coupled oscillators. There are many parallels between this organization and the forms of inter-limb coupling observed in locomotion and rhythmic manual tasks.
Speech training alters tone frequency tuning in rat primary auditory cortex
Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Kilgard, Michael P.
2013-01-01
Previous studies in both humans and animals have documented improved performance following discrimination training. This enhanced performance is often associated with cortical response changes. In this study, we tested the hypothesis that long-term speech training on multiple tasks can improve primary auditory cortex (A1) responses compared to rats trained on a single speech discrimination task or experimentally naïve rats. Specifically, we compared the percent of A1 responding to trained sounds, the responses to both trained and untrained sounds, receptive field properties of A1 neurons, and the neural discrimination of pairs of speech sounds in speech trained and naïve rats. Speech training led to accurate discrimination of consonant and vowel sounds, but did not enhance A1 response strength or the neural discrimination of these sounds. Speech training altered tone responses in rats trained on six speech discrimination tasks but not in rats trained on a single speech discrimination task. Extensive speech training resulted in broader frequency tuning, shorter onset latencies, a decreased driven response to tones, and caused a shift in the frequency map to favor tones in the range where speech sounds are the loudest. Both the number of trained tasks and the number of days of training strongly predict the percent of A1 responding to a low frequency tone. Rats trained on a single speech discrimination task performed less accurately than rats trained on multiple tasks and did not exhibit A1 response changes. Our results indicate that extensive speech training can reorganize the A1 frequency map, which may have downstream consequences on speech sound processing. PMID:24344364
Multitasking During Degraded Speech Recognition in School-Age Children
Ward, Kristina M.; Brehm, Laurel
2017-01-01
Multitasking requires individuals to allocate their cognitive resources across different tasks. The purpose of the current study was to assess school-age children’s multitasking abilities during degraded speech recognition. Children (8 to 12 years old) completed a dual-task paradigm including a sentence recognition (primary) task containing speech that was either unprocessed or noise-band vocoded with 8, 6, or 4 spectral channels and a visual monitoring (secondary) task. Children’s accuracy and reaction time on the visual monitoring task was quantified during the dual-task paradigm in each condition of the primary task and compared with single-task performance. Children experienced dual-task costs in the 6- and 4-channel conditions of the primary speech recognition task with decreased accuracy on the visual monitoring task relative to baseline performance. In all conditions, children’s dual-task performance on the visual monitoring task was strongly predicted by their single-task (baseline) performance on the task. Results suggest that children’s proficiency with the secondary task contributes to the magnitude of dual-task costs while multitasking during degraded speech recognition. PMID:28105890
Multitasking During Degraded Speech Recognition in School-Age Children.
Grieco-Calub, Tina M; Ward, Kristina M; Brehm, Laurel
2017-01-01
Multitasking requires individuals to allocate their cognitive resources across different tasks. The purpose of the current study was to assess school-age children's multitasking abilities during degraded speech recognition. Children (8 to 12 years old) completed a dual-task paradigm including a sentence recognition (primary) task containing speech that was either unprocessed or noise-band vocoded with 8, 6, or 4 spectral channels and a visual monitoring (secondary) task. Children's accuracy and reaction time on the visual monitoring task was quantified during the dual-task paradigm in each condition of the primary task and compared with single-task performance. Children experienced dual-task costs in the 6- and 4-channel conditions of the primary speech recognition task with decreased accuracy on the visual monitoring task relative to baseline performance. In all conditions, children's dual-task performance on the visual monitoring task was strongly predicted by their single-task (baseline) performance on the task. Results suggest that children's proficiency with the secondary task contributes to the magnitude of dual-task costs while multitasking during degraded speech recognition.
[Speech-related tremor of lips: a focal task-specific tremor].
Morita, Shuhei; Takagi, Rieko; Miwa, Hideto; Kondo, Tomoyoshi
2002-04-01
We report a 66-year-old Japanese woman in whom tremor of lips appeared during speech. Her past and family histories were unremarkable. On neurological examination, there was no abnormal finding except the lip tremor. Results of laboratory findings were all within normal levels. Her MRI and EEG were normal. Surface EMG studies revealed that regular grouped discharges at a frequency of about 4-5 Hz appeared in the orbicularis oris muscle only during voluntary speaking. The tremor was not observed under conditions of a purposeless phonation or a vocalization of a simple word, suggesting that the tremor was not a vocal tremor but a task-specific tremor related to speaking. Administration of a beta-blocker and consumption of small amount of alcohol could effectively improve the tremor, possibly suggesting that this type of tremor might be a clinical variant of essential tremor.
Speech task effects on acoustic measure of fundamental frequency in Cantonese-speaking children.
Ma, Estella P-M; Lam, Nina L-N
2015-12-01
Speaking fundamental frequency (F0) is a voice measure frequently used to document changes in vocal performance over time. Knowing the intra-subject variability of speaking F0 has implications on its clinical usefulness. The present study examined the speaking F0 elicited from three speech tasks in Cantonese-speaking children. The study also compared the variability of speaking F0 elicited from different speech tasks. Fifty-six vocally healthy Cantonese-speaking children (31 boys and 25 girls) aged between 7.0 and 10.11 years participated. For each child, speaking F0 was elicited using speech tasks at three linguistic levels (sustained vowel /a/ prolongation, reading aloud a sentence and passage). Two types of variability, within-session (trial-to-trial) and across-session (test-retest) variability, were compared across speech tasks. Significant differences in mean speaking F0 values were found between speech tasks. Mean speaking F0 value elicited from sustained vowel phonations was significantly higher than those elicited from the connected speech tasks. The variability of speaking F0 was higher in sustained vowel prolongation than that in connected speech. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Vowel reduction across tasks for male speakers of American English.
Kuo, Christina; Weismer, Gary
2016-07-01
This study examined acoustic variation of vowels within speakers across speech tasks. The overarching goal of the study was to understand within-speaker variation as one index of the range of normal speech motor behavior for American English vowels. Ten male speakers of American English performed four speech tasks including citation form sentence reading with a clear-speech style (clear-speech), citation form sentence reading (citation), passage reading (reading), and conversational speech (conversation). Eight monophthong vowels in a variety of consonant contexts were studied. Clear-speech was operationally defined as the reference point for describing variation. Acoustic measures associated with the conventions of vowel targets were obtained and examined. These included temporal midpoint formant frequencies for the first three formants (F1, F2, and F3) and the derived Euclidean distances in the F1-F2 and F2-F3 planes. Results indicated that reduction toward the center of the F1-F2 and F2-F3 planes increased in magnitude across the tasks in the order of clear-speech, citation, reading, and conversation. The cross-task variation was comparable for all speakers despite fine-grained individual differences. The characteristics of systematic within-speaker acoustic variation across tasks have potential implications for the understanding of the mechanisms of speech motor control and motor speech disorders.
Application of advanced speech technology in manned penetration bombers
NASA Astrophysics Data System (ADS)
North, R.; Lea, W.
1982-03-01
This report documents research on the potential use of speech technology in a manned penetration bomber aircraft (B-52/G and H). The objectives of the project were to analyze the pilot/copilot crewstation tasks over a three-hour-and forty-minute mission and determine the tasks that would benefit the most from conversion to speech recognition/generation, determine the technological feasibility of each of the identified tasks, and prioritize these tasks based on these criteria. Secondary objectives of the program were to enunciate research strategies in the application of speech technologies in airborne environments, and develop guidelines for briefing user commands on the potential of using speech technologies in the cockpit. The results of this study indicated that for the B-52 crewmember, speech recognition would be most beneficial for retrieving chart and procedural data that is contained in the flight manuals. Technological feasibility of these tasks indicated that the checklist and procedural retrieval tasks would be highly feasible for a speech recognition system.
Stenbäck, Victoria; Hällgren, Mathias; Lyxell, Björn; Larsby, Birgitta
2015-06-01
Cognitive functions and speech-recognition-in-noise were evaluated with a cognitive test battery, assessing response inhibition using the Hayling task, working memory capacity (WMC) and verbal information processing, and an auditory test of speech recognition. The cognitive tests were performed in silence whereas the speech recognition task was presented in noise. Thirty young normally-hearing individuals participated in the study. The aim of the study was to investigate one executive function, response inhibition, and whether it is related to individual working memory capacity (WMC), and how speech-recognition-in-noise relates to WMC and inhibitory control. The results showed a significant difference between initiation and response inhibition, suggesting that the Hayling task taps cognitive activity responsible for executive control. Our findings also suggest that high verbal ability was associated with better performance in the Hayling task. We also present findings suggesting that individuals who perform well on tasks involving response inhibition, and WMC, also perform well on a speech-in-noise task. Our findings indicate that capacity to resist semantic interference can be used to predict performance on speech-in-noise tasks. © 2015 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
Effects of speech intelligibility level on concurrent visual task performance.
Payne, D G; Peters, L J; Birkmire, D P; Bonto, M A; Anastasi, J S; Wenger, M J
1994-09-01
Four experiments were performed to determine if changes in the level of speech intelligibility in an auditory task have an impact on performance in concurrent visual tasks. The auditory task used in each experiment was a memory search task in which subjects memorized a set of words and then decided whether auditorily presented probe items were members of the memorized set. The visual tasks used were an unstable tracking task, a spatial decision-making task, a mathematical reasoning task, and a probability monitoring task. Results showed that performance on the unstable tracking and probability monitoring tasks was unaffected by the level of speech intelligibility on the auditory task, whereas accuracy in the spatial decision-making and mathematical processing tasks was significantly worse at low speech intelligibility levels. The findings are interpreted within the framework of multiple resource theory.
Toward a Natural Speech Understanding System
1989-10-01
WALTER J. SENUS Technical Director Directorate of Intelligence & Reconnaissance FOR THE COMMANDER JAMES W. HYDE III V Directorate of Plans & Programs ...applicable) Human Resources Laboratory F30602-81-C-0193 8 . ADDRESS (City, State, and ZIP Code) 10. SOURCE OF FUNDING NUMBERS PROGRAM PROJECT TASK WORK...error rates for distinctive words produced in isolation by a single speaker, and their simple programming requirements. Template-matching systems rank
Walking while talking: Young adults flexibly allocate resources between speech and gait.
Raffegeau, Tiphanie E; Haddad, Jeffrey M; Huber, Jessica E; Rietdyk, Shirley
2018-05-26
Walking while talking is an ideal multitask behavior to assess how young healthy adults manage concurrent tasks as it is well-practiced, cognitively demanding, and has real consequences for impaired performance in either task. Since the association between cognitive tasks and gait appears stronger when the gait task is more challenging, gait challenge was systematically manipulated in this study. To understand how young adults accomplish the multitask behavior of walking while talking as the gait challenge was systematically manipulated. Sixteen young adults (21 ± 1.6 years, 9 males) performed three gait tasks with and without speech: unobstructed gait (easy), obstacle crossing (moderate), obstacle crossing and tray carrying (difficult). Participants also provided a speech sample while seated for a baseline indicator of speech. The speech task was to speak extemporaneously about a topic (e.g. first car). Gait speed and the duration of silent pauses during speaking were determined. Silent pauses reflect cognitive processes involved in speech production and language planning. When speaking and walking without obstacles, gait speed decreased (relative to walking without speaking) but silent pause duration did not change (relative to seated speech). These changes are consistent with the idea that, in the easy gait task, participants placed greater value on speech pauses than on gait speed, likely due to the negative social consequences of impaired speech. In the moderate and difficult gait tasks both parameters changed: gait speed decreased and silent pauses increased. Walking while talking is a cognitively demanding task for healthy young adults, despite being a well-practiced habitual activity. These findings are consistent with the integrated model of task prioritization from Yogev-Seligmann et al., [1]. Copyright © 2018 Elsevier B.V. All rights reserved.
Clarissa Spoken Dialogue System for Procedure Reading and Navigation
NASA Technical Reports Server (NTRS)
Hieronymus, James; Dowding, John
2004-01-01
Speech is the most natural modality for humans use to communicate with other people, agents and complex systems. A spoken dialogue system must be robust to noise and able to mimic human conversational behavior, like correcting misunderstandings, answering simple questions about the task and understanding most well formed inquiries or commands. The system aims to understand the meaning of the human utterance, and if it does not, then it discards the utterance as being meant for someone else. The first operational system is Clarissa, a conversational procedure reader and navigator, which will be used in a System Development Test Objective (SDTO) on the International Space Station (ISS) during Expedition 10. In the present environment one astronaut reads the procedure on a Manual Procedure Viewer (MPV) or paper, and has to stop to read or turn pages, shifting focus from the task. Clarissa is designed to read and navigate ISS procedures entirely with speech, while the astronaut has his eyes and hands engaged in performing the task. The system also provides an MPV like graphical interface so the procedure can be read visually. A demo of the system will be given.
Speech responses and dual-task performance - Better time-sharing or asymmetric transfer?
NASA Technical Reports Server (NTRS)
Vidulich, Michael A.
1988-01-01
The value of speech controls in a dual-task experiment that also evaluated asymmetric transfer effects is considered. There was no evidence of asymmetric transfer in spite of significant effects supporting the advantage of mixing manual and speech responses. The data suggest that speech controls can be used to enhance performance in operational multiple-task environments.
Relationship between Speech Production and Perception in People Who Stutter.
Lu, Chunming; Long, Yuhang; Zheng, Lifen; Shi, Guang; Liu, Li; Ding, Guosheng; Howell, Peter
2016-01-01
Speech production difficulties are apparent in people who stutter (PWS). PWS also have difficulties in speech perception compared to controls. It is unclear whether the speech perception difficulties in PWS are independent of, or related to, their speech production difficulties. To investigate this issue, functional MRI data were collected on 13 PWS and 13 controls whilst the participants performed a speech production task and a speech perception task. PWS performed poorer than controls in the perception task and the poorer performance was associated with a functional activity difference in the left anterior insula (part of the speech motor area) compared to controls. PWS also showed a functional activity difference in this and the surrounding area [left inferior frontal cortex (IFC)/anterior insula] in the production task compared to controls. Conjunction analysis showed that the functional activity differences between PWS and controls in the left IFC/anterior insula coincided across the perception and production tasks. Furthermore, Granger Causality Analysis on the resting-state fMRI data of the participants showed that the causal connection from the left IFC/anterior insula to an area in the left primary auditory cortex (Heschl's gyrus) differed significantly between PWS and controls. The strength of this connection correlated significantly with performance in the perception task. These results suggest that speech perception difficulties in PWS are associated with anomalous functional activity in the speech motor area, and the altered functional connectivity from this area to the auditory area plays a role in the speech perception difficulties of PWS.
To speak or not to speak - A multiple resource perspective
NASA Technical Reports Server (NTRS)
Tsang, P. S.; Hartzell, E. J.; Rothschild, R. A.
1985-01-01
The desirability of employing speech response in a dynamic dual task situation was discussed from a multiple resource perspective. A secondary task technique was employed to examine the time-sharing performance of five dual tasks with various degrees of resource overlap according to the structure-specific resource model of Wickens (1980). The primary task was a visual/manual tracking task which required spatial processing. The secondary task was either another tracking task or a spatial transformation task with one of four input (visual or auditory) and output (manual or speech) configurations. The results show that the dual task performance was best when the primary tracking task was paired with the visual/speech transformation task. This finding was explained by an interaction of the stimulus-central processing-response compatibility of the transformation task and the degree of resource competition between the time-shared tasks. Implications on the utility of speech response were discussed.
Speed-difficulty trade-off in speech: Chinese versus English
Sun, Yao; Latash, Elizaveta M.; Mikaelian, Irina L.
2011-01-01
This study continues the investigation of the previously described speed-difficulty trade-off in picture description tasks. In particular, we tested a hypothesis that the Mandarin Chinese and American English are similar in showing logarithmic dependences between speech time and index of difficulty (ID), while they differ significantly in the amount of time needed to describe simple pictures, this difference increases for more complex pictures, and it is associated with a proportional difference in the number of syllables used. Subjects (eight Chinese speakers and eight English speakers) were tested in pairs. One subject (the Speaker) described simple pictures, while the other subject (the Performer) tried to reproduce the pictures based on the verbal description as quickly as possible with a set of objects. The Chinese speakers initiated speech production significantly faster than the English speakers. Speech time scaled linearly with ln(ID) in all subjects, but the regression coefficient was significantly higher in the English speakers as compared with the Chinese speakers. The number of errors was somewhat lower in the Chinese participants (not significantly). The Chinese pairs also showed a shorter delay between the initiation of speech and initiation of action by the Performer, shorter movement time by the Performer, and shorter overall performance time. The number of syllables scaled with ID, and the Chinese speakers used significantly smaller numbers of syllables. Speech rate was comparable between the two groups, about 3 syllables/s; it dropped for more complex pictures (higher ID). When asked to reproduce the same pictures without speaking, movement time scaled linearly with ln(ID); the Chinese performers were slower than the English performers. We conclude that natural languages show a speed-difficulty trade-off similar to Fitts’ law; the trade-offs in movement and speech production are likely to originate at a cognitive level. The time advantage of the Chinese participants originates not from similarity of the simple pictures and Chinese written characters and not from more sloppy performance. It is linked to using fewer syllables to transmit the same information. We suggest that natural languages may differ by informational density defined as the amount of information transmitted by a given number of syllables. PMID:21479658
LaCroix, Arianna N; Diaz, Alvaro F; Rogalsky, Corianne
2015-01-01
The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music.
LaCroix, Arianna N.; Diaz, Alvaro F.; Rogalsky, Corianne
2015-01-01
The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music. PMID:26321976
Buchan, Julie N; Munhall, Kevin G
2012-01-01
Audiovisual speech perception is an everyday occurrence of multisensory integration. Conflicting visual speech information can influence the perception of acoustic speech (namely the McGurk effect), and auditory and visual speech are integrated over a rather wide range of temporal offsets. This research examined whether the addition of a concurrent cognitive load task would affect the audiovisual integration in a McGurk speech task and whether the cognitive load task would cause more interference at increasing offsets. The amount of integration was measured by the proportion of responses in incongruent trials that did not correspond to the audio (McGurk response). An eye-tracker was also used to examine whether the amount of temporal offset and the presence of a concurrent cognitive load task would influence gaze behavior. Results from this experiment show a very modest but statistically significant decrease in the number of McGurk responses when subjects also perform a cognitive load task, and that this effect is relatively constant across the various temporal offsets. Participant's gaze behavior was also influenced by the addition of a cognitive load task. Gaze was less centralized on the face, less time was spent looking at the mouth and more time was spent looking at the eyes, when a concurrent cognitive load task was added to the speech task.
Task-dependent modulation of the visual sensory thalamus assists visual-speech recognition.
Díaz, Begoña; Blank, Helen; von Kriegstein, Katharina
2018-05-14
The cerebral cortex modulates early sensory processing via feed-back connections to sensory pathway nuclei. The functions of this top-down modulation for human behavior are poorly understood. Here, we show that top-down modulation of the visual sensory thalamus (the lateral geniculate body, LGN) is involved in visual-speech recognition. In two independent functional magnetic resonance imaging (fMRI) studies, LGN response increased when participants processed fast-varying features of articulatory movements required for visual-speech recognition, as compared to temporally more stable features required for face identification with the same stimulus material. The LGN response during the visual-speech task correlated positively with the visual-speech recognition scores across participants. In addition, the task-dependent modulation was present for speech movements and did not occur for control conditions involving non-speech biological movements. In face-to-face communication, visual speech recognition is used to enhance or even enable understanding what is said. Speech recognition is commonly explained in frameworks focusing on cerebral cortex areas. Our findings suggest that task-dependent modulation at subcortical sensory stages has an important role for communication: Together with similar findings in the auditory modality the findings imply that task-dependent modulation of the sensory thalami is a general mechanism to optimize speech recognition. Copyright © 2018. Published by Elsevier Inc.
Aging and the Vulnerability of Speech to Dual Task Demands
Kemper, Susan; Schmalzried, RaLynn; Hoffman, Lesa; Herman, Ruth
2010-01-01
Tracking a digital pursuit rotor task was used to measure dual task costs of language production by young and older adults. Tracking performance by both groups was affected by dual task demands: time on target declined and tracking error increased as dual task demands increased from the baseline condition to a moderately demanding dual task condition to a more demanding dual task condition. When dual task demands were moderate, older adults’ speech rate declined but their fluency, grammatical complexity, and content were unaffected. When the dual task was more demanding, older adults’ speech, like young adults’ speech, became highly fragmented, ungrammatical, and incoherent. Vocabulary, working memory, processing speed, and inhibition affected vulnerability to dual task costs: vocabulary provided some protection for sentence length and grammaticality, working memory conferred some protection for grammatical complexity, and processing speed provided some protection for speech rate, propositional density, coherence, and lexical diversity. Further, vocabulary and working memory capacity provided more protection for older adults than for young adults although the protective effect of processing speed was somewhat reduced for older adults as compared to the young adults. PMID:21186917
Relationship between Speech Production and Perception in People Who Stutter
Lu, Chunming; Long, Yuhang; Zheng, Lifen; Shi, Guang; Liu, Li; Ding, Guosheng; Howell, Peter
2016-01-01
Speech production difficulties are apparent in people who stutter (PWS). PWS also have difficulties in speech perception compared to controls. It is unclear whether the speech perception difficulties in PWS are independent of, or related to, their speech production difficulties. To investigate this issue, functional MRI data were collected on 13 PWS and 13 controls whilst the participants performed a speech production task and a speech perception task. PWS performed poorer than controls in the perception task and the poorer performance was associated with a functional activity difference in the left anterior insula (part of the speech motor area) compared to controls. PWS also showed a functional activity difference in this and the surrounding area [left inferior frontal cortex (IFC)/anterior insula] in the production task compared to controls. Conjunction analysis showed that the functional activity differences between PWS and controls in the left IFC/anterior insula coincided across the perception and production tasks. Furthermore, Granger Causality Analysis on the resting-state fMRI data of the participants showed that the causal connection from the left IFC/anterior insula to an area in the left primary auditory cortex (Heschl’s gyrus) differed significantly between PWS and controls. The strength of this connection correlated significantly with performance in the perception task. These results suggest that speech perception difficulties in PWS are associated with anomalous functional activity in the speech motor area, and the altered functional connectivity from this area to the auditory area plays a role in the speech perception difficulties of PWS. PMID:27242487
ERIC Educational Resources Information Center
Casini, Laurence; Burle, Boris; Nguyen, Noel
2009-01-01
Time is essential to speech. The duration of speech segments plays a critical role in the perceptual identification of these segments, and therefore in that of spoken words. Here, using a French word identification task, we show that vowels are perceived as shorter when attention is divided between two tasks, as compared to a single task control…
Children's perception of their synthetically corrected speech production.
Strömbergsson, Sofia; Wengelin, Asa; House, David
2014-06-01
We explore children's perception of their own speech - in its online form, in its recorded form, and in synthetically modified forms. Children with phonological disorder (PD) and children with typical speech and language development (TD) performed tasks of evaluating accuracy of the different types of speech stimuli, either immediately after having produced the utterance or after a delay. In addition, they performed a task designed to assess their ability to detect synthetic modification. Both groups showed high performance in tasks involving evaluation of other children's speech, whereas in tasks of evaluating one's own speech, the children with PD were less accurate than their TD peers. The children with PD were less sensitive to misproductions in immediate conjunction with their production of an utterance, and more accurate after a delay. Within-category modification often passed undetected, indicating a satisfactory quality of the generated speech. Potential clinical benefits of using corrective re-synthesis are discussed.
A Nonword Repetition Task for Speakers with Misarticulations: The Syllable Repetition Task (SRT)
Shriberg, Lawrence D.; Lohmeier, Heather L.; Campbell, Thomas F.; Dollaghan, Christine A.; Green, Jordan R.; Moore, Christopher A.
2010-01-01
Purpose Conceptual and methodological confounds occur when non(sense) repetition tasks are administered to speakers who do not have the target speech sounds in their phonetic inventories or who habitually misarticulate targeted speech sounds. We describe a nonword repetition task, the Syllable Repetiton Task (SRT) that eliminates this confound and report findings from three validity studies. Method Ninety-five preschool children with Speech Delay and 63 with Typical Speech, completed an assessment battery that included the Nonword Repetition Task (NRT: Dollaghan & Campbell, 1998) and the SRT. SRT stimuli include only four of the earliest occurring consonants and one early occurring vowel. Results Study 1 findings indicated that the SRT eliminated the speech confound in nonword testing with speakers who misarticulate. Study 2 findings indicated that the accuracy of the SRT to identify expressive language impairment was comparable to findings for the NRT. Study 3 findings illustrated the SRT’s potential to interrogate speech processing constraints underlying poor nonword repetition accuracy. Results supported both memorial and auditory-perceptual encoding constraints underlying nonword repetition errors in children with speech-language impairment. Conclusion The SRT appears to be a psychometrically stable and substantively informative nonword repetition task for emerging genetic and other research with speakers who misarticulate. PMID:19635944
Jezova, D; Hlavacova, N; Dicko, I; Solarikova, P; Brezina, I
2016-07-01
Repeated or chronic exposure to stressors is associated with changes in neuroendocrine responses depending on the type, intensity, number and frequency of stress exposure as well as previous stress experience. The aim of the study was to test the hypothesis that salivary cortisol and cardiovascular responses to real-life psychosocial stressors related to public performance can cross-adapt with responses to psychosocial stress induced by public speech under laboratory setting. The sample consisted of 22 healthy male volunteers, which were either actors, more precisely students of dramatic arts or non-actors, students of other fields. The stress task consisted of 15 min anticipatory preparation phase and 15 min of public speech on an emotionally charged topic. The actors, who were accustomed to public speaking, responded with a rise in salivary cortisol as well as blood pressure to laboratory public speech. The values of salivary cortisol, systolic blood pressure and state anxiety were lower in actors compared to non-actors. Unlike non-actors, subjects with experience in public speaking did not show stress-induced rise in the heart rate. Evaluation of personality traits revealed that actors scored significantly higher in extraversion than the subjects in the non-actor group. In conclusion, neuroendocrine responses to real-life stressors in actors can partially cross-adapt with responses to psychosocial stress under laboratory setting. The most evident adaptation was at the level of heart rate responses. The public speech tasks may be of help in evaluation of the ability to cope with stress in real life in artists by simple laboratory testing.
Cognitive control components and speech symptoms in people with schizophrenia.
Becker, Theresa M; Cicero, David C; Cowan, Nelson; Kerns, John G
2012-03-30
Previous schizophrenia research suggests poor cognitive control is associated with schizophrenia speech symptoms. However, cognitive control is a broad construct. Two important cognitive control components are poor goal maintenance and poor verbal working memory storage. In the current research, people with schizophrenia (n=45) performed three cognitive tasks that varied in their goal maintenance and verbal working memory storage demands. Speech symptoms were assessed using clinical rating scales, ratings of disorganized speech from typed transcripts, and self-reported disorganization. Overall, alogia was associated with both goal maintenance and verbal working memory tasks. Objectively rated disorganized speech was associated with poor goal maintenance and with a task that included both goal maintenance and verbal working memory storage demands. In contrast, self-reported disorganization was unrelated to either amount of objectively rated disorganized speech or to cognitive control task performance, instead being associated with negative mood symptoms. Overall, our results suggest that alogia is associated with both poor goal maintenance and poor verbal working memory storage and that disorganized speech is associated with poor goal maintenance. In addition, patients' own assessment of their disorganization is related to negative mood, but perhaps not to objective disorganized speech or to cognitive control task performance. Published by Elsevier Ireland Ltd.
Bailey, Dallin J; Blomgren, Michael; DeLong, Catharine; Berggren, Kiera; Wambaugh, Julie L
2017-06-22
The purpose of this article is to quantify and describe stuttering-like disfluencies in speakers with acquired apraxia of speech (AOS), utilizing the Lidcombe Behavioural Data Language (LBDL). Additional purposes include measuring test-retest reliability and examining the effect of speech sample type on disfluency rates. Two types of speech samples were elicited from 20 persons with AOS and aphasia: repetition of mono- and multisyllabic words from a protocol for assessing AOS (Duffy, 2013), and connected speech tasks (Nicholas & Brookshire, 1993). Sampling was repeated at 1 and 4 weeks following initial sampling. Stuttering-like disfluencies were coded using the LBDL, which is a taxonomy that focuses on motoric aspects of stuttering. Disfluency rates ranged from 0% to 13.1% for the connected speech task and from 0% to 17% for the word repetition task. There was no significant effect of speech sampling time on disfluency rate in the connected speech task, but there was a significant effect of time for the word repetition task. There was no significant effect of speech sample type. Speakers demonstrated both major types of stuttering-like disfluencies as categorized by the LBDL (fixed postures and repeated movements). Connected speech samples yielded more reliable tallies over repeated measurements. Suggestions are made for modifying the LBDL for use in AOS in order to further add to systematic descriptions of motoric disfluencies in this disorder.
Whitfield, Jason A; Goberman, Alexander M
2017-06-22
Everyday communication is carried out concurrently with other tasks. Therefore, determining how dual tasks interfere with newly learned speech motor skills can offer insight into the cognitive mechanisms underlying speech motor learning in Parkinson disease (PD). The current investigation examines a recently learned speech motor sequence under dual-task conditions. A previously learned sequence of 6 monosyllabic nonwords was examined using a dual-task paradigm. Participants repeated the sequence while concurrently performing a visuomotor task, and performance on both tasks was measured in single- and dual-task conditions. The younger adult group exhibited little to no dual-task interference on the accuracy and duration of the sequence. The older adult group exhibited variability in dual-task costs, with the group as a whole exhibiting an intermediate, though significant, amount of dual-task interference. The PD group exhibited the largest degree of bidirectional dual-task interference among all the groups. These data suggest that PD affects the later stages of speech motor learning, as the dual-task condition interfered with production of the recently learned sequence beyond the effect of normal aging. Because the basal ganglia is critical for the later stages of motor sequence learning, the observed deficits may result from the underlying neural dysfunction associated with PD.
Rethinking the connection between working memory and language impairment.
Archibald, Lisa M D; Harder Griebeling, Katherine
2016-05-01
Working memory deficits have been found for children with specific language impairment (SLI) on tasks imposing increasing short-term memory load with or without additional, consistent (and simple) processing load. To examine the processing function of working memory in children with low language (LL) by employing tasks imposing increasing processing loads with constant storage demands individually adjusted based on each participant's short-term memory capacity. School-age groups with LL (n = 17) and typical language with either average (n = 28) or above-average nonverbal intelligence (n = 15) completed complex working memory-span tasks varying processing load while keeping storage demands constant, varying storage demands while keeping processing load constant, simple storage-span tasks, and measures of language and nonverbal intelligence. Teachers completed questionnaires about cognition and learning. Significantly lower scores were found for the LL than either matched group on storage-based tasks, but no group differences were found on the tasks varying processing load. Teachers' ratings of oral expression and mathematics abilities discriminated those who did or did not complete the most challenging cognitive tasks. The results implicate a deficit in the phonological storage but not in the central executive component of working memory for children with LL. Teacher ratings may reveal personality traits related to perseverance of effort in cognitive research. © 2015 Royal College of Speech and Language Therapists.
Effects of Concurrent Motor, Linguistic, or Cognitive Tasks on Speech Motor Performance
ERIC Educational Resources Information Center
Dromey, Christopher; Benson, April
2003-01-01
This study examined the influence of 3 different types of concurrent tasks on speech motor performance. The goal was to uncover potential differences in speech movements relating to the nature of the secondary task. Twenty young adults repeated sentences either with or without simultaneous distractor activities. These distractions included a motor…
Autonomic Correlates of Speech versus Nonspeech Tasks in Children and Adults
ERIC Educational Resources Information Center
Arnold, Hayley S.; MacPherson, Megan K.; Smith, Anne
2014-01-01
Purpose: To assess autonomic arousal associated with speech and nonspeech tasks in school-age children and young adults. Method: Measures of autonomic arousal (electrodermal level, electrodermal response amplitude, blood pulse volume, and heart rate) were recorded prior to, during, and after the performance of speech and nonspeech tasks by twenty…
Audiovisual speech perception development at varying levels of perceptual processing
Lalonde, Kaylah; Holt, Rachael Frush
2016-01-01
This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children. PMID:27106318
Audiovisual speech perception development at varying levels of perceptual processing.
Lalonde, Kaylah; Holt, Rachael Frush
2016-04-01
This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children.
Oral Motor Abilities Are Task Dependent: A Factor Analytic Approach to Performance Rate.
Staiger, Anja; Schölderle, Theresa; Brendel, Bettina; Bötzel, Kai; Ziegler, Wolfram
2017-01-01
Measures of performance rates in speech-like or volitional nonspeech oral motor tasks are frequently used to draw inferences about articulation rate abnormalities in patients with neurologic movement disorders. The study objective was to investigate the structural relationship between rate measures of speech and of oral motor behaviors different from speech. A total of 130 patients with neurologic movement disorders and 130 healthy subjects participated in the study. Rate data was collected for oral reading (speech), rapid syllable repetition (speech-like), and rapid single articulator movements (nonspeech). The authors used factor analysis to determine whether the different rate variables reflect the same or distinct constructs. The behavioral data were most appropriately captured by a measurement model in which the different task types loaded onto separate latent variables. The data on oral motor performance rates show that speech tasks and oral motor tasks such as rapid syllable repetition or repetitive single articulator movements measure separate traits.
Strategies to combat auditory overload during vehicular command and control.
Abel, Sharon M; Ho, Geoffrey; Nakashima, Ann; Smith, Ingrid
2014-09-01
Strategies to combat auditory overload were studied. Normal-hearing males were tested in a sound isolated room in a mock-up of a military land vehicle. Two tasks were presented concurrently, in quiet and vehicle noise. For Task 1 dichotic phrases were delivered over a communications headset. Participants encoded only those beginning with a preassigned call sign (Baron or Charlie). For Task 2, they agreed or disagreed with simple equations presented either over loudspeakers, as text on the laptop monitor, in both the audio and the visual modalities, or not at all. Accuracy was significantly better by 20% on Task 2 when the equations were presented visually or audiovisually. Scores were at least 78% correct for dichotic phrases presented over the headset, with a right ear advantage of 7%, given the 5 dB speech-to-noise ratio. The left ear disadvantage was particularly apparent in noise, where the interaural difference was 12%. Relatively lower scores in the left ear, in noise, were observed for phrases beginning with Charlie. These findings underscore the benefit of delivering higher priority communications to the dominant ear, the importance of selecting speech sounds that are resilient to noise masking, and the advantage of using text in cases of degraded audio. Reprint & Copyright © 2014 Association of Military Surgeons of the U.S.
Marsh, John E; Yang, Jingqi; Qualter, Pamela; Richardson, Cassandra; Perham, Nick; Vachon, François; Hughes, Robert W
2018-06-01
Task-irrelevant speech impairs short-term serial recall appreciably. On the interference-by-process account, the processing of physical (i.e., precategorical) changes in speech yields order cues that conflict with the serial-ordering process deployed to perform the serial recall task. In this view, the postcategorical properties (e.g., phonology, meaning) of speech play no role. The present study reassessed the implications of recent demonstrations of auditory postcategorical distraction in serial recall that have been taken as support for an alternative, attentional-diversion, account of the irrelevant speech effect. Focusing on the disruptive effect of emotionally valent compared with neutral words on serial recall, we show that the distracter-valence effect is eliminated under conditions-high task-encoding load-thought to shield against attentional diversion whereas the general effect of speech (neutral words compared with quiet) remains unaffected (Experiment 1). Furthermore, the distracter-valence effect generalizes to a task that does not require the processing of serial order-the missing-item task-whereas the effect of speech per se is attenuated in this task (Experiment 2). We conclude that postcategorical auditory distraction phenomena in serial short-term memory (STM) are incidental: they are observable in such a setting but, unlike the acoustically driven irrelevant speech effect, are not integral to it. As such, the findings support a duplex-mechanism account over a unitary view of auditory distraction. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
ERIC Educational Resources Information Center
Lockart, Rebekah; McLeod, Sharynne
2013-01-01
Purpose: To investigate speech-language pathology students' ability to identify errors and transcribe typical and atypical speech in Cantonese, a nonnative language. Method: Thirty-three English-speaking speech-language pathology students completed 3 tasks in an experimental within-subjects design. Results: Task 1 (baseline) involved transcribing…
ERIC Educational Resources Information Center
Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.
2012-01-01
In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…
Automated Discovery of Speech Act Categories in Educational Games
ERIC Educational Resources Information Center
Rus, Vasile; Moldovan, Cristian; Niraula, Nobal; Graesser, Arthur C.
2012-01-01
In this paper we address the important task of automated discovery of speech act categories in dialogue-based, multi-party educational games. Speech acts are important in dialogue-based educational systems because they help infer the student speaker's intentions (the task of speech act classification) which in turn is crucial to providing adequate…
ERIC Educational Resources Information Center
Bailey, Dallin J.; Dromey, Christopher
2015-01-01
Purpose: The purpose of this study was to examine divided attention over a large age range by looking at the effects of 3 nonspeech tasks on concurrent speech motor performance. The nonspeech tasks were designed to facilitate measurement of bidirectional interference, allowing examination of their sensitivity to speech activity. A cross-sectional…
Syntactic processing as a marker for cognitive impairment in amyotrophic lateral sclerosis
Tsermentseli, Stella; Leigh, P. Nigel; Taylor, Lorna J.; Radunovic, Aleksandar; Catani, Marco; Goldstein, Laura H.
2016-01-01
Despite recent interest in cognitive changes in patients with amyotrophic lateral sclerosis (ALS), investigations of language function looking at the level of word, sentence and discourse processing are relatively scarce. Data were obtained from 26 patients with sporadic ALS and 26 healthy controls matched for age, education, gender, anxiety, depression and executive function performance. Standardized language tasks included confrontation naming, semantic access, and syntactic comprehension. Quantitative production analysis (QPA) was used to analyse connected speech samples of the Cookie Theft picture description task. Results showed that the ALS patients were impaired on standardized measures of grammatical comprehension and action/verb semantics. At the level of discourse, ALS patients were impaired on measures of syntactic complexity and fluency; however, the latter could be better explained by disease related factors. Discriminant analysis revealed that syntactic measures differentiated ALS patients from controls. In conclusion, patients with ALS exhibit deficits in receptive and expressive language on tasks of comprehension and connected speech production, respectively. Our findings suggest that syntactic processing deficits seem to be the predominant feature of language impairment in ALS and that these deficits can be detected by relatively simple language tests. PMID:26312952
Syntactic processing as a marker for cognitive impairment in amyotrophic lateral sclerosis.
Tsermentseli, Stella; Leigh, P Nigel; Taylor, Lorna J; Radunovic, Aleksandar; Catani, Marco; Goldstein, Laura H
2015-01-01
Despite recent interest in cognitive changes in patients with amyotrophic lateral sclerosis (ALS), investigations of language function looking at the level of word, sentence and discourse processing are relatively scarce. Data were obtained from 26 patients with sporadic ALS and 26 healthy controls matched for age, education, gender, anxiety, depression and executive function performance. Standardized language tasks included confrontation naming, semantic access, and syntactic comprehension. Quantitative production analysis (QPA) was used to analyse connected speech samples of the Cookie Theft picture description task. Results showed that the ALS patients were impaired on standardized measures of grammatical comprehension and action/verb semantics. At the level of discourse, ALS patients were impaired on measures of syntactic complexity and fluency; however, the latter could be better explained by disease related factors. Discriminant analysis revealed that syntactic measures differentiated ALS patients from controls. In conclusion, patients with ALS exhibit deficits in receptive and expressive language on tasks of comprehension and connected speech production, respectively. Our findings suggest that syntactic processing deficits seem to be the predominant feature of language impairment in ALS and that these deficits can be detected by relatively simple language tests.
Messaoud-Galusi, Souhila; Hazan, Valerie; Rosen, Stuart
2012-01-01
Purpose The claim that speech perception abilities are impaired in dyslexia was investigated in a group of 62 dyslexic children and 51 average readers matched in age. Method To test whether there was robust evidence of speech perception deficits in children with dyslexia, speech perception in noise and quiet was measured using eight different tasks involving the identification and discrimination of a complex and highly natural synthetic ‘pea’-‘bee’ contrast (copy synthesised from natural models) and the perception of naturally-produced words. Results Children with dyslexia, on average, performed more poorly than average readers in the synthetic syllables identification task in quiet and in across-category discrimination (but not when tested using an adaptive procedure). They did not differ from average readers on two tasks of word recognition in noise or identification of synthetic syllables in noise. For all tasks, a majority of individual children with dyslexia performed within norms. Finally, speech perception generally did not correlate with pseudo-word reading or phonological processing, the core skills related to dyslexia. Conclusions On the tasks and speech stimuli we used, most children with dyslexia do not appear to show a consistent deficit in speech perception. PMID:21930615
Automatic processing of tones and speech stimuli in children with specific language impairment.
Uwer, Ruth; Albrecht, Ronald; von Suchodoletz, W
2002-08-01
It is well known from behavioural experiments that children with specific language impairment (SLI) have difficulties discriminating consonant-vowel (CV) syllables such as /ba/, /da/, and /ga/. Mismatch negativity (MMN) is an auditory event-related potential component that represents the outcome of an automatic comparison process. It could, therefore, be a promising tool for assessing central auditory processing deficits for speech and non-speech stimuli in children with SLI. MMN is typically evoked by occasionally occurring 'deviant' stimuli in a sequence of identical 'standard' sounds. In this study MMN was elicited using simple tone stimuli, which differed in frequency (1000 versus 1200 Hz) and duration (175 versus 100 ms) and to digitized CV syllables which differed in place of articulation (/ba/, /da/, and /ga/) in children with expressive and receptive SLI and healthy control children (n=21 in each group, 46 males and 17 females; age range 5 to 10 years). Mean MMN amplitudes between groups were compared. Additionally, the behavioural discrimination performance was assessed. Children with SLI had attenuated MMN amplitudes to speech stimuli, but there was no significant difference between the two diagnostic subgroups. MMN to tone stimuli did not differ between the groups. Children with SLI made more errors in the discrimination task, but discrimination scores did not correlate with MMN amplitudes. The present data suggest that children with SLI show a specific deficit in automatic discrimination of CV syllables differing in place of articulation, whereas the processing of simple tone differences seems to be unimpaired.
The speech perception skills of children with and without speech sound disorder.
Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie
To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
FEENAUGHTY, LYNDA; TJADEN, KRIS; BENEDICT, RALPH H.B.; WEINSTOCK-GUTTMAN, BIANCA
2017-01-01
This preliminary study investigated how cognitive-linguistic status in multiple sclerosis (MS) is reflected in two speech tasks (i.e. oral reading, narrative) that differ in cognitive-linguistic demand. Twenty individuals with MS were selected to comprise High and Low performance groups based on clinical tests of executive function and information processing speed and efficiency. Ten healthy controls were included for comparison. Speech samples were audio-recorded and measures of global speech timing were obtained. Results indicated predicted differences in global speech timing (i.e. speech rate and pause characteristics) for speech tasks differing in cognitive-linguistic demand, but the magnitude of these task-related differences was similar for all speaker groups. Findings suggest that assumptions concerning the cognitive-linguistic demands of reading aloud as compared to spontaneous speech may need to be re-considered for individuals with cognitive impairment. Qualitative trends suggest that additional studies investigating the association between cognitive-linguistic and speech motor variables in MS are warranted. PMID:23294227
Pals, Carina; Sarampalis, Anastasios; van Dijk, Mart; Başkent, Deniz
2018-05-11
Residual acoustic hearing in electric-acoustic stimulation (EAS) can benefit cochlear implant (CI) users in increased sound quality, speech intelligibility, and improved tolerance to noise. The goal of this study was to investigate whether the low-pass-filtered acoustic speech in simulated EAS can provide the additional benefit of reducing listening effort for the spectrotemporally degraded signal of noise-band-vocoded speech. Listening effort was investigated using a dual-task paradigm as a behavioral measure, and the NASA Task Load indeX as a subjective self-report measure. The primary task of the dual-task paradigm was identification of sentences presented in three experiments at three fixed intelligibility levels: at near-ceiling, 50%, and 79% intelligibility, achieved by manipulating the presence and level of speech-shaped noise in the background. Listening effort for the primary intelligibility task was reflected in the performance on the secondary, visual response time task. Experimental speech processing conditions included monaural or binaural vocoder, with added low-pass-filtered speech (to simulate EAS) or without (to simulate CI). In Experiment 1, in quiet with intelligibility near-ceiling, additional low-pass-filtered speech reduced listening effort compared with binaural vocoder, in line with our expectations, although not compared with monaural vocoder. In Experiments 2 and 3, for speech in noise, added low-pass-filtered speech allowed the desired intelligibility levels to be reached at less favorable speech-to-noise ratios, as expected. It is interesting that this came without the cost of increased listening effort usually associated with poor speech-to-noise ratios; at 50% intelligibility, even a reduction in listening effort on top of the increased tolerance to noise was observed. The NASA Task Load indeX did not capture these differences. The dual-task results provide partial evidence for a potential decrease in listening effort as a result of adding low-frequency acoustic speech to noise-band-vocoded speech. Whether these findings translate to CI users with residual acoustic hearing will need to be addressed in future research because the quality and frequency range of low-frequency acoustic sound available to listeners with hearing loss may differ from our idealized simulations, and additional factors, such as advanced age and varying etiology, may also play a role.This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
ERIC Educational Resources Information Center
Eichorn, Naomi; Marton, Klara; Schwartz, Richard G.; Melara, Robert D.; Pirutinsky, Steven
2016-01-01
Purpose: The present study examined whether engaging working memory in a secondary task benefits speech fluency. Effects of dual-task conditions on speech fluency, rate, and errors were examined with respect to predictions derived from three related theoretical accounts of disfluencies. Method: Nineteen adults who stutter and twenty adults who do…
Masking release for words in amplitude-modulated noise as a function of modulation rate and task
Buss, Emily; Whittle, Lisa N.; Grose, John H.; Hall, Joseph W.
2009-01-01
For normal-hearing listeners, masked speech recognition can improve with the introduction of masker amplitude modulation. The present experiments tested the hypothesis that this masking release is due in part to an interaction between the temporal distribution of cues necessary to perform the task and the probability of those cues temporally coinciding with masker modulation minima. Stimuli were monosyllabic words masked by speech-shaped noise, and masker modulation was introduced via multiplication with a raised sinusoid of 2.5–40 Hz. Tasks included detection, three-alternative forced-choice identification, and open-set identification. Overall, there was more masking release associated with the closed than the open-set tasks. The best rate of modulation also differed as a function of task; whereas low modulation rates were associated with best performance for the detection and three-alternative identification tasks, performance improved with modulation rate in the open-set task. This task-by-rate interaction was also observed when amplitude-modulated speech was presented in a steady masker, and for low- and high-pass filtered speech presented in modulated noise. These results were interpreted as showing that the optimal rate of amplitude modulation depends on the temporal distribution of speech cues and the information required to perform a particular task. PMID:19603883
Huber, Jessica E.; Darling, Meghan
2012-01-01
Purpose The purpose of the present study was to examine the effects of cognitive-linguistic deficits and respiratory physiologic changes on respiratory support for speech in PD, using two speech tasks, reading and extemporaneous speech. Methods Five women with PD, 9 men with PD, and 14 age- and sex-matched control participants read a passage and spoke extemporaneously on a topic of their choice at comfortable loudness. Sound pressure level, syllables per breath group, speech rate, and lung volume parameters were measured. Number of formulation errors, disfluencies, and filled pauses were counted. Results Individuals with PD produced shorter utterances as compared to control participants. The relationships between utterance length and lung volume initiation and inspiratory duration were weaker in individuals with PD than for control participants, particularly for the extemporaneous speech task. These results suggest less consistent planning for utterance length by individuals with PD in extemporaneous speech. Individuals with PD produced more formulation errors in both tasks and significantly fewer filled pauses in extemporaneous speech. Conclusions Both respiratory physiologic and cognitive-linguistic issues affected speech production by individuals with PD. Overall, individuals with PD had difficulty planning or coordinating language formulation and respiratory support, in particular during extemporaneous speech. PMID:20844256
Speech processing and production in two-year-old children acquiring isiXhosa: A tale of two children
Rossouw, Kate; Fish, Laura; Jansen, Charne; Manley, Natalie; Powell, Michelle; Rosen, Loren
2016-01-01
We investigated the speech processing and production of 2-year-old children acquiring isiXhosa in South Africa. Two children (2 years, 5 months; 2 years, 8 months) are presented as single cases. Speech input processing, stored phonological knowledge and speech output are described, based on data from auditory discrimination, naming, and repetition tasks. Both children were approximating adult levels of accuracy in their speech output, although naming was constrained by vocabulary. Performance across tasks was variable: One child showed a relative strength with repetition, and experienced most difficulties with auditory discrimination. The other performed equally well in naming and repetition, and obtained 100% for her auditory task. There is limited data regarding typical development of isiXhosa, and the focus has mainly been on speech production. This exploratory study describes typical development of isiXhosa using a variety of tasks understood within a psycholinguistic framework. We describe some ways in which speech and language therapists can devise and carry out assessment with children in situations where few formal assessments exist, and also detail the challenges of such work. PMID:27245131
The impact of phonetic dissimilarity on the perception of foreign accented speech
NASA Astrophysics Data System (ADS)
Weil, Shawn A.
2003-10-01
Non-normative speech (i.e., synthetic speech, pathological speech, foreign accented speech) is more difficult to process for native listeners than is normative speech. Does perceptual dissimilarity affect only intelligibility, or are there other costs to processing? The current series of experiments investigates both the intelligibility and time course of foreign accented speech (FAS) perception. Native English listeners heard single English words spoken by both native English speakers and non-native speakers (Mandarin or Russian). Words were chosen based on the similarity between the phonetic inventories of the respective languages. Three experimental designs were used: a cross-modal matching task, a word repetition (shadowing) task, and two subjective ratings tasks which measured impressions of accentedness and effortfulness. The results replicate previous investigations that have found that FAS significantly lowers word intelligibility. Furthermore, in FAS as well as perceptual effort, in the word repetition task, correct responses are slower to accented words than to nonaccented words. An analysis indicates that both intelligibility and reaction time are, in part, functions of the similarity between the talker's utterance and the listener's representation of the word.
The role of Broca's area in speech perception: evidence from aphasia revisited.
Hickok, Gregory; Costanzo, Maddalena; Capasso, Rita; Miceli, Gabriele
2011-12-01
Motor theories of speech perception have been re-vitalized as a consequence of the discovery of mirror neurons. Some authors have even promoted a strong version of the motor theory, arguing that the motor speech system is critical for perception. Part of the evidence that is cited in favor of this claim is the observation from the early 1980s that individuals with Broca's aphasia, and therefore inferred damage to Broca's area, can have deficits in speech sound discrimination. Here we re-examine this issue in 24 patients with radiologically confirmed lesions to Broca's area and various degrees of associated non-fluent speech production. Patients performed two same-different discrimination tasks involving pairs of CV syllables, one in which both CVs were presented auditorily, and the other in which one syllable was auditorily presented and the other visually presented as an orthographic form; word comprehension was also assessed using word-to-picture matching tasks in both auditory and visual forms. Discrimination performance on the all-auditory task was four standard deviations above chance, as measured using d', and was unrelated to the degree of non-fluency in the patients' speech production. Performance on the auditory-visual task, however, was worse than, and not correlated with, the all-auditory task. The auditory-visual task was related to the degree of speech non-fluency. Word comprehension was at ceiling for the auditory version (97% accuracy) and near ceiling for the orthographic version (90% accuracy). We conclude that the motor speech system is not necessary for speech perception as measured both by discrimination and comprehension paradigms, but may play a role in orthographic decoding or in auditory-visual matching of phonological forms. 2011 Elsevier Inc. All rights reserved.
Reichenbach, Chagit S.; Braiman, Chananel; Schiff, Nicholas D.; Hudspeth, A. J.; Reichenbach, Tobias
2016-01-01
The auditory-brainstem response (ABR) to short and simple acoustical signals is an important clinical tool used to diagnose the integrity of the brainstem. The ABR is also employed to investigate the auditory brainstem in a multitude of tasks related to hearing, such as processing speech or selectively focusing on one speaker in a noisy environment. Such research measures the response of the brainstem to short speech signals such as vowels or words. Because the voltage signal of the ABR has a tiny amplitude, several hundred to a thousand repetitions of the acoustic signal are needed to obtain a reliable response. The large number of repetitions poses a challenge to assessing cognitive functions due to neural adaptation. Here we show that continuous, non-repetitive speech, lasting several minutes, may be employed to measure the ABR. Because the speech is not repeated during the experiment, the precise temporal form of the ABR cannot be determined. We show, however, that important structural features of the ABR can nevertheless be inferred. In particular, the brainstem responds at the fundamental frequency of the speech signal, and this response is modulated by the envelope of the voiced parts of speech. We accordingly introduce a novel measure that assesses the ABR as modulated by the speech envelope, at the fundamental frequency of speech and at the characteristic latency of the response. This measure has a high signal-to-noise ratio and can hence be employed effectively to measure the ABR to continuous speech. We use this novel measure to show that the ABR is weaker to intelligible speech than to unintelligible, time-reversed speech. The methods presented here can be employed for further research on speech processing in the auditory brainstem and can lead to the development of future clinical diagnosis of brainstem function. PMID:27303286
Revisiting speech rate and utterance length manipulations in stuttering speakers.
Blomgren, Michael; Goberman, Alexander M
2008-01-01
The goal of this study was to evaluate stuttering frequency across a multidimensional (2x2) hierarchy of speech performance tasks. Specifically, this study examined the interaction between changes in length of utterance and levels of speech rate stability. Forty-four adult male speakers participated in the study (22 stuttering speakers and 22 non-stuttering speakers). Participants were audio and video recorded while producing a spontaneous speech task and four different experimental speaking tasks. The four experimental speaking tasks involved reading a list of 45 words and a list 45 phrases two times each. One reading of each list involved speaking at a steady habitual rate (habitual rate tasks) and another reading involved producing each list at a variable speaking rate (variable rate tasks). For the variable rate tasks, participants were directed to produce words or phrases at randomly ordered slow, habitual, and fast rates. The stuttering speakers exhibited significantly more stuttering on the variable rate tasks than on the habitual rate tasks. In addition, the stuttering speakers exhibited significantly more stuttering on the first word of the phrase length tasks compared to the single word tasks. Overall, the results indicated that varying levels of both utterance length and temporal complexity function to modulate stuttering frequency in adult stuttering speakers. Discussion focuses on issues of speech performance according to stuttering severity and possible clinical implications. The reader will learn about and be able to: (1) describe the mediating effects of length of utterance and speech rate on the frequency of stuttering in stuttering speakers; (2) understand the rationale behind multidimensional skill performance matrices; and (3) describe possible applications of motor skill performance matrices to stuttering therapy.
Breath Group Analysis for Reading and Spontaneous Speech in Healthy Adults
Wang, Yu-Tsai; Green, Jordan R.; Nip, Ignatius S.B.; Kent, Ray D.; Kent, Jane Finley
2010-01-01
Aims The breath group can serve as a functional unit to define temporal and fundamental frequency (f0) features in continuous speech. These features of the breath group are determined by the physiologic, linguistic, and cognitive demands of communication. Reading and spontaneous speech are two speaking tasks that vary in these demands and are commonly used to evaluate speech performance for research and clinical applications. The purpose of this study is to examine differences between reading and spontaneous speech in the temporal and f0 aspects of their breath groups. Methods Sixteen participants read two passages and answered six questions while wearing a circumferentially vented mask connected to a pneumotach. The aerodynamic signal was used to identify inspiratory locations. The audio signal was used to analyze task differences in breath group structure, including temporal and f0 components. Results The main findings were that spontaneous speech task exhibited significantly more grammatically inappropriate breath group locations and longer breath group duration than did the passage reading task. Conclusion The task differences in the percentage of grammatically inadequate breath group locations and in breath group duration for healthy adult speakers partly explain the differences in cognitive-linguistic load between the passage reading and spontaneous speech. PMID:20588052
Breath group analysis for reading and spontaneous speech in healthy adults.
Wang, Yu-Tsai; Green, Jordan R; Nip, Ignatius S B; Kent, Ray D; Kent, Jane Finley
2010-01-01
The breath group can serve as a functional unit to define temporal and fundamental frequency (f0) features in continuous speech. These features of the breath group are determined by the physiologic, linguistic, and cognitive demands of communication. Reading and spontaneous speech are two speaking tasks that vary in these demands and are commonly used to evaluate speech performance for research and clinical applications. The purpose of this study is to examine differences between reading and spontaneous speech in the temporal and f0 aspects of their breath groups. Sixteen participants read two passages and answered six questions while wearing a circumferentially vented mask connected to a pneumotach. The aerodynamic signal was used to identify inspiratory locations. The audio signal was used to analyze task differences in breath group structure, including temporal and f0 components. The main findings were that spontaneous speech task exhibited significantly more grammatically inappropriate breath group locations and longer breath group duration than did the passage reading task. The task differences in the percentage of grammatically inadequate breath group locations and in breath group duration for healthy adult speakers partly explain the differences in cognitive-linguistic load between the passage reading and spontaneous speech. Copyright © 2010 S. Karger AG, Basel.
Individual differneces in degraded speech perception
NASA Astrophysics Data System (ADS)
Carbonell, Kathy M.
One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.
The normalities and abnormalities associated with speech in psychometrically-defined schizotypy.
Cohen, Alex S; Auster, Tracey L; McGovern, Jessica E; MacAulay, Rebecca K
2014-12-01
Speech deficits are thought to be an important feature of schizotypy--defined as the personality organization reflecting a putative liability for schizophrenia. There is reason to suspect that these deficits manifest as a function of limited cognitive resources. To evaluate this idea, we examined speech from individuals with psychometrically-defined schizotypy during a low cognitively-demanding task versus a relatively high cognitively-demanding task. A range of objective, computer-based measures of speech tapping speech production (silence, number and length of pauses, number and length of utterances), speech variability (global and local intonation and emphasis) and speech content (word fillers, idea density) were employed. Data for control (n=37) and schizotypy (n=39) groups were examined. Results did not confirm our hypotheses. While the cognitive-load task reduced speech expressivity for subjects as a group for most variables, the schizotypy group was not more pathological in speech characteristics compared to the control group. Interestingly, some aspects of speech in schizotypal versus control subjects were healthier under high cognitive load. Moreover, schizotypal subjects performed better, at a trend level, than controls on the cognitively demanding task. These findings hold important implications for our understanding of the neurocognitive architecture associated with the schizophrenia-spectrum. Of particular note concerns the apparent mismatch between self-reported schizotypal traits and objective performance, and the resiliency of speech under cognitive stress in persons with high levels of schizotypy. Copyright © 2014 Elsevier B.V. All rights reserved.
Hashizume, Hiroshi; Taki, Yasuyuki; Sassa, Yuko; Thyreau, Benjamin; Asano, Michiko; Asano, Kohei; Takeuchi, Hikaru; Nouchi, Rui; Kotozaki, Yuka; Jeong, Hyeonjeong; Sugiura, Motoaki; Kawashima, Ryuta
2014-08-01
Older children are more successful at producing unfamiliar, non-native speech sounds than younger children during the initial stages of learning. To reveal the neuronal underpinning of the age-related increase in the accuracy of non-native speech production, we examined the developmental changes in activation involved in the production of novel speech sounds using functional magnetic resonance imaging. Healthy right-handed children (aged 6-18 years) were scanned while performing an overt repetition task and a perceptual task involving aurally presented non-native and native syllables. Productions of non-native speech sounds were recorded and evaluated by native speakers. The mouth regions in the bilateral primary sensorimotor areas were activated more significantly during the repetition task relative to the perceptual task. The hemodynamic response in the left inferior frontal gyrus pars opercularis (IFG pOp) specific to non-native speech sound production (defined by prior hypothesis) increased with age. Additionally, the accuracy of non-native speech sound production increased with age. These results provide the first evidence of developmental changes in the neural processes underlying the production of novel speech sounds. Our data further suggest that the recruitment of the left IFG pOp during the production of novel speech sounds was possibly enhanced due to the maturation of the neuronal circuits needed for speech motor planning. This, in turn, would lead to improvement in the ability to immediately imitate non-native speech. Copyright © 2014 Wiley Periodicals, Inc.
Music and Speech Perception in Children Using Sung Speech
Nie, Yingjiu; Galvin, John J.; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie
2018-01-01
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners. PMID:29609496
Music and Speech Perception in Children Using Sung Speech.
Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie
2018-01-01
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Speech and nonspeech: What are we talking about?
Maas, Edwin
2017-08-01
Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.
Higgins, Meaghan C; Penney, Sarah B; Robertson, Erin K
2017-10-01
The roles of phonological short-term memory (pSTM) and speech perception in spoken sentence comprehension were examined in an experimental design. Deficits in pSTM and speech perception were simulated through task demands while typically-developing children (N [Formula: see text] 71) completed a sentence-picture matching task. Children performed the control, simulated pSTM deficit, simulated speech perception deficit, or simulated double deficit condition. On long sentences, the double deficit group had lower scores than the control and speech perception deficit groups, and the pSTM deficit group had lower scores than the control group and marginally lower scores than the speech perception deficit group. The pSTM and speech perception groups performed similarly to groups with real deficits in these areas, who completed the control condition. Overall, scores were lowest on noncanonical long sentences. Results show pSTM has a greater effect than speech perception on sentence comprehension, at least in the tasks employed here.
ERIC Educational Resources Information Center
Fraser, Sarah; Gagne, Jean-Pierre; Alepins, Majolaine; Dubois, Pascale
2010-01-01
Purpose: Using a dual-task paradigm, 2 experiments (Experiments 1 and 2) were conducted to assess differences in the amount of listening effort expended to understand speech in noise in audiovisual (AV) and audio-only (A-only) modalities. Experiment 1 had equivalent noise levels in both modalities, and Experiment 2 equated speech recognition…
The role of linguistic experience in the processing of probabilistic information in production.
Gustafson, Erin; Goldrick, Matthew
2018-01-01
Speakers track the probability that a word will occur in a particular context and utilize this information during phonetic processing. For example, content words that have high probability within a discourse tend to be realized with reduced acoustic/articulatory properties. Such probabilistic information may influence L1 and L2 speech processing in distinct ways (reflecting differences in linguistic experience across groups and the overall difficulty of L2 speech processing). To examine this issue, L1 and L2 speakers performed a referential communication task, describing sequences of simple actions. The two groups of speakers showed similar effects of discourse-dependent probabilistic information on production, suggesting that L2 speakers can successfully track discourse-dependent probabilities and use such information to modulate phonetic processing.
Relationship between listeners' nonnative speech recognition and categorization abilities
Atagi, Eriko; Bent, Tessa
2015-01-01
Enhancement of the perceptual encoding of talker characteristics (indexical information) in speech can facilitate listeners' recognition of linguistic content. The present study explored this indexical-linguistic relationship in nonnative speech processing by examining listeners' performance on two tasks: nonnative accent categorization and nonnative speech-in-noise recognition. Results indicated substantial variability across listeners in their performance on both the accent categorization and nonnative speech recognition tasks. Moreover, listeners' accent categorization performance correlated with their nonnative speech-in-noise recognition performance. These results suggest that having more robust indexical representations for nonnative accents may allow listeners to more accurately recognize the linguistic content of nonnative speech. PMID:25618098
Cognitive control and its impact on recovery from aphasic stroke
Warren, Jane E.; Geranmayeh, Fatemeh; Woodhead, Zoe; Leech, Robert; Wise, Richard J. S.
2014-01-01
Aphasic deficits are usually only interpreted in terms of domain-specific language processes. However, effective human communication and tests that probe this complex cognitive skill are also dependent on domain-general processes. In the clinical context, it is a pragmatic observation that impaired attention and executive functions interfere with the rehabilitation of aphasia. One system that is important in cognitive control is the salience network, which includes dorsal anterior cingulate cortex and adjacent cortex in the superior frontal gyrus (midline frontal cortex). This functional imaging study assessed domain-general activity in the midline frontal cortex, which was remote from the infarct, in relation to performance on a standard test of spoken language in 16 chronic aphasic patients both before and after a rehabilitation programme. During scanning, participants heard simple sentences, with each listening trial followed immediately by a trial in which they repeated back the previous sentence. Listening to sentences in the context of a listen–repeat task was expected to activate regions involved in both language-specific processes (speech perception and comprehension, verbal working memory and pre-articulatory rehearsal) and a number of task-specific processes (including attention to utterances and attempts to overcome pre-response conflict and decision uncertainty during impaired speech perception). To visualize the same system in healthy participants, sentences were presented to them as three-channel noise-vocoded speech, thereby impairing speech perception and assessing whether this evokes domain general cognitive systems. As expected, contrasting the more difficult task of perceiving and preparing to repeat noise-vocoded speech with the same task on clear speech demonstrated increased activity in the midline frontal cortex in the healthy participants. The same region was activated in the aphasic patients as they listened to standard (undistorted) sentences. Using a region of interest defined from the data on the healthy participants, data from the midline frontal cortex was obtained from the patients. Across the group and across different scanning sessions, activity correlated significantly with the patients’ communicative abilities. This correlation was not influenced by the sizes of the lesion or the patients’ chronological ages. This is the first study that has directly correlated activity in a domain general system, specifically the salience network, with residual language performance in post-stroke aphasia. It provides direct evidence in support of the clinical intuition that domain-general cognitive control is an essential factor contributing to the potential for recovery from aphasic stroke. PMID:24163248
Social Anxiety, Affect, Cortisol Response and Performance on a Speech Task.
Losiak, Wladyslaw; Blaut, Agata; Klosowska, Joanna; Slowik, Natalia
2016-01-01
Social anxiety is characterized by increased emotional reactivity to social stimuli, but results of studies focusing on affective reactions of socially anxious subjects in the situation of social exposition are inconclusive, especially in the case of endocrinological measures of affect. This study was designed to examine individual differences in endocrinological and affective reactions to social exposure as well as in performance on a speech task in a group of students (n = 44) comprising subjects with either high or low levels of social anxiety. Measures of salivary cortisol and positive and negative affect were taken before and after an impromptu speech. Self-ratings and observer ratings of performance were also obtained. Cortisol levels and negative affect increased in both groups after the speech task, and positive affect decreased; however, group × affect interactions were not significant. Assessments conducted after the speech task revealed that highly socially anxious participants had lower observer ratings of performance while cortisol increase and changes in self-reported affect were not related to performance. Socially anxious individuals do not differ from nonanxious individuals in affective reactions to social exposition, but reveal worse performance at a speech task. © 2015 S. Karger AG, Basel.
Autistic traits and attention to speech: Evidence from typically developing individuals.
Korhonen, Vesa; Werner, Stefan
2017-04-01
Individuals with autism spectrum disorder have a preference for attending to non-speech stimuli over speech stimuli. We are interested in whether non-speech preference is only a feature of diagnosed individuals, and whether we can we test implicit preference experimentally. In typically developed individuals, serial recall is disrupted more by speech stimuli than by non-speech stimuli. Since behaviour of individuals with autistic traits resembles that of individuals with autism, we have used serial recall to test whether autistic traits influence task performance during irrelevant speech sounds. The errors made on the serial recall task during speech or non-speech sounds were counted as a measure of speech or non-speech preference in relation to no sound condition. We replicated the serial order effect and found the speech to be more disruptive than the non-speech sounds, but were unable to find any associations between the autism quotient scores and the non-speech sounds. Our results may indicate a learnt behavioural response to speech sounds.
Preschoolers Benefit from Visually Salient Speech Cues
ERIC Educational Resources Information Center
Lalonde, Kaylah; Holt, Rachael Frush
2015-01-01
Purpose: This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method: Twelve adults and 27 typically developing 3-…
Preschoolers Benefit From Visually Salient Speech Cues
Holt, Rachael Frush
2015-01-01
Purpose This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method Twelve adults and 27 typically developing 3- and 4-year-old children completed 3 audiovisual (AV) speech integration tasks: matching, discrimination, and recognition. The authors compared AV benefit for visually salient and less visually salient speech discrimination contrasts and assessed the visual saliency of consonant confusions in auditory-only and AV word recognition. Results Four-year-olds and adults demonstrated visual influence on all measures. Three-year-olds demonstrated visual influence on speech discrimination and recognition measures. All groups demonstrated greater AV benefit for the visually salient discrimination contrasts. AV recognition benefit in 4-year-olds and adults depended on the visual saliency of speech sounds. Conclusions Preschoolers can demonstrate AV speech integration. Their AV benefit results from efficient use of visually salient speech cues. Four-year-olds, but not 3-year-olds, used visual phonological knowledge to take advantage of visually salient speech cues, suggesting possible developmental differences in the mechanisms of AV benefit. PMID:25322336
Speech Perception Deficits in Mandarin-Speaking School-Aged Children with Poor Reading Comprehension
Liu, Huei-Mei; Tsao, Feng-Ming
2017-01-01
Previous studies have shown that children learning alphabetic writing systems who have language impairment or dyslexia exhibit speech perception deficits. However, whether such deficits exist in children learning logographic writing systems who have poor reading comprehension remains uncertain. To further explore this issue, the present study examined speech perception deficits in Mandarin-speaking children with poor reading comprehension. Two self-designed tasks, consonant categorical perception task and lexical tone discrimination task were used to compare speech perception performance in children (n = 31, age range = 7;4–10;2) with poor reading comprehension and an age-matched typically developing group (n = 31, age range = 7;7–9;10). Results showed that the children with poor reading comprehension were less accurate in consonant and lexical tone discrimination tasks and perceived speech contrasts less categorically than the matched group. The correlations between speech perception skills (i.e., consonant and lexical tone discrimination sensitivities and slope of consonant identification curve) and individuals’ oral language and reading comprehension were stronger than the correlations between speech perception ability and word recognition ability. In conclusion, the results revealed that Mandarin-speaking children with poor reading comprehension exhibit less-categorized speech perception, suggesting that imprecise speech perception, especially lexical tone perception, is essential to account for reading learning difficulties in Mandarin-speaking children. PMID:29312031
NASA Technical Reports Server (NTRS)
Simpson, C. A.
1985-01-01
In the present study of the responses of pairs of pilots to aircraft warning classification tasks using an isolated word, speaker-dependent speech recognition system, the induced stress was manipulated by means of different scoring procedures for the classification task and by the inclusion of a competitive manual control task. Both speech patterns and recognition accuracy were analyzed, and recognition errors were recorded by type for an isolated word speaker-dependent system and by an offline technique for a connected word speaker-dependent system. While errors increased with task loading for the isolated word system, there was no such effect for task loading in the case of the connected word system.
Motor speech signature of behavioral variant frontotemporal dementia: Refining the phenotype.
Vogel, Adam P; Poole, Matthew L; Pemberton, Hugh; Caverlé, Marja W J; Boonstra, Frederique M C; Low, Essie; Darby, David; Brodtmann, Amy
2017-08-22
To provide a comprehensive description of motor speech function in behavioral variant frontotemporal dementia (bvFTD). Forty-eight individuals (24 bvFTD and 24 age- and sex-matched healthy controls) provided speech samples. These varied in complexity and thus cognitive demand. Their language was assessed using the Progressive Aphasia Language Scale and verbal fluency tasks. Speech was analyzed perceptually to describe the nature of deficits and acoustically to quantify differences between patients with bvFTD and healthy controls. Cortical thickness and subcortical volume derived from MRI scans were correlated with speech outcomes in patients with bvFTD. Speech of affected individuals was significantly different from that of healthy controls. The speech signature of patients with bvFTD is characterized by a reduced rate (75%) and accuracy (65%) on alternating syllable production tasks, and prosodic deficits including reduced speech rate (45%), prolonged intervals (54%), and use of short phrases (41%). Groups differed on acoustic measures derived from the reading, unprepared monologue, and diadochokinetic tasks but not the days of the week or sustained vowel tasks. Variability of silence length was associated with cortical thickness of the inferior frontal gyrus and insula and speech rate with the precentral gyrus. One in 8 patients presented with moderate speech timing deficits with a further two-thirds rated as mild or subclinical. Subtle but measurable deficits in prosody are common in bvFTD and should be considered during disease management. Language function correlated with speech timing measures derived from the unprepared monologue only. © 2017 American Academy of Neurology.
Task Repetition and Second Language Speech Processing
ERIC Educational Resources Information Center
Lambert, Craig; Kormos, Judit; Minn, Danny
2017-01-01
This study examines the relationship between the repetition of oral monologue tasks and immediate gains in L2 fluency. It considers the effect of aural-oral task repetition on speech rate, frequency of clause-final and midclause filled pauses, and overt self-repairs across different task types and proficiency levels and relates these findings to…
Oral-diadochokinesis rates across languages: English and Hebrew norms.
Icht, Michal; Ben-David, Boaz M
2014-01-01
Oro-facial and speech motor control disorders represent a variety of speech and language pathologies. Early identification of such problems is important and carries clinical implications. A common and simple tool for gauging the presence and severity of speech motor control impairments is oral-diadochokinesis (oral-DDK). Surprisingly, norms for adult performance are missing from the literature. The goals of this study were: (1) to establish a norm for oral-DDK rate for (young to middle-age) adult English speakers, by collecting data from the literature (five studies, N=141); (2) to investigate the possible effect of language (and culture) on oral-DDK performance, by analyzing studies conducted in other languages (five studies, N=140), alongside the English norm; and (3) to find a new norm for adult Hebrew speakers, by testing 115 speakers. We first offer an English norm with a mean of 6.2syllables/s (SD=.8), and a lower boundary of 5.4syllables/s that can be used to indicate possible abnormality. Next, we found significant differences between four tested languages (English, Portuguese, Farsi and Greek) in oral-DDK rates. Results suggest the need to set language and culture sensitive norms for the application of the oral-DDK task world-wide. Finally, we found the oral-DDK performance for adult Hebrew speakers to be 6.4syllables/s (SD=.8), not significantly different than the English norms. This implies possible phonological similarities between English and Hebrew. We further note that no gender effects were found in our study. We recommend using oral-DDK as an important tool in the speech language pathologist's arsenal. Yet, application of this task should be done carefully, comparing individual performance to a set norm within the specific language. Readers will be able to: (1) identify the Speech-Language Pathologist assessment process using the oral-DDK task, by comparing an individual performance to the present English norm, (2) describe the impact of language on oral-DDK performance, and (3) accurately detect Hebrew speakers' patients using this tool. Copyright © 2014 Elsevier Inc. All rights reserved.
Age-Related Differences in Listening Effort During Degraded Speech Recognition.
Ward, Kristina M; Shen, Jing; Souza, Pamela E; Grieco-Calub, Tina M
The purpose of the present study was to quantify age-related differences in executive control as it relates to dual-task performance, which is thought to represent listening effort, during degraded speech recognition. Twenty-five younger adults (YA; 18-24 years) and 21 older adults (OA; 56-82 years) completed a dual-task paradigm that consisted of a primary speech recognition task and a secondary visual monitoring task. Sentence material in the primary task was either unprocessed or spectrally degraded into 8, 6, or 4 spectral channels using noise-band vocoding. Performance on the visual monitoring task was assessed by the accuracy and reaction time of participants' responses. Performance on the primary and secondary task was quantified in isolation (i.e., single task) and during the dual-task paradigm. Participants also completed a standardized psychometric measure of executive control, including attention and inhibition. Statistical analyses were implemented to evaluate changes in listeners' performance on the primary and secondary tasks (1) per condition (unprocessed vs. vocoded conditions); (2) per task (single task vs. dual task); and (3) per group (YA vs. OA). Speech recognition declined with increasing spectral degradation for both YA and OA when they performed the task in isolation or concurrently with the visual monitoring task. OA were slower and less accurate than YA on the visual monitoring task when performed in isolation, which paralleled age-related differences in standardized scores of executive control. When compared with single-task performance, OA experienced greater declines in secondary-task accuracy, but not reaction time, than YA. Furthermore, results revealed that age-related differences in executive control significantly contributed to age-related differences on the visual monitoring task during the dual-task paradigm. OA experienced significantly greater declines in secondary-task accuracy during degraded speech recognition than YA. These findings are interpreted as suggesting that OA expended greater listening effort than YA, which may be partially attributed to age-related differences in executive control.
Awareness of Rhythm Patterns in Speech and Music in Children with Specific Language Impairments
Cumming, Ruth; Wilson, Angela; Leong, Victoria; Colling, Lincoln J.; Goswami, Usha
2015-01-01
Children with specific language impairments (SLIs) show impaired perception and production of language, and also show impairments in perceiving auditory cues to rhythm [amplitude rise time (ART) and sound duration] and in tapping to a rhythmic beat. Here we explore potential links between language development and rhythm perception in 45 children with SLI and 50 age-matched controls. We administered three rhythmic tasks, a musical beat detection task, a tapping-to-music task, and a novel music/speech task, which varied rhythm and pitch cues independently or together in both speech and music. Via low-pass filtering, the music sounded as though it was played from a low-quality radio and the speech sounded as though it was muffled (heard “behind the door”). We report data for all of the SLI children (N = 45, IQ varying), as well as for two independent subgroupings with intact IQ. One subgroup, “Pure SLI,” had intact phonology and reading (N = 16), the other, “SLI PPR” (N = 15), had impaired phonology and reading. When IQ varied (all SLI children), we found significant group differences in all the rhythmic tasks. For the Pure SLI group, there were rhythmic impairments in the tapping task only. For children with SLI and poor phonology (SLI PPR), group differences were found in all of the filtered speech/music AXB tasks. We conclude that difficulties with rhythmic cues in both speech and music are present in children with SLIs, but that some rhythmic measures are more sensitive than others. The data are interpreted within a “prosodic phrasing” hypothesis, and we discuss the potential utility of rhythmic and musical interventions in remediating speech and language difficulties in children. PMID:26733848
Awareness of Rhythm Patterns in Speech and Music in Children with Specific Language Impairments.
Cumming, Ruth; Wilson, Angela; Leong, Victoria; Colling, Lincoln J; Goswami, Usha
2015-01-01
Children with specific language impairments (SLIs) show impaired perception and production of language, and also show impairments in perceiving auditory cues to rhythm [amplitude rise time (ART) and sound duration] and in tapping to a rhythmic beat. Here we explore potential links between language development and rhythm perception in 45 children with SLI and 50 age-matched controls. We administered three rhythmic tasks, a musical beat detection task, a tapping-to-music task, and a novel music/speech task, which varied rhythm and pitch cues independently or together in both speech and music. Via low-pass filtering, the music sounded as though it was played from a low-quality radio and the speech sounded as though it was muffled (heard "behind the door"). We report data for all of the SLI children (N = 45, IQ varying), as well as for two independent subgroupings with intact IQ. One subgroup, "Pure SLI," had intact phonology and reading (N = 16), the other, "SLI PPR" (N = 15), had impaired phonology and reading. When IQ varied (all SLI children), we found significant group differences in all the rhythmic tasks. For the Pure SLI group, there were rhythmic impairments in the tapping task only. For children with SLI and poor phonology (SLI PPR), group differences were found in all of the filtered speech/music AXB tasks. We conclude that difficulties with rhythmic cues in both speech and music are present in children with SLIs, but that some rhythmic measures are more sensitive than others. The data are interpreted within a "prosodic phrasing" hypothesis, and we discuss the potential utility of rhythmic and musical interventions in remediating speech and language difficulties in children.
Franco, Ana; Gaillard, Vinciane; Cleeremans, Axel; Destrebecqz, Arnaud
2015-12-01
Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212-223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.
The Effect of Feedback Schedule Manipulation on Speech Priming Patterns and Reaction Time
ERIC Educational Resources Information Center
Slocomb, Dana; Spencer, Kristie A.
2009-01-01
Speech priming tasks are frequently used to delineate stages in the speech process such as lexical retrieval and motor programming. These tasks, often measured in reaction time (RT), require fast and accurate responses, reflecting maximized participant performance, to result in robust priming effects. Encouraging speed and accuracy in responding…
Cognitive Load in Voice Therapy Carry-Over Exercises
ERIC Educational Resources Information Center
Iwarsson, Jenny; Morris, David Jackson; Balling, Laura Winther
2017-01-01
Purpose: The cognitive load generated by online speech production may vary with the nature of the speech task. This article examines 3 speech tasks used in voice therapy carry-over exercises, in which a patient is required to adopt and automatize new voice behaviors, ultimately in daily spontaneous communication. Method: Twelve subjects produced…
Articulatory Control in Childhood Apraxia of Speech in a Novel Word-Learning Task
ERIC Educational Resources Information Center
Case, Julie; Grigos, Maria I.
2016-01-01
Purpose: Articulatory control and speech production accuracy were examined in children with childhood apraxia of speech (CAS) and typically developing (TD) controls within a novel word-learning task to better understand the influence of planning and programming deficits in the production of unfamiliar words. Method: Participants included 16…
Berk, L E; Landau, S
1993-04-01
Learning disabled (LD) children are often targets for cognitive-behavioral interventions designed to train them in effective use of a self-directed speech. The purpose of this study was to determine if, indeed, these children display immature private speech in the naturalistic classroom setting. Comparisons were made of the private speech, motor accompaniment to task, and attention of LD and normally achieving classmates during academic seatwork. Setting effects were examined by comparing classroom data with observations during academic seatwork and puzzle solving in the laboratory. Finally, a subgroup of LD children symptomatic of attention-deficit hyperactivity disorder (ADHD) was compared with pure LD and normally achieving controls to determine if the presumed immature private speech is a function of a learning disability or externalizing behavior problems. Results indicated that LD children used more task-relevant private speech than controls, an effect that was especially pronounced for the LD/ADHD subgroup. Use of private speech was setting- and task-specific. Implications for intervention and future research methodology are discussed.
Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.
2015-01-01
In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency, and voice emerge more saliently in conversation than in repetition, reading, or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have revealed that formulaic language is more impaired than novel language. This descriptive study extends these observations to a case of severely dysfluent dysarthria due to a parkinsonian syndrome. Dysfluencies were quantified and compared for conversation, two forms of repetition, reading, recited speech, and singing. Other measures examined phonetic inventories, word forms, and formulaic language. Phonetic, syllabic, and lexical dysfluencies were more abundant in conversation than in other task conditions. Formulaic expressions in conversation were reduced compared to normal speakers. A proposed explanation supports the notion that the basal ganglia contribute to formulation of internal models for execution of speech. PMID:22774929
Latash, Mark L.; Mikaelian, Irina L.
2010-01-01
We explored the relations between task difficulty and speech time in picture description tasks. Six native speakers of Mandarin Chinese (CH group) and six native speakers or Indo-European languages (IE group) produced quick and accurate verbal descriptions of pictures in a self-paced manner. The pictures always involved two objects, a plate and one of the three objects (a stick, a fork, or a knife) located and oriented differently with respect to the plate in different trials. An index of difficulty was assigned to each picture. CH group showed lower reaction time and much lower speech time. Speech time scaled linearly with the log-transformed index of difficulty in all subjects. The results suggest generality of Fitts’ law for movement and speech tasks, and possibly for other cognitive tasks as well. The differences between the CH and IE groups may be due to specific task features, differences in the grammatical rules of CH and IE languages, and possible use of tone for information transmission. PMID:21339514
Soli, Sigfrid D; Giguère, Christian; Laroche, Chantal; Vaillancourt, Véronique; Dreschler, Wouter A; Rhebergen, Koenraad S; Harkins, Kevin; Ruckstuhl, Mark; Ramulu, Pradeep; Meyers, Lawrence S
The objectives of this study were to (1) identify essential hearing-critical job tasks for public safety and law enforcement personnel; (2) determine the locations and real-world noise environments where these tasks are performed; (3) characterize each noise environment in terms of its impact on the likelihood of effective speech communication, considering the effects of different levels of vocal effort, communication distances, and repetition; and (4) use this characterization to define an objective normative reference for evaluating the ability of individuals to perform essential hearing-critical job tasks in noisy real-world environments. Data from five occupational hearing studies performed over a 17-year period for various public safety agencies were analyzed. In each study, job task analyses by job content experts identified essential hearing-critical tasks and the real-world noise environments where these tasks are performed. These environments were visited, and calibrated recordings of each noise environment were made. The extended speech intelligibility index (ESII) was calculated for each 4-sec interval in each recording. These data, together with the estimated ESII value required for effective speech communication by individuals with normal hearing, allowed the likelihood of effective speech communication in each noise environment for different levels of vocal effort and communication distances to be determined. These likelihoods provide an objective norm-referenced and standardized means of characterizing the predicted impact of real-world noise on the ability to perform essential hearing-critical tasks. A total of 16 noise environments for law enforcement personnel and eight noise environments for corrections personnel were analyzed. Effective speech communication was essential to hearing-critical tasks performed in these environments. Average noise levels, ranged from approximately 70 to 87 dBA in law enforcement environments and 64 to 80 dBA in corrections environments. The likelihood of effective speech communication at communication distances of 0.5 and 1 m was often less than 0.50 for normal vocal effort. Likelihood values often increased to 0.80 or more when raised or loud vocal effort was used. Effective speech communication at and beyond 5 m was often unlikely, regardless of vocal effort. ESII modeling of nonstationary real-world noise environments may prove an objective means of characterizing their impact on the likelihood of effective speech communication. The normative reference provided by these measures predicts the extent to which hearing impairments that increase the ESII value required for effective speech communication also decrease the likelihood of effective speech communication. These predictions may provide an objective evidence-based link between the essential hearing-critical job task requirements of public safety and law enforcement personnel and ESII-based hearing assessment of individuals who seek to perform these jobs.
The effect of compression and attention allocation on speech intelligibility. II
NASA Astrophysics Data System (ADS)
Choi, Sangsook; Carrell, Thomas
2004-05-01
Previous investigations of the effects of amplitude compression on measures of speech intelligibility have shown inconsistent results. Recently, a novel paradigm was used to investigate the possibility of more consistent findings with a measure of speech perception that is not based entirely on intelligibility (Choi and Carrell, 2003). That study exploited a dual-task paradigm using a pursuit rotor online visual-motor tracking task (Dlhopolsky, 2000) along with a word repetition task. Intensity-compressed words caused reduced performance on the tracking task as compared to uncompressed words when subjects engaged in a simultaneous word repetition task. This suggested an increased cognitive load when listeners processed compressed words. A stronger result might be obtained if a single resource (linguistic) is required rather than two (linguistic and visual-motor) resources. In the present experiment a visual lexical decision task and an auditory word repetition task were used. The visual stimuli for the lexical decision task were blurred and presented in a noise background. The compressed and uncompressed words for repetition were placed in speech-shaped noise. Participants with normal hearing and vision conducted word repetition and lexical decision tasks both independently and simultaneously. The pattern of results is discussed and compared to the previous study.
Donaldson, Morag L; Cooper, Lynn S M
2013-09-01
Young children's speech is typically more linguistically sophisticated than their writing. However, there are grounds for asking whether production of cohesive devices, such as verb-phrase anaphora (VPA), might represent an exception to this developmental pattern, as cohesive devices are generally more important in writing than in speech and so might be expected to be more frequent in children's writing than in their speech. The study reported herein aims to compare the frequency of children's production of VPA constructions (e.g., Mary is eating an apple and so is John) between a written and a spoken task. Forty-eight children participated from each of two age groups: 7-year-olds and 10-year-olds. All the children received both a spoken and a written sentence completion task designed to elicit production of VPA. Task order was counterbalanced. VPA production was significantly more frequent in speech than in writing and when the spoken task was presented first. Surprisingly, the 7-year-olds produced VPA constructions more frequently than the 10-year-olds. Despite the greater importance of cohesion in writing than in speech, children's production of VPA is similar to their production of most other aspects of language in that more sophisticated constructions are used more frequently in speech than in writing. Children's written production of cohesive devices could probably be enhanced by presenting spoken tasks immediately before written tasks. The lower frequency of VPA production in the older children may reflect syntactic priming effects or a belief that they should produce sentences that are as fully specified as possible. © 2012 The British Psychological Society.
Mefferd, Antje S.
2016-01-01
The degree of speech movement pattern consistency can provide information about speech motor control. Although tongue motor control is particularly important because of the tongue's primary contribution to the speech acoustic signal, capturing tongue movements during speech remains difficult and costly. This study sought to determine if formant movements could be used to estimate tongue movement pattern consistency indirectly. Two age groups (seven young adults and seven older adults) and six speech conditions (typical, slow, loud, clear, fast, bite block speech) were selected to elicit an age- and task-dependent performance range in tongue movement pattern consistency. Kinematic and acoustic spatiotemporal indexes (STI) were calculated based on sentence-length tongue movement and formant movement signals, respectively. Kinematic and acoustic STI values showed strong associations across talkers and moderate to strong associations for each talker across speech tasks; although, in cases where task-related tongue motor performance changes were relatively small, the acoustic STI values were poorly associated with kinematic STI values. These findings suggest that, depending on the sensitivity needs, formant movement pattern consistency could be used in lieu of direct kinematic analysis to indirectly examine speech motor control. PMID:27908069
The Effects of Divided Attention on Speech Motor, Verbal Fluency, and Manual Task Performance
ERIC Educational Resources Information Center
Dromey, Christopher; Shim, Erin
2008-01-01
Purpose: The goal of this study was to evaluate aspects of the "functional distance hypothesis," which predicts that tasks regulated by brain networks in closer anatomic proximity will interfere more with each other than tasks controlled by spatially distant regions. Speech, verbal fluency, and manual motor tasks were examined to ascertain whether…
Song and speech: brain regions involved with perception and covert production.
Callan, Daniel E; Tsytsarev, Vassiliy; Hanakawa, Takashi; Callan, Akiko M; Katsuhara, Maya; Fukuyama, Hidenao; Turner, Robert
2006-07-01
This 3-T fMRI study investigates brain regions similarly and differentially involved with listening and covert production of singing relative to speech. Given the greater use of auditory-motor self-monitoring and imagery with respect to consonance in singing, brain regions involved with these processes are predicted to be differentially active for singing more than for speech. The stimuli consisted of six Japanese songs. A block design was employed in which the tasks for the subject were to listen passively to singing of the song lyrics, passively listen to speaking of the song lyrics, covertly sing the song lyrics visually presented, covertly speak the song lyrics visually presented, and to rest. The conjunction of passive listening and covert production tasks used in this study allow for general neural processes underlying both perception and production to be discerned that are not exclusively a result of stimulus induced auditory processing nor to low level articulatory motor control. Brain regions involved with both perception and production for singing as well as speech were found to include the left planum temporale/superior temporal parietal region, as well as left and right premotor cortex, lateral aspect of the VI lobule of posterior cerebellum, anterior superior temporal gyrus, and planum polare. Greater activity for the singing over the speech condition for both the listening and covert production tasks was found in the right planum temporale. Greater activity in brain regions involved with consonance, orbitofrontal cortex (listening task), subcallosal cingulate (covert production task) were also present for singing over speech. The results are consistent with the PT mediating representational transformation across auditory and motor domains in response to consonance for singing over that of speech. Hemispheric laterality was assessed by paired t tests between active voxels in the contrast of interest relative to the left-right flipped contrast of interest calculated from images normalized to the left-right reflected template. Consistent with some hypotheses regarding hemispheric specialization, a pattern of differential laterality for speech over singing (both covert production and listening tasks) occurs in the left temporal lobe, whereas, singing over speech (listening task only) occurs in right temporal lobe.
Speech-Action Coordination in Young Children.
ERIC Educational Resources Information Center
Balamore, Usha; Wozniak, Robert H.
1984-01-01
Speech-action coordination in 100 three and four year olds was measured according to a modified version of Wozniak's hammering-board task. Four instructional conditions (instructional, demonstration, vocalization, no vocalization) were presented in a numerical task ("Hit four times") and in two spatial tasks: three-color ("Hit red,…
Potential interactions among linguistic, autonomic, and motor factors in speech.
Kleinow, Jennifer; Smith, Anne
2006-05-01
Though anecdotal reports link certain speech disorders to increases in autonomic arousal, few studies have described the relationship between arousal and speech processes. Additionally, it is unclear how increases in arousal may interact with other cognitive-linguistic processes to affect speech motor control. In this experiment we examine potential interactions between autonomic arousal, linguistic processing, and speech motor coordination in adults and children. Autonomic responses (heart rate, finger pulse volume, tonic skin conductance, and phasic skin conductance) were recorded simultaneously with upper and lower lip movements during speech. The lip aperture variability (LA variability index) across multiple repetitions of sentences that varied in length and syntactic complexity was calculated under low- and high-arousal conditions. High arousal conditions were elicited by performance of the Stroop color word task. Children had significantly higher lip aperture variability index values across all speaking tasks, indicating more variable speech motor coordination. Increases in syntactic complexity and utterance length were associated with increases in speech motor coordination variability in both speaker groups. There was a significant effect of Stroop task, which produced increases in autonomic arousal and increased speech motor variability in both adults and children. These results provide novel evidence that high arousal levels can influence speech motor control in both adults and children. (c) 2006 Wiley Periodicals, Inc.
Schall, Sonja; von Kriegstein, Katharina
2014-01-01
It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers' voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker's face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.
Texting while driving: is speech-based text entry less risky than handheld text entry?
He, J; Chaparro, A; Nguyen, B; Burge, R J; Crandall, J; Chaparro, B; Ni, R; Cao, S
2014-11-01
Research indicates that using a cell phone to talk or text while maneuvering a vehicle impairs driving performance. However, few published studies directly compare the distracting effects of texting using a hands-free (i.e., speech-based interface) versus handheld cell phone, which is an important issue for legislation, automotive interface design and driving safety training. This study compared the effect of speech-based versus handheld text entries on simulated driving performance by asking participants to perform a car following task while controlling the duration of a secondary text-entry task. Results showed that both speech-based and handheld text entries impaired driving performance relative to the drive-only condition by causing more variation in speed and lane position. Handheld text entry also increased the brake response time and increased variation in headway distance. Text entry using a speech-based cell phone was less detrimental to driving performance than handheld text entry. Nevertheless, the speech-based text entry task still significantly impaired driving compared to the drive-only condition. These results suggest that speech-based text entry disrupts driving, but reduces the level of performance interference compared to text entry with a handheld device. In addition, the difference in the distraction effect caused by speech-based and handheld text entry is not simply due to the difference in task duration. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Feenaughty, Lynda
Purpose: The current study sought to investigate the separate effects of dysarthria and cognitive status on global speech timing, speech hesitation, and linguistic complexity characteristics and how these speech behaviors impose on listener impressions for three connected speech tasks presumed to differ in cognitive-linguistic demand for four carefully defined speaker groups; 1) MS with cognitive deficits (MSCI), 2) MS with clinically diagnosed dysarthria and intact cognition (MSDYS), 3) MS without dysarthria or cognitive deficits (MS), and 4) healthy talkers (CON). The relationship between neuropsychological test scores and speech-language production and perceptual variables for speakers with cognitive deficits was also explored. Methods: 48 speakers, including 36 individuals reporting a neurological diagnosis of MS and 12 healthy talkers participated. The three MS groups and control group each contained 12 speakers (8 women and 4 men). Cognitive function was quantified using standard clinical tests of memory, information processing speed, and executive function. A standard z-score of ≤ -1.50 indicated deficits in a given cognitive domain. Three certified speech-language pathologists determined the clinical diagnosis of dysarthria for speakers with MS. Experimental speech tasks of interest included audio-recordings of an oral reading of the Grandfather passage and two spontaneous speech samples in the form of Familiar and Unfamiliar descriptive discourse. Various measures of spoken language were of interest. Suprasegmental acoustic measures included speech and articulatory rate. Linguistic speech hesitation measures included pause frequency (i.e., silent and filled pauses), mean silent pause duration, grammatical appropriateness of pauses, and interjection frequency. For the two discourse samples, three standard measures of language complexity were obtained including subordination index, inter-sentence cohesion adequacy, and lexical diversity. Ten listeners judged each speech sample using the perceptual construct of Speech Severity using a visual analog scale. Additional measures obtained to describe participants included the Sentence Intelligibility Test (SIT), the 10-item Communication Participation Item Bank (CPIB), and standard biopsychosocial measures of depression (Beck Depression Inventory-Fast Screen; BDI-FS), fatigue (Fatigue Severity Scale; FSS), and overall disease severity (Expanded Disability Status Scale; EDSS). Healthy controls completed all measures, with the exception of the CPIB and EDSS. All data were analyzed using standard, descriptive and parametric statistics. For the MSCI group, the relationship between neuropsychological test scores and speech-language variables were explored for each speech task using Pearson correlations. The relationship between neuropsychological test scores and Speech Severity also was explored. Results and Discussion: Topic familiarity for descriptive discourse did not strongly influence speech production or perceptual variables; however, results indicated predicted task-related differences for some spoken language measures. With the exception of the MSCI group, all speaker groups produced the same or slower global speech timing (i.e., speech and articulatory rates), more silent and filled pauses, more grammatical and longer silent pause durations in spontaneous discourse compared to reading aloud. Results revealed no appreciable task differences for linguistic complexity measures. Results indicated group differences for speech rate. The MSCI group produced significantly faster speech rates compared to the MSDYS group. Both the MSDYS and the MSCI groups were judged to have significantly poorer perceived Speech Severity compared to typically aging adults. The Task x Group interaction was only significant for the number of silent pauses. The MSDYS group produced fewer silent pauses in spontaneous speech and more silent pauses in the reading task compared to other groups. Finally, correlation analysis revealed moderate relationships between neuropsychological test scores and speech hesitation measures, within the MSCI group. Slower information processing and poorer memory were significantly correlated with more silent pauses and poorer executive function was associated with fewer filled pauses in the Unfamiliar discourse task. Results have both clinical and theoretical implications. Overall, clinicians should demonstrate caution when interpreting global measures of speech timing and perceptual measures in the absence of information about cognitive ability. Results also have implications for a comprehensive model of spoken language incorporating cognitive, linguistic, and motor variables.
ERIC Educational Resources Information Center
Al-Namlah, Abdulrahman S.; Meins, Elizabeth; Fernyhough, Charles
2012-01-01
We investigated relations between 4- and 7-year-olds' (N=58) autobiographical memory and their use of self-regulatory private speech in a non-mnemonic context (a cognitive planning task). Children's use of self-regulatory private speech during the planning task was associated with longer autobiographical narratives which included specific rather…
A Task Analysis for Teaching the Organization of an Informative Speech.
ERIC Educational Resources Information Center
Parks, Arlie Muller
The purpose of this paper is to demonstrate a task analysis of the objectives needed to organize an effective information-giving speech. A hierarchical structure of the behaviors needed to deliver a well-organized extemporaneous information-giving speech is presented, with some behaviors as subtasks for the unit objective and the others as…
Vuolo, Janet; Goffman, Lisa
2017-01-01
This exploratory treatment study used phonetic transcription and speech kinematics to examine changes in segmental and articulatory variability. Nine children, ages 4- to 8-years-old, served as participants, including two with childhood apraxia of speech (CAS), five with speech sound disorder (SSD), and two who were typically developing (TD). Children practised producing agent + action phrases in an imitation task (low linguistic load) and a retrieval task (high linguistic load) over five sessions. In the imitation task in session one, both participants with CAS showed high degrees of segmental and articulatory variability. After five sessions, imitation practice resulted in increased articulatory variability for five participants. Retrieval practice resulted in decreased articulatory variability in three participants with SSD. These results suggest that short-term speech production practice in rote imitation disrupts articulatory control in children with and without CAS. In contrast, tasks that require linguistic processing may scaffold learning for children with SSD but not CAS. PMID:27960554
Gennari, Silvia P; Millman, Rebecca E; Hymers, Mark; Mattys, Sven L
2018-06-12
Perceiving speech while performing another task is a common challenge in everyday life. How the brain controls resource allocation during speech perception remains poorly understood. Using functional magnetic resonance imaging (fMRI), we investigated the effect of cognitive load on speech perception by examining brain responses of participants performing a phoneme discrimination task and a visual working memory task simultaneously. The visual task involved holding either a single meaningless image in working memory (low cognitive load) or four different images (high cognitive load). Performing the speech task under high load, compared to low load, resulted in decreased activity in pSTG/pMTG and increased activity in visual occipital cortex and two regions known to contribute to visual attention regulation-the superior parietal lobule (SPL) and the paracingulate and anterior cingulate gyrus (PaCG, ACG). Critically, activity in PaCG/ACG was correlated with performance in the visual task and with activity in pSTG/pMTG: Increased activity in PaCG/ACG was observed for individuals with poorer visual performance and with decreased activity in pSTG/pMTG. Moreover, activity in a pSTG/pMTG seed region showed psychophysiological interactions with areas of the PaCG/ACG, with stronger interaction in the high-load than the low-load condition. These findings show that the acoustic analysis of speech is affected by the demands of a concurrent visual task and that the PaCG/ACG plays a role in allocating cognitive resources to concurrent auditory and visual information. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Control of Task Sequences: What Is the Role of Language?
ERIC Educational Resources Information Center
Mayr, Ulrich; Kleffner-Canucci, Killian; Kikumoto, Atsushi; Redford, Melissa A.
2014-01-01
It is almost a truism that language aids serial-order control through self-cuing of upcoming sequential elements. We measured speech onset latencies as subjects performed hierarchically organized task sequences while "thinking aloud" each task label. Surprisingly, speech onset latencies and response times (RTs) were highly synchronized,…
ERIC Educational Resources Information Center
Huber, Jessica E.; Darling, Meghan
2011-01-01
Purpose: To examine the effects of cognitive-linguistic deficits and respiratory physiologic changes on respiratory support for speech in individuals with Parkinson's disease (PD) using two speech tasks: reading and extemporaneous speech. Method: Five women with PD, 9 men with PD, and 14 age- and sex-matched control participants read a passage and…
The Mechanism of Speech Processing in Congenital Amusia: Evidence from Mandarin Speakers
Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren
2012-01-01
Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results. PMID:22347374
Francis, Alexander L
2010-02-01
Perception of speech in competing speech is facilitated by spatial separation of the target and distracting speech, but this benefit may arise at either a perceptual or a cognitive level of processing. Load theory predicts different effects of perceptual and cognitive (working memory) load on selective attention in flanker task contexts, suggesting that this paradigm may be used to distinguish levels of interference. Two experiments examined interference from competing speech during a word recognition task under different perceptual and working memory loads in a dual-task paradigm. Listeners identified words produced by a talker of one gender while ignoring a talker of the other gender. Perceptual load was manipulated using a nonspeech response cue, with response conditional upon either one or two acoustic features (pitch and modulation). Memory load was manipulated with a secondary task consisting of one or six visually presented digits. In the first experiment, the target and distractor were presented at different virtual locations (0 degrees and 90 degrees , respectively), whereas in the second, all the stimuli were presented from the same apparent location. Results suggest that spatial cues improve resistance to distraction in part by reducing working memory demand.
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.
Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren
2012-01-01
Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
CNV amplitude as a neural correlate for stuttering frequency: A case report of acquired stuttering.
Vanhoutte, Sarah; Van Borsel, John; Cosyns, Marjan; Batens, Katja; van Mierlo, Pieter; Hemelsoet, Dimitri; Van Roost, Dirk; Corthals, Paul; De Letter, Miet; Santens, Patrick
2014-11-01
A neural hallmark of developmental stuttering is abnormal articulatory programming. One of the neurophysiological substrates of articulatory preparation is the contingent negative variation (CNV). Unfortunately, CNV tasks are rarely performed in persons who stutter and mainly focus on the effect of task variation rather than on interindividual variation in stutter related variables. However, variations in motor programming seem to be related to variation in stuttering frequency. The current study presents a case report of acquired stuttering following stroke and stroke related surgery in the left superior temporal gyrus. A speech related CNV task was administered at four points in time with differences in stuttering severity and frequency. Unexpectedly, CNV amplitudes at electrode sites approximating bilateral motor and left inferior frontal gyrus appeared to be inversely proportional to stuttering frequency. The higher the stuttering frequency, the lower the activity for articulatory preparation. Thus, the amount of disturbance in motor programming seems to determine stuttering frequency. At right frontal electrodes, a relative increase in CNV amplitude was seen at the test session with most severe stuttering. Right frontal overactivation is cautiously suggested to be a compensation strategy. In conclusion, late CNV amplitude elicited by a relatively simple speech task seems to be able to provide an objective, neural correlate of stuttering frequency. The present case report supports the hypothesis that motor preparation has an important role in stuttering. Copyright © 2014 Elsevier Ltd. All rights reserved.
Effect of perceptual load on semantic access by speech in children.
Jerger, Susan; Damian, Markus F; Mills, Candice; Bartlett, James; Tye-Murray, Nancy; Abdi, Hervé
2013-04-01
To examine whether semantic access by speech requires attention in children. Children (N = 200) named pictures and ignored distractors on a cross-modal (distractors: auditory-no face) or multimodal (distractors: auditory-static face and audiovisual-dynamic face) picture word task. The cross-modal task had a low load, and the multimodal task had a high load (i.e., respectively naming pictures displayed on a blank screen vs. below the talker's face on his T-shirt). Semantic content of distractors was manipulated to be related vs. unrelated to the picture (e.g., picture "dog" with distractors "bear" vs. "cheese"). If irrelevant semantic content manipulation influences naming times on both tasks despite variations in loads, Lavie's (2005) perceptual load model proposes that semantic access is independent of capacity-limited attentional resources; if, however, irrelevant content influences naming only on the cross-modal task (low load), the perceptual load model proposes that semantic access is dependent on attentional resources exhausted by the higher load task. Irrelevant semantic content affected performance for both tasks in 6- to 9-year-olds but only on the cross-modal task in 4- to 5-year-olds. The addition of visual speech did not influence results on the multimodal task. Younger and older children differ in dependence on attentional resources for semantic access by speech.
Relation between measures of speech-in-noise performance and measures of efferent activity
NASA Astrophysics Data System (ADS)
Smith, Brad; Harkrider, Ashley; Burchfield, Samuel; Nabelek, Anna
2003-04-01
Individual differences in auditory perceptual abilities in noise are well documented but the factors causing such variability are unclear. The purpose of this study was to determine if individual differences in responses measured from the auditory efferent system were correlated to individual variations in speech-in-noise performance. The relation between behavioral performance on three speech-in-noise tasks and two objective measures of the efferent auditory system were examined in thirty normal-hearing, young adults. Two of the speech-in-noise tasks measured an acceptable noise level, the maximum level of speech-babble noise that a subject is willing to accept while listening to a story. For these, the acceptable noise level was evaluated using both an ipsilateral (story and noise in same ear) and a contralateral (story and noise in opposite ears) paradigm. The third speech-in-noise task evaluated speech recognition using monosyllabic words presented in competing speech babble. Auditory efferent activity was assessed by examining the resulting suppression of click-evoked otoacoustic emissions following the introduction of a contralateral, broad-band stimulus and the activity of the ipsilateral and contralateral acoustic reflex arc was evaluated using tones and broad-band noise. Results will be discussed relative to current theories of speech in noise performance and auditory inhibitory processes.
Neural Recruitment for the Production of Native and Novel Speech Sounds
Moser, Dana; Fridriksson, Julius; Bonilha, Leonardo; Healy, Eric W.; Baylis, Gordon; Baker, Julie; Rorden, Chris
2010-01-01
Two primary areas of damage have been implicated in apraxia of speech (AOS) based on the time post-stroke: (1) the left inferior frontal gyrus (IFG) in acute patients, and (2) the left anterior insula (aIns) in chronic patients. While AOS is widely characterized as a disorder in motor speech planning, little is known about the specific contributions of each of these regions in speech. The purpose of this study was to investigate cortical activation during speech production with a specific focus on the aIns and the IFG in normal adults. While undergoing sparse fMRI, 30 normal adults completed a 30-minute speech-repetition task consisting of three-syllable nonwords that contained either (a) English (native) syllables or (b) Non-English (novel) syllables. When the novel syllable productions were compared to the native syllable productions, greater neural activation was observed in the aIns and IFG, particularly during the first 10 minutes of the task when novelty was the greatest. Although activation in the aIns remained high throughout the task for novel productions, greater activation was clearly demonstrated when the initial 10 minutes were compared to the final 10 minutes of the task. These results suggest increased activity within an extensive neural network, including the aIns and IFG, when the motor speech system is taxed, such as during the production of novel speech. We speculate that the amount of left aIns recruitment during speech production may be related to the internal construction of the motor speech unit such that the degree of novelty/automaticity would result in more or less demands respectively. The role of the IFG as a storehouse and integrative processor for previously acquired routines is also discussed. PMID:19385020
Rimvall, M K; Clemmensen, L; Munkholm, A; Rask, C U; Larsen, J T; Skovgaard, A M; Simons, C J P; van Os, J; Jeppesen, P
2016-10-01
Auditory verbal hallucinations (AVH) are common during development and may arise due to dysregulation in top-down processing of sensory input. This study was designed to examine the frequency and correlates of speech illusions measured using the White Noise (WN) task in children from the general population. Associations between speech illusions and putative risk factors for psychotic disorder and negative affect were examined. A total of 1486 children aged 11-12 years of the Copenhagen Child Cohort 2000 were examined with the WN task. Psychotic experiences and negative affect were determined using the Kiddie-SADS-PL. Register data described family history of mental disorders. Exaggerated Theory of Mind functioning (hyper-ToM) was measured by the ToM Storybook Frederik. A total of 145 (10%) children experienced speech illusions (hearing speech in the absence of speech stimuli), of which 102 (70%) experienced illusions perceived by the child as positive or negative (affectively salient). Experiencing hallucinations during the last month was associated with affectively salient speech illusions in the WN task [general cognitive ability: adjusted odds ratio (aOR) 2.01, 95% confidence interval (CI) 1.03-3.93]. Negative affect, both last month and lifetime, was also associated with affectively salient speech illusions (aOR 2.01, 95% CI 1.05-3.83 and aOR 1.79, 95% CI 1.11-2.89, respectively). Speech illusions were not associated with delusions, hyper-ToM or family history of mental disorders. Speech illusions were elicited in typically developing children in a WN-test paradigm, and point to an affective pathway to AVH mediated by dysregulation in top-down processing of sensory input.
Language familiarity modulates relative attention to the eyes and mouth of a talker.
Barenholtz, Elan; Mavica, Lauren; Lewkowicz, David J
2016-02-01
We investigated whether the audiovisual speech cues available in a talker's mouth elicit greater attention when adults have to process speech in an unfamiliar language vs. a familiar language. Participants performed a speech-encoding task while watching and listening to videos of a talker in a familiar language (English) or an unfamiliar language (Spanish or Icelandic). Attention to the mouth increased in monolingual subjects in response to an unfamiliar language condition but did not in bilingual subjects when the task required speech processing. In the absence of an explicit speech-processing task, subjects attended equally to the eyes and mouth in response to both familiar and unfamiliar languages. Overall, these results demonstrate that language familiarity modulates selective attention to the redundant audiovisual speech cues in a talker's mouth in adults. When our findings are considered together with similar findings from infants, they suggest that this attentional strategy emerges very early in life. Copyright © 2015 Elsevier B.V. All rights reserved.
Fargier, Raphaël; Laganaro, Marina
2017-03-01
Picture naming tasks are largely used to elicit the production of specific words and sentences in psycholinguistic and neuroimaging research. However, the generation of lexical concepts from a visual input is clearly not the exclusive way speech production is triggered. In inferential speech encoding, the concept is not provided from a visual input, but is elaborated though semantic and/or episodic associations. It is therefore likely that the cognitive operations leading to lexical selection and word encoding are different in inferential and referential expressive language. In particular, in picture naming lexical selection might ensue from a simple association between a perceptual visual representation and a word with minimal semantic processes, whereas richer semantic associations are involved in lexical retrieval in inferential situations. Here we address this hypothesis by analyzing ERP correlates during word production in a referential and an inferential task. The participants produced the same words elicited from pictures or from short written definitions. The two tasks displayed similar electrophysiological patterns only in the time-period preceding the verbal response. In the stimulus-locked ERPs waveform amplitudes and periods of stable global electrophysiological patterns differed across tasks after the P100 component and until 400-500 ms, suggesting the involvement of different, task-specific neural networks. Based on the analysis of the time-windows affected by specific semantic and lexical variables in each task, we conclude that lexical selection is underpinned by a different set of conceptual and brain processes, with semantic processes clearly preceding word retrieval in naming from definition whereas the semantic information is enriched in parallel with word retrieval in picture naming.
Measuring L2 Speakers' Interactional Ability Using Interactive Speech Tasks
ERIC Educational Resources Information Center
van Batenburg, Eline S. L.; Oostdam, Ron J.; van Gelderen, Amos J. S.; de Jong, Nivja H.
2018-01-01
This article explores ways to assess interactional performance, and reports on the use of a test format that standardizes the interlocutor's linguistic and interactional contributions to the exchange. It describes the construction and administration of six scripted speech tasks (instruction, advice, and sales tasks) with pre-vocational learners (n…
Effect of attentional load on audiovisual speech perception: evidence from ERPs.
Alsius, Agnès; Möttönen, Riikka; Sams, Mikko E; Soto-Faraco, Salvador; Tiippana, Kaisa
2014-01-01
Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs) generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual, and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e., a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech.
Age-related differences in listening effort during degraded speech recognition
Ward, Kristina M.; Shen, Jing; Souza, Pamela E.; Grieco-Calub, Tina M.
2016-01-01
Objectives The purpose of the current study was to quantify age-related differences in executive control as it relates to dual-task performance, which is thought to represent listening effort, during degraded speech recognition. Design Twenty-five younger adults (18–24 years) and twenty-one older adults (56–82 years) completed a dual-task paradigm that consisted of a primary speech recognition task and a secondary visual monitoring task. Sentence material in the primary task was either unprocessed or spectrally degraded into 8, 6, or 4 spectral channels using noise-band vocoding. Performance on the visual monitoring task was assessed by the accuracy and reaction time of participants’ responses. Performance on the primary and secondary task was quantified in isolation (i.e., single task) and during the dual-task paradigm. Participants also completed a standardized psychometric measure of executive control, including attention and inhibition. Statistical analyses were implemented to evaluate changes in listeners’ performance on the primary and secondary tasks (1) per condition (unprocessed vs. vocoded conditions); (2) per task (baseline vs. dual task); and (3) per group (younger vs. older adults). Results Speech recognition declined with increasing spectral degradation for both younger and older adults when they performed the task in isolation or concurrently with the visual monitoring task. Older adults were slower and less accurate than younger adults on the visual monitoring task when performed in isolation, which paralleled age-related differences in standardized scores of executive control. When compared to single-task performance, older adults experienced greater declines in secondary-task accuracy, but not reaction time, than younger adults. Furthermore, results revealed that age-related differences in executive control significantly contributed to age-related differences on the visual monitoring task during the dual-task paradigm. Conclusions Older adults experienced significantly greater declines in secondary-task accuracy during degraded speech recognition than younger adults. These findings are interpreted as suggesting that older listeners expended greater listening effort than younger listeners, and may be partially attributed to age-related differences in executive control. PMID:27556526
Schall, Sonja; von Kriegstein, Katharina
2014-01-01
It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers’ voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker’s face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas. PMID:24466026
Auditory-Motor Processing of Speech Sounds
Möttönen, Riikka; Dutton, Rebekah; Watkins, Kate E.
2013-01-01
The motor regions that control movements of the articulators activate during listening to speech and contribute to performance in demanding speech recognition and discrimination tasks. Whether the articulatory motor cortex modulates auditory processing of speech sounds is unknown. Here, we aimed to determine whether the articulatory motor cortex affects the auditory mechanisms underlying discrimination of speech sounds in the absence of demanding speech tasks. Using electroencephalography, we recorded responses to changes in sound sequences, while participants watched a silent video. We also disrupted the lip or the hand representation in left motor cortex using transcranial magnetic stimulation. Disruption of the lip representation suppressed responses to changes in speech sounds, but not piano tones. In contrast, disruption of the hand representation had no effect on responses to changes in speech sounds. These findings show that disruptions within, but not outside, the articulatory motor cortex impair automatic auditory discrimination of speech sounds. The findings provide evidence for the importance of auditory-motor processes in efficient neural analysis of speech sounds. PMID:22581846
Howell, Ashley N; Weeks, Justin W
2017-01-01
Psychosocial factors, such as gender role norms, may impact how social anxiety disorder (SAD) is experienced and expressed in different social contexts for women. However to date, these factors have not been examined via experimental methodology. This was a cross-sectional, quasi-experimental controlled study. The current study included 48 highly socially anxious (HSA) women (70.9% meeting criteria for SAD) and examined the relationships among psychosocial factors (i.e. gender role self-discrepancies and self-perceived physical attractiveness), self-perceived social performance, and state anxiety, across two in vivo social tasks (i.e. conversation and opinion speech). On average, participants reported belief that they ought to be less feminine for the speech task and more masculine for both the conversation and speech tasks. Also, for the conversation task, only lower self-rated attractiveness predicted poorer self-perceived performance and greater post-task state anxiety, above gender role self-discrepancies and confederate gender. For the speech task, only greater self-discrepancy in prototypical masculine traits predicted poorer performance ratings, and it was related to greater state anxiety in anticipation of the task. For HSA women, psychosocial factors may play different roles in social anxiety across social contexts.
Roy, Nelson; Mazin, Alqhazo; Awan, Shaheen N
2014-03-01
Distinguishing muscle tension dysphonia (MTD) from adductor spasmodic dysphonia (ADSD) can be difficult. Unlike MTD, ADSD is described as "task-dependent," implying that dysphonia severity varies depending upon the demands of the vocal task, with connected speech thought to be more symptomatic than sustained vowels. This study used an acoustic index of dysphonia severity (i.e., the Cepstral Spectral Index of Dysphonia [CSID]) to: 1) assess the value of "task dependency" to distinguish ADSD from MTD, and to 2) examine associations between the CSID and listener ratings. Case-Control Study. CSID estimates of dysphonia severity for connected speech and sustained vowels of patients with ADSD (n = 36) and MTD (n = 45) were compared. The diagnostic precision of task dependency (as evidenced by differences in CSID-estimated dysphonia severity between connected speech and sustained vowels) was examined. In ADSD, CSID-estimated severity for connected speech (M = 39. 2, SD = 22.0) was significantly worse than for sustained vowels (M = 29.3, SD = 21.9), [P = .020]. Whereas in MTD, no significant difference in CSID-estimated severity was observed between connected speech (M = 55.1, SD = 23.8) and sustained vowels (M = 50.0, SD = 27.4), [P = .177]. CSID evidence of task dependency correctly identified 66.7% of ADSD cases (sensitivity) and 64.4% of MTD cases (specificity). CSID and listener ratings were significantly correlated. Task dependency in ADSD, as revealed by differences in acoustically-derived estimates of dysphonia severity between connected speech and sustained vowel production, is a potentially valuable diagnostic marker. © 2013 The American Laryngological, Rhinological and Otological Society, Inc.
ERIC Educational Resources Information Center
Pattamadilok, Chotiga; Nelis, Aubéline; Kolinsky, Régine
2014-01-01
Studies on proficient readers showed that speech processing is affected by knowledge of the orthographic code. Yet, the automaticity of the orthographic influence depends on task demand. Here, we addressed this automaticity issue in normal and dyslexic adult readers by comparing the orthographic effects obtained in two speech processing tasks that…
ERIC Educational Resources Information Center
Higgins, Meaghan C.; Penney, Sarah B.; Robertson, Erin K.
2017-01-01
The roles of phonological short-term memory (pSTM) and speech perception in spoken sentence comprehension were examined in an experimental design. Deficits in pSTM and speech perception were simulated through task demands while typically-developing children (N = 71) completed a sentence-picture matching task. Children performed the control,…
Comparison of different speech tasks among adults who stutter and adults who do not stutter
Ritto, Ana Paula; Costa, Julia Biancalana; Juste, Fabiola Staróbole; de Andrade, Claudia Regina Furquim
2016-01-01
OBJECTIVES: In this study, we compared the performance of both fluent speakers and people who stutter in three different speaking situations: monologue speech, oral reading and choral reading. This study follows the assumption that the neuromotor control of speech can be influenced by external auditory stimuli in both speakers who stutter and speakers who do not stutter. METHOD: Seventeen adults who stutter and seventeen adults who do not stutter were assessed in three speaking tasks: monologue, oral reading (solo reading aloud) and choral reading (reading in unison with the evaluator). Speech fluency and rate were measured for each task. RESULTS: The participants who stuttered had a lower frequency of stuttering during choral reading than during monologue and oral reading. CONCLUSIONS: According to the dual premotor system model, choral speech enhanced fluency by providing external cues for the timing of each syllable compensating for deficient internal cues. PMID:27074176
Sleep-Driven Computations in Speech Processing
Frost, Rebecca L. A.; Monaghan, Padraic
2017-01-01
Acquiring language requires segmenting speech into individual words, and abstracting over those words to discover grammatical structure. However, these tasks can be conflicting—on the one hand requiring memorisation of precise sequences that occur in speech, and on the other requiring a flexible reconstruction of these sequences to determine the grammar. Here, we examine whether speech segmentation and generalisation of grammar can occur simultaneously—with the conflicting requirements for these tasks being over-come by sleep-related consolidation. After exposure to an artificial language comprising words containing non-adjacent dependencies, participants underwent periods of consolidation involving either sleep or wake. Participants who slept before testing demonstrated a sustained boost to word learning and a short-term improvement to grammatical generalisation of the non-adjacencies, with improvements after sleep outweighing gains seen after an equal period of wake. Thus, we propose that sleep may facilitate processing for these conflicting tasks in language acquisition, but with enhanced benefits for speech segmentation. PMID:28056104
Sleep-Driven Computations in Speech Processing.
Frost, Rebecca L A; Monaghan, Padraic
2017-01-01
Acquiring language requires segmenting speech into individual words, and abstracting over those words to discover grammatical structure. However, these tasks can be conflicting-on the one hand requiring memorisation of precise sequences that occur in speech, and on the other requiring a flexible reconstruction of these sequences to determine the grammar. Here, we examine whether speech segmentation and generalisation of grammar can occur simultaneously-with the conflicting requirements for these tasks being over-come by sleep-related consolidation. After exposure to an artificial language comprising words containing non-adjacent dependencies, participants underwent periods of consolidation involving either sleep or wake. Participants who slept before testing demonstrated a sustained boost to word learning and a short-term improvement to grammatical generalisation of the non-adjacencies, with improvements after sleep outweighing gains seen after an equal period of wake. Thus, we propose that sleep may facilitate processing for these conflicting tasks in language acquisition, but with enhanced benefits for speech segmentation.
Speech Prosody Across Stimulus Types for Individuals with Parkinson's Disease.
K-Y Ma, Joan; Schneider, Christine B; Hoffmann, Rüdiger; Storch, Alexander
2015-01-01
Up to 89% of the individuals with Parkinson's disease (PD) experience speech problem over the course of the disease. Speech prosody and intelligibility are two of the most affected areas in hypokinetic dysarthria. However, assessment of these areas could potentially be problematic as speech prosody and intelligibility could be affected by the type of speech materials employed. To comparatively explore the effects of different types of speech stimulus on speech prosody and intelligibility in PD speakers. Speech prosody and intelligibility of two groups of individuals with varying degree of dysarthria resulting from PD was compared to that of a group of control speakers using sentence reading, passage reading and monologue. Acoustic analysis including measures on fundamental frequency (F0), intensity and speech rate was used to form a prosodic profile for each individual. Speech intelligibility was measured for the speakers with dysarthria using direct magnitude estimation. Difference in F0 variability between the speakers with dysarthria and control speakers was only observed in sentence reading task. Difference in the average intensity level was observed for speakers with mild dysarthria to that of the control speakers. Additionally, there were stimulus effect on both intelligibility and prosodic profile. The prosodic profile of PD speakers was different from that of the control speakers in the more structured task, and lower intelligibility was found in less structured task. This highlighted the value of both structured and natural stimulus to evaluate speech production in PD speakers.
BP reactivity to public speaking in stage 1 hypertension: influence of different task scenarios.
Palatini, Paolo; Bratti, Paolo; Palomba, Daniela; Bonso, Elisa; Saladini, Francesca; Benetti, Elisabetta; Casiglia, Edoardo
2011-10-01
To investigate the blood pressure (BP) reaction to public speaking performed according to different emotionally distressing scenarios in stage 1 hypertension. METHODS. We assessed 64 hypertensive and 30 normotensive subjects. They performed three speech tasks with neutral, anger and anxiety scenarios. BP was assessed with the Finometer beat-to-beat non-invasive recording system throughout the test procedure. For all types of speech, the systolic BP response was greater in the hypertensive than the normotensive subjects (all p < 0.001). At repeated-measures analysis of covariate (R-M ANCOVA), a significant group-by-time interaction was found for all scenarios (p ≤ 0.001). For the diastolic BP response, the between-group difference was significant for the task with anxiety scenario (p < 0.05). At R-M ANCOVA, a group-by-time interaction of borderline statistical significance was found for the speech with anxiety content (p = 0.053) but not for the speeches with neutral or anger content. Within the hypertensive group, the diastolic BP increments during the speeches with anxiety and anger scenarios were greater than those during the speech with neutral scenario (both p < 0.001). These data indicate that reactivity to public speaking is increased in stage 1 hypertension. A speech with anxiety or anger scenario elicits a greater diastolic BP reaction than tasks with neutral content.
Rosemann, Stephanie; Thiel, Christiane M
2018-07-15
Hearing loss is associated with difficulties in understanding speech, especially under adverse listening conditions. In these situations, seeing the speaker improves speech intelligibility in hearing-impaired participants. On the neuronal level, previous research has shown cross-modal plastic reorganization in the auditory cortex following hearing loss leading to altered processing of auditory, visual and audio-visual information. However, how reduced auditory input effects audio-visual speech perception in hearing-impaired subjects is largely unknown. We here investigated the impact of mild to moderate age-related hearing loss on processing audio-visual speech using functional magnetic resonance imaging. Normal-hearing and hearing-impaired participants performed two audio-visual speech integration tasks: a sentence detection task inside the scanner and the McGurk illusion outside the scanner. Both tasks consisted of congruent and incongruent audio-visual conditions, as well as auditory-only and visual-only conditions. We found a significantly stronger McGurk illusion in the hearing-impaired participants, which indicates stronger audio-visual integration. Neurally, hearing loss was associated with an increased recruitment of frontal brain areas when processing incongruent audio-visual, auditory and also visual speech stimuli, which may reflect the increased effort to perform the task. Hearing loss modulated both the audio-visual integration strength measured with the McGurk illusion and brain activation in frontal areas in the sentence task, showing stronger integration and higher brain activation with increasing hearing loss. Incongruent compared to congruent audio-visual speech revealed an opposite brain activation pattern in left ventral postcentral gyrus in both groups, with higher activation in hearing-impaired participants in the incongruent condition. Our results indicate that already mild to moderate hearing loss impacts audio-visual speech processing accompanied by changes in brain activation particularly involving frontal areas. These changes are modulated by the extent of hearing loss. Copyright © 2018 Elsevier Inc. All rights reserved.
Divided attention disrupts perceptual encoding during speech recognition.
Mattys, Sven L; Palmer, Shekeila D
2015-03-01
Performing a secondary task while listening to speech has a detrimental effect on speech processing, but the locus of the disruption within the speech system is poorly understood. Recent research has shown that cognitive load imposed by a concurrent visual task increases dependency on lexical knowledge during speech processing, but it does not affect lexical activation per se. This suggests that "lexical drift" under cognitive load occurs either as a post-lexical bias at the decisional level or as a secondary consequence of reduced perceptual sensitivity. This study aimed to adjudicate between these alternatives using a forced-choice task that required listeners to identify noise-degraded spoken words with or without the addition of a concurrent visual task. Adding cognitive load increased the likelihood that listeners would select a word acoustically similar to the target even though its frequency was lower than that of the target. Thus, there was no evidence that cognitive load led to a high-frequency response bias. Rather, cognitive load seems to disrupt sublexical encoding, possibly by impairing perceptual acuity at the auditory periphery.
Pinto, Serge; Mancini, Laura; Jahanshahi, Marjan; Thornton, John S; Tripoliti, Elina; Yousry, Tarek A; Limousin, Patricia
2011-10-01
Among the repertoire of motor functions, although hand movement and speech production tasks have been investigated widely by functional neuroimaging, paradigms combining both movements have been studied less so. Such paradigms are of particular interest in Parkinson's disease, in which patients have specific difficulties performing two movements simultaneously. In 9 unmedicated patients with Parkinson's disease and 15 healthy control subjects, externally cued tasks (i.e., hand movement, speech production, and combined hand movement and speech production) were performed twice in a random order and functional magnetic resonance imaging detected cerebral activations, compared to the rest. F-statistics tested within-group (significant activations at P values < 0.05, familywise error corrected), between-group, and between-task comparisons (regional activations significant at P values < 0.001, uncorrected, with cluster size > 10 voxels). For control subjects, the combined task activations comprised the sum of those obtained during hand movement and speech production performed separately, reflecting the neural correlates of performing movements sharing similar programming modalities. In patients with Parkinson's disease, only activations underlying hand movement were observed during the combined task. We interpreted this phenomenon as patients' potential inability to recruit facilitatory activations while performing two movements simultaneously. This lost capacity could be related to a functional prioritization of one movement (i.e., hand movement), in comparison with the other (i.e., speech production). Our observation could also reflect the inability of patients with Parkinson's disease to intrinsically engage the motor coordination necessary to perform a combined task. Copyright © 2011 Movement Disorder Society.
Is the hand to speech what speech is to the hand?
Mildner, V
2000-01-01
Interference between the manual and the verbal performance on two types of concurrent verbal-manual tasks was studied on a sample of 48 female right-handers. The more complex verbal task (storytelling) affected both hands significantly, the less complex (essentially phonemic) task affected only the right hand, with insignificant negative influence on the left-hand performance. No significant reciprocal effects of the motor task on verbalization were found.
ERIC Educational Resources Information Center
Tanaka, Hiroya; Oki, Nanaho
2015-01-01
This practical paper discusses the effect of explicit instruction to raise Japanese EFL learners' pragmatic awareness using online discourse completion tasks. The five-part tasks developed by the authors use American TV drama scenes depicting particular speech acts and include explicit instruction in these speech acts. 46 Japanese EFL college…
Toward a Systematic Evaluation of Vowel Target Events across Speech Tasks
ERIC Educational Resources Information Center
Kuo, Christina
2011-01-01
The core objective of this study was to examine whether acoustic variability of vowel production in American English, across speaking tasks, is systematic. Ten male speakers who spoke a relatively homogeneous Wisconsin dialect produced eight monophthong vowels (in hVd and CVC contexts) in four speaking tasks, including clear-speech, citation form,…
Voice and Fluency Changes as a Function of Speech Task and Deep Brain Stimulation
ERIC Educational Resources Information Center
Van Lancker Sidtis, Diana; Rogers, Tiffany; Godier, Violette; Tagliati, Michele; Sidtis, John J.
2010-01-01
Purpose: Speaking, which naturally occurs in different modes or "tasks" such as conversation and repetition, relies on intact basal ganglia nuclei. Recent studies suggest that voice and fluency parameters are differentially affected by speech task. In this study, the authors examine the effects of subcortical functionality on voice and fluency,…
ERIC Educational Resources Information Center
Robinson, Peter; Cadierno, Teresa; Shirai, Yasuhiro
2009-01-01
The Cognition Hypothesis (Robinson 2005) claims that pedagogic tasks should be sequenced for learners in an order of increasing cognitive complexity, and that along resource-directing dimensions of task demands increasing effort at conceptualization promotes more complex and grammaticized second language (L2) speech production. This article…
NASA Astrophysics Data System (ADS)
Munson, Benjamin; Deboe, Nancy
2003-10-01
A recent study (Pierrehumbert, Bent, Munson, and Bailey, submitted) found differences in vowel production between people who are lesbian, bisexual, or gay (LBG) and people who are not. The specific differences (more fronted /u/ and /a/ in the non-LB women; an overall more-contracted vowel space in the non-gay men) were not amenable to an interpretation based on simple group differences in vocal-tract geometry. Rather, they suggested that differences were either due to group differences in some other skill, such as motor control or phonological encoding, or learned. This paper expands on this research by examining vowel production, speech-motor control (measured by diadochokinetic rates), and phonological encoding (measured by error rates in a tongue-twister task) in people who are LBG and people who are not. Analyses focus on whether the findings of Pierrehumbert et al. (submitted) are replicable, and whether group differences in vowel production are related to group differences in speech-motor control or phonological encoding. To date, 20 LB women, 20 non-LB women, 7 gay men, and 7 non-gay men have participated. Preliminary analyses suggest that there are no group differences in speech motor control or phonological encoding, suggesting that the earlier findings of Pierrehumbert et al. reflected learned behaviors.
Revealing the dual streams of speech processing.
Fridriksson, Julius; Yourganov, Grigori; Bonilha, Leonardo; Basilakos, Alexandra; Den Ouden, Dirk-Bart; Rorden, Christopher
2016-12-27
Several dual route models of human speech processing have been proposed suggesting a large-scale anatomical division between cortical regions that support motor-phonological aspects vs. lexical-semantic aspects of speech processing. However, to date, there is no complete agreement on what areas subserve each route or the nature of interactions across these routes that enables human speech processing. Relying on an extensive behavioral and neuroimaging assessment of a large sample of stroke survivors, we used a data-driven approach using principal components analysis of lesion-symptom mapping to identify brain regions crucial for performance on clusters of behavioral tasks without a priori separation into task types. Distinct anatomical boundaries were revealed between a dorsal frontoparietal stream and a ventral temporal-frontal stream associated with separate components. Collapsing over the tasks primarily supported by these streams, we characterize the dorsal stream as a form-to-articulation pathway and the ventral stream as a form-to-meaning pathway. This characterization of the division in the data reflects both the overlap between tasks supported by the two streams as well as the observation that there is a bias for phonological production tasks supported by the dorsal stream and lexical-semantic comprehension tasks supported by the ventral stream. As such, our findings show a division between two processing routes that underlie human speech processing and provide an empirical foundation for studying potential computational differences that distinguish between the two routes.
Speech Characteristics Associated with Three Genotypes of Ataxia
ERIC Educational Resources Information Center
Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana
2011-01-01
Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…
Peng, Shu-Chen; Tomblin, J Bruce; Turner, Christopher W
2008-06-01
Current cochlear implant (CI) devices are limited in providing voice pitch information that is critical for listeners' recognition of prosodic contrasts of speech (e.g., intonation and lexical tones). As a result, mastery of the production and perception of such speech contrasts can be very challenging for prelingually deafened individuals who received a CI in their childhood (i.e., pediatric CI recipients). The purpose of this study was to investigate (a) pediatric CI recipients' mastery of the production and perception of speech intonation contrasts, in comparison with their age-matched peers with normal hearing (NH), and (b) the relationships between intonation production and perception in CI and NH individuals. Twenty-six pediatric CI recipients aged from 7.44 to 20.74 yrs and 17 age-matched individuals with NH participated. All CI users were prelingually deafened, and each of them received a CI between 1.48 and 6.34 yrs of age. Each participant performed an intonation production task and an intonation perception task. In the production task, 10 questions and 10 statements that were syntactically matched (e.g., "The girl is on the playground." versus "The girl is on the playground?") were elicited from each participant using interactive discourse involving pictures. These utterances were judged by a panel of eight adult listeners with NH in terms of utterance type accuracy (question versus statement) and contour appropriateness (on a five-point scale). In the perception task, each participant identified the speech intonation contrasts of natural utterances in a two-alternative forced-choice task. The results from the production task indicated that CI participants' scores for both utterance type accuracy and contour appropriateness were significantly lower than the scores of NH participants (both p < 0.001). The results from the perception task indicated that CI participants' identification accuracy was significantly lower than that of their NH peers (CI, 70.13% versus NH, 97.11%, p < 0.001). The Pearson correlation coefficients (r) between CI participants' performance levels in the production and perception tasks were approximately 0.65 (p = 0.001). As a group, pediatric CI recipients do not show mastery of speech intonation in their production or perception to the same extent as their NH peers. Pediatric CI recipients' performance levels in the production and perception of speech intonation contrasts are moderately correlated. Intersubject variability exists in pediatric CI recipients' mastery levels in the production and perception of speech intonation contrasts. These findings suggest the importance of addressing both aspects (production and perception) of speech intonation in the aural rehabilitation and speech intervention programs for prelingually deafened children and young adults who use a CI.
ERIC Educational Resources Information Center
De Felice, Rachele; Deane, Paul
2012-01-01
This study proposes an approach to automatically score the "TOEIC"® Writing e-mail task. We focus on one component of the scoring rubric, which notes whether the test-takers have used particular speech acts such as requests, orders, or commitments. We developed a computational model for automated speech act identification and tested it…
Mapping the Speech Code: Cortical Responses Linking the Perception and Production of Vowels
Schuerman, William L.; Meyer, Antje S.; McQueen, James M.
2017-01-01
The acoustic realization of speech is constrained by the physical mechanisms by which it is produced. Yet for speech perception, the degree to which listeners utilize experience derived from speech production has long been debated. In the present study, we examined how sensorimotor adaptation during production may affect perception, and how this relationship may be reflected in early vs. late electrophysiological responses. Participants first performed a baseline speech production task, followed by a vowel categorization task during which EEG responses were recorded. In a subsequent speech production task, half the participants received shifted auditory feedback, leading most to alter their articulations. This was followed by a second, post-training vowel categorization task. We compared changes in vowel production to both behavioral and electrophysiological changes in vowel perception. No differences in phonetic categorization were observed between groups receiving altered or unaltered feedback. However, exploratory analyses revealed correlations between vocal motor behavior and phonetic categorization. EEG analyses revealed correlations between vocal motor behavior and cortical responses in both early and late time windows. These results suggest that participants' recent production behavior influenced subsequent vowel perception. We suggest that the change in perception can be best characterized as a mapping of acoustics onto articulation. PMID:28439232
Is talking to an automated teller machine natural and fun?
Chan, F Y; Khalid, H M
Usability and affective issues of using automatic speech recognition technology to interact with an automated teller machine (ATM) are investigated in two experiments. The first uncovered dialogue patterns of ATM users for the purpose of designing the user interface for a simulated speech ATM system. Applying the Wizard-of-Oz methodology, multiple mapping and word spotting techniques, the speech driven ATM accommodates bilingual users of Bahasa Melayu and English. The second experiment evaluates the usability of a hybrid speech ATM, comparing it with a simulated manual ATM. The aim is to investigate how natural and fun can talking to a speech ATM be for these first-time users. Subjects performed the withdrawal and balance enquiry tasks. The ANOVA was performed on the usability and affective data. The results showed significant differences between systems in the ability to complete the tasks as well as in transaction errors. Performance was measured on the time taken by subjects to complete the task and the number of speech recognition errors that occurred. On the basis of user emotions, it can be said that the hybrid speech system enabled pleasurable interaction. Despite the limitations of speech recognition technology, users are set to talk to the ATM when it becomes available for public use.
Development and preliminary evaluation of a pediatric Spanish-English speech perception task.
Calandruccio, Lauren; Gomez, Bianca; Buss, Emily; Leibold, Lori J
2014-06-01
The purpose of this study was to develop a task to evaluate children's English and Spanish speech perception abilities in either noise or competing speech maskers. Eight bilingual Spanish-English and 8 age-matched monolingual English children (ages 4.9-16.4 years) were tested. A forced-choice, picture-pointing paradigm was selected for adaptively estimating masked speech reception thresholds. Speech stimuli were spoken by simultaneous bilingual Spanish-English talkers. The target stimuli were 30 disyllabic English and Spanish words, familiar to 5-year-olds and easily illustrated. Competing stimuli included either 2-talker English or 2-talker Spanish speech (corresponding to target language) and spectrally matched noise. For both groups of children, regardless of test language, performance was significantly worse for the 2-talker than for the noise masker condition. No difference in performance was found between bilingual and monolingual children. Bilingual children performed significantly better in English than in Spanish in competing speech. For all listening conditions, performance improved with increasing age. Results indicated that the stimuli and task were appropriate for speech recognition testing in both languages, providing a more conventional measure of speech-in-noise perception as well as a measure of complex listening. Further research is needed to determine performance for Spanish-dominant listeners and to evaluate the feasibility of implementation into routine clinical use.
Development and preliminary evaluation of a pediatric Spanish/English speech perception task
Calandruccio, Lauren; Gomez, Bianca; Buss, Emily; Leibold, Lori J.
2014-01-01
Purpose To develop a task to evaluate children’s English and Spanish speech perception abilities in either noise or competing speech maskers. Methods Eight bilingual Spanish/English and eight age matched monolingual English children (ages 4.9 –16.4 years) were tested. A forced-choice, picture-pointing paradigm was selected for adaptively estimating masked speech reception thresholds. Speech stimuli were spoken by simultaneous bilingual Spanish/English talkers. The target stimuli were thirty disyllabic English and Spanish words, familiar to five-year-olds, and easily illustrated. Competing stimuli included either two-talker English or two-talker Spanish speech (corresponding to target language) and spectrally matched noise. Results For both groups of children, regardless of test language, performance was significantly worse for the two-talker than the noise masker. No difference in performance was found between bilingual and monolingual children. Bilingual children performed significantly better in English than in Spanish in competing speech. For all listening conditions, performance improved with increasing age. Conclusions Results indicate that the stimuli and task are appropriate for speech recognition testing in both languages, providing a more conventional measure of speech-in-noise perception as well as a measure of complex listening. Further research is needed to determine performance for Spanish-dominant listeners and to evaluate the feasibility of implementation into routine clinical use. PMID:24686915
Robust relationship between reading span and speech recognition in noise
Souza, Pamela; Arehart, Kathryn
2015-01-01
Objective Working memory refers to a cognitive system that manages information processing and temporary storage. Recent work has demonstrated that individual differences in working memory capacity measured using a reading span task are related to ability to recognize speech in noise. In this project, we investigated whether the specific implementation of the reading span task influenced the strength of the relationship between working memory capacity and speech recognition. Design The relationship between speech recognition and working memory capacity was examined for two different working memory tests that varied in approach, using a within-subject design. Data consisted of audiometric results along with the two different working memory tests; one speech-in-noise test; and a reading comprehension test. Study sample The test group included 94 older adults with varying hearing loss and 30 younger adults with normal hearing. Results Listeners with poorer working memory capacity had more difficulty understanding speech in noise after accounting for age and degree of hearing loss. That relationship did not differ significantly between the two different implementations of reading span. Conclusions Our findings suggest that different implementations of a verbal reading span task do not affect the strength of the relationship between working memory capacity and speech recognition. PMID:25975360
Robust relationship between reading span and speech recognition in noise.
Souza, Pamela; Arehart, Kathryn
2015-01-01
Working memory refers to a cognitive system that manages information processing and temporary storage. Recent work has demonstrated that individual differences in working memory capacity measured using a reading span task are related to ability to recognize speech in noise. In this project, we investigated whether the specific implementation of the reading span task influenced the strength of the relationship between working memory capacity and speech recognition. The relationship between speech recognition and working memory capacity was examined for two different working memory tests that varied in approach, using a within-subject design. Data consisted of audiometric results along with the two different working memory tests; one speech-in-noise test; and a reading comprehension test. The test group included 94 older adults with varying hearing loss and 30 younger adults with normal hearing. Listeners with poorer working memory capacity had more difficulty understanding speech in noise after accounting for age and degree of hearing loss. That relationship did not differ significantly between the two different implementations of reading span. Our findings suggest that different implementations of a verbal reading span task do not affect the strength of the relationship between working memory capacity and speech recognition.
Baylis, Adriane L.; Munson, Benjamin; Moller, Karlind T.
2010-01-01
Objective To examine the influence of speech perception, cognition, and implicit phonological learning on articulation skills of children with Velocardiofacial syndrome (VCFS) and children with cleft palate or velopharyngeal dysfunction (VPD). Design Cross-sectional group experimental design. Participants 8 children with VCFS and 5 children with non-syndromic cleft palate or VPD. Methods and Measures All children participated in a phonetic inventory task, speech perception task, implicit priming nonword repetition task, conversational sample, nonverbal intelligence test, and hearing screening. Speech tasks were scored for percentage of phonemes correctly produced. Group differences and relations among measures were examined using nonparametric statistics. Results Children in the VCFS group demonstrated significantly poorer articulation skills and lower standard scores of nonverbal intelligence compared to the children with cleft palate or VPD. There were no significant group differences in speech perception skills. For the implicit priming task, both groups of children were more accurate in producing primed nonwords than unprimed nonwords. Nonverbal intelligence and severity of velopharyngeal inadequacy for speech were correlated with articulation skills. Conclusions In this study, children with VCFS had poorer articulation skills compared to children with cleft palate or VPD. Articulation difficulties seen in the children with VCFS did not appear to be associated with speech perception skills or the ability to learn new phonological representations. Future research should continue to examine relationships between articulation, cognition, and velopharyngeal dysfunction in a larger sample of children with cleft palate and VCFS. PMID:18333642
Effect of attentional load on audiovisual speech perception: evidence from ERPs
Alsius, Agnès; Möttönen, Riikka; Sams, Mikko E.; Soto-Faraco, Salvador; Tiippana, Kaisa
2014-01-01
Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs) generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual, and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e., a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech. PMID:25076922
Soskey, Laura N; Allen, Paul D; Bennetto, Loisa
2017-08-01
One of the earliest observable impairments in autism spectrum disorder (ASD) is a failure to orient to speech and other social stimuli. Auditory spatial attention, a key component of orienting to sounds in the environment, has been shown to be impaired in adults with ASD. Additionally, specific deficits in orienting to social sounds could be related to increased acoustic complexity of speech. We aimed to characterize auditory spatial attention in children with ASD and neurotypical controls, and to determine the effect of auditory stimulus complexity on spatial attention. In a spatial attention task, target and distractor sounds were played randomly in rapid succession from speakers in a free-field array. Participants attended to a central or peripheral location, and were instructed to respond to target sounds at the attended location while ignoring nearby sounds. Stimulus-specific blocks evaluated spatial attention for simple non-speech tones, speech sounds (vowels), and complex non-speech sounds matched to vowels on key acoustic properties. Children with ASD had significantly more diffuse auditory spatial attention than neurotypical children when attending front, indicated by increased responding to sounds at adjacent non-target locations. No significant differences in spatial attention emerged based on stimulus complexity. Additionally, in the ASD group, more diffuse spatial attention was associated with more severe ASD symptoms but not with general inattention symptoms. Spatial attention deficits have important implications for understanding social orienting deficits and atypical attentional processes that contribute to core deficits of ASD. Autism Res 2017, 10: 1405-1416. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.
Yunusova, Yana; Graham, Naida L.; Shellikeri, Sanjana; Phuong, Kent; Kulkarni, Madhura; Rochon, Elizabeth; Tang-Wai, David F.; Chow, Tiffany W.; Black, Sandra E.; Zinman, Lorne H.; Green, Jordan R.
2016-01-01
Objective This study examines reading aloud in patients with amyotrophic lateral sclerosis (ALS) and those with frontotemporal dementia (FTD) in order to determine whether differences in patterns of speaking and pausing exist between patients with primary motor vs. primary cognitive-linguistic deficits, and in contrast to healthy controls. Design 136 participants were included in the study: 33 controls, 85 patients with ALS, and 18 patients with either the behavioural variant of FTD (FTD-BV) or progressive nonfluent aphasia (FTD-PNFA). Participants with ALS were further divided into 4 non-overlapping subgroups—mild, respiratory, bulbar (with oral-motor deficit) and bulbar-respiratory—based on the presence and severity of motor bulbar or respiratory signs. All participants read a passage aloud. Custom-made software was used to perform speech and pause analyses, and this provided measures of speaking and articulatory rates, duration of speech, and number and duration of pauses. These measures were statistically compared in different subgroups of patients. Results The results revealed clear differences between patient groups and healthy controls on the passage reading task. A speech-based motor function measure (i.e., articulatory rate) was able to distinguish patients with bulbar ALS or FTD-PNFA from those with respiratory ALS or FTD-BV. Distinguishing the disordered groups proved challenging based on the pausing measures. Conclusions and Relevance This study demonstrated the use of speech measures in the identification of those with an oral-motor deficit, and showed the usefulness of performing a relatively simple reading test to assess speech versus pause behaviors across the ALS—FTD disease continuum. The findings also suggest that motor speech assessment should be performed as part of the diagnostic workup for patients with FTD. PMID:26789001
Speech and gesture in spatial language and cognition among the Yucatec Mayas.
Le Guen, Olivier
2011-07-01
In previous analyses of the influence of language on cognition, speech has been the main channel examined. In studies conducted among Yucatec Mayas, efforts to determine the preferred frame of reference in use in this community have failed to reach an agreement (Bohnemeyer & Stolz, 2006; Levinson, 2003 vs. Le Guen, 2006, 2009). This paper argues for a multimodal analysis of language that encompasses gesture as well as speech, and shows that the preferred frame of reference in Yucatec Maya is only detectable through the analysis of co-speech gesture and not through speech alone. A series of experiments compares knowledge of the semantics of spatial terms, performance on nonlinguistic tasks and gestures produced by men and women. The results show a striking gender difference in the knowledge of the semantics of spatial terms, but an equal preference for a geocentric frame of reference in nonverbal tasks. In a localization task, participants used a variety of strategies in their speech, but they all exhibited a systematic preference for a geocentric frame of reference in their gestures. Copyright © 2011 Cognitive Science Society, Inc.
Van der Haegen, Lise; Acke, Frederic; Vingerhoets, Guy; Dhooge, Ingeborg; De Leenheer, Els; Cai, Qing; Brysbaert, Marc
2016-12-01
Auditory speech perception, speech production and reading lateralize to the left hemisphere in the majority of healthy right-handers. In this study, we investigated to what extent sensory input underlies the side of language dominance. We measured the lateralization of the three core subprocesses of language in patients who had profound hearing loss in the right ear from birth and in matched control subjects. They took part in a semantic decision listening task involving speech and sound stimuli (auditory perception), a word generation task (speech production) and a passive reading task (reading). The results show that a lack of sensory auditory input on the right side, which is strongly connected to the contralateral left hemisphere, does not lead to atypical lateralization of speech perception. Speech production and reading were also typically left lateralized in all but one patient, contradicting previous small scale studies. Other factors such as genetic constraints presumably overrule the role of sensory input in the development of (a)typical language lateralization. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Huber, Jessica E.
2007-01-01
Purpose: This study examined the response of the respiratory system to 3 cues used to elicit increased vocal loudness to determine whether the effects of cueing, shown previously in sentence tasks, were present in connected speech tasks and to describe differences among tasks. Method: Fifteen young men and 15 young women produced a 2-paragraph…
ERIC Educational Resources Information Center
Oi, Misato; Saito, Hirofumi; Li, Zongfeng; Zhao, Wenjun
2013-01-01
To examine the neural mechanism of co-speech gesture production, we measured brain activity of bilinguals during an animation-narration task using near-infrared spectroscopy. The task of the participants was to watch two stories via an animated cartoon, and then narrate the contents in their first language (Ll) and second language (L2),…
Chatterjee, Monita; Peng, Shu-Chen
2008-01-01
Fundamental frequency (F0) processing by cochlear implant (CI) listeners was measured using a psychophysical task and a speech intonation recognition task. Listeners' Weber fractions for modulation frequency discrimination were measured using an adaptive, 3-interval, forced-choice paradigm: stimuli were presented through a custom research interface. In the speech intonation recognition task, listeners were asked to indicate whether resynthesized bisyllabic words, when presented in the free field through the listeners' everyday speech processor, were question-like or statement-like. The resynthesized tokens were systematically manipulated to have different initial-F0s to represent male vs. female voices, and different F0 contours (i.e. falling, flat, and rising) Although the CI listeners showed considerable variation in performance on both tasks, significant correlations were observed between the CI listeners' sensitivity to modulation frequency in the psychophysical task and their performance in intonation recognition. Consistent with their greater reliance on temporal cues, the CI listeners' performance in the intonation recognition task was significantly poorer with the higher initial-F0 stimuli than with the lower initial-F0 stimuli. Similar results were obtained with normal hearing listeners attending to noiseband-vocoded CI simulations with reduced spectral resolution.
Chatterjee, Monita; Peng, Shu-Chen
2008-01-01
Fundamental frequency (F0) processing by cochlear implant (CI) listeners was measured using a psychophysical task and a speech intonation recognition task. Listeners’ Weber fractions for modulation frequency discrimination were measured using an adaptive, 3-interval, forced-choice paradigm: stimuli were presented through a custom research interface. In the speech intonation recognition task, listeners were asked to indicate whether resynthesized bisyllabic words, when presented in the free field through the listeners’ everyday speech processor, were question-like or statement-like. The resynthesized tokens were systematically manipulated to have different initial F0s to represent male vs. female voices, and different F0 contours (i.e., falling, flat, and rising) Although the CI listeners showed considerable variation in performance on both tasks, significant correlations were observed between the CI listeners’ sensitivity to modulation frequency in the psychophysical task and their performance in intonation recognition. Consistent with their greater reliance on temporal cues, the CI listeners’ performance in the intonation recognition task was significantly poorer with the higher initial-F0 stimuli than with the lower initial-F0 stimuli. Similar results were obtained with normal hearing listeners attending to noiseband-vocoded CI simulations with reduced spectral resolution. PMID:18093766
ERIC Educational Resources Information Center
Harris, Karen R.
To investigate task performance and the use of private speech and to examine the effects of a cognitive training approach, 30 learning disabled (LD) and 30 nonLD Ss (7 to 8 years old) were given a 17 piece wooden puzzle rigged so that it could not be completed correctly. Six variables were measured: (1) proportion of private speech that was task…
De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A
2016-01-01
As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and speech. The picture that emerges from our analyses suggests that the relation between gestures, speech and cognition is more complex than previously thought. We suggest that temporal differences and asymmetry in influence between gestures and speech arise from simultaneous coordination of synergies.
Martens, Heidi; Van Nuffelen, Gwen; Dekens, Tomas; Hernández-Díaz Huici, Maria; Kairuz Hernández-Díaz, Hector Arturo; De Letter, Miet; De Bodt, Marc
2015-01-01
Most studies on treatment of prosody in individuals with dysarthria due to Parkinson's disease are based on intensive treatment of loudness. The present study investigates the effect of intensive treatment of speech rate and intonation on the intelligibility of individuals with dysarthria due to Parkinson's disease. A one group pretest-posttest design was used to compare intelligibility, speech rate, and intonation before and after treatment. Participants included eleven Dutch-speaking individuals with predominantly moderate dysarthria due to Parkinson's disease, who received five one-hour treatment sessions per week during three weeks. Treatment focused on lowering speech rate and magnifying the phrase final intonation contrast between statements and questions. Intelligibility was perceptually assessed using a standardized sentence intelligibility test. Speech rate was automatically assessed during the sentence intelligibility test as well as during a passage reading task and a storytelling task. Intonation was perceptually assessed using a sentence reading task and a sentence repetition task, and also acoustically analyzed in terms of maximum fundamental frequency. After treatment, there was a significant improvement of sentence intelligibility (effect size .83), a significant increase of pause frequency during the passage reading task, a significant improvement of correct listener identification of statements and questions, and a significant increase of the maximum fundamental frequency in the final syllable of questions during both intonation tasks. The findings suggest that participants were more intelligible and more able to manipulate pause frequency and statement-question intonation after treatment. However, the relationship between the change in intelligibility on the one hand and the changes in speech rate and intonation on the other hand is not yet fully understood. Results are nuanced in the light of the operated research design. The reader will be able to: (1) describe the effect of intensive speech rate and intonation treatment on intelligibility of speakers with dysarthria due to PD, (2) describe the effect of intensive speech rate treatment on rate manipulation by speakers with dysarthria due to PD, and (3) describe the effect of intensive intonation treatment on manipulation of the phrase final intonation contrast between statements and questions by speakers with dysarthria due to PD. Copyright © 2015 Elsevier Inc. All rights reserved.
Age and measurement time-of-day effects on speech recognition in noise.
Veneman, Carrie E; Gordon-Salant, Sandra; Matthews, Lois J; Dubno, Judy R
2013-01-01
The purpose of this study was to determine the effect of measurement time of day on speech recognition in noise and the extent to which time-of-day effects differ with age. Older adults tend to have more difficulty understanding speech in noise than younger adults, even when hearing is normal. Two possible contributors to this age difference in speech recognition may be measurement time of day and inhibition. Most younger adults are "evening-type," showing peak circadian arousal in the evening, whereas most older adults are "morning-type," with circadian arousal peaking in the morning. Tasks that require inhibition of irrelevant information have been shown to be affected by measurement time of day, with maximum performance attained at one's peak time of day. The authors hypothesized that a change in inhibition will be associated with measurement time of day and therefore affect speech recognition in noise, with better performance in the morning for older adults and in the evening for younger adults. Fifteen younger evening-type adults (20-28 years) and 15 older morning-type adults with normal hearing (66-78 years) listened to the Hearing in Noise Test (HINT) and the Quick Speech in Noise (QuickSIN) test in the morning and evening (peak and off-peak times). Time of day preference was assessed using the Morningness-Eveningness Questionnaire. Sentences and noise were presented binaurally through insert earphones. During morning and evening sessions, participants solved word-association problems within the visual-distraction task (VDT), which was used as an estimate of inhibition. After each session, participants rated perceived mental demand of the tasks using a revised version of the NASA Task Load Index. Younger adults performed significantly better on the speech-in-noise tasks and rated themselves as requiring significantly less mental demand when tested at their peak (evening) than off-peak (morning) time of day. In contrast, time-of-day effects were not observed for the older adults on the speech recognition or rating tasks. Although older adults required significantly more advantageous signal-to-noise ratios than younger adults for equivalent speech-recognition performance, a significantly larger younger versus older age difference in speech recognition was observed in the evening than in the morning. Older adults performed significantly poorer than younger adults on the VDT, but performance was not affected by measurement time of day. VDT performance for misleading distracter items was significantly correlated with HINT and QuickSIN test performance at the peak measurement time of day. Although all participants had normal hearing, speech recognition in noise was significantly poorer for older than younger adults, with larger age-related differences in the evening (an off-peak time for older adults) than in the morning. The significant effect of measurement time of day suggests that this factor may impact the clinical assessment of speech recognition in noise for all individuals. It appears that inhibition, as estimated by a visual distraction task for misleading visual items, is a cognitive mechanism that is related to speech-recognition performance in noise, at least at a listener's peak time of day.
A Comparison of Five FMRI Protocols for Mapping Speech Comprehension Systems
Binder, Jeffrey R.; Swanson, Sara J.; Hammeke, Thomas A.; Sabsevitz, David S.
2008-01-01
Aims Many fMRI protocols for localizing speech comprehension have been described, but there has been little quantitative comparison of these methods. We compared five such protocols in terms of areas activated, extent of activation, and lateralization. Methods FMRI BOLD signals were measured in 26 healthy adults during passive listening and active tasks using words and tones. Contrasts were designed to identify speech perception and semantic processing systems. Activation extent and lateralization were quantified by counting activated voxels in each hemisphere for each participant. Results Passive listening to words produced bilateral superior temporal activation. After controlling for pre-linguistic auditory processing, only a small area in the left superior temporal sulcus responded selectively to speech. Active tasks engaged an extensive, bilateral attention and executive processing network. Optimal results (consistent activation and strongly lateralized pattern) were obtained by contrasting an active semantic decision task with a tone decision task. There was striking similarity between the network of brain regions activated by the semantic task and the network of brain regions that showed task-induced deactivation, suggesting that semantic processing occurs during the resting state. Conclusions FMRI protocols for mapping speech comprehension systems differ dramatically in pattern, extent, and lateralization of activation. Brain regions involved in semantic processing were identified only when an active, non-linguistic task was used as a baseline, supporting the notion that semantic processing occurs whenever attentional resources are not controlled. Identification of these lexical-semantic regions is particularly important for predicting language outcome in patients undergoing temporal lobe surgery. PMID:18513352
Nonword repetition and nonword reading abilities in adults who do and do not stutter.
Sasisekaran, Jayanthi
2013-09-01
In the present study a nonword repetition and a nonword reading task were used to investigate the behavioral (speech accuracy) and speech kinematic (movement variability measured as lip aperture variability index; speech duration) profiles of groups of young adults who do (AWS) and do not stutter (control). Participants were 9 AWS (8 males, Mean age=32.2, SD=14.7) and 9 age- and sex-matched control participants (Mean age=31.8, SD=14.6). For the nonword repetition task, participants were administered the Nonword Repetition Test (Dollaghan & Campbell, 1998). For the reading task, participants were required to read out target nonwords varying in length (6 vs. 11 syllables). Repeated measures analyses of variance were conducted to compare the groups in percent speech accuracy for both tasks; only for the nonword reading task, the groups were compared in movement variability and speech duration. The groups were comparable in percent accuracy in nonword repetition. Findings from nonword reading revealed a trend for the AWS to show a lower percent of accurate productions compared to the control group. AWS also showed significantly higher movement variability and longer speech durations compared to the control group in nonword reading. Some preliminary evidence for group differences in practice effect (seen as differences between the early vs. later 5 trials) was evident in speech duration. Findings suggest differences between AWS and control groups in phonemic encoding and/or speech motor planning and production. Findings from nonword repetition vs. reading highlight the need for careful consideration of nonword properties. At the end of this activity the reader will be able to: (a) summarize the literature on nonword repetition skills in adults who stutter, (b) describe processes underlying nonword repetition and nonword reading, (c) summarize whether or not adults who stutter differ from those who do not in the behavioral and kinematic markers of nonword reading performance, (d) discuss future directions for research. Copyright © 2013 Elsevier Inc. All rights reserved.
Morin, Alain; Hamper, Breanne
2012-01-01
Inner speech involvement in self-reflection was examined by reviewing 130 studies assessing brain activation during self-referential processing in key self-domains: agency, self-recognition, emotions, personality traits, autobiographical memory, and miscellaneous (e.g., prospection, judgments). The left inferior frontal gyrus (LIFG) has been shown to be reliably recruited during inner speech production. The percentage of studies reporting LIFG activity for each self-dimension was calculated. Fifty five percent of all studies reviewed indicated LIFG (and presumably inner speech) activity during self-reflection tasks; on average LIFG activation is observed 16% of the time during completion of non-self tasks (e.g., attention, perception). The highest LIFG activation rate was observed during retrieval of autobiographical information. The LIFG was significantly more recruited during conceptual tasks (e.g., prospection, traits) than during perceptual tasks (agency and self-recognition). This constitutes additional evidence supporting the idea of a participation of inner speech in self-related thinking. PMID:23049653
Morin, Alain; Hamper, Breanne
2012-01-01
Inner speech involvement in self-reflection was examined by reviewing 130 studies assessing brain activation during self-referential processing in key self-domains: agency, self-recognition, emotions, personality traits, autobiographical memory, and miscellaneous (e.g., prospection, judgments). The left inferior frontal gyrus (LIFG) has been shown to be reliably recruited during inner speech production. The percentage of studies reporting LIFG activity for each self-dimension was calculated. Fifty five percent of all studies reviewed indicated LIFG (and presumably inner speech) activity during self-reflection tasks; on average LIFG activation is observed 16% of the time during completion of non-self tasks (e.g., attention, perception). The highest LIFG activation rate was observed during retrieval of autobiographical information. The LIFG was significantly more recruited during conceptual tasks (e.g., prospection, traits) than during perceptual tasks (agency and self-recognition). This constitutes additional evidence supporting the idea of a participation of inner speech in self-related thinking.
Westenberg, P Michiel; Bokhorst, Caroline L; Miers, Anne C; Sumter, Sindy R; Kallen, Victor L; van Pelt, Johannes; Blöte, Anke W
2009-10-01
This study describes a new public speaking protocol for youth. The main question asked whether a speech prepared at home and given in front of a pre-recorded audience creates a condition of social-evaluative threat. Findings showed that, on average, this task elicits a moderate stress response in a community sample of 83 12- to 15-year-old adolescents. During the speech, participants reported feeling more nervous and having higher heart rate and sweatiness of the hands than at baseline or recovery. Likewise, physiological (heart rate and skin conductance) and neuroendocrine (cortisol) activity were higher during the speech than at baseline or recovery. Additionally, an anticipation effect was observed: baseline levels were higher than recovery levels for most variables. Taking the anticipation and speech response together, a substantial cortisol response was observed for 55% of participants. The findings indicate that the Leiden Public Speaking Task might be particularly suited to investigate individual differences in sensitivity to social-evaluative situations.
Baldwin, Carryl L; Struckman-Johnson, David
2002-01-15
Speech displays and verbal response technologies are increasingly being used in complex, high workload environments that require the simultaneous performance of visual and manual tasks. Examples of such environments include the flight decks of modern aircraft, advanced transport telematics systems providing invehicle route guidance and navigational information and mobile communication equipment in emergency and public safety vehicles. Previous research has established an optimum range for speech intelligibility. However, the potential for variations in presentation levels within this range to affect attentional resources and cognitive processing of speech material has not been examined previously. Results of the current experimental investigation demonstrate that as presentation level increases within this 'optimum' range, participants in high workload situations make fewer sentence-processing errors and generally respond faster. Processing errors were more sensitive to changes in presentation level than were measures of reaction time. Implications of these findings are discussed in terms of their application for the design of speech communications displays in complex multi-task environments.
Speech processing using maximum likelihood continuity mapping
Hogden, John E.
2000-01-01
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Speech processing using maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.E.
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
High-frequency energy in singing and speech
NASA Astrophysics Data System (ADS)
Monson, Brian Bruce
While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.
A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields.
Carlin, Michael A; Elhilali, Mounya
2015-12-01
One of the hallmarks of sound processing in the brain is the ability of the nervous system to adapt to changing behavioral demands and surrounding soundscapes. It can dynamically shift sensory and cognitive resources to focus on relevant sounds. Neurophysiological studies indicate that this ability is supported by adaptively retuning the shapes of cortical spectro-temporal receptive fields (STRFs) to enhance features of target sounds while suppressing those of task-irrelevant distractors. Because an important component of human communication is the ability of a listener to dynamically track speech in noisy environments, the solution obtained by auditory neurophysiology implies a useful adaptation strategy for speech activity detection (SAD). SAD is an important first step in a number of automated speech processing systems, and performance is often reduced in highly noisy environments. In this paper, we describe how task-driven adaptation is induced in an ensemble of neurophysiological STRFs, and show how speech-adapted STRFs reorient themselves to enhance spectro-temporal modulations of speech while suppressing those associated with a variety of nonspeech sounds. We then show how an adapted ensemble of STRFs can better detect speech in unseen noisy environments compared to an unadapted ensemble and a noise-robust baseline. Finally, we use a stimulus reconstruction task to demonstrate how the adapted STRF ensemble better captures the spectrotemporal modulations of attended speech in clean and noisy conditions. Our results suggest that a biologically plausible adaptation framework can be applied to speech processing systems to dynamically adapt feature representations for improving noise robustness.
Lockart, Rebekah; McLeod, Sharynne
2013-08-01
To investigate speech-language pathology students' ability to identify errors and transcribe typical and atypical speech in Cantonese, a nonnative language. Thirty-three English-speaking speech-language pathology students completed 3 tasks in an experimental within-subjects design. Task 1 (baseline) involved transcribing English words. In Task 2, students transcribed 25 words spoken by a Cantonese adult. An average of 59.1% consonants was transcribed correctly (72.9% when Cantonese-English transfer patterns were allowed). There was higher accuracy on shared English and Cantonese syllable-initial consonants /m,n,f,s,h,j,w,l/ and syllable-final consonants. In Task 3, students identified consonant errors and transcribed 100 words spoken by Cantonese-speaking children under 4 additive conditions: (1) baseline, (2) +adult model, (3) +information about Cantonese phonology, and (4) all variables (2 and 3 were counterbalanced). There was a significant improvement in the students' identification and transcription scores for conditions 2, 3, and 4, with a moderate effect size. Increased skill was not based on listeners' proficiency in speaking another language, perceived transcription skill, musicality, or confidence with multilingual clients. Speech-language pathology students, with no exposure to or specific training in Cantonese, have some skills to identify errors and transcribe Cantonese. Provision of a Cantonese-adult model and information about Cantonese phonology increased students' accuracy in transcribing Cantonese speech.
Cohesive and coherent connected speech deficits in mild stroke.
Barker, Megan S; Young, Breanne; Robinson, Gail A
2017-05-01
Spoken language production theories and lesion studies highlight several important prelinguistic conceptual preparation processes involved in the production of cohesive and coherent connected speech. Cohesion and coherence broadly connect sentences with preceding ideas and the overall topic. Broader cognitive mechanisms may mediate these processes. This study aims to investigate (1) whether stroke patients without aphasia exhibit impairments in cohesion and coherence in connected speech, and (2) the role of attention and executive functions in the production of connected speech. Eighteen stroke patients (8 right hemisphere stroke [RHS]; 6 left [LHS]) and 21 healthy controls completed two self-generated narrative tasks to elicit connected speech. A multi-level analysis of within and between-sentence processing ability was conducted. Cohesion and coherence impairments were found in the stroke group, particularly RHS patients, relative to controls. In the whole stroke group, better performance on the Hayling Test of executive function, which taps verbal initiation/suppression, was related to fewer propositional repetitions and global coherence errors. Better performance on attention tasks was related to fewer propositional repetitions, and decreased global coherence errors. In the RHS group, aspects of cohesive and coherent speech were associated with better performance on attention tasks. Better Hayling Test scores were related to more cohesive and coherent speech in RHS patients, and more coherent speech in LHS patients. Thus, we documented connected speech deficits in a heterogeneous stroke group without prominent aphasia. Our results suggest that broader cognitive processes may play a role in producing connected speech at the early conceptual preparation stage. Copyright © 2017 Elsevier Inc. All rights reserved.
Children's Acoustic and Linguistic Adaptations to Peers With Hearing Impairment.
Granlund, Sonia; Hazan, Valerie; Mahon, Merle
2018-05-17
This study aims to examine the clear speaking strategies used by older children when interacting with a peer with hearing loss, focusing on both acoustic and linguistic adaptations in speech. The Grid task, a problem-solving task developed to elicit spontaneous interactive speech, was used to obtain a range of global acoustic and linguistic measures. Eighteen 9- to 14-year-old children with normal hearing (NH) performed the task in pairs, once with a friend with NH and once with a friend with a hearing impairment (HI). In HI-directed speech, children increased their fundamental frequency range and midfrequency intensity, decreased the number of words per phrase, and expanded their vowel space area by increasing F1 and F2 range, relative to NH-directed speech. However, participants did not appear to make changes to their articulation rate, the lexical frequency of content words, or lexical diversity when talking to their friend with HI compared with their friend with NH. Older children show evidence of listener-oriented adaptations to their speech production; although their speech production systems are still developing, they are able to make speech adaptations to benefit the needs of a peer with HI, even without being given a specific instruction to do so. https://doi.org/10.23641/asha.6118817.
Ding, Nai; Pan, Xunyi; Luo, Cheng; Su, Naifei; Zhang, Wen; Zhang, Jianfeng
2018-01-31
How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention. SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention. Copyright © 2018 the authors 0270-6474/18/381178-11$15.00/0.
Oral Articulatory Control in Childhood Apraxia of Speech
ERIC Educational Resources Information Center
Grigos, Maria I.; Moss, Aviva; Lu, Ying
2015-01-01
Purpose: The purpose of this research was to examine spatial and temporal aspects of articulatory control in children with childhood apraxia of speech (CAS), children with speech delay characterized by an articulation/phonological impairment (SD), and controls with typical development (TD) during speech tasks that increased in word length. Method:…
Collaborative Signaling of Informational Structures by Dynamic Speech Rate.
ERIC Educational Resources Information Center
Koiso, Hanae; Shimojima, Atsushi; Katagiri, Yasuhiro
1998-01-01
Investigated the functions of dynamic speech rates as contextualization cues in conversational Japanese, examining five spontaneous task-oriented dialogs and analyzing the potential of speech-rate changes in signaling the structure of the information being exchanged. Results found a correlation between speech decelerations and the openings of new…
Inferring Speaker Affect in Spoken Natural Language Communication
ERIC Educational Resources Information Center
Pon-Barry, Heather Roberta
2013-01-01
The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
Visual Speech Primes Open-Set Recognition of Spoken Words
ERIC Educational Resources Information Center
Buchwald, Adam B.; Winters, Stephen J.; Pisoni, David B.
2009-01-01
Visual speech perception has become a topic of considerable interest to speech researchers. Previous research has demonstrated that perceivers neurally encode and use speech information from the visual modality, and this information has been found to facilitate spoken word recognition in tasks such as lexical decision (Kim, Davis, & Krins,…
The right hemisphere is highlighted in connected natural speech production and perception.
Alexandrou, Anna Maria; Saarinen, Timo; Mäkelä, Sasu; Kujala, Jan; Salmelin, Riitta
2017-05-15
Current understanding of the cortical mechanisms of speech perception and production stems mostly from studies that focus on single words or sentences. However, it has been suggested that processing of real-life connected speech may rely on additional cortical mechanisms. In the present study, we examined the neural substrates of natural speech production and perception with magnetoencephalography by modulating three central features related to speech: amount of linguistic content, speaking rate and social relevance. The amount of linguistic content was modulated by contrasting natural speech production and perception to speech-like non-linguistic tasks. Meaningful speech was produced and perceived at three speaking rates: normal, slow and fast. Social relevance was probed by having participants attend to speech produced by themselves and an unknown person. These speech-related features were each associated with distinct spatiospectral modulation patterns that involved cortical regions in both hemispheres. Natural speech processing markedly engaged the right hemisphere in addition to the left. In particular, the right temporo-parietal junction, previously linked to attentional processes and social cognition, was highlighted in the task modulations. The present findings suggest that its functional role extends to active generation and perception of meaningful, socially relevant speech. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
The Use of Electroencephalography in Language Production Research: A Review
Ganushchak, Lesya Y.; Christoffels, Ingrid K.; Schiller, Niels O.
2011-01-01
Speech production long avoided electrophysiological experiments due to the suspicion that potential artifacts caused by muscle activity of overt speech may lead to a bad signal-to-noise ratio in the measurements. Therefore, researchers have sought to assess speech production by using indirect speech production tasks, such as tacit or implicit naming, delayed naming, or meta-linguistic tasks, such as phoneme-monitoring. Covert speech may, however, involve different processes than overt speech production. Recently, overt speech has been investigated using electroencephalography (EEG). As the number of papers published is rising steadily, this clearly indicates the increasing interest and demand for overt speech research within the field of cognitive neuroscience of language. Our main goal here is to review all currently available results of overt speech production involving EEG measurements, such as picture naming, Stroop naming, and reading aloud. We conclude that overt speech production can be successfully studied using electrophysiological measures, for instance, event-related brain potentials (ERPs). We will discuss possible relevant components in the ERP waveform of speech production and aim to address the issue of how to interpret the results of ERP research using overt speech, and whether the ERP components in language production are comparable to results from other fields. PMID:21909333
Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging.
Hagedorn, Christina; Proctor, Michael; Goldstein, Louis; Wilson, Stephen M; Miller, Bruce; Gorno-Tempini, Maria Luisa; Narayanan, Shrikanth S
2017-04-14
Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech discussed. One adult male speaker with apraxia of speech was imaged using real-time MRI while producing spontaneous speech, repeated naming tasks, and self-paced repetition of word pairs designed to elicit speech errors. Articulatory data were analyzed, and speech errors were detected using time series reflecting articulatory activity in regions of interest. Real-time MRI captured two types of apraxic gestural intrusion errors in a word pair repetition task. Gestural intrusion errors in nonrepetitive speech, multiple silent initiation gestures at the onset of speech, and covert (unphonated) articulation of entire monosyllabic words were also captured. Real-time MRI and accompanying analytical methods capture and quantify many features of apraxic speech that have been previously observed using other modalities while offering high spatial resolution. This patient's apraxia of speech affected the ability to select only the appropriate vocal tract gestures for a target utterance, suppressing others, and to coordinate them in time.
ERIC Educational Resources Information Center
Tierney, Joseph; Mack, Molly
1987-01-01
Stimuli used in research on the perception of the speech signal have often been obtained from simple filtering and distortion of the speech waveform, sometimes accompanied by noise. However, for more complex stimulus generation, the parameters of speech can be manipulated, after analysis and before synthesis, using various types of algorithms to…
Effects of Speaking Task on Intelligibility in Parkinson's Disease
ERIC Educational Resources Information Center
Tjaden, Kris; Wilding, Greg
2011-01-01
Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare…
Ventilation and Speech Characteristics during Submaximal Aerobic Exercise
ERIC Educational Resources Information Center
Baker, Susan E.; Hipp, Jenny; Alessio, Helaine
2008-01-01
Purpose: This study examined alterations in ventilation and speech characteristics as well as perceived dyspnea during submaximal aerobic exercise tasks. Method: Twelve healthy participants completed aerobic exercise-only and simultaneous speaking and aerobic exercise tasks at 50% and 75% of their maximum oxygen consumption (VO[subscript 2] max).…
Obermeier, Christian; Holle, Henning; Gunter, Thomas C
2011-07-01
The present series of experiments explores several issues related to gesture-speech integration and synchrony during sentence processing. To be able to more precisely manipulate gesture-speech synchrony, we used gesture fragments instead of complete gestures, thereby avoiding the usual long temporal overlap of gestures with their coexpressive speech. In a pretest, the minimal duration of an iconic gesture fragment needed to disambiguate a homonym (i.e., disambiguation point) was therefore identified. In three subsequent ERP experiments, we then investigated whether the gesture information available at the disambiguation point has immediate as well as delayed consequences on the processing of a temporarily ambiguous spoken sentence, and whether these gesture-speech integration processes are susceptible to temporal synchrony. Experiment 1, which used asynchronous stimuli as well as an explicit task, showed clear N400 effects at the homonym as well as at the target word presented further downstream, suggesting that asynchrony does not prevent integration under explicit task conditions. No such effects were found when asynchronous stimuli were presented using a more shallow task (Experiment 2). Finally, when gesture fragment and homonym were synchronous, similar results as in Experiment 1 were found, even under shallow task conditions (Experiment 3). We conclude that when iconic gesture fragments and speech are in synchrony, their interaction is more or less automatic. When they are not, more controlled, active memory processes are necessary to be able to combine the gesture fragment and speech context in such a way that the homonym is disambiguated correctly.
Influence of auditory attention on sentence recognition captured by the neural phase.
Müller, Jana Annina; Kollmeier, Birger; Debener, Stefan; Brand, Thomas
2018-03-07
The aim of this study was to investigate whether attentional influences on speech recognition are reflected in the neural phase entrained by an external modulator. Sentences were presented in 7 Hz sinusoidally modulated noise while the neural response to that modulation frequency was monitored by electroencephalogram (EEG) recordings in 21 participants. We implemented a selective attention paradigm including three different attention conditions while keeping physical stimulus parameters constant. The participants' task was either to repeat the sentence as accurately as possible (speech recognition task), to count the number of decrements implemented in modulated noise (decrement detection task), or to do both (dual task), while the EEG was recorded. Behavioural analysis revealed reduced performance in the dual task condition for decrement detection, possibly reflecting limited cognitive resources. EEG analysis revealed no significant differences in power for the 7 Hz modulation frequency, but an attention-dependent phase difference between tasks. Further phase analysis revealed a significant difference 500 ms after sentence onset between trials with correct and incorrect responses for speech recognition, indicating that speech recognition performance and the neural phase are linked via selective attention mechanisms, at least shortly after sentence onset. However, the neural phase effects identified were small and await further investigation. © 2018 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Processing melodic contour and speech intonation in congenital amusics with Mandarin Chinese.
Jiang, Cunmei; Hamm, Jeff P; Lim, Vanessa K; Kirk, Ian J; Yang, Yufang
2010-07-01
Congenital amusia is a disorder in the perception and production of musical pitch. It has been suggested that early exposure to a tonal language may compensate for the pitch disorder (Peretz, 2008). If so, it is reasonable to expect that there would be different characterizations of pitch perception in music and speech in congenital amusics who speak a tonal language, such as Mandarin. In this study, a group of 11 adults with amusia whose first language was Mandarin were tested with melodic contour and speech intonation discrimination and identification tasks. The participants with amusia were impaired in discriminating and identifying melodic contour. These abnormalities were also detected in identifying both speech and non-linguistic analogue derived patterns for the Mandarin intonation tasks. In addition, there was an overall trend for the participants with amusia to show deficits with respect to controls in the intonation discrimination tasks for both speech and non-linguistic analogues. These findings suggest that the amusics' melodic pitch deficits may extend to the perception of speech, and could potentially result in some language deficits in those who speak a tonal language. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Stanley, Nicholas; Davis, Tara; Estis, Julie
2017-03-01
Aging effects on speech understanding in noise have primarily been assessed through speech recognition tasks. Recognition tasks, which focus on bottom-up, perceptual aspects of speech understanding, intentionally limit linguistic and cognitive factors by asking participants to only repeat what they have heard. On the other hand, linguistic processing tasks require bottom-up and top-down (linguistic, cognitive) processing skills and are, therefore, more reflective of speech understanding abilities used in everyday communication. The effect of signal-to-noise ratio (SNR) on linguistic processing ability is relatively unknown for either young (YAs) or older adults (OAs). To determine if reduced SNRs would be more deleterious to the linguistic processing of OAs than YAs, as measured by accuracy and reaction time in a semantic judgment task in competing speech. In the semantic judgment task, participants indicated via button press whether word pairs were a semantic Match or No Match. This task was performed in quiet, as well as, +3, 0, -3, and -6 dB SNR with two-talker speech competition. Seventeen YAs (20-30 yr) with normal hearing sensitivity and 17 OAs (60-68 yr) with normal hearing sensitivity or mild-to-moderate sensorineural hearing loss within age-appropriate norms. Accuracy, reaction time, and false alarm rate were measured and analyzed using a mixed design analysis of variance. A decrease in SNR level significantly reduced accuracy and increased reaction time in both YAs and OAs. However, poor SNRs affected accuracy and reaction time of Match and No Match word pairs differently. Accuracy for Match pairs declined at a steeper rate than No Match pairs in both groups as SNR decreased. In addition, reaction time for No Match pairs increased at a greater rate than Match pairs in more difficult SNRs, particularly at -3 and -6 dB SNR. False-alarm rates indicated that participants had a response bias to No Match pairs as the SNR decreased. Age-related differences were limited to No Match pair accuracies at -6 dB SNR. The ability to correctly identify semantically matched word pairs was more susceptible to disruption by a poor SNR than semantically unrelated words in both YAs and OAs. The effect of SNR on this semantic judgment task implies that speech competition differentially affected the facilitation of semantically related words and the inhibition of semantically incompatible words, although processing speed, as measured by reaction time, remained faster for semantically matched pairs. Overall, the semantic judgment task in competing speech elucidated the effect of a poor listening environment on the higher order processing of words. American Academy of Audiology
Speed-Accuracy Tradeoffs in Speech Production
2017-06-01
imaging data of speech production. A theoretical framework for considering Fitts’ law in the domain of speech production is elucidated. Methodological ...articulatory kinematics conform to Fitts’ law. A second, associated goal is to address the methodological challenges inherent in performing Fitts-style...analysis on rtMRI data of speech production. Methodological challenges include segmenting continuous speech into specific motor tasks, defining key
Best, Virginia; Mason, Christine R.; Swaminathan, Jayaganesh; Roverud, Elin; Kidd, Gerald
2017-01-01
In many situations, listeners with sensorineural hearing loss demonstrate reduced spatial release from masking compared to listeners with normal hearing. This deficit is particularly evident in the “symmetric masker” paradigm in which competing talkers are located to either side of a central target talker. However, there is some evidence that reduced target audibility (rather than a spatial deficit per se) under conditions of spatial separation may contribute to the observed deficit. In this study a simple “glimpsing” model (applied separately to each ear) was used to isolate the target information that is potentially available in binaural speech mixtures. Intelligibility of these glimpsed stimuli was then measured directly. Differences between normally hearing and hearing-impaired listeners observed in the natural binaural condition persisted for the glimpsed condition, despite the fact that the task no longer required segregation or spatial processing. This result is consistent with the idea that the performance of listeners with hearing loss in the spatialized mixture was limited by their ability to identify the target speech based on sparse glimpses, possibly as a result of some of those glimpses being inaudible. PMID:28147587
Onojima, Takayuki; Kitajo, Keiichi; Mizuhara, Hiroaki
2017-01-01
Neural oscillation is attracting attention as an underlying mechanism for speech recognition. Speech intelligibility is enhanced by the synchronization of speech rhythms and slow neural oscillation, which is typically observed as human scalp electroencephalography (EEG). In addition to the effect of neural oscillation, it has been proposed that speech recognition is enhanced by the identification of a speaker's motor signals, which are used for speech production. To verify the relationship between the effect of neural oscillation and motor cortical activity, we measured scalp EEG, and simultaneous EEG and functional magnetic resonance imaging (fMRI) during a speech recognition task in which participants were required to recognize spoken words embedded in noise sound. We proposed an index to quantitatively evaluate the EEG phase effect on behavioral performance. The results showed that the delta and theta EEG phase before speech inputs modulated the participant's response time when conducting speech recognition tasks. The simultaneous EEG-fMRI experiment showed that slow EEG activity was correlated with motor cortical activity. These results suggested that the effect of the slow oscillatory phase was associated with the activity of the motor cortex during speech recognition.
Application of speech recognition and synthesis in the general aviation cockpit
NASA Technical Reports Server (NTRS)
North, R. A.; Mountford, S. J.; Bergeron, H.
1984-01-01
Interactive speech recognition/synthesis technology is assessed as a method for the aleviation of single-pilot IFR flight workloads. Attention was given during this series of evaluations to the conditions typical of general aviation twin-engine aircrft cockpits, covering several commonly encountered IFR flight condition scenarios. The most beneficial speech command tasks are noted to be in the data retrieval domain, which would allow the pilot access to uplinked data, checklists, and performance charts. Data entry tasks also appear to benefit from this technology.
Visual Distractors Disrupt Audiovisual Integration Regardless of Stimulus Complexity
Gibney, Kyla D.; Aligbe, Enimielen; Eggleston, Brady A.; Nunes, Sarah R.; Kerkhoff, Willa G.; Dean, Cassandra L.; Kwakye, Leslie D.
2017-01-01
The intricate relationship between multisensory integration and attention has been extensively researched in the multisensory field; however, the necessity of attention for the binding of multisensory stimuli remains contested. In the current study, we investigated whether diverting attention from well-known multisensory tasks would disrupt integration and whether the complexity of the stimulus and task modulated this interaction. A secondary objective of this study was to investigate individual differences in the interaction of attention and multisensory integration. Participants completed a simple audiovisual speeded detection task and McGurk task under various perceptual load conditions: no load (multisensory task while visual distractors present), low load (multisensory task while detecting the presence of a yellow letter in the visual distractors), and high load (multisensory task while detecting the presence of a number in the visual distractors). Consistent with prior studies, we found that increased perceptual load led to decreased reports of the McGurk illusion, thus confirming the necessity of attention for the integration of speech stimuli. Although increased perceptual load led to longer response times for all stimuli in the speeded detection task, participants responded faster on multisensory trials than unisensory trials. However, the increase in multisensory response times violated the race model for no and low perceptual load conditions only. Additionally, a geometric measure of Miller’s inequality showed a decrease in multisensory integration for the speeded detection task with increasing perceptual load. Surprisingly, we found diverging changes in multisensory integration with increasing load for participants who did not show integration for the no load condition: no changes in integration for the McGurk task with increasing load but increases in integration for the detection task. The results of this study indicate that attention plays a crucial role in multisensory integration for both highly complex and simple multisensory tasks and that attention may interact differently with multisensory processing in individuals who do not strongly integrate multisensory information. PMID:28163675
Visual Distractors Disrupt Audiovisual Integration Regardless of Stimulus Complexity.
Gibney, Kyla D; Aligbe, Enimielen; Eggleston, Brady A; Nunes, Sarah R; Kerkhoff, Willa G; Dean, Cassandra L; Kwakye, Leslie D
2017-01-01
The intricate relationship between multisensory integration and attention has been extensively researched in the multisensory field; however, the necessity of attention for the binding of multisensory stimuli remains contested. In the current study, we investigated whether diverting attention from well-known multisensory tasks would disrupt integration and whether the complexity of the stimulus and task modulated this interaction. A secondary objective of this study was to investigate individual differences in the interaction of attention and multisensory integration. Participants completed a simple audiovisual speeded detection task and McGurk task under various perceptual load conditions: no load (multisensory task while visual distractors present), low load (multisensory task while detecting the presence of a yellow letter in the visual distractors), and high load (multisensory task while detecting the presence of a number in the visual distractors). Consistent with prior studies, we found that increased perceptual load led to decreased reports of the McGurk illusion, thus confirming the necessity of attention for the integration of speech stimuli. Although increased perceptual load led to longer response times for all stimuli in the speeded detection task, participants responded faster on multisensory trials than unisensory trials. However, the increase in multisensory response times violated the race model for no and low perceptual load conditions only. Additionally, a geometric measure of Miller's inequality showed a decrease in multisensory integration for the speeded detection task with increasing perceptual load. Surprisingly, we found diverging changes in multisensory integration with increasing load for participants who did not show integration for the no load condition: no changes in integration for the McGurk task with increasing load but increases in integration for the detection task. The results of this study indicate that attention plays a crucial role in multisensory integration for both highly complex and simple multisensory tasks and that attention may interact differently with multisensory processing in individuals who do not strongly integrate multisensory information.
Wagner, Valentin; Jescheniak, Jörg D; Schriefers, Herbert
2010-03-01
Three picture-word interference experiments addressed the question of whether the scope of grammatical advance planning in sentence production corresponds to some fixed unit or rather is flexible. Subjects produced sentences of different formats under varying amounts of cognitive load. When speakers described 2-object displays with simple sentences of the form "the frog is next to the mug," the 2 nouns were found to be lexically-semantically activated to similar degrees at speech onset, as indexed by similarly sized interference effects from semantic distractors related to either the first or the second noun. When speakers used more complex sentences (including prenominal color adjectives; e.g., "the blue frog is next to the blue mug") much larger interference effects were observed for the first than the second noun, suggesting that the second noun was lexically-semantically activated before speech onset on only a subset of trials. With increased cognitive load, introduced by an additional conceptual decision task and variable utterance formats, the interference effect for the first noun was increased and the interference effect for second noun disappeared, suggesting that the scope of advance planning had been narrowed. By contrast, if cognitive load was induced by a secondary working memory task to be performed during speech planning, the interference effect for both nouns was increased, suggesting that the scope of advance planning had not been affected. In all, the data suggest that the scope of advance planning during grammatical encoding in sentence production is flexible, rather than structurally fixed.
Hodgson, Jessica C; Hudson, John M
2017-03-01
Research using clinical populations to explore the relationship between hemispheric speech lateralization and handedness has focused on individuals with speech and language disorders, such as dyslexia or specific language impairment (SLI). Such work reveals atypical patterns of cerebral lateralization and handedness in these groups compared to controls. There are few studies that examine this relationship in people with motor coordination impairments but without speech or reading deficits, which is a surprising omission given the prevalence of theories suggesting a common neural network underlying both functions. We use an emerging imaging technique in cognitive neuroscience; functional transcranial Doppler (fTCD) ultrasound, to assess whether individuals with developmental coordination disorder (DCD) display reduced left-hemisphere lateralization for speech production compared to control participants. Twelve adult control participants and 12 adults with DCD, but no other developmental/cognitive impairments, performed a word-generation task whilst undergoing fTCD imaging to establish a hemispheric lateralization index for speech production. All participants also completed an electronic peg-moving task to determine hand skill. As predicted, the DCD group showed a significantly reduced left lateralization pattern for the speech production task compared to controls. Performance on the motor skill task showed a clear preference for the dominant hand across both groups; however, the DCD group mean movement times were significantly higher for the non-dominant hand. This is the first study of its kind to assess hand skill and speech lateralization in DCD. The results reveal a reduced leftwards asymmetry for speech and a slower motor performance. This fits alongside previous work showing atypical cerebral lateralization in DCD for other cognitive processes (e.g., executive function and short-term memory) and thus speaks to debates on theories of the links between motor control and language production. © 2016 The Authors. Journal of Neuropsychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.
Dynamic action units slip in speech production errors ☆
Goldstein, Louis; Pouplier, Marianne; Chen, Larissa; Saltzman, Elliot; Byrd, Dani
2008-01-01
In the past, the nature of the compositional units proposed for spoken language has largely diverged from the types of control units pursued in the domains of other skilled motor tasks. A classic source of evidence as to the units structuring speech has been patterns observed in speech errors – “slips of the tongue”. The present study reports, for the first time, on kinematic data from tongue and lip movements during speech errors elicited in the laboratory using a repetition task. Our data are consistent with the hypothesis that speech production results from the assembly of dynamically defined action units – gestures – in a linguistically structured environment. The experimental results support both the presence of gestural units and the dynamical properties of these units and their coordination. This study of speech articulation shows that it is possible to develop a principled account of spoken language within a more general theory of action. PMID:16822494
Central timing deficits in subtypes of primary speech disorders.
Peter, Beate; Stoel-Gammon, Carol
2008-03-01
Childhood apraxia of speech (CAS) is a proposed speech disorder subtype that interferes with motor planning and/or programming, affecting prosody in many cases. Pilot data (Peter & Stoel-Gammon, 2005) were consistent with the notion that deficits in timing accuracy in speech and music-related tasks may be associated with CAS. This study replicated and expanded earlier findings. Eleven children with speech disorders and age-and gender-matched controls participated in non-word imitation, clapped rhythm imitation, and paced repetitive tapping tasks. Results suggest a central timing deficit, expressed in both the oral and the limb modality, and observable in two different types of timing measures, overall rhythmic structures and small-scale durations. Associations among timing measures were strongest in the participants with speech disorders, who also showed lower timing accuracy than the controls in all measures. The number of observed CAS characteristics was associated with timing deficits.
Inner speech impairments in autism.
Whitehouse, Andrew J O; Maybery, Murray T; Durkin, Kevin
2006-08-01
Three experiments investigated the role of inner speech deficit in cognitive performances of children with autism. Experiment 1 compared children with autism with ability-matched controls on a verbal recall task presenting pictures and words. Experiment 2 used pictures for which the typical names were either single syllable or multisyllable. Two encoding conditions manipulated the use of verbal encoding. Experiment 3 employed a task-switching paradigm for which performance has been shown to be contingent upon inner speech. In Experiment 1, children with autism demonstrated a lower picture-superiority effect compared to controls. In Experiment 2, the children with autism showed a lower word-length effect when pictures were presented alone, but a more substantial word-length effect in a condition requiring overt labelling. In Experiment 3, articulatory suppression affected the task-switching performance of the control participants only. Individuals with autism have limitations in their use of inner speech.
PETER, BEATE; BUTTON, LE; STOEL-GAMMON, CAROL; CHAPMAN, KATHY; RASKIND, WENDY H.
2013-01-01
The purpose of this study was to evaluate a global deficit in sequential processing as candidate endophenotypein a family with familial childhood apraxia of speech (CAS). Of 10 adults and 13 children in a three-generational family with speech sound disorder (SSD) consistent with CAS, 3 adults and 6 children had past or present SSD diagnoses. Two preschoolers with unremediated CAS showed a high number of sequencing errors during single-word production. Performance on tasks with high sequential processing loads differentiated between the affected and unaffected family members, whereas there were no group differences in tasks with low processing loads. Adults with a history of SSD produced more sequencing errors during nonword and multisyllabic real word imitation, compared to those without such a history. Results are consistent with a global deficit in sequential processing that influences speech development as well as cognitive and linguistic processing. PMID:23339324
Joint Spatial-Spectral Feature Space Clustering for Speech Activity Detection from ECoG Signals
Kanas, Vasileios G.; Mporas, Iosif; Benz, Heather L.; Sgarbas, Kyriakos N.; Bezerianos, Anastasios; Crone, Nathan E.
2014-01-01
Brain machine interfaces for speech restoration have been extensively studied for more than two decades. The success of such a system will depend in part on selecting the best brain recording sites and signal features corresponding to speech production. The purpose of this study was to detect speech activity automatically from electrocorticographic signals based on joint spatial-frequency clustering of the ECoG feature space. For this study, the ECoG signals were recorded while a subject performed two different syllable repetition tasks. We found that the optimal frequency resolution to detect speech activity from ECoG signals was 8 Hz, achieving 98.8% accuracy by employing support vector machines (SVM) as a classifier. We also defined the cortical areas that held the most information about the discrimination of speech and non-speech time intervals. Additionally, the results shed light on the distinct cortical areas associated with the two syllable repetition tasks and may contribute to the development of portable ECoG-based communication. PMID:24658248
Asaridou, Salomi S.; Hagoort, Peter; McQueen, James M.
2015-01-01
We investigated music and language processing in a group of early bilinguals who spoke a tone language and a non-tone language (Cantonese and Dutch). We assessed online speech-music processing interactions, that is, interactions that occur when speech and music are processed simultaneously in songs, with a speeded classification task. In this task, participants judged sung pseudowords either musically (based on the direction of the musical interval) or phonologically (based on the identity of the sung vowel). We also assessed longer-term effects of linguistic experience on musical ability, that is, the influence of extensive prior experience with language when processing music. These effects were assessed with a task in which participants had to learn to identify musical intervals and with four pitch-perception tasks. Our hypothesis was that due to their experience in two different languages using lexical versus intonational tone, the early Cantonese-Dutch bilinguals would outperform the Dutch control participants. In online processing, the Cantonese-Dutch bilinguals processed speech and music more holistically than controls. This effect seems to be driven by experience with a tone language, in which integration of segmental and pitch information is fundamental. Regarding longer-term effects of linguistic experience, we found no evidence for a bilingual advantage in either the music-interval learning task or the pitch-perception tasks. Together, these results suggest that being a Cantonese-Dutch bilingual does not have any measurable longer-term effects on pitch and music processing, but does have consequences for how speech and music are processed jointly. PMID:26659377
Lavigne, Katie M; Woodward, Todd S
2018-04-01
Hypercoupling of activity in speech-perception-specific brain networks has been proposed to play a role in the generation of auditory-verbal hallucinations (AVHs) in schizophrenia; however, it is unclear whether this hypercoupling extends to nonverbal auditory perception. We investigated this by comparing schizophrenia patients with and without AVHs, and healthy controls, on task-based functional magnetic resonance imaging (fMRI) data combining verbal speech perception (SP), inner verbal thought generation (VTG), and nonverbal auditory oddball detection (AO). Data from two previously published fMRI studies were simultaneously analyzed using group constrained principal component analysis for fMRI (group fMRI-CPCA), which allowed for comparison of task-related functional brain networks across groups and tasks while holding the brain networks under study constant, leading to determination of the degree to which networks are common to verbal and nonverbal perception conditions, and which show coordinated hyperactivity in hallucinations. Three functional brain networks emerged: (a) auditory-motor, (b) language processing, and (c) default-mode (DMN) networks. Combining the AO and sentence tasks allowed the auditory-motor and language networks to separately emerge, whereas they were aggregated when individual tasks were analyzed. AVH patients showed greater coordinated activity (deactivity for DMN regions) than non-AVH patients during SP in all networks, but this did not extend to VTG or AO. This suggests that the hypercoupling in AVH patients in speech-perception-related brain networks is specific to perceived speech, and does not extend to perceived nonspeech or inner verbal thought generation. © 2017 Wiley Periodicals, Inc.
Asaridou, Salomi S; Hagoort, Peter; McQueen, James M
2015-01-01
We investigated music and language processing in a group of early bilinguals who spoke a tone language and a non-tone language (Cantonese and Dutch). We assessed online speech-music processing interactions, that is, interactions that occur when speech and music are processed simultaneously in songs, with a speeded classification task. In this task, participants judged sung pseudowords either musically (based on the direction of the musical interval) or phonologically (based on the identity of the sung vowel). We also assessed longer-term effects of linguistic experience on musical ability, that is, the influence of extensive prior experience with language when processing music. These effects were assessed with a task in which participants had to learn to identify musical intervals and with four pitch-perception tasks. Our hypothesis was that due to their experience in two different languages using lexical versus intonational tone, the early Cantonese-Dutch bilinguals would outperform the Dutch control participants. In online processing, the Cantonese-Dutch bilinguals processed speech and music more holistically than controls. This effect seems to be driven by experience with a tone language, in which integration of segmental and pitch information is fundamental. Regarding longer-term effects of linguistic experience, we found no evidence for a bilingual advantage in either the music-interval learning task or the pitch-perception tasks. Together, these results suggest that being a Cantonese-Dutch bilingual does not have any measurable longer-term effects on pitch and music processing, but does have consequences for how speech and music are processed jointly.
Speech Synthesis Applied to Language Teaching.
ERIC Educational Resources Information Center
Sherwood, Bruce
1981-01-01
The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…
Speech Entrainment Compensates for Broca's Area Damage
Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris
2015-01-01
Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443
Lau, Johnny King L; Humphreys, Glyn W; Douis, Hassan; Balani, Alex; Bickerton, Wai-Ling; Rotshtein, Pia
2015-01-01
We report a lesion-symptom mapping analysis of visual speech production deficits in a large group (280) of stroke patients at the sub-acute stage (<120 days post-stroke). Performance on object naming was evaluated alongside three other tests of visual speech production, namely sentence production to a picture, sentence reading and nonword reading. A principal component analysis was performed on all these tests' scores and revealed a 'shared' component that loaded across all the visual speech production tasks and a 'unique' component that isolated object naming from the other three tasks. Regions for the shared component were observed in the left fronto-temporal cortices, fusiform gyrus and bilateral visual cortices. Lesions in these regions linked to both poor object naming and impairment in general visual-speech production. On the other hand, the unique naming component was potentially associated with the bilateral anterior temporal poles, hippocampus and cerebellar areas. This is in line with the models proposing that object naming relies on a left-lateralised language dominant system that interacts with a bilateral anterior temporal network. Neuropsychological deficits in object naming can reflect both the increased demands specific to the task and the more general difficulties in language processing.
Rayes, Hanin; Sheft, Stanley; Shafiro, Valeriy
2014-01-01
Past work has shown relationship between the ability to discriminate spectral patterns and measures of speech intelligibility. The purpose of this study was to investigate the ability of both children and young adults to discriminate static and dynamic spectral patterns, comparing performance between the two groups and evaluating within-group results in terms of relationship to speech-in-noise perception. Data were collected from normal-hearing children (age range: 5.4 - 12.8 yrs) and young adults (mean age: 22.8 yrs) on two spectral discrimination tasks and speech-in-noise perception. The first discrimination task, involving static spectral profiles, measured the ability to detect a change in the phase of a low-density sinusoidal spectral ripple of wideband noise. Using dynamic spectral patterns, the second task determined the signal-to-noise ratio needed to discriminate the temporal pattern of frequency fluctuation imposed by stochastic low-rate frequency modulation (FM). Children performed significantly poorer than young adults on both discrimination tasks. For children, a significant correlation between speech-in-noise perception and spectral-pattern discrimination was obtained only with the dynamic patterns of the FM condition, with partial correlation suggesting that factors related to the children's age mediated the relationship.
Motor laterality as an indicator of speech laterality.
Flowers, Kenneth A; Hudson, John M
2013-03-01
The determination of speech laterality, especially where it is anomalous, is both a theoretical issue and a practical problem for brain surgery. Handedness is commonly thought to be related to speech representation, but exactly how is not clearly understood. This investigation analyzed handedness by preference rating and performance on a reliable task of motor laterality in 34 patients undergoing a Wada test, to see if they could provide an indicator of speech laterality. Hand usage preference ratings divided patients into left, right, and mixed in preference. Between-hand differences in movement time on a pegboard task determined motor laterality. Results were correlated (χ2) with speech representation as determined by a standard Wada test. It was found that patients whose between-hand difference in speed on the motor task was small or inconsistent were the ones whose Wada test speech representation was likely to be ambiguous or anomalous, whereas all those with a consistently large between-hand difference showed clear unilateral speech representation in the hemisphere controlling the better hand (χ2 = 10.45, df = 1, p < .01, η2 = 0.55) This relationship prevailed across hand preference and level of skill in the hands itself. We propose that motor and speech laterality are related where they both involve a central control of motor output sequencing and that a measure of that aspect of the former will indicate the likely representation of the latter. A between-hand measure of motor laterality based on such a measure may indicate the possibility of anomalous speech representation. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Central Timing Deficits in Subtypes of Primary Speech Disorders
ERIC Educational Resources Information Center
Peter, Beate; Stoel-Gammon, Carol
2008-01-01
Childhood apraxia of speech (CAS) is a proposed speech disorder subtype that interferes with motor planning and/or programming, affecting prosody in many cases. Pilot data (Peter & Stoel-Gammon, 2005) were consistent with the notion that deficits in timing accuracy in speech and music-related tasks may be associated with CAS. This study…
Coppens-Hofman, Marjolein C.; Terband, Hayo; Snik, Ad F.M.; Maassen, Ben A.M.
2017-01-01
Purpose Adults with intellectual disabilities (ID) often show reduced speech intelligibility, which affects their social interaction skills. This study aims to establish the main predictors of this reduced intelligibility in order to ultimately optimise management. Method Spontaneous speech and picture naming tasks were recorded in 36 adults with mild or moderate ID. Twenty-five naïve listeners rated the intelligibility of the spontaneous speech samples. Performance on the picture-naming task was analysed by means of a phonological error analysis based on expert transcriptions. Results The transcription analyses showed that the phonemic and syllabic inventories of the speakers were complete. However, multiple errors at the phonemic and syllabic level were found. The frequencies of specific types of errors were related to intelligibility and quality ratings. Conclusions The development of the phonemic and syllabic repertoire appears to be completed in adults with mild-to-moderate ID. The charted speech difficulties can be interpreted to indicate speech motor control and planning difficulties. These findings may aid the development of diagnostic tests and speech therapies aimed at improving speech intelligibility in this specific group. PMID:28118637
Vocal Control: Is It Susceptible to the Negative Effects of Self-Regulatory Depletion?
Vinney, Lisa A; van Mersbergen, Miriam; Connor, Nadine P; Turkstra, Lyn S
2016-09-01
Self-regulation (SR) relies on the capacity to modify behavior. This capacity may diminish with use and result in self-regulatory depletion (SRD), or the reduced ability to engage in future SR efforts. If the SRD effect applies to vocal behavior, it may hinder success during behavioral voice treatment. Thus, this proof-of-concept study sought to determine whether SRD affects vocal behavior change and if so, whether it can be repaired by an intervention meant to replete SR resources. One hundred four women without voice disorders were randomized into groups that performed either (1) a high-SR writing task followed by a high-SR voice task; (2) a low-SR writing task followed by a high-SR voice task; or (3) a high-SR writing task followed by a relaxation intervention and a high-SR voice task. The high-SR voice tasks in all groups involved suppression of the Lombard effect during reading and free speech. The low-SR group suppressed the Lombard effect to a greater extent than the high-SR group and high-SR-plus-relaxation group on the free speech task. There were no significant group differences on the reading task. Findings suggest that SRD may present challenges to vocal behavior modification during free speech but not reading. Furthermore, relaxation did not significantly replete self-regulatory resources for vocal modification during free speech. Findings may highlight potential considerations for voice treatment and assessment and support the need for future research focusing on effective methods to test self-regulatory capacity and replete self-regulatory resources in voice patients. Published by Elsevier Inc.
Does the cost function matter in Bayes decision rule?
Schlü ter, Ralf; Nussbaum-Thom, Markus; Ney, Hermann
2012-02-01
In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.
The relation between categorical perception of speech stimuli and reading skills in children
NASA Astrophysics Data System (ADS)
Breier, Joshua; Fletcher, Jack; Klaas, Patricia; Gray, Lincoln
2005-09-01
Children ages 7 to 14 years listened to seven tokens, /ga/ to /ka/ synthesized in equal steps from 0 to 60 ms along the voice onset time (VOT) continuum, played in continuous rhythm. All possible changes (21) between the seven tokens were presented seven times at random intervals, maintaining the rhythm. Children were asked to press a button as soon as they detected a change. Maps of the seven tokens, constructed from multidimensional scaling of reaction times, indicated two salient dimensions: one phonological and the other acoustic/phonetic. Better reading, spelling, and phonological processing skills were associated with greater relative weighting of the phonological as compared to the acoustic dimension, suggesting that children with reading difficulty and associated deficits may underweight the phonological and/or overweight the acoustic information in speech signals. This task required no training and only momentary memory of the tokens. That an analysis of a simple task coincides with more complex reading tests suggests a low-level deficit (or shift in listening strategy). Compared to control children, children with reading disabilities may pay more attention to subtle details in these signals and less attention to the global pattern or attribute. [Supported by NIH Grant 1 RO1 HD35938 to JIB.
Gygi, Brian; Shafiro, Valeriy
2014-04-01
Speech perception in multitalker environments often requires listeners to divide attention among several concurrent talkers before focusing on one talker with pertinent information. Such attentionally demanding tasks are particularly difficult for older adults due both to age-related hearing loss (presbacusis) and general declines in attentional processing and associated cognitive abilities. This study investigated two signal-processing techniques that have been suggested as a means of improving speech perception accuracy of older adults: time stretching and spatial separation of target talkers. Stimuli in each experiment comprised 2-4 fixed-form utterances in which listeners were asked to consecutively 1) detect concurrently spoken keywords in the beginning of the utterance (divided attention); and, 2) identify additional keywords from only one talker at the end of the utterance (selective attention). In Experiment 1, the overall tempo of each utterance was unaltered or slowed down by 25%; in Experiment 2 the concurrent utterances were spatially coincident or separated across a 180-degree hemifield. Both manipulations improved performance for elderly adults with age-appropriate hearing on both tasks. Increasing the divided attention load by attending to more concurrent keywords had a marked negative effect on performance of the selective attention task only when the target talker was identified by a keyword, but not by spatial location. These findings suggest that the temporal and spatial modifications of multitalker speech improved perception of multitalker speech primarily by reducing competition among cognitive resources required to perform attentionally demanding tasks. Published by Elsevier B.V.
Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Pinheiro da Silva, Joyce; Wertzner, Haydée Fiszbein
2017-05-22
The purpose of the study was to determine the sensitivity and specificity, and to establish cutoff points for the severity index Percentage of Consonants Correct - Revised (PCC-R) in Brazilian Portuguese-speaking children with and without speech sound disorders. 72 children between 5:00 and 7:11 years old - 36 children without speech and language complaints and 36 children with speech sound disorders. The PCC-R was applied to the figure naming and word imitation tasks that are part of the ABFW Child Language Test. Results were statistically analyzed. The ROC curve was performed and sensitivity and specificity values of the index were verified. The group of children without speech sound disorders presented greater PCC-R values in both tasks, regardless of the gender of the participants. The cutoff value observed for the picture naming task was 93.4%, with a sensitivity value of 0.89 and specificity of 0.94 (age independent). For the word imitation task, results were age-dependent: for age group ≤6:5 years old, the cutoff value was 91.0% (sensitivity of 0.77 and specificity of 0.94) and for age group >6:5 years-old, the cutoff value was 93.9% (sensitivity of 0.93 and specificity of 0.94). Given the high sensitivity and specificity of PCC-R, we can conclude that the index was effective in discriminating and identifying children with and without speech sound disorders.
Effect of Perceptual Load on Semantic Access by Speech in Children
ERIC Educational Resources Information Center
Jerger, Susan; Damian, Markus F.; Mills, Candice; Bartlett, James; Tye-Murray, Nancy; Abdi, Herve
2013-01-01
Purpose: To examine whether semantic access by speech requires attention in children. Method: Children ("N" = 200) named pictures and ignored distractors on a cross-modal (distractors: auditory-no face) or multimodal (distractors: auditory-static face and audiovisual- dynamic face) picture word task. The cross-modal task had a low load,…
Response Generalization in Apraxia of Speech Treatments: Taking Another Look.
ERIC Educational Resources Information Center
Ballard, Kirrie J.
2001-01-01
This article presents a critical review and reanalysis of response generalization effects in studies of treatment efficacy in apraxia of speech. The discussion focuses on the influence of the theoretical basis used to develop hypotheses and select behavior to test predictions, the complexity of the treatment task/s, and patient characteristics.…
Mental Load in Listening, Speech Shadowing and Simultaneous Interpreting: A Pupillometric Study.
ERIC Educational Resources Information Center
Tommola, Jorma; Hyona, Jukka
This study investigated the sensitivity of the pupillary response as an indicator of average mental load during three language processing tasks of varying complexity. The tasks included: (1) listening (without any subsequent comprehension testing); (2) speech shadowing (repeating a message in the same language while listening to it); and (3)…
ERIC Educational Resources Information Center
Caviness, John N.; Liss, Julie M.; Adler, Charles; Evidente, Virgilio
2006-01-01
Purpose: Corticomuscular electroencephalographic-electromyographic (EEG-EMG) coherence elicited by speech and nonspeech oromotor tasks in healthy participants and those with Parkinson's disease (PD) was examined. Hypotheses were the following: (a) corticomuscular coherence is demonstrable between orbicularis oris (OO) muscles' EMG and scalp EEG…
Effects of redundancy in the comparison of speech and pictorial displays in the cockpit environment.
Byblow, W D
1990-06-01
Synthesised speech and pictorial displays were compared in a spatially compatible simulated cockpit environment. Messages of high or low levels of redundancy were presented to subjects in both modality conditions. Subjects responded to warnings presented in a warning-only condition and in a dual-task condition, in which a simulated flight task was performed with visual and manual input/output modalities. Because the amount of information presented in most real-world applications and experimental paradigms is quantifiably large with respect to present guidelines for the use of synthesised speech warnings, the low-redundancy condition was hypothesised to allow for better performance. Results showed that subjects respond quicker to messages of low redundancy in both modalities. It is suggested that speech messages with low-redundancy levels were effective in minimising message length and ensuring that messages did not overload the short-term memory required to process and maintain speech in memory. Manipulation of phrase structure was used to optimise message redundancy and enhance the conceptual compatibility of the message without increasing message length or imposing a perceptual cost or memory overload. The results also suggest that system response times were quicker when synthesised speech warnings were used. This result is consistent with predictions from multiple resource theory which states that the resources required for the perception of verbal warnings are different from those for the flight task. It is also suggested that the perception of a pictorial display requires the same resources used for the perception of the primary flight task. An alternative explanation is that pictorial displays impose a visual scanning cost which is responsible for decreased performance. Based on the findings reported here, it is suggested that speech displays be incorporated in a spatially compatible cockpit environment because they allow equal or better performance when compared with pictorial displays. More importantly, the amount of time that the operator must direct his vision away from information vital to the flight task is decreased.
Inhibitory Control as a Moderator of Threat-related Interference Biases in Social Anxiety
Gorlin, Eugenia I.; Teachman, Bethany A.
2014-01-01
Prior findings are mixed regarding the presence and direction of threat-related interference biases in social anxiety. The current study examined general inhibitory control (IC), measured by the classic color-word Stroop, as a moderator of the relationship between both threat interference biases (indexed by the emotional Stroop) and several social anxiety indicators. High socially anxious undergraduate students (N=159) completed the emotional and color-word Stroop tasks, followed by an anxiety-inducing speech task. Participants completed measures of trait social anxiety, state anxiety before and during the speech, negative task-interfering cognitions during the speech, and overall self-evaluation of speech performance. Speech duration was used to measure behavioral avoidance. In line with hypotheses, IC moderated the relationship between emotional Stroop bias and every anxiety indicator (with the exception of behavioral avoidance), such that greater social-threat interference was associated with higher anxiety among those with weak IC, whereas lesser social-threat interference was associated with higher anxiety among those with strong IC. Implications for the theory and treatment of threat interference biases in socially anxious individuals are discussed. PMID:24967719
Moberly, Aaron C; Patel, Tirth R; Castellanos, Irina
2018-02-01
As a result of their hearing loss, adults with cochlear implants (CIs) would self-report poorer executive functioning (EF) skills than normal-hearing (NH) peers, and these EF skills would be associated with performance on speech recognition tasks. EF refers to a group of high order neurocognitive skills responsible for behavioral and emotional regulation during goal-directed activity, and EF has been found to be poorer in children with CIs than their NH age-matched peers. Moreover, there is increasing evidence that neurocognitive skills, including some EF skills, contribute to the ability to recognize speech through a CI. Thirty postlingually deafened adults with CIs and 42 age-matched NH adults were enrolled. Participants and their spouses or significant others (informants) completed well-validated self-reports or informant-reports of EF, the Behavior Rating Inventory of Executive Function - Adult (BRIEF-A). CI users' speech recognition skills were assessed in quiet using several measures of sentence recognition. NH peers were tested for recognition of noise-vocoded versions of the same speech stimuli. CI users self-reported difficulty on EF tasks of shifting and task monitoring. In CI users, measures of speech recognition correlated with several self-reported EF skills. The present findings provide further evidence that neurocognitive factors, including specific EF skills, may decline in association with hearing loss, and that some of these EF skills contribute to speech processing under degraded listening conditions.
Smith, Anne; Goffman, Lisa; Sasisekaran, Jayanthi; Weber-Fox, Christine
2012-01-01
Stuttering is a disorder of speech production that typically arises in the preschool years, and many accounts of its onset and development implicate language and motor processes as critical underlying factors. There have, however, been very few studies of speech motor control processes in preschool children who stutter. Hearing novel nonwords and reproducing them engages multiple neural networks, including those involved in phonological analysis and storage and speech motor programming and execution. We used this task to explore speech motor and language abilities of 31 children aged 4–5 years who were diagnosed as stuttering. We also used sensitive and specific standardized tests of speech and language abilities to determine which of the children who stutter had concomitant language and/or phonological disorders. Approximately half of our sample of stuttering children had language and/or phonological disorders. As previous investigations would suggest, the stuttering children with concomitant language or speech sound disorders produced significantly more errors on the nonword repetition task compared to typically developing children. In contrast, the children who were diagnosed as stuttering, but who had normal speech sound and language abilities, performed the nonword repetition task with equal accuracy compared to their normally fluent peers. Analyses of interarticulator motions during accurate and fluent productions of the nonwords revealed that the children who stutter (without concomitant disorders) showed higher variability in oral motor coordination indices. These results provide new evidence that preschool children diagnosed as stuttering lag their typically developing peers in maturation of speech motor control processes. Educational objectives The reader will be able to: (a) discuss why performance on nonword repetition tasks has been investigated in children who stutter; (b) discuss why children who stutter in the current study had a higher incidence of concomitant language deficits compared to several other studies; (c) describe how performance differed on a nonword repetition test between children who stutter who do and do not have concomitant speech or language deficits; (d) make a general statement about speech motor control for nonword production in children who stutter compared to controls. PMID:23218217
Özçalışkan, Şeyda; Levine, Susan C.; Goldin-Meadow, Susan
2013-01-01
Children with pre/perinatal unilateral brain lesions (PL) show remarkable plasticity for language development. Is this plasticity characterized by the same developmental trajectory that characterizes typically developing (TD) children, with gesture leading the way into speech? We explored this question, comparing 11 children with PL—matched to 30 TD children on expressive vocabulary—in the second year of life. Children with PL showed similarities to TD children for simple but not complex sentence types. Children with PL produced simple sentences across gesture and speech several months before producing them entirely in speech, exhibiting parallel delays in both gesture+speech and speech-alone. However, unlike TD children, children with PL produced complex sentence types first in speech-alone. Overall, the gesture-speech system appears to be a robust feature of language-learning for simple—but not complex—sentence constructions, acting as a harbinger of change in language development even when that language is developing in an injured brain. PMID:23217292
Effects of Hearing Loss and Cognitive Load on Speech Recognition with Competing Talkers.
Meister, Hartmut; Schreitmüller, Stefan; Ortmann, Magdalene; Rählmann, Sebastian; Walger, Martin
2016-01-01
Everyday communication frequently comprises situations with more than one talker speaking at a time. These situations are challenging since they pose high attentional and memory demands placing cognitive load on the listener. Hearing impairment additionally exacerbates communication problems under these circumstances. We examined the effects of hearing loss and attention tasks on speech recognition with competing talkers in older adults with and without hearing impairment. We hypothesized that hearing loss would affect word identification, talker separation and word recall and that the difficulties experienced by the hearing impaired listeners would be especially pronounced in a task with high attentional and memory demands. Two listener groups closely matched for their age and neuropsychological profile but differing in hearing acuity were examined regarding their speech recognition with competing talkers in two different tasks. One task required repeating back words from one target talker (1TT) while ignoring the competing talker whereas the other required repeating back words from both talkers (2TT). The competing talkers differed with respect to their voice characteristics. Moreover, sentences either with low or high context were used in order to consider linguistic properties. Compared to their normal hearing peers, listeners with hearing loss revealed limited speech recognition in both tasks. Their difficulties were especially pronounced in the more demanding 2TT task. In order to shed light on the underlying mechanisms, different error sources, namely having misunderstood, confused, or omitted words were investigated. Misunderstanding and omitting words were more frequently observed in the hearing impaired than in the normal hearing listeners. In line with common speech perception models, it is suggested that these effects are related to impaired object formation and taxed working memory capacity (WMC). In a post-hoc analysis, the listeners were further separated with respect to their WMC. It appeared that higher capacity could be used in the sense of a compensatory mechanism with respect to the adverse effects of hearing loss, especially with low context speech.
Effect(s) of Language Tasks on Severity of Disfluencies in Preschool Children with Stuttering.
Zamani, Peyman; Ravanbakhsh, Majid; Weisi, Farzad; Rashedi, Vahid; Naderi, Sara; Hosseinzadeh, Ayub; Rezaei, Mohammad
2017-04-01
Speech disfluency in children can be increased or decreased depending on the type of linguistic task presented to them. In this study, the effect of sentence imitation and sentence modeling on severity of speech disfluencies in preschool children with stuttering is investigated. In this cross-sectional descriptive analytical study, 58 children with stuttering (29 with mild stuttering and 29 with moderate stuttering) and 58 typical children aged between 4 and 6 years old participated. The severity of speech disfluencies was assessed by SSI-3 and TOCS before and after offering each task. In boys with mild stuttering, The mean stuttering severity scores in two tasks of sentence imitation and sentence modeling were [Formula: see text] and [Formula: see text] respectively ([Formula: see text]). But, in boys with moderate stuttering the stuttering severity in the both tasks were [Formula: see text] and [Formula: see text] respectively ([Formula: see text]). In girls with mild stuttering, the stuttering severity in two tasks of sentence imitation and sentence modeling were [Formula: see text] and [Formula: see text] respectively ([Formula: see text]). But, in girls with moderate stuttering the mean stuttering severity in the both tasks were [Formula: see text] and [Formula: see text] respectively ([Formula: see text]). In both gender of typical children, the score of speech disfluencies had no significant difference between two tasks ([Formula: see text]). In preschool children with mild stuttering and peer non-stutters, performing the tasks of sentence imitation and sentence modeling could not increase the severity of stuttering. But, in preschool children with moderate stuttering, doing the task of sentence modeling increased the stuttering severity score.
Söderlund, Göran B. W.; Jobs, Elisabeth Nilsson
2016-01-01
The most common neuropsychiatric condition in the in children is attention deficit hyperactivity disorder (ADHD), affecting ∼6–9% of the population. ADHD is distinguished by inattention and hyperactive, impulsive behaviors as well as poor performance in various cognitive tasks often leading to failures at school. Sensory and perceptual dysfunctions have also been noticed. Prior research has mainly focused on limitations in executive functioning where differences are often explained by deficits in pre-frontal cortex activation. Less notice has been given to sensory perception and subcortical functioning in ADHD. Recent research has shown that children with ADHD diagnosis have a deviant auditory brain stem response compared to healthy controls. The aim of the present study was to investigate if the speech recognition threshold differs between attentive and children with ADHD symptoms in two environmental sound conditions, with and without external noise. Previous research has namely shown that children with attention deficits can benefit from white noise exposure during cognitive tasks and here we investigate if noise benefit is present during an auditory perceptual task. For this purpose we used a modified Hagerman’s speech recognition test where children with and without attention deficits performed a binaural speech recognition task to assess the speech recognition threshold in no noise and noise conditions (65 dB). Results showed that the inattentive group displayed a higher speech recognition threshold than typically developed children and that the difference in speech recognition threshold disappeared when exposed to noise at supra threshold level. From this we conclude that inattention can partly be explained by sensory perceptual limitations that can possibly be ameliorated through noise exposure. PMID:26858679
Objective measures of listening effort: effects of background noise and noise reduction.
Sarampalis, Anastasios; Kalluri, Sridhar; Edwards, Brent; Hafter, Ervin
2009-10-01
This work is aimed at addressing a seeming contradiction related to the use of noise-reduction (NR) algorithms in hearing aids. The problem is that although some listeners claim a subjective improvement from NR, it has not been shown to improve speech intelligibility, often even making it worse. To address this, the hypothesis tested here is that the positive effects of NR might be to reduce cognitive effort directed toward speech reception, making it available for other tasks. Normal-hearing individuals participated in 2 dual-task experiments, in which 1 task was to report sentences or words in noise set to various signal-to-noise ratios. Secondary tasks involved either holding words in short-term memory or responding in a complex visual reaction-time task. At low values of signal-to-noise ratio, although NR had no positive effect on speech reception thresholds, it led to better performance on the word-memory task and quicker responses in visual reaction times. Results from both dual tasks support the hypothesis that NR reduces listening effort and frees up cognitive resources for other tasks. Future hearing aid research should incorporate objective measurements of cognitive benefits.
Eichorn, Naomi; Marton, Klara; Schwartz, Richard G; Melara, Robert D; Pirutinsky, Steven
2016-06-01
The present study examined whether engaging working memory in a secondary task benefits speech fluency. Effects of dual-task conditions on speech fluency, rate, and errors were examined with respect to predictions derived from three related theoretical accounts of disfluencies. Nineteen adults who stutter and twenty adults who do not stutter participated in the study. All participants completed 2 baseline tasks: a continuous-speaking task and a working-memory (WM) task involving manipulations of domain, load, and interstimulus interval. In the dual-task portion of the experiment, participants simultaneously performed the speaking task with each unique combination of WM conditions. All speakers showed similar fluency benefits and decrements in WM accuracy as a result of dual-task conditions. Fluency effects were specific to atypical forms of disfluency and were comparable across WM-task manipulations. Changes in fluency were accompanied by reductions in speaking rate but not by corresponding changes in overt errors. Findings suggest that WM contributes to disfluencies regardless of stuttering status and that engaging WM resources while speaking enhances fluency. Further research is needed to verify the cognitive mechanism involved in this effect and to determine how these findings can best inform clinical intervention.
McMurray, Bob; Jongman, Allard
2012-01-01
Most theories of categorization emphasize how continuous perceptual information is mapped to categories. However, equally important is the informational assumptions of a model, the type of information subserving this mapping. This is crucial in speech perception where the signal is variable and context-dependent. This study assessed the informational assumptions of several models of speech categorization, in particular, the number of cues that are the basis of categorization and whether these cues represent the input veridically or have undergone compensation. We collected a corpus of 2880 fricative productions (Jongman, Wayland & Wong, 2000) spanning many talker- and vowel-contexts and measured 24 cues for each. A subset was also presented to listeners in an 8AFC phoneme categorization task. We then trained a common classification model based on logistic regression to categorize the fricative from the cue values, and manipulated the information in the training set to contrast 1) models based on a small number of invariant cues; 2) models using all cues without compensation, and 3) models in which cues underwent compensation for contextual factors. Compensation was modeled by Computing Cues Relative to Expectations (C-CuRE), a new approach to compensation that preserves fine-grained detail in the signal. Only the compensation model achieved a similar accuracy to listeners, and showed the same effects of context. Thus, even simple categorization metrics can overcome the variability in speech when sufficient information is available and compensation schemes like C-CuRE are employed. PMID:21417542
ERIC Educational Resources Information Center
Mailend, Marja-Liisa; Maas, Edwin
2013-01-01
Purpose: Apraxia of speech (AOS) is considered a speech motor programming impairment, but the specific nature of the impairment remains a matter of debate. This study investigated 2 hypotheses about the underlying impairment in AOS framed within the Directions Into Velocities of Articulators (DIVA; Guenther, Ghosh, & Tourville, 2006) model: The…
ERIC Educational Resources Information Center
Vandana, V. P.; Manjula, R.
2006-01-01
Cerebellum plays an important role in speech motor control. Various tasks like sustained phonation, diadochokinesis and conversation have been used to tap the speech timing abilities of dysarthric clients with cerebellar lesion. It has recently been proposed that not all areas of the cerebellum may be involved in speech motor control; especially…
So, Wing-Chee; Yi-Feng, Alvan Low; Yap, De-Fu; Kheng, Eugene; Yap, Ju-Min Melvin
2013-01-01
Previous studies have shown that iconic gestures presented in an isolated manner prime visually presented semantically related words. Since gestures and speech are almost always produced together, this study examined whether iconic gestures accompanying speech would prime words and compared the priming effect of iconic gestures with speech to that of iconic gestures presented alone. Adult participants (N = 180) were randomly assigned to one of three conditions in a lexical decision task: Gestures-Only (the primes were iconic gestures presented alone); Speech-Only (the primes were auditory tokens conveying the same meaning as the iconic gestures); Gestures-Accompanying-Speech (the primes were the simultaneous coupling of iconic gestures and their corresponding auditory tokens). Our findings revealed significant priming effects in all three conditions. However, the priming effect in the Gestures-Accompanying-Speech condition was comparable to that in the Speech-Only condition and was significantly weaker than that in the Gestures-Only condition, suggesting that the facilitatory effect of iconic gestures accompanying speech may be constrained by the level of language processing required in the lexical decision task where linguistic processing of words forms is more dominant than semantic processing. Hence, the priming effect afforded by the co-speech iconic gestures was weakened. PMID:24155738
Rith-Najarian, Leslie R.; McLaughlin, Katie A.; Sheridan, Margaret A.; Nock, Matthew K.
2014-01-01
Extensive research among adults supports the biopsychosocial (BPS) model of challenge and threat, which describes relationships among stress appraisals, physiological stress reactivity, and performance; however, no previous studies have examined these relationships in adolescents. Perceptions of stressors as well as physiological reactivity to stress increase during adolescence, highlighting the importance of understanding the relationships among stress appraisals, physiological reactivity, and performance during this developmental period. In this study, 79 adolescent participants reported on stress appraisals before and after a Trier Social Stress Test in which they performed a speech task. Physiological stress reactivity was defined by changes in cardiac output and total peripheral resistance from a baseline rest period to the speech task, and performance on the speech was coded using an objective rating system. We observed in adolescents only two relationships found in past adult research on the BPS model variables: (1) pre-task stress appraisal predicted post-task stress appraisal and (2) performance predicted post-task stress appraisal. Physiological reactivity during the speech was unrelated to pre- and post-task stress appraisals and to performance. We conclude that the lack of association between post-task stress appraisal and physiological stress reactivity suggests that adolescents might have low self-awareness of physiological emotional arousal. Our findings further suggest that adolescent stress appraisals are based largely on their performance during stressful situations. Developmental implications of this potential lack of awareness of one’s physiological and emotional state during adolescence are discussed. PMID:24491123
Rith-Najarian, Leslie R; McLaughlin, Katie A; Sheridan, Margaret A; Nock, Matthew K
2014-03-01
Extensive research among adults supports the biopsychosocial (BPS) model of challenge and threat, which describes relationships among stress appraisals, physiological stress reactivity, and performance; however, no previous studies have examined these relationships in adolescents. Perceptions of stressors as well as physiological reactivity to stress increase during adolescence, highlighting the importance of understanding the relationships among stress appraisals, physiological reactivity, and performance during this developmental period. In this study, 79 adolescent participants reported on stress appraisals before and after a Trier Social Stress Test in which they performed a speech task. Physiological stress reactivity was defined by changes in cardiac output and total peripheral resistance from a baseline rest period to the speech task, and performance on the speech was coded using an objective rating system. We observed in adolescents only two relationships found in past adult research on the BPS model variables: (1) pre-task stress appraisal predicted post-task stress appraisal and (2) performance predicted post-task stress appraisal. Physiological reactivity during the speech was unrelated to pre- and post-task stress appraisals and to performance. We conclude that the lack of association between post-task stress appraisal and physiological stress reactivity suggests that adolescents might have low self-awareness of physiological emotional arousal. Our findings further suggest that adolescent stress appraisals are based largely on their performance during stressful situations. Developmental implications of this potential lack of awareness of one's physiological and emotional state during adolescence are discussed.
Co-speech iconic gestures and visuo-spatial working memory.
Wu, Ying Choon; Coulson, Seana
2014-11-01
Three experiments tested the role of verbal versus visuo-spatial working memory in the comprehension of co-speech iconic gestures. In Experiment 1, participants viewed congruent discourse primes in which the speaker's gestures matched the information conveyed by his speech, and incongruent ones in which the semantic content of the speaker's gestures diverged from that in his speech. Discourse primes were followed by picture probes that participants judged as being either related or unrelated to the preceding clip. Performance on this picture probe classification task was faster and more accurate after congruent than incongruent discourse primes. The effect of discourse congruency on response times was linearly related to measures of visuo-spatial, but not verbal, working memory capacity, as participants with greater visuo-spatial WM capacity benefited more from congruent gestures. In Experiments 2 and 3, participants performed the same picture probe classification task under conditions of high and low loads on concurrent visuo-spatial (Experiment 2) and verbal (Experiment 3) memory tasks. Effects of discourse congruency and verbal WM load were additive, while effects of discourse congruency and visuo-spatial WM load were interactive. Results suggest that congruent co-speech gestures facilitate multi-modal language comprehension, and indicate an important role for visuo-spatial WM in these speech-gesture integration processes. Copyright © 2014 Elsevier B.V. All rights reserved.
Children's Acoustic and Linguistic Adaptations to Peers with Hearing Impairment
ERIC Educational Resources Information Center
Granlund, Sonia; Hazan, Valerie; Mahon, Merle
2018-01-01
Purpose: This study aims to examine the clear speaking strategies used by older children when interacting with a peer with hearing loss, focusing on both acoustic and linguistic adaptations in speech. Method: The Grid task, a problem-solving task developed to elicit spontaneous interactive speech, was used to obtain a range of global acoustic and…
Effect(s) of Language Tasks on Severity of Disfluencies in Preschool Children with Stuttering
ERIC Educational Resources Information Center
Zamani, Peyman; Ravanbakhsh, Majid; Weisi, Farzad; Rashedi, Vahid; Naderi, Sara; Hosseinzadeh, Ayub; Rezaei, Mohammad
2017-01-01
Speech disfluency in children can be increased or decreased depending on the type of linguistic task presented to them. In this study, the effect of sentence imitation and sentence modeling on severity of speech disfluencies in preschool children with stuttering is investigated. In this cross-sectional descriptive analytical study, 58 children…
Impact of Noise and Working Memory on Speech Processing in Adults with and without ADHD
ERIC Educational Resources Information Center
Michalek, Anne M. P.
2012-01-01
Auditory processing of speech is influenced by internal (i.e., attention, working memory) and external factors (i.e., background noise, visual information). This study examined the interplay among these factors in individuals with and without ADHD. All participants completed a listening in noise task, two working memory capacity tasks, and two…
ERIC Educational Resources Information Center
Sandgren, Olof; Andersson, Richard; van de Weijer, Joost; Hansson, Kristina; Sahlén, Birgitta
2014-01-01
Purpose: To investigate gaze behavior during communication between children with hearing impairment (HI) and normal-hearing (NH) peers. Method: Ten HI-NH and 10 NH-NH dyads performed a referential communication task requiring description of faces. During task performance, eye movements and speech were tracked. Using verbal event (questions,…
ERIC Educational Resources Information Center
Kormos, Judit; Préfontaine, Yvonne
2017-01-01
The present mixed-methods study examined the role of learner appraisals of speech tasks in second language (L2) French fluency. Forty adult learners in a Canadian immersion program participated in the study that compared four sources of data: (1) objectively measured utterance fluency in participants' performances of three narrative tasks…
Alternating Motion Rate as an Index of Speech Motor Disorder in Traumatic Brain Injury
ERIC Educational Resources Information Center
Wang, Yu-Tsai; Kent, Ray D.; Duffy, Joseph R.; Thomas, Jack E.; Weismer, Gary
2004-01-01
The task of syllable alternating motion rate (AMR) (also called diadochokinesis) is suitable for examining speech disorders of varying degrees of severity and in individuals with varying levels of linguistic and cognitive ability. However, very limited information on this task has been published for subjects with traumatic brain injury (TBI). This…
Developmental Shifts in Children's Sensitivity to Visual Speech: A New Multimodal Picture-Word Task
ERIC Educational Resources Information Center
Jerger, Susan; Damian, Markus F.; Spence, Melanie J.; Tye-Murray, Nancy; Abdi, Herve
2009-01-01
This research developed a multimodal picture-word task for assessing the influence of visual speech on phonological processing by 100 children between 4 and 14 years of age. We assessed how manipulation of seemingly to-be-ignored auditory (A) and audiovisual (AV) phonological distractors affected picture naming without participants consciously…
Perceived Conventionality in Co-speech Gestures Involves the Fronto-Temporal Language Network.
Wolf, Dhana; Rekittke, Linn-Marlen; Mittelberg, Irene; Klasen, Martin; Mathiak, Klaus
2017-01-01
Face-to-face communication is multimodal; it encompasses spoken words, facial expressions, gaze, and co-speech gestures. In contrast to linguistic symbols (e.g., spoken words or signs in sign language) relying on mostly explicit conventions, gestures vary in their degree of conventionality. Bodily signs may have a general accepted or conventionalized meaning (e.g., a head shake) or less so (e.g., self-grooming). We hypothesized that subjective perception of conventionality in co-speech gestures relies on the classical language network, i.e., the left hemispheric inferior frontal gyrus (IFG, Broca's area) and the posterior superior temporal gyrus (pSTG, Wernicke's area) and studied 36 subjects watching video-recorded story retellings during a behavioral and an functional magnetic resonance imaging (fMRI) experiment. It is well documented that neural correlates of such naturalistic videos emerge as intersubject covariance (ISC) in fMRI even without involving a stimulus (model-free analysis). The subjects attended either to perceived conventionality or to a control condition (any hand movements or gesture-speech relations). Such tasks modulate ISC in contributing neural structures and thus we studied ISC changes to task demands in language networks. Indeed, the conventionality task significantly increased covariance of the button press time series and neuronal synchronization in the left IFG over the comparison with other tasks. In the left IFG, synchronous activity was observed during the conventionality task only. In contrast, the left pSTG exhibited correlated activation patterns during all conditions with an increase in the conventionality task at the trend level only. Conceivably, the left IFG can be considered a core region for the processing of perceived conventionality in co-speech gestures similar to spoken language. In general, the interpretation of conventionalized signs may rely on neural mechanisms that engage during language comprehension.
Cingulo-opercular activity affects incidental memory encoding for speech in noise.
Vaden, Kenneth I; Teubner-Rhodes, Susan; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A
2017-08-15
Correctly understood speech in difficult listening conditions is often difficult to remember. A long-standing hypothesis for this observation is that the engagement of cognitive resources to aid speech understanding can limit resources available for memory encoding. This hypothesis is consistent with evidence that speech presented in difficult conditions typically elicits greater activity throughout cingulo-opercular regions of frontal cortex that are proposed to optimize task performance through adaptive control of behavior and tonic attention. However, successful memory encoding of items for delayed recognition memory tasks is consistently associated with increased cingulo-opercular activity when perceptual difficulty is minimized. The current study used a delayed recognition memory task to test competing predictions that memory encoding for words is enhanced or limited by the engagement of cingulo-opercular activity during challenging listening conditions. An fMRI experiment was conducted with twenty healthy adult participants who performed a word identification in noise task that was immediately followed by a delayed recognition memory task. Consistent with previous findings, word identification trials in the poorer signal-to-noise ratio condition were associated with increased cingulo-opercular activity and poorer recognition memory scores on average. However, cingulo-opercular activity decreased for correctly identified words in noise that were not recognized in the delayed memory test. These results suggest that memory encoding in difficult listening conditions is poorer when elevated cingulo-opercular activity is not sustained. Although increased attention to speech when presented in difficult conditions may detract from more active forms of memory maintenance (e.g., sub-vocal rehearsal), we conclude that task performance monitoring and/or elevated tonic attention supports incidental memory encoding in challenging listening conditions. Copyright © 2017 Elsevier Inc. All rights reserved.
Perceived Conventionality in Co-speech Gestures Involves the Fronto-Temporal Language Network
Wolf, Dhana; Rekittke, Linn-Marlen; Mittelberg, Irene; Klasen, Martin; Mathiak, Klaus
2017-01-01
Face-to-face communication is multimodal; it encompasses spoken words, facial expressions, gaze, and co-speech gestures. In contrast to linguistic symbols (e.g., spoken words or signs in sign language) relying on mostly explicit conventions, gestures vary in their degree of conventionality. Bodily signs may have a general accepted or conventionalized meaning (e.g., a head shake) or less so (e.g., self-grooming). We hypothesized that subjective perception of conventionality in co-speech gestures relies on the classical language network, i.e., the left hemispheric inferior frontal gyrus (IFG, Broca's area) and the posterior superior temporal gyrus (pSTG, Wernicke's area) and studied 36 subjects watching video-recorded story retellings during a behavioral and an functional magnetic resonance imaging (fMRI) experiment. It is well documented that neural correlates of such naturalistic videos emerge as intersubject covariance (ISC) in fMRI even without involving a stimulus (model-free analysis). The subjects attended either to perceived conventionality or to a control condition (any hand movements or gesture-speech relations). Such tasks modulate ISC in contributing neural structures and thus we studied ISC changes to task demands in language networks. Indeed, the conventionality task significantly increased covariance of the button press time series and neuronal synchronization in the left IFG over the comparison with other tasks. In the left IFG, synchronous activity was observed during the conventionality task only. In contrast, the left pSTG exhibited correlated activation patterns during all conditions with an increase in the conventionality task at the trend level only. Conceivably, the left IFG can be considered a core region for the processing of perceived conventionality in co-speech gestures similar to spoken language. In general, the interpretation of conventionalized signs may rely on neural mechanisms that engage during language comprehension. PMID:29249945
Emotional reactivity and regulation in preschool-age children who stutter.
Ntourou, Katerina; Conture, Edward G; Walden, Tedra A
2013-09-01
This study experimentally investigated behavioral correlates of emotional reactivity and emotion regulation and their relation to speech (dis)fluency in preschool-age children who do (CWS) and do not (CWNS) stutter during emotion-eliciting conditions. Participants (18 CWS, 14 boys; 18 CWNS, 14 boys) completed two experimental tasks (1) a neutral ("apples and leaves in a transparent box," ALTB) and (2) a frustrating ("attractive toy in a transparent box," ATTB) task, both of which were followed by a narrative task. Dependent measures were emotional reactivity (positive affect, negative affect), emotion regulation (self-speech, distraction) exhibited during the ALTB and the ATTB tasks, percentage of stuttered disfluencies (SDs) and percentage of non-stuttered disfluencies (NSDs) produced during the narratives. Results indicated that preschool-age CWS exhibited significantly more negative emotion and more self-speech than preschool-age CWNS. For CWS only, emotion regulation behaviors (i.e., distraction, self-speech) during the experimental tasks were predictive of stuttered disfluencies during the subsequent narrative tasks. Furthermore, for CWS there was no relation between emotional processes and non-stuttered disfluencies, but CWNS's negative affect was significantly related to nonstuttered disfluencies. In general, present findings support the notion that emotional processes are associated with childhood stuttering. Specifically, findings are consistent with the notion that preschool-age CWS are more emotionally reactive than CWNS and that their self-speech regulatory attempts may be less than effective in modulating their emotions. The reader will be able to: (a) communicate the relevance of studying the role of emotion in developmental stuttering close to the onset of stuttering and (b) describe the main findings of the present study in relation to previous studies that have used different methodologies to investigate the role of emotion in developmental stuttering of young children who stutter. Copyright © 2013 Elsevier Inc. All rights reserved.
Temporal event structure and timing in schizophrenia: preserved binding in a longer "now".
Martin, Brice; Giersch, Anne; Huron, Caroline; van Wassenhove, Virginie
2013-01-01
Patients with schizophrenia experience a loss of temporal continuity or subjective fragmentation along the temporal dimension. Here, we develop the hypothesis that impaired temporal awareness results from a perturbed structuring of events in time-i.e., canonical neural dynamics. To address this, 26 patients and their matched controls took part in two psychophysical studies using desynchronized audiovisual speech. Two tasks were used and compared: first, an identification task testing for multisensory binding impairments in which participants reported what they heard while looking at a speaker's face; in a second task, we tested the perceived simultaneity of the same audiovisual speech stimuli. In both tasks, we used McGurk fusion and combination that are classic ecologically valid multisensory illusions. First, and contrary to previous reports, our results show that patients do not significantly differ from controls in their rate of illusory reports. Second, the illusory reports of patients in the identification task were more sensitive to audiovisual speech desynchronies than those of controls. Third, and surprisingly, patients considered audiovisual speech to be synchronized for longer delays than controls. As such, the temporal tolerance profile observed in a temporal judgement task was less of a predictor for sensory binding in schizophrenia than for that obtained in controls. We interpret our results as an impairment of temporal event structuring in schizophrenia which does not specifically affect sensory binding operations but rather, the explicit access to timing information associated here with audiovisual speech processing. Our findings are discussed in the context of curent neurophysiological frameworks for the binding and the structuring of sensory events in time. Copyright © 2012 Elsevier Ltd. All rights reserved.
Law, Jeremy M.; Vandermosten, Maaike; Ghesquiere, Pol; Wouters, Jan
2014-01-01
This study investigated whether auditory, speech perception, and phonological skills are tightly interrelated or independently contributing to reading. We assessed each of these three skills in 36 adults with a past diagnosis of dyslexia and 54 matched normal reading adults. Phonological skills were tested by the typical threefold tasks, i.e., rapid automatic naming, verbal short-term memory and phonological awareness. Dynamic auditory processing skills were assessed by means of a frequency modulation (FM) and an amplitude rise time (RT); an intensity discrimination task (ID) was included as a non-dynamic control task. Speech perception was assessed by means of sentences and words-in-noise tasks. Group analyses revealed significant group differences in auditory tasks (i.e., RT and ID) and in phonological processing measures, yet no differences were found for speech perception. In addition, performance on RT discrimination correlated with reading but this relation was mediated by phonological processing and not by speech-in-noise. Finally, inspection of the individual scores revealed that the dyslexic readers showed an increased proportion of deviant subjects on the slow-dynamic auditory and phonological tasks, yet each individual dyslexic reader does not display a clear pattern of deficiencies across the processing skills. Although our results support phonological and slow-rate dynamic auditory deficits which relate to literacy, they suggest that at the individual level, problems in reading and writing cannot be explained by the cascading auditory theory. Instead, dyslexic adults seem to vary considerably in the extent to which each of the auditory and phonological factors are expressed and interact with environmental and higher-order cognitive influences. PMID:25071512
Effect of perceptual load on semantic access by speech in children
Jerger, Susan; Damian, Markus F.; Mills, Candice; Bartlett, James; Tye-Murray, Nancy; Abdi, Hervè
2013-01-01
Purpose To examine whether semantic access by speech requires attention in children. Method Children (N=200) named pictures and ignored distractors on a cross-modal (distractors: auditory-no face) or multi-modal (distractors: auditory-static face and audiovisual-dynamic face) picture word task. The cross-modal had a low load, and the multi-modal had a high load [i.e., respectively naming pictures displayed 1) on a blank screen vs 2) below the talker’s face on his T-shirt]. Semantic content of distractors was manipulated to be related vs unrelated to picture (e.g., picture dog with distractors bear vs cheese). Lavie's (2005) perceptual load model proposes that semantic access is independent of capacity limited attentional resources if irrelevant semantic-content manipulation influences naming times on both tasks despite variations in loads but dependent on attentional resources exhausted by higher load task if irrelevant content influences naming only on cross-modal (low load). Results Irrelevant semantic content affected performance for both tasks in 6- to 9-year-olds, but only on cross-modal in 4–5-year-olds. The addition of visual speech did not influence results on the multi-modal task. Conclusion Younger and older children differ in dependence on attentional resources for semantic access by speech. PMID:22896045
Winn, Matthew B; Won, Jong Ho; Moon, Il Joon
This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). The authors hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. The authors further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization. Nineteen cochlear implant listeners and 10 listeners with normal hearing participated in a suite of tasks that included spectral ripple discrimination, temporal modulation detection, and syllable categorization, which was split into a spectral cue-based task (targeting the /ba/-/da/ contrast) and a timing cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for cochlear implant listeners. Cochlear implant users were generally less successful at utilizing both spectral and temporal cues for categorization compared with listeners with normal hearing. For the cochlear implant listener group, spectral ripple discrimination was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. Temporal modulation detection using 100- and 10-Hz-modulated noise was not correlated either with the cochlear implant subjects' categorization of voice onset time or with word recognition. Word recognition was correlated more closely with categorization of the controlled speech cues than with performance on the psychophysical discrimination tasks. When evaluating people with cochlear implants, controlled speech-based stimuli are feasible to use in tests of auditory cue categorization, to complement traditional measures of auditory discrimination. Stimuli based on specific speech cues correspond to counterpart nonlinguistic measures of discrimination, but potentially show better correspondence with speech perception more generally. The ubiquity of the spectral (formant transition) and temporal (voice onset time) stimulus dimensions across languages highlights the potential to use this testing approach even in cases where English is not the native language.
Winn, Matthew B.; Won, Jong Ho; Moon, Il Joon
2016-01-01
Objectives This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). We hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. We further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization. Design Nineteen CI listeners and 10 listeners with normal hearing (NH) participated in a suite of tasks that included spectral ripple discrimination (SRD), temporal modulation detection (TMD), and syllable categorization, which was split into a spectral-cue-based task (targeting the /ba/-/da/ contrast) and a timing-cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated in order to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression in order to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for CI listeners. Results CI users were generally less successful at utilizing both spectral and temporal cues for categorization compared to listeners with normal hearing. For the CI listener group, SRD was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. TMD using 100 Hz and 10 Hz modulated noise was not correlated with the CI subjects’ categorization of VOT, nor with word recognition. Word recognition was correlated more closely with categorization of the controlled speech cues than with performance on the psychophysical discrimination tasks. Conclusions When evaluating people with cochlear implants, controlled speech-based stimuli are feasible to use in tests of auditory cue categorization, to complement traditional measures of auditory discrimination. Stimuli based on specific speech cues correspond to counterpart non-linguistic measures of discrimination, but potentially show better correspondence with speech perception more generally. The ubiquity of the spectral (formant transition) and temporal (VOT) stimulus dimensions across languages highlights the potential to use this testing approach even in cases where English is not the native language. PMID:27438871
Effects of vocal training and phonatory task on voice onset time.
McCrea, Christopher R; Morris, Richard J
2007-01-01
The purpose of this study was to examine the temporal-acoustic differences between trained singers and nonsingers during speech and singing tasks. Thirty male participants were separated into two groups of 15 according to level of vocal training (ie, trained or untrained). The participants spoke and sang carrier phrases containing English voiced and voiceless bilabial stops, and voice onset time (VOT) was measured for the stop consonant productions. Mixed analyses of variance revealed a significant main effect between speech and singing for /p/ and /b/, with VOT durations longer during speech than singing for /p/, and the opposite true for /b/. Furthermore, a significant phonatory task by vocal training interaction was observed for /p/ productions. The results indicated that the type of phonatory task influences VOT and that these influences are most obvious in trained singers secondary to the articulatory and phonatory adjustments learned during vocal training.
Control of Task Sequences: What is the Role of Language?
Mayr, Ulrich; Kleffner, Killian; Kikumoto, Atsushi; Redford, Melissa A.
2015-01-01
It is almost a truism that language aids serial-order control through self-cuing of upcoming sequential elements. We measured speech onset latencies as subjects performed hierarchically organized task sequences while "thinking aloud" each task label. Surprisingly, speech onset latencies and response times (RTs) were highly synchronized, a pattern that is not consistent with the hypothesis that speaking aids proactive retrieval of upcoming sequential elements during serial-order control. We also found that when instructed to do so, participants were able to speak task labels prior to presentation of response-relevant stimuli and that this substantially reduced RT signatures of retrieval—however at the cost of more sequencing errors. Thus, while proactive retrieval is possible in principle, in natural situations it seems to be prevented through a strong, "gestalt-like" tendency to synchronize speech and action. We suggest that this tendency may support context updating rather than proactive control. PMID:24274386
Automatic detection of Parkinson's disease in running speech spoken in three different languages.
Orozco-Arroyave, J R; Hönig, F; Arias-Londoño, J D; Vargas-Bonilla, J F; Daqrouq, K; Skodda, S; Rusz, J; Nöth, E
2016-01-01
The aim of this study is the analysis of continuous speech signals of people with Parkinson's disease (PD) considering recordings in different languages (Spanish, German, and Czech). A method for the characterization of the speech signals, based on the automatic segmentation of utterances into voiced and unvoiced frames, is addressed here. The energy content of the unvoiced sounds is modeled using 12 Mel-frequency cepstral coefficients and 25 bands scaled according to the Bark scale. Four speech tasks comprising isolated words, rapid repetition of the syllables /pa/-/ta/-/ka/, sentences, and read texts are evaluated. The method proves to be more accurate than classical approaches in the automatic classification of speech of people with PD and healthy controls. The accuracies range from 85% to 99% depending on the language and the speech task. Cross-language experiments are also performed confirming the robustness and generalization capability of the method, with accuracies ranging from 60% to 99%. This work comprises a step forward for the development of computer aided tools for the automatic assessment of dysarthric speech signals in multiple languages.
Flanagan, Sheila; Goswami, Usha
2018-03-01
Recent models of the neural encoding of speech suggest a core role for amplitude modulation (AM) structure, particularly regarding AM phase alignment. Accordingly, speech tasks that measure linguistic development in children may exhibit systematic properties regarding AM structure. Here, the acoustic structure of spoken items in child phonological and morphological tasks, phoneme deletion and plural elicitation, was investigated. The phase synchronisation index (PSI), reflecting the degree of phase alignment between pairs of AMs, was computed for 3 AM bands (delta, theta, beta/low gamma; 0.9-2.5 Hz, 2.5-12 Hz, 12-40 Hz, respectively), for five spectral bands covering 100-7250 Hz. For phoneme deletion, data from 94 child participants with and without dyslexia was used to relate AM structure to behavioural performance. Results revealed that a significant change in magnitude of the phase synchronisation index (ΔPSI) of slower AMs (delta-theta) systematically accompanied both phoneme deletion and plural elicitation. Further, children with dyslexia made more linguistic errors as the delta-theta ΔPSI increased. Accordingly, ΔPSI between slower temporal modulations in the speech signal systematically distinguished test items from accurate responses and predicted task performance. This may suggest that sensitivity to slower AM information in speech is a core aspect of phonological and morphological development.
ERIC Educational Resources Information Center
Preston, Jonathan L.; Edwards, Mary Louise
2009-01-01
Children with residual speech sound errors are often underserved clinically, yet there has been a lack of recent research elucidating the specific deficits in this population. Adolescents aged 10-14 with residual speech sound errors (RE) that included rhotics were compared to normally speaking peers on tasks assessing speed and accuracy of speech…
ERIC Educational Resources Information Center
Code, Chris; Tree, Jeremy; Ball, Martin
2011-01-01
We describe an analysis of speech errors on a confrontation naming task in a man with progressive speech degeneration of 10-year duration from Pick's disease. C.S. had a progressive non-fluent aphasia together with a motor speech impairment and early assessment indicated some naming impairments. There was also an absence of significant…
Sixteen-Month-Old Infants' Segment Words from Infant- and Adult-Directed Speech
ERIC Educational Resources Information Center
Mani, Nivedita; Pätzold, Wiebke
2016-01-01
One of the first challenges facing the young language learner is the task of segmenting words from a natural language speech stream, without prior knowledge of how these words sound. Studies with younger children find that children find it easier to segment words from fluent speech when the words are presented in infant-directed speech, i.e., the…
Loudness perception and speech intensity control in Parkinson's disease.
Clark, Jenna P; Adams, Scott G; Dykstra, Allyson D; Moodie, Shane; Jog, Mandar
2014-01-01
The aim of this study was to examine loudness perception in individuals with hypophonia and Parkinson's disease. The participants included 17 individuals with hypophonia related to Parkinson's disease (PD) and 25 age-equivalent controls. The three loudness perception tasks included a magnitude estimation procedure involving a sentence spoken at 60, 65, 70, 75 and 80 dB SPL, an imitation task involving a sentence spoken at 60, 65, 70, 75 and 80 dB SPL, and a magnitude production procedure involving the production of a sentence at five different loudness levels (habitual, two and four times louder and two and four times quieter). The participants with PD produced a significantly different pattern and used a more restricted range than the controls in their perception of speech loudness, imitation of speech intensity, and self-generated estimates of speech loudness. The results support a speech loudness perception deficit in PD involving an abnormal perception of externally generated and self-generated speech intensity. Readers will recognize that individuals with hypophonia related to Parkinson's disease may demonstrate a speech loudness perception deficit involving the abnormal perception of externally generated and self-generated speech intensity. Copyright © 2014 Elsevier Inc. All rights reserved.
The brain dynamics of rapid perceptual adaptation to adverse listening conditions.
Erb, Julia; Henry, Molly J; Eisner, Frank; Obleser, Jonas
2013-06-26
Listeners show a remarkable ability to quickly adjust to degraded speech input. Here, we aimed to identify the neural mechanisms of such short-term perceptual adaptation. In a sparse-sampling, cardiac-gated functional magnetic resonance imaging (fMRI) acquisition, human listeners heard and repeated back 4-band-vocoded sentences (in which the temporal envelope of the acoustic signal is preserved, while spectral information is highly degraded). Clear-speech trials were included as baseline. An additional fMRI experiment on amplitude modulation rate discrimination quantified the convergence of neural mechanisms that subserve coping with challenging listening conditions for speech and non-speech. First, the degraded speech task revealed an "executive" network (comprising the anterior insula and anterior cingulate cortex), parts of which were also activated in the non-speech discrimination task. Second, trial-by-trial fluctuations in successful comprehension of degraded speech drove hemodynamic signal change in classic "language" areas (bilateral temporal cortices). Third, as listeners perceptually adapted to degraded speech, downregulation in a cortico-striato-thalamo-cortical circuit was observable. The present data highlight differential upregulation and downregulation in auditory-language and executive networks, respectively, with important subcortical contributions when successfully adapting to a challenging listening situation.
Gutierrez-Sigut, Eva; Daws, Richard; Payne, Heather; Blott, Jonathan; Marshall, Chloë; MacSweeney, Mairéad
2016-01-01
Neuroimaging studies suggest greater involvement of the left parietal lobe in sign language compared to speech production. This stronger activation might be linked to the specific demands of sign encoding and proprioceptive monitoring. In Experiment 1 we investigate hemispheric lateralization during sign and speech generation in hearing native users of English and British Sign Language (BSL). Participants exhibited stronger lateralization during BSL than English production. In Experiment 2 we investigated whether this increased lateralization index could be due exclusively to the higher motoric demands of sign production. Sign naïve participants performed a phonological fluency task in English and a non-sign repetition task. Participants were left lateralized in the phonological fluency task but there was no consistent pattern of lateralization for the non-sign repetition in these hearing non-signers. The current data demonstrate stronger left hemisphere lateralization for producing signs than speech, which was not primarily driven by motoric articulatory demands. PMID:26605960
Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.
2016-01-01
Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967
van den Tillaart-Haverkate, Maj; de Ronde-Brons, Inge; Dreschler, Wouter A; Houben, Rolph
2017-01-01
Single-microphone noise reduction leads to subjective benefit, but not to objective improvements in speech intelligibility. We investigated whether response times (RTs) provide an objective measure of the benefit of noise reduction and whether the effect of noise reduction is reflected in rated listening effort. Twelve normal-hearing participants listened to digit triplets that were either unprocessed or processed with one of two noise-reduction algorithms: an ideal binary mask (IBM) and a more realistic minimum mean square error estimator (MMSE). For each of these three processing conditions, we measured (a) speech intelligibility, (b) RTs on two different tasks (identification of the last digit and arithmetic summation of the first and last digit), and (c) subjective listening effort ratings. All measurements were performed at four signal-to-noise ratios (SNRs): -5, 0, +5, and +∞ dB. Speech intelligibility was high (>97% correct) for all conditions. A significant decrease in response time, relative to the unprocessed condition, was found for both IBM and MMSE for the arithmetic but not the identification task. Listening effort ratings were significantly lower for IBM than for MMSE and unprocessed speech in noise. We conclude that RT for an arithmetic task can provide an objective measure of the benefit of noise reduction. For young normal-hearing listeners, both ideal and realistic noise reduction can reduce RTs at SNRs where speech intelligibility is close to 100%. Ideal noise reduction can also reduce perceived listening effort.
Preservation of propositional speech in a pure anomic: the importance of an abstract vocabulary.
Crutch, Sebastian J; Warrington, Elizabeth K
2003-12-01
We describe a detailed quantitative analysis of the propositional speech of a patient, FAV, who became severely anomic following a left occipito-temporal infarction. FAV showed a selective noun retrieval deficit in naming to confrontation and from verbal description. Nonetheless, his propositional speech was fluent and content-rich. To quantify this observation, three picture description-based tasks were designed to elicit spontaneous speech. These were pictures of professional occupations, real world scenes and stylised object scenes. FAV's performance was compared and contrasted with that of 5 age- and sex-matched control subjects on a number of variables including speech production rate, volume of output, pause frequency and duration, word frequency, word concreteness and diversity of vocabulary used. FAV's propositional speech fell within the range of normal control performance on the majority of measurements of quality, quantity and fluency. Only in the narrative tasks which relied more heavily upon a concrete vocabulary, did FAV become less voluble and resort to summarising the scenes in an manner. This dissociation between virtually intact propositional speech and a severe naming deficit represents the purest case of anomia currently on record. We attribute this dissociation in part to the preservation of his ability to retrieve his abstract word vocabulary. Our account demonstrates that poor performance on standard naming tasks may be indicative of only a narrowly defined word retrieval deficit. However, we also propose the existence of a feedback circuit which guides sentence construction by providing information regarding lexical availability.
Rong, Panying; Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L; Berry, James D; Perry, Bridget; Green, Jordan R
2016-01-01
To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred.
ERIC Educational Resources Information Center
Osnes, Berge; Hugdahl, Kenneth; Hjelmervik, Helene; Specht, Karsten
2012-01-01
In studies on auditory speech perception, participants are often asked to perform active tasks, e.g. decide whether the perceived sound is a speech sound or not. However, information about the stimulus, inherent in such tasks, may induce expectations that cause altered activations not only in the auditory cortex, but also in frontal areas such as…
Neural Processing Associated with Comprehension of an Indirect Reply during a Scenario Reading Task
ERIC Educational Resources Information Center
Shibata, Midori; Abe, Jun-ichi; Itoh, Hiroaki; Shimada, Koji; Umeda, Satoshi
2011-01-01
In daily communication, we often use indirect speech to convey our intention. However, little is known about the brain mechanisms that underlie the comprehension of indirect speech. In this study, we conducted a functional MRI experiment using a scenario reading task to compare the neural activity induced by an indirect reply (a type of indirect…
Phonological and Executive Working Memory in L2 Task-Based Speech Planning and Performance
ERIC Educational Resources Information Center
Wen, Zhisheng
2016-01-01
The present study sets out to explore the distinctive roles played by two working memory (WM) components in various aspects of L2 task-based speech planning and performance. A group of 40 post-intermediate proficiency level Chinese EFL learners took part in the empirical study. Following the tenets and basic principles of the…
Constructing a Scale to Assess L2 Written Speech Act Performance: WDCT and E-Mail Tasks
ERIC Educational Resources Information Center
Chen, Yuan-shan; Liu, Jianda
2016-01-01
This study reports the development of a scale to evaluate the speech act performance by intermediate-level Chinese learners of English. A qualitative analysis of the American raters' comments was conducted on learner scripts in response to a total of 16 apology and request written discourse completion task (WDCT) situations. The results showed…
When cognition kicks in: working memory and speech understanding in noise.
Rönnberg, Jerker; Rudner, Mary; Lunner, Thomas; Zekveld, Adriana A
2010-01-01
Perceptual load and cognitive load can be separately manipulated and dissociated in their effects on speech understanding in noise. The Ease of Language Understanding model assumes a theoretical position where perceptual task characteristics interact with the individual's implicit capacities to extract the phonological elements of speech. Phonological precision and speed of lexical access are important determinants for listening in adverse conditions. If there are mismatches between the phonological elements perceived and phonological representations in long-term memory, explicit working memory (WM)-related capacities will be continually invoked to reconstruct and infer the contents of the ongoing discourse. Whether this induces a high cognitive load or not will in turn depend on the individual's storage and processing capacities in WM. Data suggest that modulated noise maskers may serve as triggers for speech maskers and therefore induce a WM, explicit mode of processing. Individuals with high WM capacity benefit more than low WM-capacity individuals from fast amplitude compression at low or negative input speech-to-noise ratios. The general conclusion is that there is an overarching interaction between the focal purpose of processing in the primary listening task and the extent to which a secondary, distracting task taps into these processes.
The Impact of Feedback Frequency on Performance in a Novel Speech Motor Learning Task.
Lowe, Mara Steinberg; Buchwald, Adam
2017-06-22
This study investigated whether whole nonword accuracy, phoneme accuracy, and acoustic duration measures were influenced by the amount of feedback speakers without impairment received during a novel speech motor learning task. Thirty-two native English speakers completed a nonword production task across 3 time points: practice, short-term retention, and long-term retention. During practice, participants received knowledge of results feedback according to a randomly assigned schedule (100%, 50%, 20%, or 0%). Changes in nonword accuracy, phoneme accuracy, nonword duration, and initial-cluster duration were compared among feedback groups, sessions, and stimulus properties. All participants improved phoneme and whole nonword accuracy at short-term and long-term retention time points. Participants also refined productions of nonwords, as indicated by a decrease in nonword duration across sessions. The 50% group exhibited the largest reduction in duration between practice and long-term retention for nonwords with native and nonnative clusters. All speakers, regardless of feedback schedule, learned new speech motor behaviors quickly with a high degree of accuracy and refined their speech motor skills for perceptually accurate productions. Acoustic measurements may capture more subtle, subperceptual changes that may occur during speech motor learning. https://doi.org/10.23641/asha.5116324.
Speech Restoration: An Interactive Process
ERIC Educational Resources Information Center
Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny
2009-01-01
Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…
Spoken Language Processing in the Clarissa Procedure Browser
NASA Technical Reports Server (NTRS)
Rayner, M.; Hockey, B. A.; Renders, J.-M.; Chatzichrisafis, N.; Farrell, K.
2005-01-01
Clarissa, an experimental voice enabled procedure browser that has recently been deployed on the International Space Station, is as far as we know the first spoken dialog system in space. We describe the objectives of the Clarissa project and the system's architecture. In particular, we focus on three key problems: grammar-based speech recognition using the Regulus toolkit; methods for open mic speech recognition; and robust side-effect free dialogue management for handling undos, corrections and confirmations. We first describe the grammar-based recogniser we have build using Regulus, and report experiments where we compare it against a class N-gram recogniser trained off the same 3297 utterance dataset. We obtained a 15% relative improvement in WER and a 37% improvement in semantic error rate. The grammar-based recogniser moreover outperforms the class N-gram version for utterances of all lengths from 1 to 9 words inclusive. The central problem in building an open-mic speech recognition system is being able to distinguish between commands directed at the system, and other material (cross-talk), which should be rejected. Most spoken dialogue systems make the accept/reject decision by applying a threshold to the recognition confidence score. NASA shows how a simple and general method, based on standard approaches to document classification using Support Vector Machines, can give substantially better performance, and report experiments showing a relative reduction in the task-level error rate by about 25% compared to the baseline confidence threshold method. Finally, we describe a general side-effect free dialogue management architecture that we have implemented in Clarissa, which extends the "update semantics'' framework by including task as well as dialogue information in the information state. We show that this enables elegant treatments of several dialogue management problems, including corrections, confirmations, querying of the environment, and regression testing.
Loebach, Jeremy L.; Pisoni, David B.; Svirsky, Mario A.
2009-01-01
Objective The objective of this study was to assess whether training on speech processed with an 8-channel noise vocoder to simulate the output of a cochlear implant would produce transfer of auditory perceptual learning to the recognition of non-speech environmental sounds, the identification of speaker gender, and the discrimination of talkers by voice. Design Twenty-four normal hearing subjects were trained to transcribe meaningful English sentences processed with a noise vocoder simulation of a cochlear implant. An additional twenty-four subjects served as an untrained control group and transcribed the same sentences in their unprocessed form. All subjects completed pre- and posttest sessions in which they transcribed vocoded sentences to provide an assessment of training efficacy. Transfer of perceptual learning was assessed using a series of closed-set, nonlinguistic tasks: subjects identified talker gender, discriminated the identity of pairs of talkers, and identified ecologically significant environmental sounds from a closed set of alternatives. Results Although both groups of subjects showed significant pre- to posttest improvements, subjects who transcribed vocoded sentences during training performed significantly better at posttest than subjects in the control group. Both groups performed equally well on gender identification and talker discrimination. Subjects who received explicit training on the vocoded sentences, however, performed significantly better on environmental sound identification than the untrained subjects. Moreover, across both groups, pretest speech performance, and to a higher degree posttest speech performance, were significantly correlated with environmental sound identification. For both groups, environmental sounds that were characterized as having more salient temporal information were identified more often than environmental sounds that were characterized as having more salient spectral information. Conclusions Listeners trained to identify noise-vocoded sentences showed evidence of transfer of perceptual learning to the identification of environmental sounds. In addition, the correlation between environmental sound identification and sentence transcription indicates that subjects who were better able to utilize the degraded acoustic information to identify the environmental sounds were also better able to transcribe the linguistic content of novel sentences. Both trained and untrained groups performed equally well (~75% correct) on the gender identification task, indicating that training did not have an effect on the ability to identify the gender of talkers. Although better than chance, performance on the talker discrimination task was poor overall (~55%), suggesting that either explicit training is required to reliably discriminate talkers’ voices, or that additional information (perhaps spectral in nature) not present in the vocoded speech is required to excel in such tasks. Taken together, the results suggest that while transfer of auditory perceptual learning with spectrally degraded speech does occur, explicit task-specific training may be necessary for tasks that cannot rely on temporal information alone. PMID:19773659
Theoretical Aspects of Speech Production.
ERIC Educational Resources Information Center
Stevens, Kenneth N.
1992-01-01
This paper on speech production in children and youth with hearing impairments summarizes theoretical aspects, including the speech production process, sound sources in the vocal tract, vowel production, and consonant production. Examples of spectra for several classes of vowel and consonant sounds in simple syllables are given. (DB)
Getting the cocktail party started: masking effects in speech perception
Evans, S; McGettigan, C; Agnew, ZK; Rosen, S; Scott, SK
2016-01-01
Spoken conversations typically take place in noisy environments and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous functional Magnetic Resonance Imaging (fMRI), whilst they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioural task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream, and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment, activity was found within right lateralised frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise. PMID:26696297
Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program.
Kohlberg, Gavriel D; Gal, Ya'akov Kobi; Lalwani, Anil K
2016-09-01
Loss of speech following tracheostomy and laryngectomy severely limits communication to simple gestures and facial expressions that are largely ineffective. To facilitate communication in these patients, we seek to develop a low-cost, noninvasive, portable, and simple visual speech recognition program (VSRP) to convert articulatory facial movements into speech. A Microsoft Kinect-based VSRP was developed to capture spatial coordinates of lip movements and translate them into speech. The articulatory speech movements associated with 12 sentences were used to train an artificial neural network classifier. The accuracy of the classifier was then evaluated on a separate, previously unseen set of articulatory speech movements. The VSRP was successfully implemented and tested in 5 subjects. It achieved an accuracy rate of 77.2% (65.0%-87.6% for the 5 speakers) on a 12-sentence data set. The mean time to classify an individual sentence was 2.03 milliseconds (1.91-2.16). We have demonstrated the feasibility of a low-cost, noninvasive, portable VSRP based on Kinect to accurately predict speech from articulation movements in clinically trivial time. This VSRP could be used as a novel communication device for aphonic patients. © The Author(s) 2016.
Speech deficits in serious mental illness: a cognitive resource issue?
Cohen, Alex S; McGovern, Jessica E; Dinzeo, Thomas J; Covington, Michael A
2014-12-01
Speech deficits, notably those involved in psychomotor retardation, blunted affect, alogia and poverty of content of speech, are pronounced in a wide range of serious mental illnesses (e.g., schizophrenia, unipolar depression, bipolar disorders). The present project evaluated the degree to which these deficits manifest as a function of cognitive resource limitations. We examined natural speech from 52 patients meeting criteria for serious mental illnesses (i.e., severe functional deficits with a concomitant diagnosis of schizophrenia, unipolar and/or bipolar affective disorders) and 30 non-psychiatric controls using a range of objective, computer-based measures tapping speech production ("alogia"), variability ("blunted vocal affect") and content ("poverty of content of speech"). Subjects produced natural speech during a baseline condition and while engaging in an experimentally-manipulated cognitively-effortful task. For correlational analysis, cognitive ability was measured using a standardized battery. Generally speaking, speech deficits did not differ as a function of SMI diagnosis. However, every speech production and content measure was significantly abnormal in SMI versus control groups. Speech variability measures generally did not differ between groups. For both patients and controls as a group, speech during the cognitively-effortful task was sparser and less rich in content. Relative to controls, patients were abnormal under cognitive load with respect only to average pause length. Correlations between the speech variables and cognitive ability were only significant for this same variable: average pause length. Results suggest that certain speech deficits, notably involving pause length, may manifest as a function of cognitive resource limitations. Implications for treatment, research and assessment are discussed. Copyright © 2014 Elsevier B.V. All rights reserved.
Sound frequency affects speech emotion perception: results from congenital amusia
Lolli, Sydney L.; Lewenstein, Ari D.; Basurto, Julian; Winnik, Sean; Loui, Psyche
2015-01-01
Congenital amusics, or “tone-deaf” individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech. PMID:26441718
Smith, Sherri L.; Pichora-Fuller, M. Kathleen
2015-01-01
Listeners with hearing loss commonly report having difficulty understanding speech, particularly in noisy environments. Their difficulties could be due to auditory and cognitive processing problems. Performance on speech-in-noise tests has been correlated with reading working memory span (RWMS), a measure often chosen to avoid the effects of hearing loss. If the goal is to assess the cognitive consequences of listeners’ auditory processing abilities, however, then listening working memory span (LWMS) could be a more informative measure. Some studies have examined the effects of different degrees and types of masking on working memory, but less is known about the demands placed on working memory depending on the linguistic complexity of the target speech or the task used to measure speech understanding in listeners with hearing loss. Compared to RWMS, LWMS measures using different speech targets and maskers may provide a more ecologically valid approach. To examine the contributions of RWMS and LWMS to speech understanding, we administered two working memory measures (a traditional RWMS measure and a new LWMS measure), and a battery of tests varying in the linguistic complexity of the speech materials, the presence of babble masking, and the task. Participants were a group of younger listeners with normal hearing and two groups of older listeners with hearing loss (n = 24 per group). There was a significant group difference and a wider range in performance on LWMS than on RWMS. There was a significant correlation between both working memory measures only for the oldest listeners with hearing loss. Notably, there were only few significant correlations among the working memory and speech understanding measures. These findings suggest that working memory measures reflect individual differences that are distinct from those tapped by these measures of speech understanding. PMID:26441769
Koenig, Laura L.; Lucero, Jorge C.; Perlman, Elizabeth
2008-01-01
This study investigates token-to-token variability in fricative production of 5 year olds, 10 year olds, and adults. Previous studies have reported higher intrasubject variability in children than adults, in speech as well as nonspeech tasks, but authors have disagreed on the causes and implications of this finding. The current work assessed the characteristics of age-related variability across articulators (larynx and tongue) as well as in temporal versus spatial domains. Oral airflow signals, which reflect changes in both laryngeal and supralaryngeal apertures, were obtained for multiple productions of ∕h s z∕. The data were processed using functional data analysis, which provides a means of obtaining relatively independent indices of amplitude and temporal (phasing) variability. Consistent with past work, both temporal and amplitude variabilities were higher in children than adults, but the temporal indices were generally less adultlike than the amplitude indices for both groups of children. Quantitative and qualitative analyses showed considerable speaker- and consonant-specific patterns of variability. The data indicate that variability in ∕s∕ may represent laryngeal as well as supralaryngeal control and further that a simple random noise factor, higher in children than in adults, is insufficient to explain developmental differences in speech production variability. PMID:19045800
Natural user interface as a supplement of the holographic Raman tweezers
NASA Astrophysics Data System (ADS)
Tomori, Zoltan; Kanka, Jan; Kesa, Peter; Jakl, Petr; Sery, Mojmir; Bernatova, Silvie; Antalik, Marian; Zemánek, Pavel
2014-09-01
Holographic Raman tweezers (HRT) manipulates with microobjects by controlling the positions of multiple optical traps via the mouse or joystick. Several attempts have appeared recently to exploit touch tablets, 2D cameras or Kinect game console instead. We proposed a multimodal "Natural User Interface" (NUI) approach integrating hands tracking, gestures recognition, eye tracking and speech recognition. For this purpose we exploited "Leap Motion" and "MyGaze" low-cost sensors and a simple speech recognition program "Tazti". We developed own NUI software which processes signals from the sensors and sends the control commands to HRT which subsequently controls the positions of trapping beams, micropositioning stage and the acquisition system of Raman spectra. System allows various modes of operation proper for specific tasks. Virtual tools (called "pin" and "tweezers") serving for the manipulation with particles are displayed on the transparent "overlay" window above the live camera image. Eye tracker identifies the position of the observed particle and uses it for the autofocus. Laser trap manipulation navigated by the dominant hand can be combined with the gestures recognition of the secondary hand. Speech commands recognition is useful if both hands are busy. Proposed methods make manual control of HRT more efficient and they are also a good platform for its future semi-automated and fully automated work.
NASA Technical Reports Server (NTRS)
Tsang, Pamela S.; Hart, Sandra G.; Vidulich, Michael A.
1987-01-01
The utility of speech technology was evaluated in terms of three dual task principles: resource competition between the time shared tasks, stimulus central processing response compatibility, and task integrality. Empirical support for these principles was reviewed. Two studies investigating the interactive effects of the three principles were described. Objective performance and subjective workload ratings for both single and dual tasks were examined. It was found that the single task measures were not necessarily good predictors for the dual task measures. It was shown that all three principles played an important role in determining an optimal task configuration. This was reflected in both the performance measures and the subjective measures. Therefore, consideration of all three principles is required to insure proper use of speech technology in a complex environment.
ERIC Educational Resources Information Center
Munson, Benjamin; Johnson, Julie M.; Edwards, Jan
2012-01-01
Purpose: This study examined whether experienced speech-language pathologists (SLPs) differ from inexperienced people in their perception of phonetic detail in children's speech. Method: Twenty-one experienced SLPs and 21 inexperienced listeners participated in a series of tasks in which they used a visual-analog scale (VAS) to rate children's…
Age and Function Differences in Shared Task Performance: Walking and Talking
ERIC Educational Resources Information Center
Williams, Kathleen; Hinton, Virginia A.; Bories, Tamara; Kovacs, Christopher R.
2006-01-01
Less is known about the effects of normal aging on speech output than other motor actions, because studies of communication integrity have focused on voice production and linguistic parameters rather than speech production characteristics. Studies investigating speech production in older adults have reported increased syllable duration (Slawinski,…
Influence of Visual Information on the Intelligibility of Dysarthric Speech
ERIC Educational Resources Information Center
Keintz, Connie K.; Bunton, Kate; Hoit, Jeannette D.
2007-01-01
Purpose: To examine the influence of visual information on speech intelligibility for a group of speakers with dysarthria associated with Parkinson's disease. Method: Eight speakers with Parkinson's disease and dysarthria were recorded while they read sentences. Speakers performed a concurrent manual task to facilitate typical speech production.…
Music and speech prosody: a common rhythm.
Hausen, Maija; Torppa, Ritva; Salmela, Viljami R; Vainio, Martti; Särkämö, Teppo
2013-01-01
Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61) using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA) and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception, and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks) was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress).
Music and speech prosody: a common rhythm
Hausen, Maija; Torppa, Ritva; Salmela, Viljami R.; Vainio, Martti; Särkämö, Teppo
2013-01-01
Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61) using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA) and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception, and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks) was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress). PMID:24032022
Cullington, Helen E; Zeng, Fan-Gang
2011-02-01
Despite excellent performance in speech recognition in quiet, most cochlear implant users have great difficulty with speech recognition in noise, music perception, identifying tone of voice, and discriminating different talkers. This may be partly due to the pitch coding in cochlear implant speech processing. Most current speech processing strategies use only the envelope information; the temporal fine structure is discarded. One way to improve electric pitch perception is to use residual acoustic hearing via a hearing aid on the nonimplanted ear (bimodal hearing). This study aimed to test the hypothesis that bimodal users would perform better than bilateral cochlear implant users on tasks requiring good pitch perception. Four pitch-related tasks were used. 1. Hearing in Noise Test (HINT) sentences spoken by a male talker with a competing female, male, or child talker. 2. Montreal Battery of Evaluation of Amusia. This is a music test with six subtests examining pitch, rhythm and timing perception, and musical memory. 3. Aprosodia Battery. This has five subtests evaluating aspects of affective prosody and recognition of sarcasm. 4. Talker identification using vowels spoken by 10 different talkers (three men, three women, two boys, and two girls). Bilateral cochlear implant users were chosen as the comparison group. Thirteen bimodal and 13 bilateral adult cochlear implant users were recruited; all had good speech perception in quiet. There were no significant differences between the mean scores of the bimodal and bilateral groups on any of the tests, although the bimodal group did perform better than the bilateral group on almost all tests. Performance on the different pitch-related tasks was not correlated, meaning that if a subject performed one task well they would not necessarily perform well on another. The correlation between the bimodal users' hearing threshold levels in the aided ear and their performance on these tasks was weak. Although the bimodal cochlear implant group performed better than the bilateral group on most parts of the four pitch-related tests, the differences were not statistically significant. The lack of correlation between test results shows that the tasks used are not simply providing a measure of pitch ability. Even if the bimodal users have better pitch perception, the real-world tasks used are reflecting more diverse skills than pitch. This research adds to the existing speech perception, language, and localization studies that show no significant difference between bimodal and bilateral cochlear implant users.
Speech entrainment compensates for Broca's area damage.
Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris
2015-08-01
Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to SE. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during SE versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of SE to improve speech production and may help select patients for SE treatment. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reality Monitoring and Feedback Control of Speech Production Are Related Through Self-Agency.
Subramaniam, Karuna; Kothare, Hardik; Mizuiri, Danielle; Nagarajan, Srikantan S; Houde, John F
2018-01-01
Self-agency is the experience of being the agent of one's own thoughts and motor actions. The intact experience of self-agency is necessary for successful interactions with the outside world (i.e., reality monitoring) and for responding to sensory feedback of our motor actions (e.g., speech feedback control). Reality monitoring is the ability to distinguish internally self-generated information from outside reality (externally-derived information). In the present study, we examined the relationship of self-agency between lower-level speech feedback monitoring (i.e., monitoring what we hear ourselves say) and a higher-level cognitive reality monitoring task. In particular, we examined whether speech feedback monitoring and reality monitoring were driven by the capacity to experience self-agency-the ability to make reliable predictions about the outcomes of self-generated actions. During the reality monitoring task, subjects made judgments as to whether information was previously self-generated (self-agency judgments) or externally derived (external-agency judgments). During speech feedback monitoring, we assessed self-agency by altering environmental auditory feedback so that subjects listened to a perturbed version of their own speech. When subjects heard minimal perturbations in their auditory feedback while speaking, they made corrective responses, indicating that they judged the perturbations as errors in their speech output. We found that self-agency judgments in the reality-monitoring task were higher in people who had smaller corrective responses ( p = 0.05) and smaller inter-trial variability ( p = 0.03) during minimal pitch perturbations of their auditory feedback. These results provide support for a unitary process for the experience of self-agency governing low-level speech control and higher level reality monitoring.
ERIC Educational Resources Information Center
Hantsch, Ansgar; Jescheniak, Jorg D.; Schriefers, Herbert
2009-01-01
A number of recent studies have questioned the idea that lexical selection during speech production is a competitive process. One type of evidence against selection by competition is the observation that in the picture-word interference task semantically related distractors may facilitate the naming of a picture, whereas the selection by…
ERIC Educational Resources Information Center
Vandana, V. P.
2007-01-01
There are very few acoustic studies reflecting on the localization of speech function within the different loci of the cerebellum. Task based performance profile of subjects with lesion in different cerebellar loci is not reported. Also, the findings on nonfocal cerebellar lesions cannot be generalized to lesions restricted to the cerebellum.…
Cortical oscillations and entrainment in speech processing during working memory load.
Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten
2018-02-02
Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (< 13 Hz). We found that increases in both types of WM load (background noise and n-back level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C
1999-01-01
The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.
Speech processing in children with functional articulation disorders.
Gósy, Mária; Horváth, Viktória
2015-03-01
This study explored auditory speech processing and comprehension abilities in 5-8-year-old monolingual Hungarian children with functional articulation disorders (FADs) and their typically developing peers. Our main hypothesis was that children with FAD would show co-existing auditory speech processing disorders, with different levels of these skills depending on the nature of the receptive processes. The tasks included (i) sentence and non-word repetitions, (ii) non-word discrimination and (iii) sentence and story comprehension. Results suggest that the auditory speech processing of children with FAD is underdeveloped compared with that of typically developing children, and largely varies across task types. In addition, there are differences between children with FAD and controls in all age groups from 5 to 8 years. Our results have several clinical implications.
Psycho acoustical Measures in Individuals with Congenital Visual Impairment.
Kumar, Kaushlendra; Thomas, Teenu; Bhat, Jayashree S; Ranjan, Rajesh
2017-12-01
In congenital visual impaired individuals one modality is impaired (visual modality) this impairment is compensated by other sensory modalities. There is evidence that visual impaired performed better in different auditory task like localization, auditory memory, verbal memory, auditory attention, and other behavioural tasks when compare to normal sighted individuals. The current study was aimed to compare the temporal resolution, frequency resolution and speech perception in noise ability in individuals with congenital visual impaired and normal sighted. Temporal resolution, frequency resolution, and speech perception in noise were measured using MDT, GDT, DDT, SRDT, and SNR50 respectively. Twelve congenital visual impaired participants with age range of 18 to 40 years were taken and equal in number with normal sighted participants. All the participants had normal hearing sensitivity with normal middle ear functioning. Individual with visual impairment showed superior threshold in MDT, SRDT and SNR50 as compared to normal sighted individuals. This may be due to complexity of the tasks; MDT, SRDT and SNR50 are complex tasks than GDT and DDT. Visual impairment showed superior performance in auditory processing and speech perception with complex auditory perceptual tasks.
Hearing history influences voice gender perceptual performance in cochlear implant users.
Kovačić, Damir; Balaban, Evan
2010-12-01
The study was carried out to assess the role that five hearing history variables (chronological age, age at onset of deafness, age of first cochlear implant [CI] activation, duration of CI use, and duration of known deafness) play in the ability of CI users to identify speaker gender. Forty-one juvenile CI users participated in two voice gender identification tasks. In a fixed, single-interval task, subjects listened to a single speech item from one of 20 adult male or 20 adult female speakers and had to identify speaker gender. In an adaptive speech-based voice gender discrimination task with the fundamental frequency difference between the voices as the adaptive parameter, subjects listened to a pair of speech items presented in sequential order, one of which was always spoken by an adult female and the other by an adult male. Subjects had to identify the speech item spoken by the female voice. Correlation and regression analyses between perceptual scores in the two tasks and the hearing history variables were performed. Subjects fell into three performance groups: (1) those who could distinguish voice gender in both tasks, (2) those who could distinguish voice gender in the adaptive but not the fixed task, and (3) those who could not distinguish voice gender in either task. Gender identification performance for single voices in the fixed task was significantly and negatively related to the duration of deafness before cochlear implantation (shorter deafness yielded better performance), whereas performance in the adaptive task was weakly but significantly related to age at first activation of the CI device, with earlier activations yielding better scores. The existence of a group of subjects able to perform adaptive discrimination but unable to identify the gender of singly presented voices demonstrates the potential dissociability of the skills required for these two tasks, suggesting that duration of deafness and age of cochlear implantation could have dissociable effects on the development of different skills required by CI users to identify speaker gender.
Speech Auditory Alerts Promote Memory for Alerted Events in a Video-Simulated Self-Driving Car Ride.
Nees, Michael A; Helbein, Benji; Porter, Anna
2016-05-01
Auditory displays could be essential to helping drivers maintain situation awareness in autonomous vehicles, but to date, few or no studies have examined the effectiveness of different types of auditory displays for this application scenario. Recent advances in the development of autonomous vehicles (i.e., self-driving cars) have suggested that widespread automation of driving may be tenable in the near future. Drivers may be required to monitor the status of automation programs and vehicle conditions as they engage in secondary leisure or work tasks (entertainment, communication, etc.) in autonomous vehicles. An experiment compared memory for alerted events-a component of Level 1 situation awareness-using speech alerts, auditory icons, and a visual control condition during a video-simulated self-driving car ride with a visual secondary task. The alerts gave information about the vehicle's operating status and the driving scenario. Speech alerts resulted in better memory for alerted events. Both auditory display types resulted in less perceived effort devoted toward the study tasks but also greater perceived annoyance with the alerts. Speech auditory displays promoted Level 1 situation awareness during a simulation of a ride in a self-driving vehicle under routine conditions, but annoyance remains a concern with auditory displays. Speech auditory displays showed promise as a means of increasing Level 1 situation awareness of routine scenarios during an autonomous vehicle ride with an unrelated secondary task. © 2016, Human Factors and Ergonomics Society.
NASA Astrophysics Data System (ADS)
Izdebski, Krzysztof; Jarosz, Paweł; Usydus, Ireneusz
2017-02-01
Ventilation, speech and singing must use facial musculature to complete these motor tasks and these tasks are fueled by the air we inhale. This motor process requires increase in the blood flow as the muscles contract and relax, therefore skin surface temperature changes are expected. Hence, we used thermography to image these effects. The system used was the thermography camera model FLIR X6580sc with a chilled detector (FLIR Systems Advanced Thermal Solutions, 27700 SW Parkway Ave Wilsonville, OR 97070, USA). To assure improved imaging, the room temperature was air-conditioned to +18° C. All images were recoded at the speed of 30 f/s. Acquired data were analyzed with FLIR Research IR Max Version 4 software and software filters. In this preliminary study a male subject was imaged from frontal and lateral views simultaneously while he performed normal resting ventilation, speech and song. The lateral image was captured in a stainless steel mirror. Results showed different levels of heat flow in the facial musculature as a function of these three tasks. Also, we were able to capture the exalted air jet directionality. The breathing jet was discharged in horizontal direction, speaking voice jet was discharged downwards while singing jet went upward. We interpreted these jet directions as representing different gas content of air expired during these different tasks, with speech having less oxygen than singing. Further studies examining gas exchange during various forms of speech and song and emotional states are warranted.
Zekveld, Adriana A.; Kramer, Sophia E.; Kessens, Judith M.; Vlaming, Marcel S. M. G.; Houtgast, Tammo
2009-01-01
This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word accuracy (WA) level of the captions was 60% or 70% and the text was presented delayed to the speech. After each test, the hearing impaired participants (n = 20) completed the NASA-Task Load Index and several rating scales evaluating the support from the captions. Participants indicated that using the erroneous text in speech comprehension was difficult and the reported task load did not differ between the audio + text and audio-only conditions. In a follow-up experiment (n = 10), the perceived benefit of presenting captions increased with an increase of WA levels to 80% and 90%, and elimination of the text delay. However, in general, the task load did not decrease when captions were presented. These results suggest that the extra effort required to process the text could have been compensated for by less effort required to comprehend the speech. Future research should aim at reducing the complexity of the task to increase the willingness of hearing impaired persons to use an assistive communication system automatically providing captions. The current results underline the need for obtaining both objective and subjective measures of benefit when evaluating assistive communication systems. PMID:19126551
Speech Deficits in Serious mental Illness: A Cognitive Resource Issue?
Cohen, Alex S.; McGovern, Jessica E.; Dinzeo, Thomas J.; Covington, Michael A.
2014-01-01
Speech deficits, notably those involved in psychomotor retardation, blunted affect, alogia and poverty of content of speech, are pronounced in a wide range of serious mental illnesses (e.g., schizophrenia, unipolar depression, bipolar disorders). The present project evaluated the degree to which these deficits manifest as a function of cognitive resource limitations. We examined natural speech from 52 patients meeting criteria for serious mental illnesses (i.e., severe functional deficits with a concomitant diagnosis of schizophrenia, unipolar and/or bipolar affective disorders) and 30 non-psychiatric controls using a range of objective, computer-based measures tapping speech production (“alogia”), variability (“blunted vocal affect”) and content (“poverty of content of speech”). Subjects produced natural speech during a baseline condition and while engaging in an experimentally-manipulated cognitively-effortful task. For correlational analysis, cognitive ability was measured using a standardized battery. Generally speaking, speech deficits did not differ as a function of SMI diagnosis. However, every speech production and content measure was significantly abnormal in SMI versus control groups. Speech variability measures generally did not differ between groups. For both patients and controls as a group, speech during the cognitively-effortful task was sparser and less rich in content. Relative to controls, patients were abnormal under cognitive load with respect only to average pause length. Correlations between the speech variables and cognitive ability were only significant for this same variable: average pause length. Results suggest that certain speech deficits, notably involving pause length, may manifest as a function of cognitive resource limitations. Implications for treatment, research and assessment are discussed. PMID:25464920
Borghini, Giulia; Hazan, Valerie
2018-01-01
Current evidence demonstrates that even though some non-native listeners can achieve native-like performance for speech perception tasks in quiet, the presence of a background noise is much more detrimental to speech intelligibility for non-native compared to native listeners. Even when performance is equated across groups, it is likely that greater listening effort is required for non-native listeners. Importantly, the added listening effort might result in increased fatigue and a reduced ability to successfully perform multiple tasks simultaneously. Task-evoked pupil responses have been demonstrated to be a reliable measure of cognitive effort and can be useful in clarifying those aspects. In this study we compared the pupil response for 23 native English speakers and 27 Italian speakers of English as a second language. Speech intelligibility was tested for sentences presented in quiet and in background noise at two performance levels that were matched across groups. Signal-to-noise levels corresponding to these sentence intelligibility levels were pre-determined using an adaptive intelligibility task. Pupil response was significantly greater in non-native compared to native participants across both intelligibility levels. Therefore, for a given intelligibility level, a greater listening effort is required when listening in a second language in order to understand speech in noise. Results also confirmed that pupil response is sensitive to speech intelligibility during language comprehension, in line with previous research. However, contrary to our predictions, pupil response was not differentially modulated by intelligibility levels for native and non-native listeners. The present study corroborates that pupillometry can be deemed as a valid measure to be used in speech perception investigation, because it is sensitive to differences both across participants, such as listener type, and across conditions, such as variations in the level of speech intelligibility. Importantly, pupillometry offers us the possibility to uncover differences in listening effort even when those do not emerge in the performance level of individuals. PMID:29593489
Hoover, Eric C; Souza, Pamela E; Gallun, Frederick J
2017-04-01
Auditory complaints following mild traumatic brain injury (MTBI) are common, but few studies have addressed the role of auditory temporal processing in speech recognition complaints. In this study, deficits understanding speech in a background of speech noise following MTBI were evaluated with the goal of comparing the relative contributions of auditory and nonauditory factors. A matched-groups design was used in which a group of listeners with a history of MTBI were compared to a group matched in age and pure-tone thresholds, as well as a control group of young listeners with normal hearing (YNH). Of the 33 listeners who participated in the study, 13 were included in the MTBI group (mean age = 46.7 yr), 11 in the Matched group (mean age = 49 yr), and 9 in the YNH group (mean age = 20.8 yr). Speech-in-noise deficits were evaluated using subjective measures as well as monaural word (Words-in-Noise test) and sentence (Quick Speech-in-Noise test) tasks, and a binaural spatial release task. Performance on these measures was compared to psychophysical tasks that evaluate monaural and binaural temporal fine-structure tasks and spectral resolution. Cognitive measures of attention, processing speed, and working memory were evaluated as possible causes of differences between MTBI and Matched groups that might contribute to speech-in-noise perception deficits. A high proportion of listeners in the MTBI group reported difficulty understanding speech in noise (84%) compared to the Matched group (9.1%), and listeners who reported difficulty were more likely to have abnormal results on objective measures of speech in noise. No significant group differences were found between the MTBI and Matched listeners on any of the measures reported, but the number of abnormal tests differed across groups. Regression analysis revealed that a combination of auditory and auditory processing factors contributed to monaural speech-in-noise scores, but the benefit of spatial separation was related to a combination of working memory and peripheral auditory factors across all listeners in the study. The results of this study are consistent with previous findings that a subset of listeners with MTBI has objective auditory deficits. Speech-in-noise performance was related to a combination of auditory and nonauditory factors, confirming the important role of audiology in MTBI rehabilitation. Further research is needed to evaluate the prevalence and causal relationship of auditory deficits following MTBI. American Academy of Audiology
Working Memory Training and Speech in Noise Comprehension in Older Adults.
Wayne, Rachel V; Hamilton, Cheryl; Jones Huyck, Julia; Johnsrude, Ingrid S
2016-01-01
Understanding speech in the presence of background sound can be challenging for older adults. Speech comprehension in noise appears to depend on working memory and executive-control processes (e.g., Heald and Nusbaum, 2014), and their augmentation through training may have rehabilitative potential for age-related hearing loss. We examined the efficacy of adaptive working-memory training (Cogmed; Klingberg et al., 2002) in 24 older adults, assessing generalization to other working-memory tasks (near-transfer) and to other cognitive domains (far-transfer) using a cognitive test battery, including the Reading Span test, sensitive to working memory (e.g., Daneman and Carpenter, 1980). We also assessed far transfer to speech-in-noise performance, including a closed-set sentence task (Kidd et al., 2008). To examine the effect of cognitive training on benefit obtained from semantic context, we also assessed transfer to open-set sentences; half were semantically coherent (high-context) and half were semantically anomalous (low-context). Subjects completed 25 sessions (0.5-1 h each; 5 sessions/week) of both adaptive working memory training and placebo training over 10 weeks in a crossover design. Subjects' scores on the adaptive working-memory training tasks improved as a result of training. However, training did not transfer to other working memory tasks, nor to tasks recruiting other cognitive domains. We did not observe any training-related improvement in speech-in-noise performance. Measures of working memory correlated with the intelligibility of low-context, but not high-context, sentences, suggesting that sentence context may reduce the load on working memory. The Reading Span test significantly correlated only with a test of visual episodic memory, suggesting that the Reading Span test is not a pure-test of working memory, as is commonly assumed.
Working Memory Training and Speech in Noise Comprehension in Older Adults
Wayne, Rachel V.; Hamilton, Cheryl; Jones Huyck, Julia; Johnsrude, Ingrid S.
2016-01-01
Understanding speech in the presence of background sound can be challenging for older adults. Speech comprehension in noise appears to depend on working memory and executive-control processes (e.g., Heald and Nusbaum, 2014), and their augmentation through training may have rehabilitative potential for age-related hearing loss. We examined the efficacy of adaptive working-memory training (Cogmed; Klingberg et al., 2002) in 24 older adults, assessing generalization to other working-memory tasks (near-transfer) and to other cognitive domains (far-transfer) using a cognitive test battery, including the Reading Span test, sensitive to working memory (e.g., Daneman and Carpenter, 1980). We also assessed far transfer to speech-in-noise performance, including a closed-set sentence task (Kidd et al., 2008). To examine the effect of cognitive training on benefit obtained from semantic context, we also assessed transfer to open-set sentences; half were semantically coherent (high-context) and half were semantically anomalous (low-context). Subjects completed 25 sessions (0.5–1 h each; 5 sessions/week) of both adaptive working memory training and placebo training over 10 weeks in a crossover design. Subjects' scores on the adaptive working-memory training tasks improved as a result of training. However, training did not transfer to other working memory tasks, nor to tasks recruiting other cognitive domains. We did not observe any training-related improvement in speech-in-noise performance. Measures of working memory correlated with the intelligibility of low-context, but not high-context, sentences, suggesting that sentence context may reduce the load on working memory. The Reading Span test significantly correlated only with a test of visual episodic memory, suggesting that the Reading Span test is not a pure-test of working memory, as is commonly assumed. PMID:27047370
Anxiety and speaking in people who stutter: an investigation using the emotional Stroop task.
Hennessey, Neville W; Dourado, Esther; Beilby, Janet M
2014-06-01
People with anxiety disorders show an attentional bias towards threat or negative emotion words. This exploratory study examined whether people who stutter (PWS), who can be anxious when speaking, show similar bias and whether reactions to threat words also influence speech motor planning and execution. Comparisons were made between 31 PWS and 31 fluent controls in a modified emotional Stroop task where, depending on a visual cue, participants named the colour of threat and neutral words at either a normal or fast articulation rate. In a manual version of the same task participants pressed the corresponding colour button with either a long or short duration. PWS but not controls were slower to respond to threat words than neutral words, however, this emotionality effect was only evident for verbal responding. Emotionality did not interact with speech rate, but the size of the emotionality effect among PWS did correlate with frequency of stuttering. Results suggest PWS show an attentional bias to threat words similar to that found in people with anxiety disorder. In addition, this bias appears to be contingent on engaging the speech production system as a response modality. No evidence was found to indicate that emotional reactivity during the Stroop task constrains or destabilises, perhaps via arousal mechanisms, speech motor adjustment or execution for PWS. The reader will be able to: (1) explain the importance of cognitive aspects of anxiety, such as attentional biases, in the possible cause and/or maintenance of anxiety in people who stutter, (2) explain how the emotional Stroop task can be used as a measure of attentional bias to threat information, and (3) evaluate the findings with respect to the relationship between attentional bias to threat information and speech production in people who stutter. Copyright © 2013 Elsevier Inc. All rights reserved.
Address entry while driving: speech recognition versus a touch-screen keyboard.
Tsimhoni, Omer; Smith, Daniel; Green, Paul
2004-01-01
A driving simulator experiment was conducted to determine the effects of entering addresses into a navigation system during driving. Participants drove on roads of varying visual demand while entering addresses. Three address entry methods were explored: word-based speech recognition, character-based speech recognition, and typing on a touch-screen keyboard. For each method, vehicle control and task measures, glance timing, and subjective ratings were examined. During driving, word-based speech recognition yielded the shortest total task time (15.3 s), followed by character-based speech recognition (41.0 s) and touch-screen keyboard (86.0 s). The standard deviation of lateral position when performing keyboard entry (0.21 m) was 60% higher than that for all other address entry methods (0.13 m). Degradation of vehicle control associated with address entry using a touch screen suggests that the use of speech recognition is favorable. Speech recognition systems with visual feedback, however, even with excellent accuracy, are not without performance consequences. Applications of this research include the design of in-vehicle navigation systems as well as other systems requiring significant driver input, such as E-mail, the Internet, and text messaging.
Buchan, Julie N; Munhall, Kevin G
2011-01-01
Conflicting visual speech information can influence the perception of acoustic speech, causing an illusory percept of a sound not present in the actual acoustic speech (the McGurk effect). We examined whether participants can voluntarily selectively attend to either the auditory or visual modality by instructing participants to pay attention to the information in one modality and to ignore competing information from the other modality. We also examined how performance under these instructions was affected by weakening the influence of the visual information by manipulating the temporal offset between the audio and video channels (experiment 1), and the spatial frequency information present in the video (experiment 2). Gaze behaviour was also monitored to examine whether attentional instructions influenced the gathering of visual information. While task instructions did have an influence on the observed integration of auditory and visual speech information, participants were unable to completely ignore conflicting information, particularly information from the visual stream. Manipulating temporal offset had a more pronounced interaction with task instructions than manipulating the amount of visual information. Participants' gaze behaviour suggests that the attended modality influences the gathering of visual information in audiovisual speech perception.
The neural processing of masked speech
Scott, Sophie K; McGettigan, Carolyn
2014-01-01
Spoken language is rarely heard in silence, and a great deal of interest in psychoacoustics has focused on the ways that the perception of speech is affected by properties of masking noise. In this review we first briefly outline the neuroanatomy of speech perception. We then summarise the neurobiological aspects of the perception of masked speech, and investigate this as a function of masker type, masker level and task. PMID:23685149
Hemispheric speech lateralisation in the developing brain is related to motor praxis ability.
Hodgson, Jessica C; Hirst, Rebecca J; Hudson, John M
2016-12-01
Commonly displayed functional asymmetries such as hand dominance and hemispheric speech lateralisation are well researched in adults. However there is debate about when such functions become lateralised in the typically developing brain. This study examined whether patterns of speech laterality and hand dominance were related and whether they varied with age in typically developing children. 148 children aged 3-10 years performed an electronic pegboard task to determine hand dominance; a subset of 38 of these children also underwent functional Transcranial Doppler (fTCD) imaging to derive a lateralisation index (LI) for hemispheric activation during speech production using an animation description paradigm. There was no main effect of age in the speech laterality scores, however, younger children showed a greater difference in performance between their hands on the motor task. Furthermore, this between-hand performance difference significantly interacted with direction of speech laterality, with a smaller between-hand difference relating to increased left hemisphere activation. This data shows that both handedness and speech lateralisation appear relatively determined by age 3, but that atypical cerebral lateralisation is linked to greater performance differences in hand skill, irrespective of age. Results are discussed in terms of the common neural systems underpinning handedness and speech lateralisation. Copyright © 2016. Published by Elsevier Ltd.
Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.
Van Engen, Kristin J; McLaughlin, Drew J
2018-05-04
Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.
Davies-Venn, Evelyn; Nelson, Peggy; Souza, Pamela
2015-01-01
Some listeners with hearing loss show poor speech recognition scores in spite of using amplification that optimizes audibility. Beyond audibility, studies have suggested that suprathreshold abilities such as spectral and temporal processing may explain differences in amplified speech recognition scores. A variety of different methods has been used to measure spectral processing. However, the relationship between spectral processing and speech recognition is still inconclusive. This study evaluated the relationship between spectral processing and speech recognition in listeners with normal hearing and with hearing loss. Narrowband spectral resolution was assessed using auditory filter bandwidths estimated from simultaneous notched-noise masking. Broadband spectral processing was measured using the spectral ripple discrimination (SRD) task and the spectral ripple depth detection (SMD) task. Three different measures were used to assess unamplified and amplified speech recognition in quiet and noise. Stepwise multiple linear regression revealed that SMD at 2.0 cycles per octave (cpo) significantly predicted speech scores for amplified and unamplified speech in quiet and noise. Commonality analyses revealed that SMD at 2.0 cpo combined with SRD and equivalent rectangular bandwidth measures to explain most of the variance captured by the regression model. Results suggest that SMD and SRD may be promising clinical tools for diagnostic evaluation and predicting amplification outcomes. PMID:26233047
Davies-Venn, Evelyn; Nelson, Peggy; Souza, Pamela
2015-07-01
Some listeners with hearing loss show poor speech recognition scores in spite of using amplification that optimizes audibility. Beyond audibility, studies have suggested that suprathreshold abilities such as spectral and temporal processing may explain differences in amplified speech recognition scores. A variety of different methods has been used to measure spectral processing. However, the relationship between spectral processing and speech recognition is still inconclusive. This study evaluated the relationship between spectral processing and speech recognition in listeners with normal hearing and with hearing loss. Narrowband spectral resolution was assessed using auditory filter bandwidths estimated from simultaneous notched-noise masking. Broadband spectral processing was measured using the spectral ripple discrimination (SRD) task and the spectral ripple depth detection (SMD) task. Three different measures were used to assess unamplified and amplified speech recognition in quiet and noise. Stepwise multiple linear regression revealed that SMD at 2.0 cycles per octave (cpo) significantly predicted speech scores for amplified and unamplified speech in quiet and noise. Commonality analyses revealed that SMD at 2.0 cpo combined with SRD and equivalent rectangular bandwidth measures to explain most of the variance captured by the regression model. Results suggest that SMD and SRD may be promising clinical tools for diagnostic evaluation and predicting amplification outcomes.
Network dysfunction predicts speech production after left hemisphere stroke.
Geranmayeh, Fatemeh; Leech, Robert; Wise, Richard J S
2016-03-09
To investigate the role of multiple distributed brain networks, including the default mode, fronto-temporo-parietal, and cingulo-opercular networks, which mediate domain-general and task-specific processes during speech production after aphasic stroke. We conducted an observational functional MRI study to investigate the effects of a previous left hemisphere stroke on functional connectivity within and between distributed networks as patients described pictures. Study design included various baseline tasks, and we compared results to those of age-matched healthy participants performing the same tasks. We used independent component and psychophysiological interaction analyses. Although activity within individual networks was not predictive of speech production, relative activity between networks was a predictor of both within-scanner and out-of-scanner language performance, over and above that predicted from lesion volume, age, sex, and years of education. Specifically, robust functional imaging predictors were the differential activity between the default mode network and both the left and right fronto-temporo-parietal networks, respectively activated and deactivated during speech. We also observed altered between-network functional connectivity of these networks in patients during speech production. Speech production is dependent on complex interactions among widely distributed brain networks, indicating that residual speech production after stroke depends on more than the restoration of local domain-specific functions. Our understanding of the recovery of function following focal lesions is not adequately captured by consideration of ipsilesional or contralesional brain regions taking over lost domain-specific functions, but is perhaps best considered as the interaction between what remains of domain-specific networks and domain-general systems that regulate behavior. © 2016 American Academy of Neurology.
Network dysfunction predicts speech production after left hemisphere stroke
Leech, Robert; Wise, Richard J.S.
2016-01-01
Objective: To investigate the role of multiple distributed brain networks, including the default mode, fronto-temporo-parietal, and cingulo-opercular networks, which mediate domain-general and task-specific processes during speech production after aphasic stroke. Methods: We conducted an observational functional MRI study to investigate the effects of a previous left hemisphere stroke on functional connectivity within and between distributed networks as patients described pictures. Study design included various baseline tasks, and we compared results to those of age-matched healthy participants performing the same tasks. We used independent component and psychophysiological interaction analyses. Results: Although activity within individual networks was not predictive of speech production, relative activity between networks was a predictor of both within-scanner and out-of-scanner language performance, over and above that predicted from lesion volume, age, sex, and years of education. Specifically, robust functional imaging predictors were the differential activity between the default mode network and both the left and right fronto-temporo-parietal networks, respectively activated and deactivated during speech. We also observed altered between-network functional connectivity of these networks in patients during speech production. Conclusions: Speech production is dependent on complex interactions among widely distributed brain networks, indicating that residual speech production after stroke depends on more than the restoration of local domain-specific functions. Our understanding of the recovery of function following focal lesions is not adequately captured by consideration of ipsilesional or contralesional brain regions taking over lost domain-specific functions, but is perhaps best considered as the interaction between what remains of domain-specific networks and domain-general systems that regulate behavior. PMID:26962070
Internal and external attention in speech anxiety.
Deiters, Désirée D; Stevens, Stephan; Hermann, Christiane; Gerlach, Alexander L
2013-06-01
Cognitive models of social phobia propose that socially anxious individuals engage in heightened self-focused attention. Evidence for this assumption was provided by dot probe and feedback tasks measuring attention and reactions to internal cues. However, it is unclear whether similar patterns of attentional processing can be revealed while participants actually engage in a social situation. The current study used a novel paradigm, simultaneously measuring attention to internal and external stimuli in anticipation of and during a speech task. Participants with speech anxiety and non-anxious controls were asked to press a button in response to external or internal probes, while giving a speech on a controversial topic in front of an audience. The external probe consisted of a LED attached to the head of one spectator and the internal probe was a light vibration, which ostensibly signaled changes in participants' pulse or skin conductance. The results indicate that during speech anticipation, high speech anxious participants responded significantly faster to internal probes than low speech anxious participants, while during the speech no differences were revealed between internal and external probes. Generalization of our results is restricted to speech anxious individuals. Our results provide support for the pivotal role of self-focused attention in anticipatory social anxiety. Furthermore, they provide a new framework for understanding interaction effects of internal and external attention in anticipation of and during actual social situations. Copyright © 2012 Elsevier Ltd. All rights reserved.
McCaig, Cassandra M; Adams, Scott G; Dykstra, Allyson D; Jog, Mandar
2016-01-01
Previous studies have demonstrated a negative effect of concurrent walking and talking on gait in Parkinson's disease (PD) but there is limited information about the effect of concurrent walking on speech production. The present study examined the effect of sitting, standing, and three concurrent walking tasks (slow, normal, fast) on conversational speech intensity and speech rate in fifteen individuals with hypophonia related to idiopathic Parkinson's disease (PD) and fourteen age-equivalent controls. Interlocuter (talker-to-talker) distance effects and walking speed were also examined. Concurrent walking was found to produce a significant increase in speech intensity, relative to standing and sitting, in both the control and PD groups. Faster walking produced significantly greater speech intensity than slower walking. Concurrent walking had no effect on speech rate. Concurrent walking and talking produced significant reductions in walking speed in both the control and PD groups. In general, the results of the present study indicate that concurrent walking tasks and the speed of concurrent walking can have a significant positive effect on conversational speech intensity. These positive, "energizing" effects need to be given consideration in future attempts to develop a comprehensive model of speech intensity regulation and they may have important implications for the development of new evaluation and treatment procedures for individuals with hypophonia related to PD. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Data-Driven Subclassification of Speech Sound Disorders in Preschool Children
Vick, Jennell C.; Campbell, Thomas F.; Shriberg, Lawrence D.; Green, Jordan R.; Truemper, Klaus; Rusiewicz, Heather Leavy; Moore, Christopher A.
2015-01-01
Purpose The purpose of the study was to determine whether distinct subgroups of preschool children with speech sound disorders (SSD) could be identified using a subgroup discovery algorithm (SUBgroup discovery via Alternate Random Processes, or SUBARP). Of specific interest was finding evidence of a subgroup of SSD exhibiting performance consistent with atypical speech motor control. Method Ninety-seven preschool children with SSD completed speech and nonspeech tasks. Fifty-three kinematic, acoustic, and behavioral measures from these tasks were input to SUBARP. Results Two distinct subgroups were identified from the larger sample. The 1st subgroup (76%; population prevalence estimate = 67.8%–84.8%) did not have characteristics that would suggest atypical speech motor control. The 2nd subgroup (10.3%; population prevalence estimate = 4.3%– 16.5%) exhibited significantly higher variability in measures of articulatory kinematics and poor ability to imitate iambic lexical stress, suggesting atypical speech motor control. Both subgroups were consistent with classes of SSD in the Speech Disorders Classification System (SDCS; Shriberg et al., 2010a). Conclusion Characteristics of children in the larger subgroup were consistent with the proportionally large SDCS class termed speech delay; characteristics of children in the smaller subgroup were consistent with the SDCS subtype termed motor speech disorder—not otherwise specified. The authors identified candidate measures to identify children in each of these groups. PMID:25076005
Brain Activity Varies with Modulation of Dynamic Pitch Variance in Sentence Melody
ERIC Educational Resources Information Center
Meyer, Martin; Steinhauer, Karsten; Alter, Kai; Friederici, Angela D.; von Cramon, D. Yves
2004-01-01
Fourteen native speakers of German heard normal sentences, sentences which were either lacking dynamic pitch variation (flattened speech), or comprised of intonation contour exclusively (degraded speech). Participants were to listen carefully to the sentences and to perform a rehearsal task. Passive listening to flattened speech compared to normal…
Influence of Sound Immersion and Communicative Interaction on the Lombard Effect
ERIC Educational Resources Information Center
Garnier, Maeva; Henrich, Nathalie; Dubois, Daniele
2010-01-01
Purpose: To examine the influence of sound immersion techniques and speech production tasks on speech adaptation in noise. Method: In Experiment 1, we compared the modification of speakers' perception and speech production in noise when noise is played into headphones (with and without additional self-monitoring feedback) or over loudspeakers. We…
Confusability of Consonant Phonemes in Sound Discrimination Tasks.
ERIC Educational Resources Information Center
Rudegeair, Robert E.
The findings of Marsh and Sherman's investigation, in 1970, of the speech sound discrimination ability of kindergarten subjects, are discussed in this paper. In the study a comparison was made between performance when speech sounds were presented in isolation and when speech sounds were presented in a word context, using minimal sound contrasts.…
Private Speech Moderates the Effects of Effortful Control on Emotionality
ERIC Educational Resources Information Center
Day, Kimberly L.; Smith, Cynthia L.; Neal, Amy; Dunsmore, Julie C.
2018-01-01
Research Findings: In addition to being a regulatory strategy, children's private speech may enhance or interfere with their effortful control used to regulate emotion. The goal of the current study was to investigate whether children's private speech during a selective attention task moderated the relations of their effortful control to their…
Dynamic Assessment of Phonological Awareness for Children with Speech Sound Disorders
ERIC Educational Resources Information Center
Gillam, Sandra Laing; Ford, Mikenzi Bentley
2012-01-01
The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…
ERIC Educational Resources Information Center
Howard, Sara
2004-01-01
A combination of perceptual and electropalatographic (EPG) analysis is used to investigate speech production in three adolescent speakers with a history of cleft palate. All the subjects still sound markedly atypical. Their speech output is analysed in three conditions: diadochokinetic tasks; single word production; connected speech. Comparison of…
Children Perceive Speech Onsets by Ear and Eye
ERIC Educational Resources Information Center
Jerger, Susan; Damian, Markus F.; Tye-Murrey, Nancy; Abdi, Herve
2017-01-01
Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize…
Precategorical Acoustic Storage and the Perception of Speech
ERIC Educational Resources Information Center
Frankish, Clive
2008-01-01
Theoretical accounts of both speech perception and of short term memory must consider the extent to which perceptual representations of speech sounds might survive in relatively unprocessed form. This paper describes a novel version of the serial recall task that can be used to explore this area of shared interest. In immediate recall of digit…
Scheperle, Rachel A; Abbas, Paul J
2015-01-01
The ability to perceive speech is related to the listener's ability to differentiate among frequencies (i.e., spectral resolution). Cochlear implant (CI) users exhibit variable speech-perception and spectral-resolution abilities, which can be attributed in part to the extent of electrode interactions at the periphery (i.e., spatial selectivity). However, electrophysiological measures of peripheral spatial selectivity have not been found to correlate with speech perception. The purpose of this study was to evaluate auditory processing at the periphery and cortex using both simple and spectrally complex stimuli to better understand the stages of neural processing underlying speech perception. The hypotheses were that (1) by more completely characterizing peripheral excitation patterns than in previous studies, significant correlations with measures of spectral selectivity and speech perception would be observed, (2) adding information about processing at a level central to the auditory nerve would account for additional variability in speech perception, and (3) responses elicited with spectrally complex stimuli would be more strongly correlated with speech perception than responses elicited with spectrally simple stimuli. Eleven adult CI users participated. Three experimental processor programs (MAPs) were created to vary the likelihood of electrode interactions within each participant. For each MAP, a subset of 7 of 22 intracochlear electrodes was activated: adjacent (MAP 1), every other (MAP 2), or every third (MAP 3). Peripheral spatial selectivity was assessed using the electrically evoked compound action potential (ECAP) to obtain channel-interaction functions for all activated electrodes (13 functions total). Central processing was assessed by eliciting the auditory change complex with both spatial (electrode pairs) and spectral (rippled noise) stimulus changes. Speech-perception measures included vowel discrimination and the Bamford-Kowal-Bench Speech-in-Noise test. Spatial and spectral selectivity and speech perception were expected to be poorest with MAP 1 (closest electrode spacing) and best with MAP 3 (widest electrode spacing). Relationships among the electrophysiological and speech-perception measures were evaluated using mixed-model and simple linear regression analyses. All electrophysiological measures were significantly correlated with each other and with speech scores for the mixed-model analysis, which takes into account multiple measures per person (i.e., experimental MAPs). The ECAP measures were the best predictor. In the simple linear regression analysis on MAP 3 data, only the cortical measures were significantly correlated with speech scores; spectral auditory change complex amplitude was the strongest predictor. The results suggest that both peripheral and central electrophysiological measures of spatial and spectral selectivity provide valuable information about speech perception. Clinically, it is often desirable to optimize performance for individual CI users. These results suggest that ECAP measures may be most useful for within-subject applications when multiple measures are performed to make decisions about processor options. They also suggest that if the goal is to compare performance across individuals based on a single measure, then processing central to the auditory nerve (specifically, cortical measures of discriminability) should be considered.
NASA Technical Reports Server (NTRS)
Olorenshaw, Lex; Trawick, David
1991-01-01
The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.
Real-Time Performance Feedback for the Manual Control of Spacecraft
NASA Astrophysics Data System (ADS)
Karasinski, John Austin
Real-time performance metrics were developed to quantify workload, situational awareness, and manual task performance for use as visual feedback to pilots of aerospace vehicles. Results from prior lunar lander experiments with variable levels of automation were replicated and extended to provide insights for the development of real-time metrics. Increased levels of automation resulted in increased flight performance, lower workload, and increased situational awareness. Automated Speech Recognition (ASR) was employed to detect verbal callouts as a limited measure of subjects' situational awareness. A one-dimensional manual tracking task and simple instructor-model visual feedback scheme was developed. This feedback was indicated to the operator by changing the color of a guidance element on the primary flight display, similar to how a flight instructor points out elements of a display to a student pilot. Experiments showed that for this low-complexity task, visual feedback did not change subject performance, but did increase the subjects' measured workload. Insights gained from these experiments were applied to a Simplified Aid for EVA Rescue (SAFER) inspection task. The effects of variations of an instructor-model performance-feedback strategy on human performance in a novel SAFER inspection task were investigated. Real-time feedback was found to have a statistically significant effect of improving subject performance and decreasing workload in this complicated four degree of freedom manual control task with two secondary tasks.
Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification
NASA Astrophysics Data System (ADS)
Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato
We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.
Reed, Amanda C.; Centanni, Tracy M.; Borland, Michael S.; Matney, Chanel J.; Engineer, Crystal T.; Kilgard, Michael P.
2015-01-01
Objectives Hearing loss is a commonly experienced disability in a variety of populations including veterans and the elderly and can often cause significant impairment in the ability to understand spoken language. In this study, we tested the hypothesis that neural and behavioral responses to speech will be differentially impaired in an animal model after two forms of hearing loss. Design Sixteen female Sprague–Dawley rats were exposed to one of two types of broadband noise which was either moderate or intense. In nine of these rats, auditory cortex recordings were taken 4 weeks after noise exposure (NE). The other seven were pretrained on a speech sound discrimination task prior to NE and were then tested on the same task after hearing loss. Results Following intense NE, rats had few neural responses to speech stimuli. These rats were able to detect speech sounds but were no longer able to discriminate between speech sounds. Following moderate NE, rats had reorganized cortical maps and altered neural responses to speech stimuli but were still able to accurately discriminate between similar speech sounds during behavioral testing. Conclusions These results suggest that rats are able to adjust to the neural changes after moderate NE and discriminate speech sounds, but they are not able to recover behavioral abilities after intense NE. Animal models could help clarify the adaptive and pathological neural changes that contribute to speech processing in hearing-impaired populations and could be used to test potential behavioral and pharmacological therapies. PMID:25072238
Two different phenomena in basic motor speech performance in premanifest Huntington disease.
Skodda, Sabine; Grönheit, Wenke; Lukas, Carsten; Bellenberg, Barbara; von Hein, Sarah M; Hoffmann, Rainer; Saft, Carsten
2016-03-09
Dysarthria is a common feature in Huntington disease (HD). The aim of this cross-sectional pilot study was the description and objective analysis of different speech parameters with special emphasis on the aspect of speech timing of connected speech and nonspeech verbal utterances in premanifest HD (preHD). A total of 28 preHD mutation carriers and 28 age- and sex-matched healthy speakers had to perform a reading task and several syllable repetition tasks. Results of computerized acoustic analysis of different variables for the measurement of speech rate and regularity were correlated with clinical measures and MRI-based brain atrophy assessment by voxel-based morphometry. An impaired capacity to steadily repeat single syllables with higher variations in preHD compared to healthy controls was found (variance 1: Cohen d = 1.46). Notably, speech rate was increased compared to controls and showed correlations to the volume of certain brain areas known to be involved in the sensory-motor speech networks (net speech rate: Cohen d = 1.19). Furthermore, speech rate showed correlations to disease burden score, probability of disease onset, the estimated years to onset, and clinical measures like the cognitive score. Measurement of speech rate and regularity might be helpful additional tools for the monitoring of subclinical functional disability in preHD. As one of the possible causes for higher performance in preHD, we discuss huntingtin-dependent temporarily advantageous development processes of the brain. © 2016 American Academy of Neurology.
Oxytocin selectively moderates negative cognitive appraisals in high trait anxious males.
Alvares, Gail A; Chen, Nigel T M; Balleine, Bernard W; Hickie, Ian B; Guastella, Adam J
2012-12-01
The mammalian neuropeptide oxytocin has well-characterized effects in facilitating prosocial and affiliative behavior. Additionally, oxytocin decreases physiological and behavioral responses to social stress. In the present study we investigated the effects of oxytocin on cognitive appraisals after a naturalistic social stress task in healthy male students. In a randomized, double-blind, placebo-controlled trial, 48 participants self-administered either an oxytocin or placebo nasal spray and, following a wait period, completed an impromptu speech task. Eye gaze to a pre-recorded video of an audience displayed during the task was simultaneously collected. After the speech, participants completed questionnaires assessing negative cognitive beliefs about speech performance. Whilst there was no overall effect of oxytocin compared to placebo on either eye gaze or questionnaire measures, there were significant positive correlations between trait levels of anxiety and negative self-appraisals following the speech. Exploratory analyses revealed that whilst higher trait anxiety was associated with increasingly poorer perceptions of speech performance in the placebo group, this relationship was not found in participants administered oxytocin. These results provide preliminary evidence to suggest that oxytocin may reduce negative cognitive self-appraisals in high trait anxious males. It adds to a growing body of evidence that oxytocin seems to attenuate negative cognitive responses to stress in anxious individuals. Copyright © 2012 Elsevier Ltd. All rights reserved.
Button, Le; Peter, Beate; Stoel-Gammon, Carol; Raskind, Wendy H
2013-03-01
The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin.
Gutierrez-Sigut, Eva; Daws, Richard; Payne, Heather; Blott, Jonathan; Marshall, Chloë; MacSweeney, Mairéad
2015-12-01
Neuroimaging studies suggest greater involvement of the left parietal lobe in sign language compared to speech production. This stronger activation might be linked to the specific demands of sign encoding and proprioceptive monitoring. In Experiment 1 we investigate hemispheric lateralization during sign and speech generation in hearing native users of English and British Sign Language (BSL). Participants exhibited stronger lateralization during BSL than English production. In Experiment 2 we investigated whether this increased lateralization index could be due exclusively to the higher motoric demands of sign production. Sign naïve participants performed a phonological fluency task in English and a non-sign repetition task. Participants were left lateralized in the phonological fluency task but there was no consistent pattern of lateralization for the non-sign repetition in these hearing non-signers. The current data demonstrate stronger left hemisphere lateralization for producing signs than speech, which was not primarily driven by motoric articulatory demands. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
BUTTON, LE; PETER, BEATE; STOEL-GAMMON, CAROL; RASKIND, WENDY H.
2013-01-01
The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin. PMID:23339292
RSA Reactivity in Current and Remitted Major Depressive Disorder
Bylsma, Lauren M.; Salomon, Kristen; Taylor-Clift, April; Morris, Bethany H.; Rottenberg, Jonathan
2014-01-01
Objective Low resting respiratory sinus arrhythmia (RSA) levels and blunted RSA reactivity are thought to index impaired emotion regulation capacity. Major Depressive Disorder (MDD) has been associated with abberant RSA reactivity and recovery to a speech stressor task relative to healthy controls. Whether impaired RSA functioning reflects aspects of the depressed mood state or a stable vulnerability marker for depression is unknown. Methods We compared resting RSA and RSA reactivity between individuals with MDD (n=49), remitted depression (RMD, n=24), and healthy controls (n=45). ECG data were collected during a resting baseline, a paced-breathing baseline, and two reactivity tasks (speech stressor, cold exposure). Results A group by time quadratic effect emerged (F=4.36(2,109), p=.015) for RSA across phases of the speech stressor (baseline, instruction, preparation, speech, recovery). Follow-up analyses revealed that those with MDD uniquely exhibited blunted RSA reactivity, whereas RMD and controls both exhibited normal task-related vagal withdrawal and post-task recovery. The group by time interaction remained after covariation for age, sex, waist circumference, physical activity, and respiration, but not sleep quality. Conclusions These results provide new evidence that abberant RSA reactivity marks features that track the depressed state, such as poor sleep, rather than a stable trait evident among asymtomatic persons. PMID:24367127
Beal, Deryk S; Cheyne, Douglas O; Gracco, Vincent L; Quraan, Maher A; Taylor, Margot J; De Nil, Luc F
2010-10-01
We used magnetoencephalography to investigate auditory evoked responses to speech vocalizations and non-speech tones in adults who do and do not stutter. Neuromagnetic field patterns were recorded as participants listened to a 1 kHz tone, playback of their own productions of the vowel /i/ and vowel-initial words, and actively generated the vowel /i/ and vowel-initial words. Activation of the auditory cortex at approximately 50 and 100 ms was observed during all tasks. A reduction in the peak amplitudes of the M50 and M100 components was observed during the active generation versus passive listening tasks dependent on the stimuli. Adults who stutter did not differ in the amount of speech-induced auditory suppression relative to fluent speakers. Adults who stutter had shorter M100 latencies for the actively generated speaking tasks in the right hemisphere relative to the left hemisphere but the fluent speakers showed similar latencies across hemispheres. During passive listening tasks, adults who stutter had longer M50 and M100 latencies than fluent speakers. The results suggest that there are timing, rather than amplitude, differences in auditory processing during speech in adults who stutter and are discussed in relation to hypotheses of auditory-motor integration breakdown in stuttering. Copyright 2010 Elsevier Inc. All rights reserved.
Excitability of the motor system: A transcranial magnetic stimulation study on singing and speaking.
Royal, Isabelle; Lidji, Pascale; Théoret, Hugo; Russo, Frank A; Peretz, Isabelle
2015-08-01
The perception of movements is associated with increased activity in the human motor cortex, which in turn may underlie our ability to understand actions, as it may be implicated in the recognition, understanding and imitation of actions. Here, we investigated the involvement and lateralization of the primary motor cortex (M1) in the perception of singing and speech. Transcranial magnetic stimulation (TMS) was applied independently for both hemispheres over the mouth representation of the motor cortex in healthy participants while they watched 4-s audiovisual excerpts of singers producing a 2-note ascending interval (singing condition) or 4-s audiovisual excerpts of a person explaining a proverb (speech condition). Subjects were instructed to determine whether a sung interval/written proverb, matched a written interval/proverb. During both tasks, motor evoked potentials (MEPs) were recorded from the contralateral mouth muscle (orbicularis oris) of the stimulated motor cortex compared to a control task. Moreover, to investigate the time course of motor activation, TMS pulses were randomly delivered at 7 different time points (ranging from 500 to 3500 ms after stimulus onset). Results show that stimulation of the right hemisphere had a similar effect on the MEPs for both the singing and speech perception tasks, whereas stimulation of the left hemisphere significantly differed in the speech perception task compared to the singing perception task. Furthermore, analysis of the MEPs in the singing task revealed that they decreased for small musical intervals, but increased for large musical intervals, regardless of which hemisphere was stimulated. Overall, these results suggest a dissociation between the lateralization of M1 activity for speech perception and for singing perception, and that in the latter case its activity can be modulated by musical parameters such as the size of a musical interval. Copyright © 2015 Elsevier Ltd. All rights reserved.
Könönen, Mervi; Tamsi, Niko; Säisänen, Laura; Kemppainen, Samuli; Määttä, Sara; Julkunen, Petro; Jutila, Leena; Äikiä, Marja; Kälviäinen, Reetta; Niskanen, Eini; Vanninen, Ritva; Karjalainen, Pasi; Mervaala, Esa
2015-06-15
Navigated transcranial magnetic stimulation (nTMS) is a modern precise method to activate and study cortical functions noninvasively. We hypothesized that a combination of nTMS and functional magnetic resonance imaging (fMRI) could clarify the localization of functional areas involved with motor control and production of speech. Navigated repetitive TMS (rTMS) with short bursts was used to map speech areas on both hemispheres by inducing speech disruption during number recitation tasks in healthy volunteers. Two experienced video reviewers, blinded to the stimulated area, graded each trial offline according to possible speech disruption. The locations of speech disrupting nTMS trials were overlaid with fMRI activations of word generation task. Speech disruptions were produced on both hemispheres by nTMS, though there were more disruptive stimulation sites on the left hemisphere. Grade of the disruptions varied from subjective sensation to mild objectively recognizable disruption up to total speech arrest. The distribution of locations in which speech disruptions could be elicited varied among individuals. On the left hemisphere the locations of disturbing rTMS bursts with reviewers' verification followed the areas of fMRI activation. Similar pattern was not observed on the right hemisphere. The reviewer-verified speech disruptions induced by nTMS provided clinically relevant information, and fMRI might explain further the function of the cortical area. nTMS and fMRI complement each other, and their combination should be advocated when assessing individual localization of speech network. Copyright © 2015 Elsevier B.V. All rights reserved.
Synthesized speech rate and pitch effects on intelligibility of warning messages for pilots
NASA Technical Reports Server (NTRS)
Simpson, C. A.; Marchionda-Frost, K.
1984-01-01
In civilian and military operations, a future threat-warning system with a voice display could warn pilots of other traffic, obstacles in the flight path, and/or terrain during low-altitude helicopter flights. The present study was conducted to learn whether speech rate and voice pitch of phoneme-synthesized speech affects pilot accuracy and response time to typical threat-warning messages. Helicopter pilots engaged in an attention-demanding flying task and listened for voice threat warnings presented in a background of simulated helicopter cockpit noise. Performance was measured by flying-task performance, threat-warning intelligibility, and response time. Pilot ratings were elicited for the different voice pitches and speech rates. Significant effects were obtained only for response time and for pilot ratings, both as a function of speech rate. For the few cases when pilots forgot to respond to a voice message, they remembered 90 percent of the messages accurately when queried for their response 8 to 10 sec later.
Effects of speaking task on intelligibility in Parkinson’s disease
TJADEN, KRIS; WILDING, GREG
2017-01-01
Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare intelligibility estimates obtained for a reading passage and an extemporaneous monologue produced by12 speakers with Parkinson’s disease (PD). The relationship between structural characteristics of utterances and scaled intelligibility was explored within speakers. Speakers were audio-recorded while reading a paragraph and producing a monologue. Speech samples were separated into individual utterances for presentation to 70 listeners who judged intelligibility using orthographic transcription and direct magnitude estimation (DME). Results suggest that scaled estimates of intelligibility for reading show potential for indexing intelligibility of an extemporaneous monologue. Within-speaker variation in scaled intelligibility also was related to the number of words per speech run for extemporaneous speech. PMID:20887216
ERIC Educational Resources Information Center
O'Brien, Nancy, Ed.
The articles in this paper explore the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical research applications. Titles of the papers and their authors are as follows: (1) "Task Dynamic Coordination of the Speech Articulators: A Preliminary Model" (Elliot Saltzman); (2) "Some Observations…
Alternating motion rate as an index of speech motor disorder in traumatic brain injury.
Wang, Yu-Tsai; Kent, Ray D; Duffy, Joseph R; Thomas, Jack E; Weismer, Gary
2004-01-01
The task of syllable alternating motion rate (AMR) (also called diadochokinesis) is suitable for examining speech disorders of varying degrees of severity and in individuals with varying levels of linguistic and cognitive ability. However, very limited information on this task has been published for subjects with traumatic brain injury (TBI). This study is a quantitative and qualitative acoustic analysis of AMR in seven subjects with TBI. The primary goal was to use acoustic analyses to assess speech motor control disturbances for the group as a whole and for individual patients. Quantitative analyses included measures of syllable rate, syllable and intersyllable gap durations, energy maxima, and voice onset time (VOT). Qualitative analyses included classification of features evident in spectrograms and waveforms to provide a more detailed description. The TBI group had (1) a slowed syllable rate due mostly to lengthened syllables and, to a lesser degree, lengthened intersyllable gaps, (2) highly correlated syllable rates between AMR and conversation, (3) temporal and energy maxima irregularities within repetition sequences, (4) normal median VOT values but with large variation, and (5) a number of speech production abnormalities revealed by qualitative analysis, including explosive speech quality, breathy voice quality, phonatory instability, multiple or missing stop bursts, continuous voicing, and spirantization. The relationships between these findings and TBI speakers' neurological status and dysarthria types are also discussed. It was concluded that acoustic analyses of the AMR task provides specific information on motor speech limitations in individuals with TBI.
Children with bilateral cochlear implants identify emotion in speech and music.
Volkova, Anna; Trehub, Sandra E; Schellenberg, E Glenn; Papsin, Blake C; Gordon, Karen A
2013-03-01
This study examined the ability of prelingually deaf children with bilateral implants to identify emotion (i.e. happiness or sadness) in speech and music. Participants in Experiment 1 were 14 prelingually deaf children from 5-7 years of age who had bilateral implants and 18 normally hearing children from 4-6 years of age. They judged whether linguistically neutral utterances produced by a man and woman sounded happy or sad. Participants in Experiment 2 were 14 bilateral implant users from 4-6 years of age and the same normally hearing children as in Experiment 1. They judged whether synthesized piano excerpts sounded happy or sad. Child implant users' accuracy of identifying happiness and sadness in speech was well above chance levels but significantly below the accuracy achieved by children with normal hearing. Similarly, their accuracy of identifying happiness and sadness in music was well above chance levels but significantly below that of children with normal hearing, who performed at ceiling. For the 12 implant users who participated in both experiments, performance on the speech task correlated significantly with performance on the music task and implant experience was correlated with performance on both tasks. Child implant users' accurate identification of emotion in speech exceeded performance in previous studies, which may be attributable to fewer response alternatives and the use of child-directed speech. Moreover, child implant users' successful identification of emotion in music indicates that the relevant cues are accessible at a relatively young age.
Technology and Speech Training: An Affair to Remember.
ERIC Educational Resources Information Center
Levitt, Harry
1989-01-01
A history of speech training technology is presented, from the simple hand-held mirror to complicated computer-based systems and tactile devices, and subsequent papers in this theme issue are introduced. Both the advantages and problems of technological aids are addressed. Simplicity in the application and use of speech training aids is stressed.…
An intelligent multi-media human-computer dialogue system
NASA Technical Reports Server (NTRS)
Neal, J. G.; Bettinger, K. E.; Byoun, J. S.; Dobes, Z.; Thielman, C. Y.
1988-01-01
Sophisticated computer systems are being developed to assist in the human decision-making process for very complex tasks performed under stressful conditions. The human-computer interface is a critical factor in these systems. The human-computer interface should be simple and natural to use, require a minimal learning period, assist the user in accomplishing his task(s) with a minimum of distraction, present output in a form that best conveys information to the user, and reduce cognitive load for the user. In pursuit of this ideal, the Intelligent Multi-Media Interfaces project is devoted to the development of interface technology that integrates speech, natural language text, graphics, and pointing gestures for human-computer dialogues. The objective of the project is to develop interface technology that uses the media/modalities intelligently in a flexible, context-sensitive, and highly integrated manner modelled after the manner in which humans converse in simultaneous coordinated multiple modalities. As part of the project, a knowledge-based interface system, called CUBRICON (CUBRC Intelligent CONversationalist) is being developed as a research prototype. The application domain being used to drive the research is that of military tactical air control.
ERIC Educational Resources Information Center
Sperbeck, Mieko
2010-01-01
The primary aim of this dissertation was to investigate the relationship between speech perception and speech production difficulties among Japanese second language (L2) learners of English, in their learning complex syllable structures. Japanese L2 learners and American English controls were tested in a categorical ABX discrimination task of…
The Use of Reported Speech in Children's Narratives: A Priming Study
ERIC Educational Resources Information Center
Serratrice, Ludovica; Hesketh, Anne; Ashworth, Rachel
2015-01-01
This study investigated the long-term effects of structural priming on children's use of indirect speech clauses in a narrative context. Forty-two monolingual English-speaking 5-year-olds in two primary classrooms took part in a story-retelling task including reported speech. Testing took place in three individual sessions (pre-test, post-test 1,…
An Acquired Deficit of Audiovisual Speech Processing
ERIC Educational Resources Information Center
Hamilton, Roy H.; Shenton, Jeffrey T.; Coslett, H. Branch
2006-01-01
We report a 53-year-old patient (AWF) who has an acquired deficit of audiovisual speech integration, characterized by a perceived temporal mismatch between speech sounds and the sight of moving lips. AWF was less accurate on an auditory digit span task with vision of a speaker's face as compared to a condition in which no visual information from…
ERIC Educational Resources Information Center
Rasanen, Okko
2011-01-01
Word segmentation from continuous speech is a difficult task that is faced by human infants when they start to learn their native language. Several studies indicate that infants might use several different cues to solve this problem, including intonation, linguistic stress, and transitional probabilities between subsequent speech sounds. In this…
Philosophy of Research in Motor Speech Disorders
ERIC Educational Resources Information Center
Weismer, Gary
2006-01-01
The primary objective of this position paper is to assess the theoretical and empirical support that exists for the Mayo Clinic view of motor speech disorders in general, and for oromotor, nonverbal tasks as a window to speech production processes in particular. Literature both in support of and against the Mayo clinic view and the associated use…
The Development of the Mealings, Demuth, Dillon, and Buchholz Classroom Speech Perception Test
ERIC Educational Resources Information Center
Mealings, Kiri T.; Demuth, Katherine; Buchholz, Jörg; Dillon, Harvey
2015-01-01
Purpose: Open-plan classroom styles are increasingly being adopted in Australia despite evidence that their high intrusive noise levels adversely affect learning. The aim of this study was to develop a new Australian speech perception task (the Mealings, Demuth, Dillon, and Buchholz Classroom Speech Perception Test) and use it in an open-plan…
Revisiting Speech Rate and Utterance Length Manipulations in Stuttering Speakers
ERIC Educational Resources Information Center
Blomgren, Michael; Goberman, Alexander M.
2008-01-01
The goal of this study was to evaluate stuttering frequency across a multidimensional (2 x 2) hierarchy of speech performance tasks. Specifically, this study examined the interaction between changes in length of utterance and levels of speech rate stability. Forty-four adult male speakers participated in the study (22 stuttering speakers and 22…
Speech and Voice Response to a Levodopa Challenge in Late-Stage Parkinson's Disease.
Fabbri, Margherita; Guimarães, Isabel; Cardoso, Rita; Coelho, Miguel; Guedes, Leonor Correia; Rosa, Mario M; Godinho, Catarina; Abreu, Daisy; Gonçalves, Nilza; Antonini, Angelo; Ferreira, Joaquim J
2017-01-01
Parkinson's disease (PD) patients are affected by hypokinetic dysarthria, characterized by hypophonia and dysprosody, which worsens with disease progression. Levodopa's (l-dopa) effect on quality of speech is inconclusive; no data are currently available for late-stage PD (LSPD). To assess the modifications of speech and voice in LSPD following an acute l-dopa challenge. LSPD patients [Schwab and England score <50/Hoehn and Yahr stage >3 (MED ON)] performed several vocal tasks before and after an acute l-dopa challenge. The following was assessed: respiratory support for speech, voice quality, stability and variability, speech rate, and motor performance (MDS-UPDRS-III). All voice samples were recorded and analyzed by a speech and language therapist blinded to patients' therapeutic condition using Praat 5.1 software. 24/27 (14 men) LSPD patients succeeded in performing voice tasks. Median age and disease duration of patients were 79 [IQR: 71.5-81.7] and 14.5 [IQR: 11-15.7] years, respectively. In MED OFF, respiratory breath support and pitch break time of LSPD patients were worse than the normative values of non-parkinsonian. A correlation was found between disease duration and voice quality ( R = 0.51; p = 0.013) and speech rate ( R = -0.55; p = 0.008). l-Dopa significantly improved MDS-UPDRS-III score (20%), with no effect on speech as assessed by clinical rating scales and automated analysis. Speech is severely affected in LSPD. Although l-dopa had some effect on motor performance, including axial signs, speech and voice did not improve. The applicability and efficacy of non-pharmacological treatment for speech impairment should be considered for speech disorder management in PD.
Loebach, Jeremy L; Pisoni, David B; Svirsky, Mario A
2009-12-01
The objective of this study was to assess whether training on speech processed with an eight-channel noise vocoder to simulate the output of a cochlear implant would produce transfer of auditory perceptual learning to the recognition of nonspeech environmental sounds, the identification of speaker gender, and the discrimination of talkers by voice. Twenty-four normal-hearing subjects were trained to transcribe meaningful English sentences processed with a noise vocoder simulation of a cochlear implant. An additional 24 subjects served as an untrained control group and transcribed the same sentences in their unprocessed form. All subjects completed pre- and post-test sessions in which they transcribed vocoded sentences to provide an assessment of training efficacy. Transfer of perceptual learning was assessed using a series of closed set, nonlinguistic tasks: subjects identified talker gender, discriminated the identity of pairs of talkers, and identified ecologically significant environmental sounds from a closed set of alternatives. Although both groups of subjects showed significant pre- to post-test improvements, subjects who transcribed vocoded sentences during training performed significantly better at post-test than those in the control group. Both groups performed equally well on gender identification and talker discrimination. Subjects who received explicit training on the vocoded sentences, however, performed significantly better on environmental sound identification than the untrained subjects. Moreover, across both groups, pre-test speech performance and, to a higher degree, post-test speech performance, were significantly correlated with environmental sound identification. For both groups, environmental sounds that were characterized as having more salient temporal information were identified more often than environmental sounds that were characterized as having more salient spectral information. Listeners trained to identify noise-vocoded sentences showed evidence of transfer of perceptual learning to the identification of environmental sounds. In addition, the correlation between environmental sound identification and sentence transcription indicates that subjects who were better able to use the degraded acoustic information to identify the environmental sounds were also better able to transcribe the linguistic content of novel sentences. Both trained and untrained groups performed equally well ( approximately 75% correct) on the gender-identification task, indicating that training did not have an effect on the ability to identify the gender of talkers. Although better than chance, performance on the talker discrimination task was poor overall ( approximately 55%), suggesting that either explicit training is required to discriminate talkers' voices reliably or that additional information (perhaps spectral in nature) not present in the vocoded speech is required to excel in such tasks. Taken together, the results suggest that although transfer of auditory perceptual learning with spectrally degraded speech does occur, explicit task-specific training may be necessary for tasks that cannot rely on temporal information alone.
Two-voice fundamental frequency estimation
NASA Astrophysics Data System (ADS)
de Cheveigné, Alain
2002-05-01
An algorithm is presented that estimates the fundamental frequencies of two concurrent voices or instruments. The algorithm models each voice as a periodic function of time, and jointly estimates both periods by cancellation according to a previously proposed method [de Cheveigné and Kawahara, Speech Commun. 27, 175-185 (1999)]. The new algorithm improves on the old in several respects; it allows an unrestricted search range, effectively avoids harmonic and subharmonic errors, is more accurate (it uses two-dimensional parabolic interpolation), and is computationally less costly. It remains subject to unavoidable errors when periods are in certain simple ratios and the task is inherently ambiguous. The algorithm is evaluated on a small database including speech, singing voice, and instrumental sounds. It can be extended in several ways; to decide the number of voices, to handle amplitude variations, and to estimate more than two voices (at the expense of increased processing cost and decreased reliability). It makes no use of instrument models, learned or otherwise, although it could usefully be combined with such models. [Work supported by the Cognitique programme of the French Ministry of Research and Technology.
Game-Based Augmented Visual Feedback for Enlarging Speech Movements in Parkinson's Disease.
Yunusova, Yana; Kearney, Elaine; Kulkarni, Madhura; Haworth, Brandon; Baljko, Melanie; Faloutsos, Petros
2017-06-22
The purpose of this pilot study was to demonstrate the effect of augmented visual feedback on acquisition and short-term retention of a relatively simple instruction to increase movement amplitude during speaking tasks in patients with dysarthria due to Parkinson's disease (PD). Nine patients diagnosed with PD, hypokinetic dysarthria, and impaired speech intelligibility participated in a training program aimed at increasing the size of their articulatory (tongue) movements during sentences. Two sessions were conducted: a baseline and training session, followed by a retention session 48 hr later. At baseline, sentences were produced at normal, loud, and clear speaking conditions. Game-based visual feedback regarding the size of the articulatory working space (AWS) was presented during training. Eight of nine participants benefited from training, increasing their sentence AWS to a greater degree following feedback as compared with the baseline loud and clear conditions. The majority of participants were able to demonstrate the learned skill at the retention session. This study demonstrated the feasibility of augmented visual feedback via articulatory kinematics for training movement enlargement in patients with hypokinesia due to PD. https://doi.org/10.23641/asha.5116840.
Francis, Alexander L.; MacPherson, Megan K.; Chandrasekaran, Bharath; Alvar, Ann M.
2016-01-01
Typically, understanding speech seems effortless and automatic. However, a variety of factors may, independently or interactively, make listening more effortful. Physiological measures may help to distinguish between the application of different cognitive mechanisms whose operation is perceived as effortful. In the present study, physiological and behavioral measures associated with task demand were collected along with behavioral measures of performance while participants listened to and repeated sentences. The goal was to measure psychophysiological reactivity associated with three degraded listening conditions, each of which differed in terms of the source of the difficulty (distortion, energetic masking, and informational masking), and therefore were expected to engage different cognitive mechanisms. These conditions were chosen to be matched for overall performance (keywords correct), and were compared to listening to unmasked speech produced by a natural voice. The three degraded conditions were: (1) Unmasked speech produced by a computer speech synthesizer, (2) Speech produced by a natural voice and masked byspeech-shaped noise and (3) Speech produced by a natural voice and masked by two-talker babble. Masked conditions were both presented at a -8 dB signal to noise ratio (SNR), a level shown in previous research to result in comparable levels of performance for these stimuli and maskers. Performance was measured in terms of proportion of key words identified correctly, and task demand or effort was quantified subjectively by self-report. Measures of psychophysiological reactivity included electrodermal (skin conductance) response frequency and amplitude, blood pulse amplitude and pulse rate. Results suggest that the two masked conditions evoked stronger psychophysiological reactivity than did the two unmasked conditions even when behavioral measures of listening performance and listeners’ subjective perception of task demand were comparable across the three degraded conditions. PMID:26973564
Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach
NASA Astrophysics Data System (ADS)
Bugatti, Alessandro; Flammini, Alessandra; Migliorati, Pierangelo
2002-12-01
We focus the attention on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on Zero crossing rate and Bayesian classification. It is very simple from a computational point of view, and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method, that uses more features, and is based on neural networks (specifically a multi-layer Perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to be implemented if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used.
Tóth, László; Hoffmann, Ildikó; Gosztolya, Gábor; Vincze, Veronika; Szatlóczki, Gréta; Bánréti, Zoltán; Pákáski, Magdolna; Kálmán, János
2018-01-01
Background: Even today the reliable diagnosis of the prodromal stages of Alzheimer’s disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive de-cline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Methods: Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech sig-nals, first manually (using the Praat software), and then automatically, with an automatic speech recogni-tion (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. Results: The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process – that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. Conclusion: The temporal analysis of spontaneous speech can be exploited in implementing a new, auto-matic detection-based tool for screening MCI for the community. PMID:29165085
Toth, Laszlo; Hoffmann, Ildiko; Gosztolya, Gabor; Vincze, Veronika; Szatloczki, Greta; Banreti, Zoltan; Pakaski, Magdolna; Kalman, Janos
2018-01-01
Even today the reliable diagnosis of the prodromal stages of Alzheimer's disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive decline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening software for the recognition of MCI. Participants were 38 healthy controls and 48 clinically diagnosed MCI patients. The provoked spontaneous speech by asking the patients to recall the content of 2 short black and white films (one direct, one delayed), and by answering one question. Acoustic parameters (hesitation ratio, speech tempo, length and number of silent and filled pauses, length of utterance) were extracted from the recorded speech signals, first manually (using the Praat software), and then automatically, with an automatic speech recognition (ASR) based tool. First, the extracted parameters were statistically analyzed. Then we applied machine learning algorithms to see whether the MCI and the control group can be discriminated automatically based on the acoustic features. The statistical analysis showed significant differences for most of the acoustic parameters (speech tempo, articulation rate, silent pause, hesitation ratio, length of utterance, pause-per-utterance ratio). The most significant differences between the two groups were found in the speech tempo in the delayed recall task, and in the number of pauses for the question-answering task. The fully automated version of the analysis process - that is, using the ASR-based features in combination with machine learning - was able to separate the two classes with an F1-score of 78.8%. The temporal analysis of spontaneous speech can be exploited in implementing a new, automatic detection-based tool for screening MCI for the community. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Effects of context and word class on lexical retrieval in Chinese speakers with anomic aphasia.
Law, Sam-Po; Kong, Anthony Pak-Hin; Lai, Loretta Wing-Shan; Lai, Christy
2015-01-01
Differences in processing nouns and verbs have been investigated intensely in psycholinguistics and neuropsychology in past decades. However, the majority of studies examining retrieval of these word classes have involved tasks of single word stimuli or responses. While the results have provided rich information for addressing issues about grammatical class distinctions, it is unclear whether they have adequate ecological validity for understanding lexical retrieval in connected speech which characterizes daily verbal communication. Previous investigations comparing retrieval of nouns and verbs in single word production and connected speech have reported either discrepant performance between the two contexts with presence of word class dissociation in picture naming but absence in connected speech, or null effects of word class. In addition, word finding difficulties have been found to be less severe in connected speech than picture naming. However, these studies have failed to match target stimuli of the two word classes and between tasks on psycholinguistic variables known to affect performance in response latency and/or accuracy. The present study compared lexical retrieval of nouns and verbs in picture naming and connected speech from picture description, procedural description, and story-telling among 19 Chinese speakers with anomic aphasia and their age, gender, and education matched healthy controls, to understand the influence of grammatical class on word production across speech contexts when target items were balanced for confounding variables between word classes and tasks. Elicitation of responses followed the protocol of the AphasiaBank consortium (http://talkbank.org/AphasiaBank/). Target words for confrontation naming were based on well-established naming tests, while those for narrative were drawn from a large database of normal speakers. Selected nouns and verbs in the two contexts were matched for age-of-acquisition (AoA) and familiarity. Influence of imageability was removed through statistical control. When AoA and familiarity were balanced, nouns were retrieved better than verbs, and performance was higher in picture naming than connected speech. When imageability was further controlled for, only the effect of task remained significant. The absence of word class effects when confounding variables are controlled for is similar to many previous reports; however, the pattern of better word retrieval in naming is rare but compatible with the account that processing demands are higher in narrative than naming. The overall findings have strongly suggested the importance of including connected speech tasks in any language assessment and evaluation of language rehabilitation of individuals with aphasia.
Effects of context and word class on lexical retrieval in Chinese speakers with anomic aphasia
Law, Sam-Po; Kong, Anthony Pak-Hin; Lai, Loretta Wing-Shan; Lai, Christy
2014-01-01
Background Differences in processing nouns and verbs have been investigated intensely in psycholinguistics and neuropsychology in past decades. However, the majority of studies examining retrieval of these word classes have involved tasks of single word stimuli or responses. While the results have provided rich information for addressing issues about grammatical class distinctions, it is unclear whether they have adequate ecological validity for understanding lexical retrieval in connected speech which characterizes daily verbal communication. Previous investigations comparing retrieval of nouns and verbs in single word production and connected speech have reported either discrepant performance between the two contexts with presence of word class dissociation in picture naming but absence in connected speech, or null effects of word class. In addition, word finding difficulties have been found to be less severe in connected speech than picture naming. However, these studies have failed to match target stimuli of the two word classes and between tasks on psycholinguistic variables known to affect performance in response latency and/or accuracy. Aims The present study compared lexical retrieval of nouns and verbs in picture naming and connected speech from picture description, procedural description, and story-telling among 19 Chinese speakers with anomic aphasia and their age, gender, and education matched healthy controls, to understand the influence of grammatical class on word production across speech contexts when target items were balanced for confounding variables between word classes and tasks. Methods & Procedures Elicitation of responses followed the protocol of the AphasiaBank consortium (http://talkbank.org/AphasiaBank/). Target words for confrontation naming were based on well-established naming tests, while those for narrative were drawn from a large database of normal speakers. Selected nouns and verbs in the two contexts were matched for age-of-acquisition (AoA) and familiarity. Influence of imageability was removed through statistical control. Outcomes & Results When AoA and familiarity were balanced, nouns were retrieved better than verbs, and performance was higher in picture naming than connected speech. When imageability was further controlled for, only the effect of task remained significant. Conclusions The absence of word class effects when confounding variables are controlled for is similar to many previous reports; however, the pattern of better word retrieval in naming is rare but compatible with the account that processing demands are higher in narrative than naming. The overall findings have strongly suggested the importance of including connected speech tasks in any language assessment and evaluation of language rehabilitation of individuals with aphasia. PMID:25505810
A speech-controlled environmental control system for people with severe dysarthria.
Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca
2007-06-01
Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.
The minimal unit of phonological encoding: prosodic or lexical word.
Wheeldon, Linda R; Lahiri, Aditi
2002-09-01
Wheeldon and Lahiri (Journal of Memory and Language 37 (1997) 356) used a prepared speech production task (Sternberg, S., Monsell, S., Knoll, R. L., & Wright, C. E. (1978). The latency and duration of rapid movement sequences: comparisons of speech and typewriting. In G. E. Stelmach (Ed.), Information processing in motor control and learning (pp. 117-152). New York: Academic Press; Sternberg, S., Wright, C. E., Knoll, R. L., & Monsell, S. (1980). Motor programs in rapid speech: additional evidence. In R. A. Cole (Ed.), The perception and production of fluent speech (pp. 507-534). Hillsdale, NJ: Erlbaum) to demonstrate that the latency to articulate a sentence is a function of the number of phonological words it comprises. Latencies for the sentence [Ik zoek het] [water] 'I seek the water' were shorter than latencies for sentences like [Ik zoek] [vers] [water] 'I seek fresh water'. We extend this research by examining the prepared production of utterances containing phonological words that are less than a lexical word in length. Dutch compounds (e.g. ooglid 'eyelid') form a single morphosyntactic word and a phonological word, which in turn includes two phonological words. We compare their prepared production latencies to those syntactic phrases consisting of an adjective and a noun (e.g. oud lid 'old member') which comprise two morphosyntactic and two phonological words, and to morphologically simple words (e.g. orgel 'organ') which comprise one morphosyntactic and one phonological word. Our findings demonstrate that the effect is limited to phrasal level phonological words, suggesting that production models need to make a distinction between lexical and phrasal phonology.
Dimension-Based Statistical Learning Affects Both Speech Perception and Production.
Lehet, Matthew; Holt, Lori L
2017-04-01
Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more "perceptual weight" and more effectively signal category membership to native listeners. Yet perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners' perceptual weights in response to speech that deviates from the norms also affects listeners' own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, and then shifted to an "artificial accent" that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 × VOT correlation, F0 was a less robust cue to voicing in listeners' own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted. Copyright © 2016 Cognitive Science Society, Inc.
Improving Speech Perception in Noise with Current Focusing in Cochlear Implant Users
Srinivasan, Arthi G.; Padilla, Monica; Shannon, Robert V.; Landsberger, David M.
2013-01-01
Cochlear implant (CI) users typically have excellent speech recognition in quiet but struggle with understanding speech in noise. It is thought that broad current spread from stimulating electrodes causes adjacent electrodes to activate overlapping populations of neurons which results in interactions across adjacent channels. Current focusing has been studied as a way to reduce spread of excitation, and therefore, reduce channel interactions. In particular, partial tripolar stimulation has been shown to reduce spread of excitation relative to monopolar stimulation. However, the crucial question is whether this benefit translates to improvements in speech perception. In this study, we compared speech perception in noise with experimental monopolar and partial tripolar speech processing strategies. The two strategies were matched in terms of number of active electrodes, microphone, filterbanks, stimulation rate and loudness (although both strategies used a lower stimulation rate than typical clinical strategies). The results of this study showed a significant improvement in speech perception in noise with partial tripolar stimulation. All subjects benefited from the current focused speech processing strategy. There was a mean improvement in speech recognition threshold of 2.7 dB in a digits in noise task and a mean improvement of 3 dB in a sentences in noise task with partial tripolar stimulation relative to monopolar stimulation. Although the experimental monopolar strategy was worse than the clinical, presumably due to different microphones, frequency allocations and stimulation rates, the experimental partial-tripolar strategy, which had the same changes, showed no acute deficit relative to the clinical. PMID:23467170
Dimension-based statistical learning affects both speech perception and production
Lehet, Matthew; Holt, Lori L.
2016-01-01
Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more “perceptual weight” and more effectively signal category membership to native listeners. Yet, perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners’ perceptual weights in response to speech that deviates from the norms also affects listeners’ own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, then shifted to an “artificial accent” that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 x VOT correlation, F0 was a less robust cue to voicing in listeners’ own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted. PMID:27666146
Some components of the ``cocktail-party effect,'' as revealed when it fails
NASA Astrophysics Data System (ADS)
Divenyi, Pierre L.; Gygi, Brian
2003-04-01
The precise way listeners cope with cocktail-party situations, i.e., understand speech in the midst of other, simultaneously ongoing conversations, has by-and-large remained a puzzle, despite research committed to studying the problem over the past half century. In contrast, it is widely acknowledged that the cocktail-party effect (CPE) deteriorates in aging. Our investigations during the last decade have assessed the deterioration of the CPE in elderly listeners and attempted to uncover specific auditory tasks, on which the performance of the same listeners will also exhibit a deficit. Correlated performance on CPE and such auditory tasks arguably signify that the tasks in question are necessary for perceptual segregation of the target speech and the background babble. We will present results on three tasks correlated with CPE performance. All three tasks require temporal processing-based perceptual segregation of specific non-speech stimuli (amplitude- and/or frequency-modulated sinusoidal complexes): discrimination of formant transition patterns, segregation of streams with different syllabic rhythms, and selective attention to AM or FM features in the designated stream. [Work supported by a grant from the National Institute on Aging and by the V.A. Medical Research.
Effects of intelligibility on working memory demand for speech perception.
Francis, Alexander L; Nusbaum, Howard C
2009-08-01
Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibility was varied either by training listeners to use available acoustic cues in a more diagnostic manner (as in Experiment 1) or by providing listeners with more informative acoustic cues (i.e., better speech quality, as in Experiments 2 and 3). In the first experiment, training significantly improved intelligibility and recognition speed; increasing WM load significantly slowed recognition. A significant interaction between training and load indicated that the benefit of training on recognition speed was observed only under low memory load. In subsequent experiments, listeners received no training; intelligibility was manipulated by changing synthesizers. Improving intelligibility without training improved recognition accuracy, and increasing memory load still decreased it, but more intelligible speech did not produce more efficient use of available WM capacity. This suggests that perceptual learning modifies the way available capacity is used, perhaps by increasing the use of more phonetically informative features and/or by decreasing use of less informative ones.
Dietrich, Susanne; Hertrich, Ingo; Müller-Dahlhaus, Florian; Ackermann, Hermann; Belardinelli, Paolo; Desideri, Debora; Seibold, Verena C; Ziemann, Ulf
2018-01-01
The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient "virtual lesion" using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message.
Dietrich, Susanne; Hertrich, Ingo; Müller-Dahlhaus, Florian; Ackermann, Hermann; Belardinelli, Paolo; Desideri, Debora; Seibold, Verena C.; Ziemann, Ulf
2018-01-01
The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient “virtual lesion” using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message. PMID:29896086
The influence of target-masker similarity on across-ear interference in dichotic listening
NASA Astrophysics Data System (ADS)
Brungart, Douglas; Simpson, Brian
2004-05-01
In most dichotic listening tasks, the comprehension of a target speech signal presented in one ear is unaffected by the presence of irrelevant speech in the opposite ear. However, recent results have shown that contralaterally presented interfering speech signals do influence performance when a second interfering speech signal is present in the same ear as the target speech. In this experiment, we examined the influence of target-masker similarity on this effect by presenting ipsilateral and contralateral masking phrases spoken by the same talker, a different same-sex talker, or a different-sex talker than the one used to generate the target speech. The results show that contralateral target-masker similarity has the greatest influence on performance when an easily segregated different-sex masker is presented in the target ear, and the least influence when a difficult-to-segregate same-talker masker is presented in the target ear. These results indicate that across-ear interference in dichotic listening is not directly related to the difficulty of the segregation task in the target ear, and suggest that contralateral maskers are least likely to interfere with dichotic speech perception when the same general strategy could be used to segregate the target from the masking voices in the ipsilateral and contralateral ears.
Negative blood oxygen level dependent signals during speech comprehension.
Rodriguez Moreno, Diana; Schiff, Nicholas D; Hirsch, Joy
2015-05-01
Speech comprehension studies have generally focused on the isolation and function of regions with positive blood oxygen level dependent (BOLD) signals with respect to a resting baseline. Although regions with negative BOLD signals in comparison to a resting baseline have been reported in language-related tasks, their relationship to regions of positive signals is not fully appreciated. Based on the emerging notion that the negative signals may represent an active function in language tasks, the authors test the hypothesis that negative BOLD signals during receptive language are more associated with comprehension than content-free versions of the same stimuli. Regions associated with comprehension of speech were isolated by comparing responses to passive listening to natural speech to two incomprehensible versions of the same speech: one that was digitally time reversed and one that was muffled by removal of high frequencies. The signal polarity was determined by comparing the BOLD signal during each speech condition to the BOLD signal during a resting baseline. As expected, stimulation-induced positive signals relative to resting baseline were observed in the canonical language areas with varying signal amplitudes for each condition. Negative BOLD responses relative to resting baseline were observed primarily in frontoparietal regions and were specific to the natural speech condition. However, the BOLD signal remained indistinguishable from baseline for the unintelligible speech conditions. Variations in connectivity between brain regions with positive and negative signals were also specifically related to the comprehension of natural speech. These observations of anticorrelated signals related to speech comprehension are consistent with emerging models of cooperative roles represented by BOLD signals of opposite polarity.
SUBTHALAMIC NUCLEUS NEURONS DIFFERENTIALLY ENCODE EARLY AND LATE ASPECTS OF SPEECH PRODUCTION.
Lipski, W J; Alhourani, A; Pirnia, T; Jones, P W; Dastolfo-Hromack, C; Helou, L B; Crammond, D J; Shaiman, S; Dickey, M W; Holt, L L; Turner, R S; Fiez, J A; Richardson, R M
2018-05-22
Basal ganglia-thalamocortical loops mediate all motor behavior, yet little detail is known about the role of basal ganglia nuclei in speech production. Using intracranial recording during deep brain stimulation surgery in humans with Parkinson's disease, we tested the hypothesis that the firing rate of subthalamic nucleus neurons is modulated in sync with motor execution aspects of speech. Nearly half of seventy-nine unit recordings exhibited firing rate modulation, during a syllable reading task across twelve subjects (male and female). Trial-to-trial timing of changes in subthalamic neuronal activity, relative to cue onset versus production onset, revealed that locking to cue presentation was associated more with units that decreased firing rate, while locking to speech onset was associated more with units that increased firing rate. These unique data indicate that subthalamic activity is dynamic during the production of speech, reflecting temporally-dependent inhibition and excitation of separate populations of subthalamic neurons. SIGNIFICANCE STATEMENT The basal ganglia are widely assumed to participate in speech production, yet no prior studies have reported detailed examination of speech-related activity in basal ganglia nuclei. Using microelectrode recordings from the subthalamic nucleus during a single syllable reading task, in awake humans undergoing deep brain stimulation implantation surgery, we show that the firing rate of subthalamic nucleus neurons is modulated in response to motor execution aspects of speech. These results are the first to establish a role for subthalamic nucleus neurons in encoding of aspects of speech production, and they lay the groundwork for launching a modern subfield to explore basal ganglia function in human speech. Copyright © 2018 the authors.
Negative Blood Oxygen Level Dependent Signals During Speech Comprehension
Rodriguez Moreno, Diana; Schiff, Nicholas D.
2015-01-01
Abstract Speech comprehension studies have generally focused on the isolation and function of regions with positive blood oxygen level dependent (BOLD) signals with respect to a resting baseline. Although regions with negative BOLD signals in comparison to a resting baseline have been reported in language-related tasks, their relationship to regions of positive signals is not fully appreciated. Based on the emerging notion that the negative signals may represent an active function in language tasks, the authors test the hypothesis that negative BOLD signals during receptive language are more associated with comprehension than content-free versions of the same stimuli. Regions associated with comprehension of speech were isolated by comparing responses to passive listening to natural speech to two incomprehensible versions of the same speech: one that was digitally time reversed and one that was muffled by removal of high frequencies. The signal polarity was determined by comparing the BOLD signal during each speech condition to the BOLD signal during a resting baseline. As expected, stimulation-induced positive signals relative to resting baseline were observed in the canonical language areas with varying signal amplitudes for each condition. Negative BOLD responses relative to resting baseline were observed primarily in frontoparietal regions and were specific to the natural speech condition. However, the BOLD signal remained indistinguishable from baseline for the unintelligible speech conditions. Variations in connectivity between brain regions with positive and negative signals were also specifically related to the comprehension of natural speech. These observations of anticorrelated signals related to speech comprehension are consistent with emerging models of cooperative roles represented by BOLD signals of opposite polarity. PMID:25412406
Scheperle, Rachel A.; Abbas, Paul J.
2014-01-01
Objectives The ability to perceive speech is related to the listener’s ability to differentiate among frequencies (i.e., spectral resolution). Cochlear implant (CI) users exhibit variable speech-perception and spectral-resolution abilities, which can be attributed in part to the extent of electrode interactions at the periphery (i.e., spatial selectivity). However, electrophysiological measures of peripheral spatial selectivity have not been found to correlate with speech perception. The purpose of this study was to evaluate auditory processing at the periphery and cortex using both simple and spectrally complex stimuli to better understand the stages of neural processing underlying speech perception. The hypotheses were that (1) by more completely characterizing peripheral excitation patterns than in previous studies, significant correlations with measures of spectral selectivity and speech perception would be observed, (2) adding information about processing at a level central to the auditory nerve would account for additional variability in speech perception, and (3) responses elicited with spectrally complex stimuli would be more strongly correlated with speech perception than responses elicited with spectrally simple stimuli. Design Eleven adult CI users participated. Three experimental processor programs (MAPs) were created to vary the likelihood of electrode interactions within each participant. For each MAP, a subset of 7 of 22 intracochlear electrodes was activated: adjacent (MAP 1), every-other (MAP 2), or every third (MAP 3). Peripheral spatial selectivity was assessed using the electrically evoked compound action potential (ECAP) to obtain channel-interaction functions for all activated electrodes (13 functions total). Central processing was assessed by eliciting the auditory change complex (ACC) with both spatial (electrode pairs) and spectral (rippled noise) stimulus changes. Speech-perception measures included vowel-discrimination and the Bamford-Kowal-Bench Sentence-in-Noise (BKB-SIN) test. Spatial and spectral selectivity and speech perception were expected to be poorest with MAP 1 (closest electrode spacing) and best with MAP 3 (widest electrode spacing). Relationships among the electrophysiological and speech-perception measures were evaluated using mixed-model and simple linear regression analyses. Results All electrophysiological measures were significantly correlated with each other and with speech perception for the mixed-model analysis, which takes into account multiple measures per person (i.e. experimental MAPs). The ECAP measures were the best predictor of speech perception. In the simple linear regression analysis on MAP 3 data, only the cortical measures were significantly correlated with speech; spectral ACC amplitude was the strongest predictor. Conclusions The results suggest that both peripheral and central electrophysiological measures of spatial and spectral selectivity provide valuable information about speech perception. Clinically, it is often desirable to optimize performance for individual CI users. These results suggest that ECAP measures may be the most useful for within-subject applications, when multiple measures are performed to make decisions about processor options. They also suggest that if the goal is to compare performance across individuals based on single measure, then processing central to the auditory nerve (specifically, cortical measures of discriminability) should be considered. PMID:25658746
A Human Machine Interface for EVA
NASA Astrophysics Data System (ADS)
Hartmann, L.
EVA astronauts work in a challenging environment that includes high rate of muscle fatigue, haptic and proprioception impairment, lack of dexterity and interaction with robotic equipment. Currently they are heavily dependent on support from on-board crew and ground station staff for information and robotics operation. They are limited to the operation of simple controls on the suit exterior and external robot controls that are difficult to operate because of the heavy gloves that are part of the EVA suit. A wearable human machine interface (HMI) inside the suit provides a powerful alternative for robot teleoperation, procedure checklist access, generic equipment operation via virtual control panels and general information retrieval and presentation. The HMI proposed here includes speech input and output, a simple 6 degree of freedom (dof) pointing device and a heads up display (HUD). The essential characteristic of this interface is that it offers an alternative to the standard keyboard and mouse interface of a desktop computer. The astronaut's speech is used as input to command mode changes, execute arbitrary computer commands and generate text. The HMI can respond with speech also in order to confirm selections, provide status and feedback and present text output. A candidate 6 dof pointing device is Measurand's Shapetape, a flexible "tape" substrate to which is attached an optic fiber with embedded sensors. Measurement of the modulation of the light passing through the fiber can be used to compute the shape of the tape and, in particular, the position and orientation of the end of the Shapetape. It can be used to provide any kind of 3d geometric information including robot teleoperation control. The HUD can overlay graphical information onto the astronaut's visual field including robot joint torques, end effector configuration, procedure checklists and virtual control panels. With suitable tracking information about the position and orientation of the EVA suit, the overlaid graphical information can be registered with the external world. For example, information about an object can be positioned on or beside the object. This wearable HMI supports many applications during EVA including robot teleoperation, procedure checklist usage, operation of virtual control panels and general information or documentation retrieval and presentation. Whether the robot end effector is a mobile platform for the EVA astronaut or is an assistant to the astronaut in an assembly or repair task, the astronaut can control the robot via a direct manipulation interface. Embedded in the suit or the astronaut's clothing, Shapetape can measure the user's arm/hand position and orientation which can be directly mapped into the workspace coordinate system of the robot. Motion of the users hand can generate corresponding motion of the robot end effector in order to reposition the EVA platform or to manipulate objects in the robot's grasp. Speech input can be used to execute commands and mode changes without the astronaut having to withdraw from the teleoperation task. Speech output from the system can provide feedback without affecting the user's visual attention. The procedure checklist guiding the astronaut's detailed activities can be presented on the HUD and manipulated (e.g., move, scale, annotate, mark tasks as done, consult prerequisite tasks) by spoken command. Virtual control panels for suit equipment, equipment being repaired or arbitrary equipment on the space station can be displayed on the HUD and can be operated by speech commands or by hand gestures. For example, an antenna being repaired could be pointed under the control of the EVA astronaut. Additionally arbitrary computer activities such as information retrieval and presentation can be carried out using similar interface techniques. Considering the risks, expense and physical challenges of EVA work, it is appropriate that EVA astronauts have considerable support from station crew and ground station staff. Reducing their dependence on such personnel may under many circumstances, however, improve performance and reduce risk. For example, the EVA astronaut is likely to have the best viewpoint at a robotic worksite. Direct access to the procedure checklist can help provide temporal context and continuity throughout an EVA. Access to station facilities through an HMI such as the one described here could be invaluable during an emergency or in a situation in which a fault occurs. The full paper will describe the HMI operation and applications in the EVA context in more detail and will describe current laboratory prototyping activities.
ERIC Educational Resources Information Center
Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina
2015-01-01
Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…
ERIC Educational Resources Information Center
Sorqvist, Patrik; Ronnberg, Jerker
2012-01-01
Purpose: To investigate whether working memory capacity (WMC) modulates the effects of to-be-ignored speech on the memory of materials conveyed by to-be-attended speech. Method: Two tasks (reading span, Daneman & Carpenter, 1980; Ronnberg et al., 2008; and size-comparison span, Sorqvist, Ljungberg, & Ljung, 2010) were used to measure individual…
ERIC Educational Resources Information Center
Lavie, Limor; Banai, Karen; Karni, Avi; Attias, Joseph
2015-01-01
Purpose: We tested whether using hearing aids can improve unaided performance in speech perception tasks in older adults with hearing impairment. Method: Unaided performance was evaluated in dichotic listening and speech-in-noise tests in 47 older adults with hearing impairment; 36 participants in 3 study groups were tested before hearing aid…
Teaching the Tyrants: Perspectives on Freedom of Speech and Undergraduates.
ERIC Educational Resources Information Center
Herbeck, Dale A.
Teaching freedom of speech to undergraduates is a difficult task, in part as a result of the challenging history of free expression in the United States. The difficulty is compounded by the need to teach the topic, in contrast to indoctrinating the students in an ideology of free speech. The Bill of Rights, and specifically the First Amendment,…
Hurkmans, Joost; Jonkers, Roel; Boonstra, Anne M; Stewart, Roy E; Reinders-Messelink, Heleen A
2012-01-01
The number of reliable and valid instruments to measure the effects of therapy in apraxia of speech (AoS) is limited. To evaluate the newly developed Modified Diadochokinesis Test (MDT), which is a task to assess the effects of rate and rhythm therapies for AoS in a multiple baseline across behaviours design. The consistency, accuracy and fluency of speech of 24 adults with AoS and 12 unaffected speakers matched for age, gender and educational level were assessed using the MDT. The reliability and validity of the instrument were considered and outcomes compared with those obtained with existing tests. The results revealed that MDT had a strong internal consistency. Scores were influenced by syllable structure complexity, while distinctive features of articulation had no measurable effect. The test-retest and intra- and inter-rater reliabilities were shown to be adequate, and the discriminant validity was good. For convergent validity different outcomes were found: apart from one correlation, the scores on tests assessing functional communication and AoS correlated significantly with the MDT outcome measures. The spontaneous speech phonology measure of the Aachen Aphasia Test (AAT) correlated significantly with the MDT outcome measures, but no correlations were found for the repetition subtest and the spontaneous speech articulation/prosody measure of the AAT. The study shows that the MDT has adequate psychometric properties, implying that it can be used to measure changes in speech motor control during treatment for apraxia of speech. The results demonstrate the validity and utility of the instrument as a supplement to speech tasks in assessing speech improvement aimed at the level of planning and programming of speech. © 2012 Royal College of Speech and Language Therapists.
Pathways of the inferior frontal occipital fasciculus in overt speech and reading.
Rollans, Claire; Cheema, Kulpreet; Georgiou, George K; Cummine, Jacqueline
2017-11-19
In this study, we examined the relationship between tractography-based measures of white matter integrity (ex. fractional anisotropy [FA]) from diffusion tensor imaging (DTI) and five reading-related tasks, including rapid automatized naming (RAN) of letters, digits, and objects, and reading of real words and nonwords. Twenty university students with no reported history of reading difficulties were tested on all five tasks and their performance was correlated with diffusion measures extracted through DTI tractography. A secondary analysis using whole-brain Tract-Based Spatial Statistics (TBSS) was also used to find clusters showing significant negative correlations between reaction time and FA. Results showed a significant relationship between the left inferior fronto-occipital fasciculus FA and performance on the RAN of objects task, as well as a strong relationship to nonword reading, which suggests a role for this tract in slower, non-automatic and/or resource-demanding speech tasks. There were no significant relationships between FA and the faster, more automatic speech tasks (RAN of letters and digits, and real word reading). These findings provide evidence for the role of the inferior fronto-occipital fasciculus in tasks that are highly demanding of orthography-phonology translation (e.g., nonword reading) and semantic processing (e.g., RAN object). This demonstrates the importance of the inferior fronto-occipital fasciculus in basic naming and suggests that this tract may be a sensitive predictor of rapid naming performance within the typical population. We discuss the findings in the context of current models of reading and speech production to further characterize the white matter pathways associated with basic reading processes. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.
Abe, Kazuhiro; Takahashi, Toshimitsu; Takikawa, Yoriko; Arai, Hajime; Kitazawa, Shigeru
2011-10-01
Independent component analysis (ICA) can be usefully applied to functional imaging studies to evaluate the spatial extent and temporal profile of task-related brain activity. It requires no a priori assumptions about the anatomical areas that are activated or the temporal profile of the activity. We applied spatial ICA to detect a voluntary but hidden response of silent speech. To validate the method against a standard model-based approach, we used the silent speech of a tongue twister as a 'Yes' response to single questions that were delivered at given times. In the first task, we attempted to estimate one number that was chosen by a participant from 10 possibilities. In the second task, we increased the possibilities to 1000. In both tasks, spatial ICA was as effective as the model-based method for determining the number in the subject's mind (80-90% correct per digit), but spatial ICA outperformed the model-based method in terms of time, especially in the 1000-possibility task. In the model-based method, calculation time increased by 30-fold, to 15 h, because of the necessity of testing 1000 possibilities. In contrast, the calculation time for spatial ICA remained as short as 30 min. In addition, spatial ICA detected an unexpected response that occurred by mistake. This advantage was validated in a third task, with 13 500 possibilities, in which participants had the freedom to choose when to make one of four responses. We conclude that spatial ICA is effective for detecting the onset of silent speech, especially when it occurs unexpectedly. © 2011 The Authors. European Journal of Neuroscience © 2011 Federation of European Neuroscience Societies and Blackwell Publishing Ltd.
ERIC Educational Resources Information Center
Brady, Lois Jean; Gonzalez, America X.; Zawadzki, Maciej; Presley, Corinda
2012-01-01
This practical resource is brimming with ideas and guidance for using simple ideas from speech and language pathology and occupational therapy to boost communication, sensory integration, and coordination skills in children on the autism spectrum. Suitable for use in the classroom, at home, and in community settings, it is packed with…
Becker, Johannes; Barbe, Michael T; Hartinger, Mariam; Dembek, Till A; Pochmann, Jil; Wirths, Jochen; Allert, Niels; Mücke, Doris; Hermes, Anne; Meister, Ingo G; Visser-Vandewalle, Veerle; Grice, Martine; Timmermann, Lars
2017-04-01
Deep brain stimulation (DBS) of the ventral intermediate nucleus (VIM) is performed to suppress medically-resistant essential tremor (ET). However, stimulation induced dysarthria (SID) is a common side effect, limiting the extent to which tremor can be suppressed. To date, the exact pathogenesis of SID in VIM-DBS treated ET patients is unknown. We investigate the effect of inactivated, uni- and bilateral VIM-DBS on speech production in patients with ET. We employ acoustic measures, tempo, and intelligibility ratings and patient's self-estimated speech to quantify SID, with a focus on comparing bilateral to unilateral stimulation effects and the effect of electrode position on speech. Sixteen German ET patients participated in this study. Each patient was acoustically recorded with DBS-off, unilateral-right-hemispheric-DBS-on, unilateral-left-hemispheric-DBS-on, and bilateral-DBS-on during an oral diadochokinesis task and a read German standard text. To capture the extent of speech impairment, we measured syllable duration and intensity ratio during the DDK task. Naïve listeners rated speech tempo and speech intelligibility of the read text on a 5-point-scale. Patients had to rate their "ability to speak". We found an effect of bilateral compared to unilateral and inactivated stimulation on syllable durations and intensity ratio, as well as on external intelligibility ratings and patients' VAS scores. Additionally, VAS scores are associated with more laterally located active contacts. For speech ratings, we found an effect of syllable duration such that tempo and intelligibility was rated worse for speakers exhibiting greater syllable durations. Our data confirms that SID is more pronounced under bilateral compared to unilateral stimulation. Laterally located electrodes are associated with more severe SID according to patient's self-ratings. We can confirm the relation between diadochokinetic rate and SID in that listener's tempo and intelligibility ratings can be predicted by measured syllable durations from DDK tasks. © 2017 International Neuromodulation Society.
Lavigne, Katie M.; Rapin, Lucile A.; Metzak, Paul D.; Whitman, Jennifer C.; Jung, Kwanghee; Dohen, Marion; Lœvenbruck, Hélène; Woodward, Todd S.
2015-01-01
Background: Task-based functional neuroimaging studies of schizophrenia have not yet replicated the increased coordinated hyperactivity in speech-related brain regions that is reported with symptom-capture and resting-state studies of hallucinations. This may be due to suboptimal selection of cognitive tasks. Methods: In the current study, we used a task that allowed experimental manipulation of control over verbal material and compared brain activity between 23 schizophrenia patients (10 hallucinators, 13 nonhallucinators), 22 psychiatric (bipolar), and 27 healthy controls. Two conditions were presented, one involving inner verbal thought (in which control over verbal material was required) and another involving speech perception (SP; in which control verbal material was not required). Results: A functional connectivity analysis resulted in a left-dominant temporal-frontal network that included speech-related auditory and motor regions and showed hypercoupling in past-week hallucinating schizophrenia patients (relative to nonhallucinating patients) during SP only. Conclusions: These findings replicate our previous work showing generalized speech-related functional network hypercoupling in schizophrenia during inner verbal thought and SP, but extend them by suggesting that hypercoupling is related to past-week hallucination severity scores during SP only, when control over verbal material is not required. This result opens the possibility that practicing control over inner verbal thought processes may decrease the likelihood or severity of hallucinations. PMID:24553150
Gaze transfer in remote cooperation: is it always helpful to see what your partner is attending to?
Müller, Romy; Helmert, Jens R; Pannasch, Sebastian; Velichkovsky, Boris M
2013-01-01
Establishing common ground in remote cooperation is challenging because nonverbal means of ambiguity resolution are limited. In such settings, information about a partner's gaze can support cooperative performance, but it is not yet clear whether and to what extent the abundance of information reflected in gaze comes at a cost. Specifically, in tasks that mainly rely on spatial referencing, gaze transfer might be distracting and leave the partner uncertain about the meaning of the gaze cursor. To examine this question, we let pairs of participants perform a joint puzzle task. One partner knew the solution and instructed the other partner's actions by (1) gaze, (2) speech, (3) gaze and speech, or (4) mouse and speech. Based on these instructions, the acting partner moved the pieces under conditions of high or low autonomy. Performance was better when using either gaze or mouse transfer compared to speech alone. However, in contrast to the mouse, gaze transfer induced uncertainty, evidenced in delayed responses to the cursor. Also, participants tried to resolve ambiguities by engaging in more verbal effort, formulating more explicit object descriptions and fewer deictic references. Thus, gaze transfer seems to increase uncertainty and ambiguity, thereby complicating grounding in this spatial referencing task. The results highlight the importance of closely examining task characteristics when considering gaze transfer as a means of support.
Some Effects of Training on the Perception of Synthetic Speech
Schwab, Eileen C.; Nusbaum, Howard C.; Pisoni, David B.
2012-01-01
The present study was conducted to determine the effects of training on the perception of synthetic speech. Three groups of subjects were tested with synthetic speech using the same tasks before and after training. One group was trained with synthetic speech. A second group went through the identical training procedures using natural speech. The third group received no training. Although performance of the three groups was the same prior to training, significant differences on the post-test measures of word recognition were observed: the group trained with synthetic speech performed much better than the other two groups. A six-month follow-up indicated that the group trained with synthetic speech displayed long-term retention of the knowledge and experience gained with prior exposure to synthetic speech generated by a text-to-speech system. PMID:2936671
Asynchronous sampling of speech with some vocoder experimental results
NASA Technical Reports Server (NTRS)
Babcock, M. L.
1972-01-01
The method of asynchronously sampling speech is based upon the derivatives of the acoustical speech signal. The following results are apparent from experiments to date: (1) It is possible to represent speech by a string of pulses of uniform amplitude, where the only information contained in the string is the spacing of the pulses in time; (2) the string of pulses may be produced in a simple analog manner; (3) the first derivative of the original speech waveform is the most important for the encoding process; (4) the resulting pulse train can be utilized to control an acoustical signal production system to regenerate the intelligence of the original speech.
Separation of trait and state in stuttering.
Connally, Emily L; Ward, David; Pliatsikas, Christos; Finnegan, Sarah; Jenkinson, Mark; Boyles, Rowan; Watkins, Kate E
2018-04-06
Stuttering is a disorder in which the smooth flow of speech is interrupted. People who stutter show structural and functional abnormalities in the speech and motor system. It is unclear whether functional differences reflect general traits of the disorder or are specifically related to the dysfluent speech state. We used a hierarchical approach to separate state and trait effects within stuttering. We collected sparse-sampled functional MRI during two overt speech tasks (sentence reading and picture description) in 17 people who stutter and 16 fluent controls. Separate analyses identified indicators of: (1) general traits of people who stutter; (2) frequency of dysfluent speech states in subgroups of people who stutter; and (3) the differences between fluent and dysfluent states in people who stutter. We found that reduced activation of left auditory cortex, inferior frontal cortex bilaterally, and medial cerebellum were general traits that distinguished fluent speech in people who stutter from that of controls. The stuttering subgroup with higher frequency of dysfluent states during scanning (n = 9) had reduced activation in the right subcortical grey matter, left temporo-occipital cortex, the cingulate cortex, and medial parieto-occipital cortex relative to the subgroup who were more fluent (n = 8). Finally, during dysfluent states relative to fluent ones, there was greater activation of inferior frontal and premotor cortex extending into the frontal operculum, bilaterally. The above differences were seen across both tasks. Subcortical state effects differed according to the task. Overall, our data emphasise the independence of trait and state effects in stuttering. © 2018 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
Higgins, Paul; Searchfield, Grant; Coad, Gavin
2012-06-01
The aim of this study was to determine which level-dependent hearing aid digital signal-processing strategy (DSP) participants preferred when listening to music and/or performing a speech-in-noise task. Two receiver-in-the-ear hearing aids were compared: one using 32-channel adaptive dynamic range optimization (ADRO) and the other wide dynamic range compression (WDRC) incorporating dual fast (4 channel) and slow (15 channel) processing. The manufacturers' first-fit settings based on participants' audiograms were used in both cases. Results were obtained from 18 participants on a quick speech-in-noise (QuickSIN; Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) task and for 3 music listening conditions (classical, jazz, and rock). Participants preferred the quality of music and performed better at the QuickSIN task using the hearing aids with ADRO processing. A potential reason for the better performance of the ADRO hearing aids was less fluctuation in output with change in sound dynamics. ADRO processing has advantages for both music quality and speech recognition in noise over the multichannel WDRC processing that was used in the study. Further evaluations of which DSP aspects contribute to listener preference are required.
Patterns of lung volume use during an extemporaneous speech task in persons with Parkinson disease.
Bunton, Kate
2005-01-01
This study examined patterns of lung volume use in speakers with Parkinson disease (PD) during an extemporaneous speaking task. The performance of a control group was also examined. Behaviors described are based on acoustic, kinematic and linguistic measures. Group differences were found in breath group duration, lung volume initiation, and lung volume termination measures. Speakers in the control group alternated between a longer and shorter breath groups. With starting lung volumes being higher for the longer breath groups and lower for shorter breath groups. Speech production was terminated before reaching tidal end expiratory level. This pattern was also seen in 4 of 7 speakers with PD. The remaining 3 PD speakers initiated speech at low starting lung volumes and continued speaking below EEL. This subgroup of PD speakers ended breath groups at agrammatical boundaries, whereas control speakers ended at appropriate grammatical boundaries. As a result of participating in this exercise, the reader will (1) be able to describe the patterns of lung volume use in speakers with Parkinson disease and compare them with those employed by control speakers; and (2) obtain information about the influence of speaking task on speech breathing.
NASA Astrophysics Data System (ADS)
Saweikis, Meghan; Surprenant, Aimée M.; Davies, Patricia; Gallant, Don
2003-10-01
While young and old subjects with comparable audiograms tend to perform comparably on speech recognition tasks in quiet environments, the older subjects have more difficulty than the younger subjects with recognition tasks in degraded listening conditions. This suggests that factors other than an absolute threshold may account for some of the difficulty older listeners have on recognition tasks in noisy environments. Many metrics, including the Speech Intelligibility Index (SII), used to measure speech intelligibility, only consider an absolute threshold when accounting for age related hearing loss. Therefore these metrics tend to overestimate the performance for elderly listeners in noisy environments [Tobias et al., J. Acoust. Soc. Am. 83, 859-895 (1988)]. The present studies examine the predictive capabilities of the SII in an environment with automobile noise present. This is of interest because people's evaluation of the automobile interior sound is closely linked to their ability to carry on conversations with their fellow passengers. The four studies examine whether, for subjects with age related hearing loss, the accuracy of the SII can be improved by incorporating factors other than an absolute threshold into the model. [Work supported by Ford Motor Company.
Leong, Victoria; Goswami, Usha
2014-02-01
Developmental dyslexia is associated with rhythmic difficulties, including impaired perception of beat patterns in music and prosodic stress patterns in speech. Spoken prosodic rhythm is cued by slow (<10 Hz) fluctuations in speech signal amplitude. Impaired neural oscillatory tracking of these slow amplitude modulation (AM) patterns is one plausible source of impaired rhythm tracking in dyslexia. Here, we characterise the temporal profile of the dyslexic rhythm deficit by examining rhythmic entrainment at multiple speech timescales. Adult dyslexic participants completed two experiments aimed at testing the perception and production of speech rhythm. In the perception task, participants tapped along to the beat of 4 metrically-regular nursery rhyme sentences. In the production task, participants produced the same 4 sentences in time to a metronome beat. Rhythmic entrainment was assessed using both traditional rhythmic indices and a novel AM-based measure, which utilised 3 dominant AM timescales in the speech signal each associated with a different phonological grain-sized unit (0.9-2.5 Hz, prosodic stress; 2.5-12 Hz, syllables; 12-40 Hz, phonemes). The AM-based measure revealed atypical rhythmic entrainment by dyslexic participants to syllable patterns in speech, in perception and production. In the perception task, both groups showed equally strong phase-locking to Syllable AM patterns, but dyslexic responses were entrained to a significantly earlier oscillatory phase angle than controls. In the production task, dyslexic utterances showed shorter syllable intervals, and differences in Syllable:Phoneme AM cross-frequency synchronisation. Our data support the view that rhythmic entrainment at slow (∼5 Hz, Syllable) rates is atypical in dyslexia, suggesting that neural mechanisms for syllable perception and production may also be atypical. These syllable timing deficits could contribute to the atypical development of phonological representations for spoken words, the central cognitive characteristic of developmental dyslexia across languages. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Leong, Victoria; Goswami, Usha
2014-01-01
Developmental dyslexia is associated with rhythmic difficulties, including impaired perception of beat patterns in music and prosodic stress patterns in speech. Spoken prosodic rhythm is cued by slow (<10 Hz) fluctuations in speech signal amplitude. Impaired neural oscillatory tracking of these slow amplitude modulation (AM) patterns is one plausible source of impaired rhythm tracking in dyslexia. Here, we characterise the temporal profile of the dyslexic rhythm deficit by examining rhythmic entrainment at multiple speech timescales. Adult dyslexic participants completed two experiments aimed at testing the perception and production of speech rhythm. In the perception task, participants tapped along to the beat of 4 metrically-regular nursery rhyme sentences. In the production task, participants produced the same 4 sentences in time to a metronome beat. Rhythmic entrainment was assessed using both traditional rhythmic indices and a novel AM-based measure, which utilised 3 dominant AM timescales in the speech signal each associated with a different phonological grain-sized unit (0.9–2.5 Hz, prosodic stress; 2.5–12 Hz, syllables; 12–40 Hz, phonemes). The AM-based measure revealed atypical rhythmic entrainment by dyslexic participants to syllable patterns in speech, in perception and production. In the perception task, both groups showed equally strong phase-locking to Syllable AM patterns, but dyslexic responses were entrained to a significantly earlier oscillatory phase angle than controls. In the production task, dyslexic utterances showed shorter syllable intervals, and differences in Syllable:Phoneme AM cross-frequency synchronisation. Our data support the view that rhythmic entrainment at slow (∼5 Hz, Syllable) rates is atypical in dyslexia, suggesting that neural mechanisms for syllable perception and production may also be atypical. These syllable timing deficits could contribute to the atypical development of phonological representations for spoken words, the central cognitive characteristic of developmental dyslexia across languages. This article is part of a Special Issue entitled
Marchina, Sarah; Norton, Andrea; Kumar, Sandeep; Schlaug, Gottfried
2018-01-01
Functional imaging studies have provided insight into the effect of rate on production of syllables, pseudowords, and naturalistic speech, but the influence of rate on repetition of commonly-used words/phrases suitable for therapeutic use merits closer examination. Aim: To identify speech-motor regions responsive to rate and test the hypothesis that those regions would provide greater support as rates increase, we used an overt speech repetition task and functional magnetic resonance imaging (fMRI) to capture rate-modulated activation within speech-motor regions and determine whether modulations occur linearly and/or show hemispheric preference. Methods: Twelve healthy, right-handed adults participated in an fMRI task requiring overt repetition of commonly-used words/phrases at rates of 1, 2, and 3 syllables/second (syll./sec.). Results: Across all rates, bilateral activation was found both in ventral portions of primary sensorimotor cortex and middle and superior temporal regions. A repeated measures analysis of variance with pairwise comparisons revealed an overall difference between rates in temporal lobe regions of interest (ROIs) bilaterally ( p < 0.001); all six comparisons reached significance ( p < 0.05). Five of the six were highly significant ( p < 0.008), while the left-hemisphere 2- vs. 3-syll./sec. comparison, though still significant, was less robust ( p = 0.037). Temporal ROI mean beta-values increased linearly across the three rates bilaterally. Significant rate effects observed in the temporal lobes were slightly more pronounced in the right-hemisphere. No significant overall rate differences were seen in sensorimotor ROIs, nor was there a clear hemispheric effect. Conclusion: Linear effects in superior temporal ROIs suggest that sensory feedback corresponds directly to task demands. The lesser degree of significance in left-hemisphere activation at the faster, closer-to-normal rate may represent an increase in neural efficiency (and therefore, decreased demand) when the task so closely approximates a highly-practiced function. The presence of significant bilateral activation during overt repetition of words/phrases at all three rates suggests that repetition-based speech production may draw support from either or both hemispheres. This bihemispheric redundancy in regions associated with speech-motor control and their sensitivity to changes in rate may play an important role in interventions for nonfluent aphasia and other fluency disorders, particularly when right-hemisphere structures are the sole remaining pathway for production of meaningful speech.
Reliability of functional MR imaging with word-generation tasks for mapping Broca's area.
Brannen, J H; Badie, B; Moritz, C H; Quigley, M; Meyerand, M E; Haughton, V M
2001-10-01
Functional MR (fMR) imaging of word generation has been used to map Broca's area in some patients selected for craniotomy. The purpose of this study was to measure the reliability, precision, and accuracy of word-generation tasks to identify Broca's area. The Brodmann areas activated during performance of word-generation tasks were tabulated in 34 consecutive patients referred for fMR imaging mapping of language areas. In patients performing two iterations of the letter word-generation tasks, test-retest reliability was quantified by using the concurrence ratio (CR), or the number of voxels activated by each iteration in proportion to the average number of voxels activated from both iterations of the task. Among patients who also underwent category or antonym word generation or both, the similarity of the activation from each task was assessed with the CR. In patients who underwent electrocortical stimulation (ECS) mapping of speech function during craniotomy while awake, the sites with speech function were compared with the locations of activation found during fMR imaging of word generation. In 31 of 34 patients, activation was identified in the inferior frontal gyri or middle frontal gyri or both in Brodmann areas 9, 44, 45, or 46, unilaterally or bilaterally, with one or more of the tasks. Activation was noted in the same gyri when the patient performed a second iteration of the letter word-generation task or second task. The CR for pixel precision in a single section averaged 49%. In patients who underwent craniotomy while awake, speech areas located with ECS coincided with areas of the brain activated during a word-generation task. fMR imaging with word-generation tasks produces technically satisfactory maps of Broca's area, which localize the area accurately and reliably.
Decoding spectrotemporal features of overt and covert speech from the human cortex
Martin, Stéphanie; Brunner, Peter; Holdgraf, Chris; Heinze, Hans-Jochen; Crone, Nathan E.; Rieger, Jochem; Schalk, Gerwin; Knight, Robert T.; Pasley, Brian N.
2014-01-01
Auditory perception and auditory imagery have been shown to activate overlapping brain regions. We hypothesized that these phenomena also share a common underlying neural representation. To assess this, we used electrocorticography intracranial recordings from epileptic patients performing an out loud or a silent reading task. In these tasks, short stories scrolled across a video screen in two conditions: subjects read the same stories both aloud (overt) and silently (covert). In a control condition the subject remained in a resting state. We first built a high gamma (70–150 Hz) neural decoding model to reconstruct spectrotemporal auditory features of self-generated overt speech. We then evaluated whether this same model could reconstruct auditory speech features in the covert speech condition. Two speech models were tested: a spectrogram and a modulation-based feature space. For the overt condition, reconstruction accuracy was evaluated as the correlation between original and predicted speech features, and was significant in each subject (p < 10−5; paired two-sample t-test). For the covert speech condition, dynamic time warping was first used to realign the covert speech reconstruction with the corresponding original speech from the overt condition. Reconstruction accuracy was then evaluated as the correlation between original and reconstructed speech features. Covert reconstruction accuracy was compared to the accuracy obtained from reconstructions in the baseline control condition. Reconstruction accuracy for the covert condition was significantly better than for the control condition (p < 0.005; paired two-sample t-test). The superior temporal gyrus, pre- and post-central gyrus provided the highest reconstruction information. The relationship between overt and covert speech reconstruction depended on anatomy. These results provide evidence that auditory representations of covert speech can be reconstructed from models that are built from an overt speech data set, supporting a partially shared neural substrate. PMID:24904404
Speech intelligibility after glossectomy and speech rehabilitation.
Furia, C L; Kowalski, L P; Latorre, M R; Angelis, E C; Martins, N M; Barros, A P; Ribeiro, K C
2001-07-01
Oral tumor resections cause articulation deficiencies, depending on the site, extent of resection, type of reconstruction, and tongue stump mobility. To evaluate the speech intelligibility of patients undergoing total, subtotal, or partial glossectomy, before and after speech therapy. Twenty-seven patients (24 men and 3 women), aged 34 to 77 years (mean age, 56.5 years), underwent glossectomy. Tumor stages were T1 in 3 patients, T2 in 4, T3 in 8, T4 in 11, and TX in 1; node stages, N0 in 15 patients, N1 in 5, N2a-c in 6, and N3 in 1. No patient had metastases (M0). Patients were divided into 3 groups by extent of tongue resection, ie, total (group 1; n = 6), subtotal (group 2; n = 9), and partial (group 3; n = 12). Different phonological tasks were recorded and analyzed by 3 experienced judges, including sustained 7 oral vowels, vowel in a syllable, and the sequence vowel-consonant-vowel (VCV). The intelligibility of spontaneous speech (sequence story) was scored from 1 to 4 in consensus. All patients underwent a therapeutic program to activate articulatory adaptations, compensations, and maximization of the remaining structures for 3 to 6 months. The tasks were recorded after speech therapy. To compare mean changes, analyses of variance and Wilcoxon tests were used. Patients of groups 1 and 2 significantly improved their speech intelligibility (P<.05). Group 1 improved vowels, VCV, and spontaneous speech; group 2, syllable, VCV, and spontaneous speech. Group 3 demonstrated better intelligibility in the pretherapy phase, but the improvement after therapy was not significant. Speech therapy was effective in improving speech intelligibility of patients undergoing glossectomy, even after major resection. Different pretherapy ability between groups was seen, with improvement of speech intelligibility in groups 1 and 2. The improvement of speech intelligibility in group 3 was not statistically significant, possibly because of the small and heterogeneous sample.
Stager, Sheila V; Jeffries, Keith J; Braun, Allen R
2003-01-01
We used H(2)15O PET to characterize the common features of two successful but markedly different fluency-evoking conditions -- paced speech and singing -- in order to identify brain mechanisms that enable fluent speech in people who stutter. To do so, we compared responses under fluency-evoking conditions with responses elicited by tasks that typically elicit dysfluent speech (quantifying the degree of stuttering and using this measure as a confounding covariate in our analyses). We evaluated task-related activations in both stuttering subjects and age- and gender-matched controls. Areas that were either uniquely activated during fluency-evoking conditions, or in which the magnitude of activation was significantly greater during fluency-evoking than dysfluency-evoking tasks included auditory association areas that process speech and voice and motor regions related to control of the larynx and oral articulators. This suggests that a common fluency-evoking mechanism might relate to more effective coupling of auditory and motor systems -- that is, more efficient self-monitoring, allowing motor areas to more effectively modify speech. These effects were seen in both PWS and controls, suggesting that they are due to the sensorimotor or cognitive demands of the fluency-evoking tasks themselves. While responses seen in both groups were bilateral, however, the fluency-evoking tasks elicited more robust activation of auditory and motor regions within the left hemisphere of stuttering subjects, suggesting a role for the left hemisphere in compensatory processes that enable fluency. The reader will learn about and be able to: (1) compare brain activation patterns under fluency- and dysfluency-evoking conditions in stuttering and control subjects; (2) appraise the common features, both central and peripheral, of fluency-evoking conditions; and (3) discuss ways in which neuroimaging methods can be used to understand the pathophysiology of stuttering.
Masson-Carro, Ingrid; Goudbeek, Martijn; Krahmer, Emiel
2016-10-01
Past research has sought to elucidate how speakers and addressees establish common ground in conversation, yet few studies have focused on how visual cues such as co-speech gestures contribute to this process. Likewise, the effect of cognitive constraints on multimodal grounding remains to be established. This study addresses the relationship between the verbal and gestural modalities during grounding in referential communication. We report data from a collaborative task where repeated references were elicited, and a time constraint was imposed to increase cognitive load. Our results reveal no differential effects of repetition or cognitive load on the semantic-based gesture rate, suggesting that representational gestures and speech are closely coordinated during grounding. However, gestures and speech differed in their execution, especially under time pressure. We argue that speech and gesture are two complementary streams that might be planned in conjunction but that unfold independently in later stages of language production, with speakers emphasizing the form of their gestures, but not of their words, to better meet the goals of the collaborative task. Copyright © 2016 Cognitive Science Society, Inc.
1996-01-01
These guidelines are an official statement of the American Speech-Language-Hearing Association. They provide guidance on the training, credentialing, use, and supervision of one category of support personnel in speech-language pathology: speech-language pathology assistants. Guidelines are not official standards of the Association. They were developed by the Task Force on Support Personnel: Dennis J. Arnst, Kenneth D. Barker, Ann Olsen Bird, Sheila Bridges, Linda S. DeYoung, Katherine Formichella, Nena M. Germany, Gilbert C. Hanke, Ann M. Horton, DeAnne M. Owre, Sidney L. Ramsey, Cathy A. Runnels, Brenda Terrell, Gerry W. Werven, Denise West, Patricia A. Mercaitis (consultant), Lisa C. O'Connor (consultant), Frederick T. Spahr (coordinator), Diane Paul-Brown (associate coordinator), Ann L. Carey (Executive Board liaison). The 1994 guidelines supersede the 1981 guidelines entitled, "Guidelines for the Employment and Utilization of Supportive Personnel" (Asha, March 1981, 165-169). Refer to the 1995 position statement on the "Training, Credentialing, Use, and Supervision of Support Personnel in Speech-Language Pathology" (Asha, 37 [Suppl. 14], 21).
Sowman, Paul F; Flavel, Stanley C; McShane, Christie L; Sakuma, Shigemitsu; Miles, Timothy S; Nordstrom, Michael A
2009-07-01
Like most of the cranial muscles involved in speech, the trigeminally innervated anterior digastric muscles are controlled by descending corticobulbar projections from the primary motor cortex (M1) of each hemisphere. We hypothesized that changes in corticobulbar M1 excitability during speech production would show a hemispheric asymmetry favoring the left side, which is the dominant hemisphere for language processing in most strongly right handed subjects. Fifteen volunteers aged 24.5+/-5.3 (SD) yr participated. All subjects were strongly right handed as reported by questionnaire. A surface electromyograph (EMG) was recorded bilaterally from digastrics and jaw movement detected by an accelerometer attached to a lower incisor. Focal transcranial magnetic stimulation (TMS) was used to assess corticomotor excitability of the digastric representation in M1 of both hemispheres during four tasks: 1) static isometric contraction of digastrics; 2) speaking a single word; 3) visually guided, nonspeech jaw movement that matched the jaw kinematics recorded during task 2; and 4) reciting a sentence. Background EMG was well matched in all tasks and jaw kinematics were similar around the time of the TMS pulse for tasks 2-4. TMS resting thresholds and digastric muscle-evoked potential (MEP) size during isometric contraction did not differ for TMS over left versus right M1. MEPs elicited by TMS over left, but not right M1 increased in size during speech and nonspeech jaw movement compared with isometric contraction. We conclude that left corticobulbar M1 is preferentially engaged for descending control of digastric muscles during speech and the performance of a rapid jaw movement to match a target kinematic profile.
Integrated Speech and Language Technology for Intelligence, Surveillance, and Reconnaissance (ISR)
2017-07-01
applying submodularity techniques to address computing challenges posed by large datasets in speech and language processing. MT and speech tools were...aforementioned research-oriented activities, the IT system administration team provided necessary support to laboratory computing and network operations...operations of SCREAM Lab computer systems and networks. Other miscellaneous activities in relation to Task Order 29 are presented in an additional fourth
Intelligibility and Acceptability Testing for Speech Technology
1992-05-22
information in memory (Luce, Feustel, and Pisoni, 1983). In high workload or multiple task situations, the added effort of listening to degraded speech can lead...the DRT provides diagnostic feature scores on six phonemic features: voicing, nasality, sustention , sibilation, graveness, and compactness, and on a...of other speech materials (e.g., polysyllabic words, paragraphs) and methods ( memory , comprehension, reaction time) have been used to evaluate the
ERIC Educational Resources Information Center
Snellings, Patrick; van der Leij, Aryan; Blok, Henk; de Jong, Peter F.
2010-01-01
This study investigated the role of speech perception accuracy and speed in fluent word decoding of reading disabled (RD) children. A same-different phoneme discrimination task with natural speech tested the perception of single consonants and consonant clusters by young but persistent RD children. RD children were slower than chronological age…
Improving speech perception in noise with current focusing in cochlear implant users.
Srinivasan, Arthi G; Padilla, Monica; Shannon, Robert V; Landsberger, David M
2013-05-01
Cochlear implant (CI) users typically have excellent speech recognition in quiet but struggle with understanding speech in noise. It is thought that broad current spread from stimulating electrodes causes adjacent electrodes to activate overlapping populations of neurons which results in interactions across adjacent channels. Current focusing has been studied as a way to reduce spread of excitation, and therefore, reduce channel interactions. In particular, partial tripolar stimulation has been shown to reduce spread of excitation relative to monopolar stimulation. However, the crucial question is whether this benefit translates to improvements in speech perception. In this study, we compared speech perception in noise with experimental monopolar and partial tripolar speech processing strategies. The two strategies were matched in terms of number of active electrodes, microphone, filterbanks, stimulation rate and loudness (although both strategies used a lower stimulation rate than typical clinical strategies). The results of this study showed a significant improvement in speech perception in noise with partial tripolar stimulation. All subjects benefited from the current focused speech processing strategy. There was a mean improvement in speech recognition threshold of 2.7 dB in a digits in noise task and a mean improvement of 3 dB in a sentences in noise task with partial tripolar stimulation relative to monopolar stimulation. Although the experimental monopolar strategy was worse than the clinical, presumably due to different microphones, frequency allocations and stimulation rates, the experimental partial-tripolar strategy, which had the same changes, showed no acute deficit relative to the clinical. Copyright © 2013 Elsevier B.V. All rights reserved.
2016-03-01
manual rather than verbal responses. The coordinate response measure ( CRM ) task and speech corpus is a highly simplified form of the command and...in multi-talker speech experiments. The CRM corpus is a collection of recorded command utterances in the form of Ready <Callsign> go to <Color...In the two-talker CRM listening task, participants respond to commands by pointing to the appropriate Color/Digit pair on a computer display. A
Alm, Magnus; Behne, Dawn
2015-01-01
Gender and age have been found to affect adults’ audio-visual (AV) speech perception. However, research on adult aging focuses on adults over 60 years, who have an increasing likelihood for cognitive and sensory decline, which may confound positive effects of age-related AV-experience and its interaction with gender. Observed age and gender differences in AV speech perception may also depend on measurement sensitivity and AV task difficulty. Consequently both AV benefit and visual influence were used to measure visual contribution for gender-balanced groups of young (20–30 years) and middle-aged adults (50–60 years) with task difficulty varied using AV syllables from different talkers in alternative auditory backgrounds. Females had better speech-reading performance than males. Whereas no gender differences in AV benefit or visual influence were observed for young adults, visually influenced responses were significantly greater for middle-aged females than middle-aged males. That speech-reading performance did not influence AV benefit may be explained by visual speech extraction and AV integration constituting independent abilities. Contrastingly, the gender difference in visually influenced responses in middle adulthood may reflect an experience-related shift in females’ general AV perceptual strategy. Although young females’ speech-reading proficiency may not readily contribute to greater visual influence, between young and middle-adulthood recurrent confirmation of the contribution of visual cues induced by speech-reading proficiency may gradually shift females AV perceptual strategy toward more visually dominated responses. PMID:26236274
Cohen-Mimran, Ravit; Sapir, Shimon
2008-01-01
To assess the relationships between central auditory processing (CAP) of sinusoidally modulated speech-like and non-speech acoustic signals and reading skills in shallow (pointed) and deep (unpointed) Hebrew orthographies. Twenty unselected fifth-grade Hebrew speakers performed a rate change detection (RCD) task using the aforementioned acoustic signals. They also performed reading and general ability (IQ) tests. After controlling for general ability, RCD tasks contributed a significant unique variance to the decoding skills. In addition, there was a fairly strong correlation between the score on the RCD with the speech-like stimuli and the unpointed text reading score. CAP abilities may affect reading skills, depending on the nature of orthography (deep vs shallow), at least in the Hebrew language.
Studer-Eichenberger, Esther; Studer-Eichenberger, Felix; Koenig, Thomas
2016-01-01
The objectives of the present study were to investigate temporal/spectral sound-feature processing in preschool children (4 to 7 years old) with peripheral hearing loss compared with age-matched controls. The results verified the presence of statistical learning, which was diminished in children with hearing impairments (HIs), and elucidated possible perceptual mediators of speech production. Perception and production of the syllables /ba/, /da/, /ta/, and /na/ were recorded in 13 children with normal hearing and 13 children with HI. Perception was assessed physiologically through event-related potentials (ERPs) recorded by EEG in a multifeature mismatch negativity paradigm and behaviorally through a discrimination task. Temporal and spectral features of the ERPs during speech perception were analyzed, and speech production was quantitatively evaluated using speech motor maximum performance tasks. Proximal to stimulus onset, children with HI displayed a difference in map topography, indicating diminished statistical learning. In later ERP components, children with HI exhibited reduced amplitudes in the N2 and early parts of the late disciminative negativity components specifically, which are associated with temporal and spectral control mechanisms. Abnormalities of speech perception were only subtly reflected in speech production, as the lone difference found in speech production studies was a mild delay in regulating speech intensity. In addition to previously reported deficits of sound-feature discriminations, the present study results reflect diminished statistical learning in children with HI, which plays an early and important, but so far neglected, role in phonological processing. Furthermore, the lack of corresponding behavioral abnormalities in speech production implies that impaired perceptual capacities do not necessarily translate into productive deficits.
Speech perception and production in severe environments
NASA Astrophysics Data System (ADS)
Pisoni, David B.
1990-09-01
The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.
Clear speech and lexical competition in younger and older adult listeners.
Van Engen, Kristin J
2017-08-01
This study investigated whether clear speech reduces the cognitive demands of lexical competition by crossing speaking style with lexical difficulty. Younger and older adults identified more words in clear versus conversational speech and more easy words than hard words. An initial analysis suggested that the effect of lexical difficulty was reduced in clear speech, but more detailed analyses within each age group showed this interaction was significant only for older adults. The results also showed that both groups improved over the course of the task and that clear speech was particularly helpful for individuals with poorer hearing: for younger adults, clear speech eliminated hearing-related differences that affected performance on conversational speech. For older adults, clear speech was generally more helpful to listeners with poorer hearing. These results suggest that clear speech affords perceptual benefits to all listeners and, for older adults, mitigates the cognitive challenge associated with identifying words with many phonological neighbors.
An integrated approach to improving noisy speech perception
NASA Astrophysics Data System (ADS)
Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail
2002-05-01
For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.
Task-related fMRI in hemiplegic cerebral palsy-A systematic review.
Gaberova, Katerina; Pacheva, Iliyana; Ivanov, Ivan
2018-04-27
Functional magnetic resonance imaging (fMRI) is used widely to study reorganization after early brain injuries. Unilateral cerebral palsy (UCP) is an appealing model for studying brain plasticity by fMRI. To summarize the results of task-related fMRI studies in UCP in order to get better understanding of the mechanism of neuroplasticity of the developing brain and its reorganization potential and better translation of this knowledge to clinical practice. A systematic search was conducted on the PubMed database by keywords: "cerebral palsy", "congenital hemiparesis", "unilateral", "Magnetic resonance imaging" , "fMRI", "reorganization", and "plasticity" The exclusion criteria were as follows: case reports; reviews; studies exploring non-UCP patients; and studies with results of rehabilitation. We found 7 articles investigated sensory tasks; 9 studies-motor tasks; 12 studies-speech tasks. Ipsilesional reorganization is dominant in sensory tasks (in 74/77 patients), contralesional-in only 3/77. In motor tasks, bilateral activation is found in 64/83, only contralesional-in 11/83, and only ipsilesional-8/83. Speech perception is bilateral in 35/51, only or dominantly ipsilesional (left-sided) in 8/51, and dominantly contralesional (right-sided) in 8/51. Speech production is only or dominantly contralesional (right-sided) in 88/130, bilateral-26/130, and only or dominantly ipsilesional (left-sided)-in 16/130. The sensory system is the most "rigid" to reorganization probably due to absence of ipsilateral (contralesional) primary somatosensory representation. The motor system is more "flexible" due to ipsilateral (contralesional) motor pathways. The speech perception and production show greater flexibility resulting in more bilateral or contralateral activation. The models of reorganization are variable, depending on the development and function of each neural system and the extent and timing of the damage. The plasticity patterns may guide therapeutic intervention and prognostics, thus proving the fruitiness of the translational approach in neurosciences. © 2018 John Wiley & Sons, Ltd.
How visual timing and form information affect speech and non-speech processing.
Kim, Jeesun; Davis, Chris
2014-10-01
Auditory speech processing is facilitated when the talker's face/head movements are seen. This effect is typically explained in terms of visual speech providing form and/or timing information. We determined the effect of both types of information on a speech/non-speech task (non-speech stimuli were spectrally rotated speech). All stimuli were presented paired with the talker's static or moving face. Two types of moving face stimuli were used: full-face versions (both spoken form and timing information available) and modified face versions (only timing information provided by peri-oral motion available). The results showed that the peri-oral timing information facilitated response time for speech and non-speech stimuli compared to a static face. An additional facilitatory effect was found for full-face versions compared to the timing condition; this effect only occurred for speech stimuli. We propose the timing effect was due to cross-modal phase resetting; the form effect to cross-modal priming. Copyright © 2014 Elsevier Inc. All rights reserved.
Children perceive speech onsets by ear and eye*
JERGER, SUSAN; DAMIAN, MARKUS F.; TYE-MURRAY, NANCY; ABDI, HERVÉ
2016-01-01
Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: −b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children – like adults – perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception. PMID:26752548
Walking the talk--speech activates the leg motor cortex.
Liuzzi, Gianpiero; Ellger, Tanja; Flöel, Agnes; Breitenstein, Caterina; Jansen, Andreas; Knecht, Stefan
2008-09-01
Speech may have evolved from earlier modes of communication based on gestures. Consistent with such a motor theory of speech, cortical orofacial and hand motor areas are activated by both speech production and speech perception. However, the extent of speech-related activation of the motor cortex remains unclear. Therefore, we examined if reading and listening to continuous prose also activates non-brachiofacial motor representations like the leg motor cortex. We found corticospinal excitability of bilateral leg muscle representations to be enhanced by speech production and silent reading. Control experiments showed that speech production yielded stronger facilitation of the leg motor system than non-verbal tongue-mouth mobilization and silent reading more than a visuo-attentional task thus indicating speech-specificity of the effect. In the frame of the motor theory of speech this finding suggests that the system of gestural communication, from which speech may have evolved, is not confined to the hand but includes gestural movements of other body parts as well.
Loudness and pitch of Kunqu opera.
Dong, Li; Sundberg, Johan; Kong, Jiangping
2014-01-01
Equivalent sound level (Leq), sound pressure level (SPL), and fundamental frequency (F0) are analyzed in each of five Kunqu Opera roles, Young girl and Young woman, Young man, Old man, and Colorful face. Their pitch ranges are similar to those of some western opera singers (alto, alto, tenor, baritone, and baritone, respectively). Differences among tasks, conditions (stage speech, singing, and reading lyrics), singers, and roles are examined. For all singers, Leq of stage speech and singing were considerably higher than that of conversational speech. Interrole differences of Leq among tasks and singers were larger than the intrarole differences. For most roles, time domain variation of SPL differed between roles both in singing and stage speech. In singing, as compared with stage speech, SPL distribution was more concentrated and variation of SPL with time was smaller. With regard to gender and age, male roles had higher mean Leq and lower average F0, MF0, as compared with female roles. Female singers showed a wider F0 distribution for singing than for stage speech, whereas the opposite was true for male singers. The Leq of stage speech was higher than in singing for young personages. Younger female personages showed higher Leq, whereas older male personages had higher Leq. The roles performed with higher Leq tended to be sung at a lower MF0. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Chen, Junwen; McLean, Jordan E; Kemps, Eva
2018-03-01
This study investigated the effects of combined audience feedback with video feedback plus cognitive preparation, and cognitive review (enabling deeper processing of feedback) on state anxiety and self-perceptions including perception of performance and perceived probability of negative evaluation in socially anxious individuals during a speech performance. One hundred and forty socially anxious students were randomly assigned to four conditions: Cognitive Preparation + Video Feedback + Audience Feedback + Cognitive Review (CP+VF+AF+CR), Cognitive Preparation + Video Feedback + Cognitive Review (CP+VF+CR), Cognitive Preparation + Video Feedback only (CP+VF), and Control. They were asked to deliver two impromptu speeches that were evaluated by confederates. Participants' levels of anxiety and self-perceptions pertaining to the speech task were assessed before and after feedback, and after the second speech. Compared to participants in the other conditions, participants in the CP+VF+AF+CR condition reported a significant decrease in their state anxiety and perceived probability of negative evaluation scores, and a significant increase in their positive perception of speech performance from before to after the feedback. These effects generalized to the second speech. Our results suggest that adding audience feedback to video feedback plus cognitive preparation and cognitive review may improve the effects of existing video feedback procedures in reducing anxiety symptoms and distorted self-representations in socially anxious individuals. Copyright © 2017. Published by Elsevier Ltd.
Learner Attention to Form in ACCESS Task-Based Interaction
ERIC Educational Resources Information Center
Dao, Phung; Iwashita, Noriko; Gatbonton, Elizabeth
2017-01-01
This study explored the potential effects of communicative tasks developed using a reformulation of a task-based language teaching called Automatization in Communicative Contexts of Essential Speech Sequences (ACCESS) that includes automatization of language elements as one of its goals on learner attention to form in task-based interaction. The…
The Timing and Effort of Lexical Access in Natural and Degraded Speech
Wagner, Anita E.; Toffanin, Paolo; Başkent, Deniz
2016-01-01
Understanding speech is effortless in ideal situations, and although adverse conditions, such as caused by hearing impairment, often render it an effortful task, they do not necessarily suspend speech comprehension. A prime example of this is speech perception by cochlear implant users, whose hearing prostheses transmit speech as a significantly degraded signal. It is yet unknown how mechanisms of speech processing deal with such degraded signals, and whether they are affected by effortful processing of speech. This paper compares the automatic process of lexical competition between natural and degraded speech, and combines gaze fixations, which capture the course of lexical disambiguation, with pupillometry, which quantifies the mental effort involved in processing speech. Listeners’ ocular responses were recorded during disambiguation of lexical embeddings with matching and mismatching durational cues. Durational cues were selected due to their substantial role in listeners’ quick limitation of the number of lexical candidates for lexical access in natural speech. Results showed that lexical competition increased mental effort in processing natural stimuli in particular in presence of mismatching cues. Signal degradation reduced listeners’ ability to quickly integrate durational cues in lexical selection, and delayed and prolonged lexical competition. The effort of processing degraded speech was increased overall, and because it had its sources at the pre-lexical level this effect can be attributed to listening to degraded speech rather than to lexical disambiguation. In sum, the course of lexical competition was largely comparable for natural and degraded speech, but showed crucial shifts in timing, and different sources of increased mental effort. We argue that well-timed progress of information from sensory to pre-lexical and lexical stages of processing, which is the result of perceptual adaptation during speech development, is the reason why in ideal situations speech is perceived as an undemanding task. Degradation of the signal or the receiver channel can quickly bring this well-adjusted timing out of balance and lead to increase in mental effort. Incomplete and effortful processing at the early pre-lexical stages has its consequences on lexical processing as it adds uncertainty to the forming and revising of lexical hypotheses. PMID:27065901
Reynolds, Michael G; Schlöffel, Sophie; Peressotti, Francesca
2015-01-01
One approach used to gain insight into the processes underlying bilingual language comprehension and production examines the costs that arise from switching languages. For unbalanced bilinguals, asymmetric switch costs are reported in speech production, where the switch cost for L1 is larger than the switch cost for L2, whereas, symmetric switch costs are reported in language comprehension tasks, where the cost of switching is the same for L1 and L2. Presently, it is unclear why asymmetric switch costs are observed in speech production, but not in language comprehension. Three experiments are reported that simultaneously examine methodological explanations of task related differences in the switch cost asymmetry and the predictions of three accounts of the switch cost asymmetry in speech production. The results of these experiments suggest that (1) the type of language task (comprehension vs. production) determines whether an asymmetric switch cost is observed and (2) at least some of the switch cost asymmetry arises within the language system.
Oi, Misato; Saito, Hirofumi; Li, Zongfeng; Zhao, Wenjun
2013-04-01
To examine the neural mechanism of co-speech gesture production, we measured brain activity of bilinguals during an animation-narration task using near-infrared spectroscopy. The task of the participants was to watch two stories via an animated cartoon, and then narrate the contents in their first language (Ll) and second language (L2), respectively. The participants showed significantly more gestures in L2 than in L1. The number of gestures lowered at the ending part of the narration in L1, but not in L2. Analyses of concentration changes of oxygenated hemoglobin revealed that activation of the left inferior frontal gyrus (IFG) significantly increased during gesture production, while activation of the left posterior superior temporal sulcus (pSTS) significantly decreased in line with an increase in the left IFG. These brain activation patterns suggest that the left IFG is involved in the gesture production, and the left pSTS is modulated by the speech load. Copyright © 2013 Elsevier Inc. All rights reserved.
Reynolds, Michael G.; Schlöffel, Sophie; Peressotti, Francesca
2016-01-01
One approach used to gain insight into the processes underlying bilingual language comprehension and production examines the costs that arise from switching languages. For unbalanced bilinguals, asymmetric switch costs are reported in speech production, where the switch cost for L1 is larger than the switch cost for L2, whereas, symmetric switch costs are reported in language comprehension tasks, where the cost of switching is the same for L1 and L2. Presently, it is unclear why asymmetric switch costs are observed in speech production, but not in language comprehension. Three experiments are reported that simultaneously examine methodological explanations of task related differences in the switch cost asymmetry and the predictions of three accounts of the switch cost asymmetry in speech production. The results of these experiments suggest that (1) the type of language task (comprehension vs. production) determines whether an asymmetric switch cost is observed and (2) at least some of the switch cost asymmetry arises within the language system. PMID:26834659
Goozée, Justine V; Murdoch, Bruce E; Theodoros, Deborah G
2002-01-01
A miniature pressure transducer was used to assess the interlabial contact pressures produced by a group of 19 adults (mean age 30.6 years) with dysarthria following severe traumatic brain injury (TBI) during a set of speech and nonspeech tasks. Ten parameters relating to lip strength, endurance, rate of movement and lip pressure accuracy and stability were measured from the nonspeech tasks. The results attained by the TBI group were compared against a group of 19 age- and sex-matched control subjects. Significant differences between the groups were found for maximum interlabial contact pressure, maximum rate of repetition of maximum pressure, and lip pressure accuracy at 50 and 10% levels of maximum pressure. In regards to speech, the interlabial contact pressures generated by the TBI group and control group did not differ significantly. When expressed as percentages of maximum pressure, however, the TBI group's interlabial pressures appeared to have been generated with greater physiological effort. Copyright 2002 S. Karger AG, Basel
Riecker, A; Ackermann, H; Wildgruber, D; Dogil, G; Grodd, W
2000-06-26
Aside from spoken language, singing represents a second mode of acoustic (auditory-vocal) communication in humans. As a new aspect of brain lateralization, functional magnetic resonance imaging (fMRI) revealed two complementary cerebral networks subserving singing and speaking. Reproduction of a non-lyrical tune elicited activation predominantly in the right motor cortex, the right anterior insula, and the left cerebellum whereas the opposite response pattern emerged during a speech task. In contrast to the hemodynamic responses within motor cortex and cerebellum, activation of the intrasylvian cortex turned out to be bound to overt task performance. These findings corroborate the assumption that the left insula supports the coordination of speech articulation. Similarly, the right insula might mediate temporo-spatial control of vocal tract musculature during overt singing. Both speech and melody production require the integration of sound structure or tonal patterns, respectively, with a speaker's emotions and attitudes. Considering the widespread interconnections with premotor cortex and limbic structures, the insula is especially suited for this task.
Science 101: How Does Speech-Recognition Software Work?
ERIC Educational Resources Information Center
Robertson, Bill
2016-01-01
This column provides background science information for elementary teachers. Many innovations with computer software begin with analysis of how humans do a task. This article takes a look at how humans recognize spoken words and explains the origins of speech-recognition software.
ERIC Educational Resources Information Center
Colletta, Jean-Marc; Guidetti, Michele; Capirci, Olga; Cristilli, Carla; Demir, Ozlem Ece; Kunene-Nicolas, Ramona N.; Levine, Susan
2015-01-01
The aim of this paper is to compare speech and co-speech gestures observed during a narrative retelling task in five- and ten-year-old children from three different linguistic groups, French, American, and Italian, in order to better understand the role of age and language in the development of multimodal monologue discourse abilities. We asked 98…
ERIC Educational Resources Information Center
Peter, Beate; Button, Le; Stoel-Gammon, Carol; Chapman, Kathy; Raskind, Wendy H.
2013-01-01
The purpose of this study was to evaluate a global deficit in sequential processing as candidate endophenotypein a family with familial childhood apraxia of speech (CAS). Of 10 adults and 13 children in a three-generational family with speech sound disorder (SSD) consistent with CAS, 3 adults and 6 children had past or present SSD diagnoses. Two…
Non-native Listeners’ Recognition of High-Variability Speech Using PRESTO
Tamati, Terrin N.; Pisoni, David B.
2015-01-01
Background Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. Purpose The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. Research Design Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. Study Sample Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. Data Collection and Analysis Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary knowledge was assessed with the WordFam word familiarity test, and executive functioning was assessed with the BRIEF-A (Behavioral Rating Inventory of Executive Function – Adult Version) self-report questionnaire. Scores from the non-native listeners on behavioral tasks and self-report questionnaires were compared with scores obtained from native listeners tested in a previous study and were examined for individual differences. Results Non-native keyword recognition scores were significantly lower on PRESTO sentences than on HINT sentences. Non-native listeners’ keyword recognition scores were also lower than native listeners’ scores on both sentence recognition tasks. Differences in performance on the sentence recognition tasks between non-native and native listeners were larger on PRESTO than on HINT, although group differences varied by signal-to-noise ratio. The non-native and native groups also differed in the ability to categorize talkers by region of origin and in vocabulary knowledge. Individual non-native word recognition accuracy on PRESTO sentences in multitalker babble at more favorable signal-to-noise ratios was found to be related to several BRIEF-A subscales and composite scores. However, non-native performance on PRESTO was not related to regional dialect categorization, talker and gender discrimination, or vocabulary knowledge. Conclusions High-variability sentences in multitalker babble were particularly challenging for non-native listeners. Difficulty under high-variability testing conditions was related to lack of experience with the L2, especially L2 sociolinguistic information, compared with native listeners. Individual differences among the non-native listeners were related to weaknesses in core neurocognitive abilities affecting behavioral control in everyday life. PMID:25405842
Hazan, Valerie; Tuomainen, Outi; Pettinato, Michèle
2016-12-01
This study investigated the acoustic characteristics of spontaneous speech by talkers aged 9-14 years and their ability to adapt these characteristics to maintain effective communication when intelligibility was artificially degraded for their interlocutor. Recordings were made for 96 children (50 female participants, 46 male participants) engaged in a problem-solving task with a same-sex friend; recordings for 20 adults were used as reference. The task was carried out in good listening conditions (normal transmission) and in degraded transmission conditions. Articulation rate, median fundamental frequency (f0), f0 range, and relative energy in the 1- to 3-kHz range were analyzed. With increasing age, children significantly reduced their median f0 and f0 range, became faster talkers, and reduced their mid-frequency energy in spontaneous speech. Children produced similar clear speech adaptations (in degraded transmission conditions) as adults, but only children aged 11-14 years increased their f0 range, an unhelpful strategy not transmitted via the vocoder. Changes made by children were consistent with a general increase in vocal effort. Further developments in speech production take place during later childhood. Children use clear speech strategies to benefit an interlocutor facing intelligibility problems but may not be able to attune these strategies to the same degree as adults.
Comprehension of Co-Speech Gestures in Aphasic Patients: An Eye Movement Study.
Eggenberger, Noëmi; Preisig, Basil C; Schumacher, Rahel; Hopfner, Simone; Vanbellingen, Tim; Nyffeler, Thomas; Gutbrod, Klemens; Annoni, Jean-Marie; Bohlhalter, Stephan; Cazzoli, Dario; Müri, René M
2016-01-01
Co-speech gestures are omnipresent and a crucial element of human interaction by facilitating language comprehension. However, it is unclear whether gestures also support language comprehension in aphasic patients. Using visual exploration behavior analysis, the present study aimed to investigate the influence of congruence between speech and co-speech gestures on comprehension in terms of accuracy in a decision task. Twenty aphasic patients and 30 healthy controls watched videos in which speech was either combined with meaningless (baseline condition), congruent, or incongruent gestures. Comprehension was assessed with a decision task, while remote eye-tracking allowed analysis of visual exploration. In aphasic patients, the incongruent condition resulted in a significant decrease of accuracy, while the congruent condition led to a significant increase in accuracy compared to baseline accuracy. In the control group, the incongruent condition resulted in a decrease in accuracy, while the congruent condition did not significantly increase the accuracy. Visual exploration analysis showed that patients fixated significantly less on the face and tended to fixate more on the gesturing hands compared to controls. Co-speech gestures play an important role for aphasic patients as they modulate comprehension. Incongruent gestures evoke significant interference and deteriorate patients' comprehension. In contrast, congruent gestures enhance comprehension in aphasic patients, which might be valuable for clinical and therapeutic purposes.
[Group therapy of aphasia patients--a professional report].
Königsbüscher, S; Meyer-Königsbüscher, J; Ostermann, F
1987-08-01
Theoretical considerations in, and an exemplary instance of, group speech therapy with aphasic patients are submitted for discussion. Contrary to current practice, individual and group therapy are considered of equal rank, as all aspects of speech and language can be realized in the latter, with both approaches having similar goals. Group work contents is characterized as verbal coping with guided verbal tasks. Group composition is recommended to emphasize speech and language criteria, as opposed to social and psychosocial criteria. Also dealt with is the position of the speech therapist.
Altered time course of amygdala activation during speech anticipation in social anxiety disorder.
Davies, Carolyn D; Young, Katherine; Torre, Jared B; Burklund, Lisa J; Goldin, Philippe R; Brown, Lily A; Niles, Andrea N; Lieberman, Matthew D; Craske, Michelle G
2017-02-01
Exaggerated anticipatory anxiety is common in social anxiety disorder (SAD). Neuroimaging studies have revealed altered neural activity in response to social stimuli in SAD, but fewer studies have examined neural activity during anticipation of feared social stimuli in SAD. The current study examined the time course and magnitude of activity in threat processing brain regions during speech anticipation in socially anxious individuals and healthy controls (HC). Participants (SAD n=58; HC n=16) underwent functional magnetic resonance imaging (fMRI) during which they completed a 90s control anticipation task and 90s speech anticipation task. Repeated measures multi-level modeling analyses were used to examine group differences in time course activity during speech vs. control anticipation for regions of interest, including bilateral amygdala, insula, ventral striatum, and dorsal anterior cingulate cortex. The time course of amygdala activity was more prolonged and less variable throughout speech anticipation in SAD participants compared to HCs, whereas the overall magnitude of amygdala response did not differ between groups. Magnitude and time course of activity was largely similar between groups across other regions of interest. Analyses were restricted to regions of interest and task order was the same across participants due to the nature of deception instructions. Sustained amygdala time course during anticipation may uniquely reflect heightened detection of threat or deficits in emotion regulation in socially anxious individuals. Findings highlight the importance of examining temporal dynamics of amygdala responding. Copyright © 2016 Elsevier B.V. All rights reserved.
Altered time course of amygdala activation during speech anticipation in social anxiety disorder
Davies, Carolyn D.; Young, Katherine; Torre, Jared B.; Burklund, Lisa J.; Goldin, Philippe R.; Brown, Lily A.; Niles, Andrea N.; Lieberman, Matthew D.; Craske, Michelle G.
2016-01-01
Background Exaggerated anticipatory anxiety is common in social anxiety disorder (SAD). Neuroimaging studies have revealed altered neural activity in response to social stimuli in SAD, but fewer studies have examined neural activity during anticipation of feared social stimuli in SAD. The current study examined the time course and magnitude of activity in threat processing brain regions during speech anticipation in socially anxious individuals and healthy controls (HC). Method Participants (SAD n = 58; HC n = 16) underwent functional magnetic resonance imaging (fMRI) during which they completed a 90s control anticipation task and 90s speech anticipation task. Repeated measures multi-level modeling analyses were used to examine group differences in time course activity during speech vs. control anticipation for regions of interest, including bilateral amygdala, insula, ventral striatum, and dorsal anterior cingulate cortex. Results The time course of amygdala activity was more prolonged and less variable throughout speech anticipation in SAD participants compared to HCs, whereas the overall magnitude of amygdala response did not differ between groups. Magnitude and time course of activity was largely similar between groups across other regions of interest. Limitations Analyses were restricted to regions of interest and task order was the same across participants due to the nature of deception instructions. Conclusions Sustained amygdala time course during anticipation may uniquely reflect heightened detection of threat or deficits in emotion regulation in socially anxious individuals. Findings highlight the importance of examining temporal dynamics of amygdala responding. PMID:27870942
Auditory-motor interaction revealed by fMRI: speech, music, and working memory in area Spt.
Hickok, Gregory; Buchsbaum, Bradley; Humphries, Colin; Muftuler, Tugan
2003-07-01
The concept of auditory-motor interaction pervades speech science research, yet the cortical systems supporting this interface have not been elucidated. Drawing on experimental designs used in recent work in sensory-motor integration in the cortical visual system, we used fMRI in an effort to identify human auditory regions with both sensory and motor response properties, analogous to single-unit responses in known visuomotor integration areas. The sensory phase of the task involved listening to speech (nonsense sentences) or music (novel piano melodies); the "motor" phase of the task involved covert rehearsal/humming of the auditory stimuli. A small set of areas in the superior temporal and temporal-parietal cortex responded both during the listening phase and the rehearsal/humming phase. A left lateralized region in the posterior Sylvian fissure at the parietal-temporal boundary, area Spt, showed particularly robust responses to both phases of the task. Frontal areas also showed combined auditory + rehearsal responsivity consistent with the claim that the posterior activations are part of a larger auditory-motor integration circuit. We hypothesize that this circuit plays an important role in speech development as part of the network that enables acoustic-phonetic input to guide the acquisition of language-specific articulatory-phonetic gestures; this circuit may play a role in analogous musical abilities. In the adult, this system continues to support aspects of speech production, and, we suggest, supports verbal working memory.
Speech and gesture interfaces for squad-level human-robot teaming
NASA Astrophysics Data System (ADS)
Harris, Jonathan; Barber, Daniel
2014-06-01
As the military increasingly adopts semi-autonomous unmanned systems for military operations, utilizing redundant and intuitive interfaces for communication between Soldiers and robots is vital to mission success. Currently, Soldiers use a common lexicon to verbally and visually communicate maneuvers between teammates. In order for robots to be seamlessly integrated within mixed-initiative teams, they must be able to understand this lexicon. Recent innovations in gaming platforms have led to advancements in speech and gesture recognition technologies, but the reliability of these technologies for enabling communication in human robot teaming is unclear. The purpose for the present study is to investigate the performance of Commercial-Off-The-Shelf (COTS) speech and gesture recognition tools in classifying a Squad Level Vocabulary (SLV) for a spatial navigation reconnaissance and surveillance task. The SLV for this study was based on findings from a survey conducted with Soldiers at Fort Benning, GA. The items of the survey focused on the communication between the Soldier and the robot, specifically in regards to verbally instructing them to execute reconnaissance and surveillance tasks. Resulting commands, identified from the survey, were then converted to equivalent arm and hand gestures, leveraging existing visual signals (e.g. U.S. Army Field Manual for Visual Signaling). A study was then run to test the ability of commercially available automated speech recognition technologies and a gesture recognition glove to classify these commands in a simulated intelligence, surveillance, and reconnaissance task. This paper presents classification accuracy of these devices for both speech and gesture modalities independently.
Callahan, Sarah M.; Walenski, Matthew; Love, Tracy
2013-01-01
Purpose To examine children’s comprehension of verb phrase (VP) ellipsis constructions in light of their automatic, online structural processing abilities and conscious, metalinguistic reflective skill. Method Forty-two children ages 5 through 12 years listened to VP ellipsis constructions involving the strict/sloppy ambiguity (e.g., “The janitor untangled himself from the rope and the fireman in the elementary school did too after the accident.”) in which the ellipsis phrase (“did too”) had 2 interpretations: (a) strict (“untangled the janitor”) and (b) sloppy (“untangled the fireman”). We examined these sentences at a normal speech rate with an online cross-modal picture priming task (n = 14) and an offline sentence–picture matching task (n = 11). Both tasks were also given with slowed speech input (n = 17). Results Children showed priming for both the strict and sloppy interpretations at a normal speech rate but only for the strict interpretation with slowed input. Offline, children displayed an adultlike preference for the sloppy interpretation with normal-rate input but a divergent pattern with slowed speech. Conclusions Our results suggest that children and adults rely on a hybrid syntax-discourse model for the online comprehension and offline interpretation of VP ellipsis constructions. This model incorporates a temporally sensitive syntactic process of VP reconstruction (disrupted with slow input) and a temporally protracted discourse effect attributed to parallelism (preserved with slow input). PMID:22223886
Nowakowski, Matilda E; Antony, Martin M; Koerner, Naomi
2015-12-01
The present study investigated the effects of computerized interpretation training and cognitive restructuring on symptomatology, behavior, and physiological reactivity in an analogue social anxiety sample. Seventy-two participants with elevated social anxiety scores were randomized to one session of computerized interpretation training (n = 24), cognitive restructuring (n = 24), or an active placebo control condition (n = 24). Participants completed self-report questionnaires focused on interpretation biases and social anxiety symptomatology at pre and posttraining and a speech task at posttraining during which subjective, behavioral, and physiological measures of anxiety were assessed. Only participants in the interpretation training condition endorsed significantly more positive than negative interpretations of ambiguous social situations at posttraining. There was no evidence of generalizability of interpretation training effects to self-report measures of interpretation biases and symptomatology or the anxiety response during the posttraining speech task. Participants in the cognitive restructuring condition were rated as having higher quality speeches and showing fewer signs of anxiety during the posttraining speech task compared to participants in the interpretation training condition. The present study did not include baseline measures of speech performance or computer assessed interpretation biases. The results of the present study bring into question the generalizability of computerized interpretation training as well as the effectiveness of a single session of cognitive restructuring in modifying the full anxiety response. Clinical and theoretical implications are discussed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yeend, Ingrid; Beach, Elizabeth Francis; Sharma, Mridula; Dillon, Harvey
2017-09-01
Recent animal research has shown that exposure to single episodes of intense noise causes cochlear synaptopathy without affecting hearing thresholds. It has been suggested that the same may occur in humans. If so, it is hypothesized that this would result in impaired encoding of sound and lead to difficulties hearing at suprathreshold levels, particularly in challenging listening environments. The primary aim of this study was to investigate the effect of noise exposure on auditory processing, including the perception of speech in noise, in adult humans. A secondary aim was to explore whether musical training might improve some aspects of auditory processing and thus counteract or ameliorate any negative impacts of noise exposure. In a sample of 122 participants (63 female) aged 30-57 years with normal or near-normal hearing thresholds, we conducted audiometric tests, including tympanometry, audiometry, acoustic reflexes, otoacoustic emissions and medial olivocochlear responses. We also assessed temporal and spectral processing, by determining thresholds for detection of amplitude modulation and temporal fine structure. We assessed speech-in-noise perception, and conducted tests of attention, memory and sentence closure. We also calculated participants' accumulated lifetime noise exposure and administered questionnaires to assess self-reported listening difficulty and musical training. The results showed no clear link between participants' lifetime noise exposure and performance on any of the auditory processing or speech-in-noise tasks. Musical training was associated with better performance on the auditory processing tasks, but not the on the speech-in-noise perception tasks. The results indicate that sentence closure skills, working memory, attention, extended high frequency hearing thresholds and medial olivocochlear suppression strength are important factors that are related to the ability to process speech in noise. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Cumming, Ruth; Wilson, Angela; Goswami, Usha
2015-01-01
Children with specific language impairments (SLIs) show impaired perception and production of spoken language, and can also present with motor, auditory, and phonological difficulties. Recent auditory studies have shown impaired sensitivity to amplitude rise time (ART) in children with SLIs, along with non-speech rhythmic timing difficulties. Linguistically, these perceptual impairments should affect sensitivity to speech prosody and syllable stress. Here we used two tasks requiring sensitivity to prosodic structure, the DeeDee task and a stress misperception task, to investigate this hypothesis. We also measured auditory processing of ART, rising pitch and sound duration, in both speech (“ba”) and non-speech (tone) stimuli. Participants were 45 children with SLI aged on average 9 years and 50 age-matched controls. We report data for all the SLI children (N = 45, IQ varying), as well as for two independent SLI subgroupings with intact IQ. One subgroup, “Pure SLI,” had intact phonology and reading (N = 16), the other, “SLI PPR” (N = 15), had impaired phonology and reading. Problems with syllable stress and prosodic structure were found for all the group comparisons. Both sub-groups with intact IQ showed reduced sensitivity to ART in speech stimuli, but the PPR subgroup also showed reduced sensitivity to sound duration in speech stimuli. Individual differences in processing syllable stress were associated with auditory processing. These data support a new hypothesis, the “prosodic phrasing” hypothesis, which proposes that grammatical difficulties in SLI may reflect perceptual difficulties with global prosodic structure related to auditory impairments in processing amplitude rise time and duration. PMID:26217286
van Nispen, Karin; van de Sandt-Koenderman, Mieke; Mol, Lisette; Krahmer, Emiel
2014-01-01
Gesticulation (gestures accompanying speech) and pantomime (gestures in the absence of speech) can each be comprehensible. Little is known about the differences between these two gesture modes in people with aphasia. To discover whether there are differences in the communicative use of gesticulation and pantomime in QH, a person with severe fluent aphasia. QH performed two tasks: naming objects and retelling a story. He did this once in a verbal condition (enabling gesticulation) and once in a pantomime condition. For both conditions, the comprehensibility of gestures was analysed in a forced-choice task by naïve judges. Secondly, a comparison was made between QH and healthy controls for the representation techniques used. Pantomimes produced by QH for naming objects were significantly more comprehensible than chance, whereas his gesticulation was not. For retelling a story the opposite pattern was found. When naming objects QH gesticulated much more than did healthy controls. His pantomimes for this task were simpler than those used by the control group. For retelling a story no differences were found. Although QH did not make full use of each gesture modes' potential, both did contribute to QH's comprehensibility. Crucially, the benefits of each mode differed across tasks. This implies that both gesture modes should be taken into account separately in models of speech and gesture production and in clinical practice for different communicative settings. © 2013 Royal College of Speech and Language Therapists.
Villena-González, Mario; López, Vladimir; Rodríguez, Eugenio
2016-05-15
When attention is oriented toward inner thoughts, as spontaneously occurs during mind wandering, the processing of external information is attenuated. However, the potential effects of thought's content regarding sensory attenuation are still unknown. The present study aims to assess if the representational format of thoughts, such as visual imagery or inner speech, might differentially affect the sensory processing of external stimuli. We recorded the brain activity of 20 participants (12 women) while they were exposed to a probe visual stimulus in three different conditions: executing a task on the visual probe (externally oriented attention), and two conditions involving inward-turned attention i.e. generating inner speech and performing visual imagery. Event-related potentials results showed that the P1 amplitude, related with sensory response, was significantly attenuated during both task involving inward attention compared with external task. When both representational formats were compared, the visual imagery condition showed stronger attenuation in sensory processing than inner speech condition. Alpha power in visual areas was measured as an index of cortical inhibition. Larger alpha amplitude was found when participants engaged in an internal thought contrasted with the external task, with visual imagery showing even more alpha power than inner speech condition. Our results show, for the first time to our knowledge, that visual attentional processing to external stimuli during self-generated thoughts is differentially affected by the representational format of the ongoing train of thoughts. Copyright © 2016 Elsevier Inc. All rights reserved.
Culture and the cognitive and neuroendocrine responses to speech.
Kim, Heejung S
2008-01-01
The present research investigated cultural differences in the psychological and biological effects of verbalization of thoughts. Three studies tested how verbalization of thoughts requires a different amount of effort for people from cultures with different assumptions about speech and examined implications for the cognitive performance and stress hormone response to the task. The results showed that verbalization impaired East Asians/East Asian Americans' performance when the task was difficult but not when the task was easy, whereas the effect of verbalization on European Americans' performance was neutral or positive regardless of task difficulty. Moreover, verbalization decreased the level of cortisol response to the task among European Americans but not among East Asian Americans. The results demonstrate how the same act that is intended to create the same psychological experience could inadvertently lead to systematically different psychological experiences for people from different cultures. Copyright 2008 APA, all rights reserved.
Kreitewolf, Jens; Friederici, Angela D; von Kriegstein, Katharina
2014-11-15
Hemispheric specialization for linguistic prosody is a controversial issue. While it is commonly assumed that linguistic prosody and emotional prosody are preferentially processed in the right hemisphere, neuropsychological work directly comparing processes of linguistic prosody and emotional prosody suggests a predominant role of the left hemisphere for linguistic prosody processing. Here, we used two functional magnetic resonance imaging (fMRI) experiments to clarify the role of left and right hemispheres in the neural processing of linguistic prosody. In the first experiment, we sought to confirm previous findings showing that linguistic prosody processing compared to other speech-related processes predominantly involves the right hemisphere. Unlike previous studies, we controlled for stimulus influences by employing a prosody and speech task using the same speech material. The second experiment was designed to investigate whether a left-hemispheric involvement in linguistic prosody processing is specific to contrasts between linguistic prosody and emotional prosody or whether it also occurs when linguistic prosody is contrasted against other non-linguistic processes (i.e., speaker recognition). Prosody and speaker tasks were performed on the same stimulus material. In both experiments, linguistic prosody processing was associated with activity in temporal, frontal, parietal and cerebellar regions. Activation in temporo-frontal regions showed differential lateralization depending on whether the control task required recognition of speech or speaker: recognition of linguistic prosody predominantly involved right temporo-frontal areas when it was contrasted against speech recognition; when contrasted against speaker recognition, recognition of linguistic prosody predominantly involved left temporo-frontal areas. The results show that linguistic prosody processing involves functions of both hemispheres and suggest that recognition of linguistic prosody is based on an inter-hemispheric mechanism which exploits both a right-hemispheric sensitivity to pitch information and a left-hemispheric dominance in speech processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Ethnography of Communication: Cultural Codes and Norms.
ERIC Educational Resources Information Center
Carbaugh, Donal
The primary tasks of the ethnographic researcher are to discover, describe, and comparatively analyze different speech communities' ways of speaking. Two general abstractions occurring in ethnographic analyses are normative and cultural. Communicative norms are formulated in analyzing and explaining the "patterned use of speech."…
Brain Oscillations during Semantic Evaluation of Speech
ERIC Educational Resources Information Center
Shahin, Antoine J.; Picton, Terence W.; Miller, Lee M.
2009-01-01
Changes in oscillatory brain activity have been related to perceptual and cognitive processes such as selective attention and memory matching. Here we examined brain oscillations, measured with electroencephalography (EEG), during a semantic speech processing task that required both lexically mediated memory matching and selective attention.…
The Influence of Child-Directed Speech on Word Learning and Comprehension.
Foursha-Stevenson, Cassandra; Schembri, Taylor; Nicoladis, Elena; Eriksen, Cody
2017-04-01
This paper describes an investigation into the function of child-directed speech (CDS) across development. In the first experiment, 10-21-month-olds were presented with familiar words in CDS and trained on novel words in CDS or adult-directed speech (ADS). All children preferred the matching display for familiar words. However, only older toddlers in the CDS condition preferred the matching display for novel words. In Experiment 2, children 3-6 years of age were presented with a sentence comprehension task in CDS or ADS. Older children performed better overall than younger children with 5- and 6-year-olds performing above chance regardless of speech condition, while 3- and 4-year-olds only performed above chance when the sentences were presented in CDS. These findings provide support for the theory that CDS is most effective at the beginning of acquisition for particular constructions (e.g. vocabulary acquisition, syntactic comprehension) rather than at a particular age or for a particular task.
Yoo, Sejin; Chung, Jun-Young; Jeon, Hyeon-Ae; Lee, Kyoung-Min; Kim, Young-Bo; Cho, Zang-Hee
2012-07-01
Speech production is inextricably linked to speech perception, yet they are usually investigated in isolation. In this study, we employed a verbal-repetition task to identify the neural substrates of speech processing with two ends active simultaneously using functional MRI. Subjects verbally repeated auditory stimuli containing an ambiguous vowel sound that could be perceived as either a word or a pseudoword depending on the interpretation of the vowel. We found verbal repetition commonly activated the audition-articulation interface bilaterally at Sylvian fissures and superior temporal sulci. Contrasting word-versus-pseudoword trials revealed neural activities unique to word repetition in the left posterior middle temporal areas and activities unique to pseudoword repetition in the left inferior frontal gyrus. These findings imply that the tasks are carried out using different speech codes: an articulation-based code of pseudowords and an acoustic-phonetic code of words. It also supports the dual-stream model and imitative learning of vocabulary. Copyright © 2012 Elsevier Inc. All rights reserved.
Morrison, Amanda S.; Brozovich, Faith A.; Lee, Ihno A.; Jazaieri, Hooria; Goldin, Philippe R.; Heimberg, Richard G.; Gross, James J.
2016-01-01
The subjective experience of anxiety plays a central role in cognitive behavioral models of social anxiety disorder (SAD). However, much remains to be learned about the temporal dynamics of anxiety elicited by feared social situations. The aims of the current study were: 1) to compare anxiety trajectories during a speech task in individuals with SAD (n = 135) versus healthy controls (HCs; n = 47), and 2) to compare the effects of CBT on anxiety trajectories with a waitlist control condition. SAD was associated with higher levels of anxiety and greater increases in anticipatory anxiety compared to HCs, but not differential change in anxiety from pre- to post-speech. CBT was associated with decreases in anxiety from pre- to post-speech but not with changes in absolute levels of anticipatory anxiety or rates of change in anxiety during anticipation. The findings suggest that anticipatory experiences should be further incorporated into exposures. PMID:26760456