Abnormal laughter-like vocalisations replacing speech in primary progressive aphasia
Rohrer, Jonathan D.; Warren, Jason D.; Rossor, Martin N.
2009-01-01
We describe ten patients with a clinical diagnosis of primary progressive aphasia (PPA) (pathologically confirmed in three cases) who developed abnormal laughter-like vocalisations in the context of progressive speech output impairment leading to mutism. Failure of speech output was accompanied by increasing frequency of the abnormal vocalisations until ultimately they constituted the patient's only extended utterance. The laughter-like vocalisations did not show contextual sensitivity but occurred as an automatic vocal output that replaced speech. Acoustic analysis of the vocalisations in two patients revealed abnormal motor features including variable note duration and inter-note interval, loss of temporal symmetry of laugh notes and loss of the normal decrescendo. Abnormal laughter-like vocalisations may be a hallmark of a subgroup in the PPA spectrum with impaired control and production of nonverbal vocal behaviour due to disruption of fronto-temporal networks mediating vocalisation. PMID:19435636
Abnormal laughter-like vocalisations replacing speech in primary progressive aphasia.
Rohrer, Jonathan D; Warren, Jason D; Rossor, Martin N
2009-09-15
We describe ten patients with a clinical diagnosis of primary progressive aphasia (PPA) (pathologically confirmed in three cases) who developed abnormal laughter-like vocalisations in the context of progressive speech output impairment leading to mutism. Failure of speech output was accompanied by increasing frequency of the abnormal vocalisations until ultimately they constituted the patient's only extended utterance. The laughter-like vocalisations did not show contextual sensitivity but occurred as an automatic vocal output that replaced speech. Acoustic analysis of the vocalisations in two patients revealed abnormal motor features including variable note duration and inter-note interval, loss of temporal symmetry of laugh notes and loss of the normal decrescendo. Abnormal laughter-like vocalisations may be a hallmark of a subgroup in the PPA spectrum with impaired control and production of nonverbal vocal behaviour due to disruption of fronto-temporal networks mediating vocalisation.
NASA Astrophysics Data System (ADS)
Sauter, Disa
This PhD is an investigation of vocal expressions of emotions, mainly focusing on non-verbal sounds such as laughter, cries and sighs. The research examines the roles of categorical and dimensional factors, the contributions of a number of acoustic cues, and the influence of culture. A series of studies established that naive listeners can reliably identify non-verbal vocalisations of positive and negative emotions in forced-choice and rating tasks. Some evidence for underlying dimensions of arousal and valence is found, although each emotion had a discrete expression. The role of acoustic characteristics of the sounds is investigated experimentally and analytically. This work shows that the cues used to identify different emotions vary, although pitch and pitch variation play a central role. The cues used to identify emotions in non-verbal vocalisations differ from the cues used when comprehending speech. An additional set of studies using stimuli consisting of emotional speech demonstrates that these sounds can also be reliably identified, and rely on similar acoustic cues. A series of studies with a pre-literate Namibian tribe shows that non-verbal vocalisations can be recognized across cultures. An fMRI study carried out to investigate the neural processing of non-verbal vocalisations of emotions is presented. The results show activation in pre-motor regions arising from passive listening to non-verbal emotional vocalisations, suggesting neural auditory-motor interactions in the perception of these sounds. In sum, this thesis demonstrates that non-verbal vocalisations of emotions are reliably identifiable tokens of information that belong to discrete categories. These vocalisations are recognisable across vastly different cultures and thus seem to, like facial expressions of emotions, comprise human universals. Listeners rely mainly on pitch and pitch variation to identify emotions in non verbal vocalisations, which differs with the cues used to comprehend speech. When listening to others' emotional vocalisations, a neural system of preparatory motor activation is engaged.
Observing conversational laughter in frontotemporal dementia
Pressman, Peter S; Simpson, Michaela; Gola, Kelly; Shdo, Suzanne M; Spinelli, Edoardo G; Miller, Bruce L; Gorno-Tempini, Maria Luisa; Rankin, Katherine; Levenson, Robert W
2017-01-01
Background We performed an observational study of laughter during seminaturalistic conversations between patients with dementia and familial caregivers. Patients were diagnosed with (1) behavioural variant fronto-temporal dementia (bvFTD), (2) right temporal variant frontotemporal dementia (rtFTD), (3) semantic variant of primary progressive aphasia (svPPA), (4) non-fluent variant primary progressive aphasia (nfvPPA) or (5) early onset Alzheimer’s disease (eoAD). We hypothesised that those with bvFTD would laugh less in response to their own speech than other dementia groups or controls, while those with rtFTD would laugh less regardless of who was speaking. Methods Patients with bvFTD (n=39), svPPA (n=19), rtFTD (n=14), nfvPPA (n=16), eoAD (n=17) and healthy controls (n=156) were recorded (video and audio) while discussing a problem in their relationship with a healthy control companion. Using the audio track only, laughs were identified by trained coders and then further classed by an automated algorithm as occurring during or shortly after the participant’s own vocalisation (‘self’ context) or during or shortly after the partner’s vocalisation (‘partner’ context). Results Individuals with bvFTD, eoAD or rtFTD laughed less across both contexts of self and partner than the other groups. Those with bvFTD laughed less relative to their own speech compared with healthy controls. Those with nfvPPA laughed more in the partner context compared with healthy controls. Conclusions Laughter in response to one’s own vocalisations or those of a conversational partner may be a clinically useful measure in dementia diagnosis. PMID:28235777
ERIC Educational Resources Information Center
Hallewell, Madeline J.; Lackovic, Natasa
2017-01-01
This article explores how 145 photographs collected from 20 PowerPoint lectures in undergraduate psychology at 16 UK universities were integrated with lecturers' speech. Little is currently known about how lecturers refer to the distinct types of photographs included in their presentations. Findings show that only 48 photographs (33%) included in…
Observing conversational laughter in frontotemporal dementia.
Pressman, Peter S; Simpson, Michaela; Gola, Kelly; Shdo, Suzanne M; Spinelli, Edoardo G; Miller, Bruce L; Gorno-Tempini, Maria Luisa; Rankin, Katherine; Levenson, Robert W
2017-05-01
We performed an observational study of laughter during seminaturalistic conversations between patients with dementia and familial caregivers. Patients were diagnosed with (1) behavioural variant frontotemporal dementia (bvFTD), (2) right temporal variant frontotemporal dementia (rtFTD), (3) semantic variant of primary progressive aphasia (svPPA), (4) non-fluent variant primary progressive aphasia (nfvPPA) or (5) early onset Alzheimer's disease (eoAD). We hypothesised that those with bvFTD would laugh less in response to their own speech than other dementia groups or controls, while those with rtFTD would laugh less regardless of who was speaking. Patients with bvFTD (n=39), svPPA (n=19), rtFTD (n=14), nfvPPA (n=16), eoAD (n=17) and healthy controls (n=156) were recorded (video and audio) while discussing a problem in their relationship with a healthy control companion. Using the audio track only, laughs were identified by trained coders and then further classed by an automated algorithm as occurring during or shortly after the participant's own vocalisation ('self' context) or during or shortly after the partner's vocalisation ('partner' context). Individuals with bvFTD, eoAD or rtFTD laughed less across both contexts of self and partner than the other groups. Those with bvFTD laughed less relative to their own speech comparedwith healthy controls. Those with nfvPPA laughed more in the partner context compared with healthy controls. Laughter in response to one's own vocalisations or those of a conversational partner may be a clinically useful measure in dementia diagnosis. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Scheerer, Nichole E; Jones, Jeffery A
2014-12-01
Speech production requires the combined effort of a feedback control system driven by sensory feedback, and a feedforward control system driven by internal models. However, the factors that dictate the relative weighting of these feedback and feedforward control systems are unclear. In this event-related potential (ERP) study, participants produced vocalisations while being exposed to blocks of frequency-altered feedback (FAF) perturbations that were either predictable in magnitude (consistently either 50 or 100 cents) or unpredictable in magnitude (50- and 100-cent perturbations varying randomly within each vocalisation). Vocal and P1-N1-P2 ERP responses revealed decreases in the magnitude and trial-to-trial variability of vocal responses, smaller N1 amplitudes, and shorter vocal, P1 and N1 response latencies following predictable FAF perturbation magnitudes. In addition, vocal response magnitudes correlated with N1 amplitudes, vocal response latencies, and P2 latencies. This pattern of results suggests that after repeated exposure to predictable FAF perturbations, the contribution of the feedforward control system increases. Examination of the presentation order of the FAF perturbations revealed smaller compensatory responses, smaller P1 and P2 amplitudes, and shorter N1 latencies when the block of predictable 100-cent perturbations occurred prior to the block of predictable 50-cent perturbations. These results suggest that exposure to large perturbations modulates responses to subsequent perturbations of equal or smaller size. Similarly, exposure to a 100-cent perturbation prior to a 50-cent perturbation within a vocalisation decreased the magnitude of vocal and N1 responses, but increased P1 and P2 latencies. Thus, exposure to a single perturbation can affect responses to subsequent perturbations. © 2014 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Sowden, Hannah; Clegg, Judy; Perkins, Michael
2013-12-01
Co-speech gestures have a close semantic relationship to speech in adult conversation. In typically developing children co-speech gestures which give additional information to speech facilitate the emergence of multi-word speech. A difficulty with integrating audio-visual information is known to exist for individuals with Autism Spectrum Disorder (ASD), which may affect development of the speech-gesture system. A longitudinal observational study was conducted with four children with ASD, aged 2;4 to 3;5 years. Participants were video-recorded for 20 min every 2 weeks during their attendance on an intervention programme. Recording continued for up to 8 months, thus affording a rich analysis of gestural practices from pre-verbal to multi-word speech across the group. All participants combined gesture with either speech or vocalisations. Co-speech gestures providing additional information to speech were observed to be either absent or rare. Findings suggest that children with ASD do not make use of the facilitating communicative effects of gesture in the same way as typically developing children.
A speech-controlled environmental control system for people with severe dysarthria.
Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca
2007-06-01
Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.
Parent-infant vocalisations at 12 months predict psychopathology at 7 years.
Allely, C S; Purves, D; McConnachie, A; Marwick, H; Johnson, P; Doolin, O; Puckering, C; Golding, J; Gillberg, C; Wilson, P
2013-03-01
This study investigated the utility of adult and infant vocalisation in the prediction of child psychopathology. Families were sampled from the Avon Longitudinal Study of Parents and Children (ALSPAC) birth cohort. Vocalisation patterns were obtained from 180 videos (60 cases and 120 randomly selected sex-matched controls) of parent-infant interactions when infants were one year old. Cases were infants who had been subsequently diagnosed aged seven years, with at least one psychiatric diagnostic categorisation using the Development and Wellbeing Assessment. Psychopathologies included in the case group were disruptive behaviour disorders, oppositional-conduct disorders, Attention Deficit Hyperactivity Disorder, pervasive development disorder, and emotional disorders. Associations between infant and parent vocalisations and later psychiatric diagnoses were investigated. Low frequencies of maternal vocalisation predicted later development of infant psychopathology. A reduction of five vocalisations per minute predicted a 44% (95%CI: 11-94%; p-value=0.006) increase in the odds of an infant being a case. No association was observed between infant vocalisations and overall case status. In sum, altered vocalisation frequency in mother-infant interactions at one year is a potential risk marker for later diagnosis of a range of child psychopathologies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Polar bear mother-offspring interactions in maternity dens in captivity.
van Gessel, Chad
2015-01-01
Two female polar bears at Dierenrijk Zoo in the Netherlands were monitored at their maternity den one day before the birth of their cubs and three days postpartum. Each bear was monitored for 96 hr to document behaviour and vocalisations. The goal was to obtain insight into the differences between the mother that lost her litter and the other that successfully reared her cubs. Six groups of cub vocalisations were identified: Comfort, Discomfort, Distress, Nursing Attempts, Nursing, and No Vocalisation. Maternal vocalisations were split into three groups: Calm, Grooming, and Stress. Maternal behaviours were also split into three groups: Active, Rest, and Stress. The unsuccessful mother produced more stress vocalisations before and during the birth of her cub, whereas the successful mother appeared less stressed. Vocalisations indicate that the cub that died tried to nurse but was unsuccessful. The unsuccessful mother showed less stress as her cub got weaker and vocalised less. From this I suggest that maternal stress was a factor in cub mortality. © 2015 Wiley Periodicals, Inc.
Dorph, Annalie
2017-01-01
Defining an acoustic repertoire is essential to understanding vocal signalling and communicative interactions within a species. Currently, quantitative and statistical definition is lacking for the vocalisations of many dasyurids, an important group of small to medium-sized marsupials from Australasia that includes the eastern quoll (Dasyurus viverrinus), a species of conservation concern. Beyond generating a better understanding of this species' social interactions, determining an acoustic repertoire will further improve detection rates and inference of vocalisations gathered by automated bioacoustic recorders. Hence, this study investigated eastern quoll vocalisations using objective signal processing techniques to quantitatively analyse spectrograms recorded from 15 different individuals. Recordings were collected in conjunction with observations of the behaviours associated with each vocalisation to develop an acoustic-based behavioural repertoire for the species. Analysis of recordings produced a putative classification of five vocalisation types: Bark, Growl, Hiss, Cp-cp, and Chuck. These were most frequently observed during agonistic encounters between conspecifics, most likely as a graded sequence from Hisses occurring in a warning context through to Growls and finally Barks being given prior to, or during, physical confrontations between individuals. Quantitative and statistical methods were used to objectively establish the accuracy of these five putative call types. A multinomial logistic regression indicated a 97.27% correlation with the perceptual classification, demonstrating support for the five different vocalisation types. This putative classification was further supported by hierarchical cluster analysis and silhouette information that determined the optimal number of clusters to be five. Minor disparity between the objective and perceptual classifications was potentially the result of gradation between vocalisations, or subtle differences present within vocalisations not discernible to the human ear. The implication of these different vocalisations and their given context is discussed in relation to the ecology of the species and the potential application of passive acoustic monitoring techniques. PMID:28686679
van Oosterom, L; Montgomery, J C; Jeffs, A G; Radford, C A
2016-01-11
Soundscapes provide a new tool for the study of fish communities. Bigeyes (Pempheris adspersa) are nocturnal planktivorous reef fish, feed in loose shoals and are soniferous. These vocalisations have been suggested to be contact calls to maintain group cohesion, however direct evidence for this is absent, despite the fact that contact calls are well documented for many other vertebrates, including marine mammals. For fish, direct evidence for group cohesion signals is restricted to the use of visual and hydrodynamic cues. In support of adding vocalisation as a contributing cue, our laboratory experiments show that bigeyes significantly increased group cohesion when exposed to recordings of ambient reef sound at higher sound levels while also decreasing vocalisations. These patterns of behaviour are consistent with acoustic masking. When exposed to playback of conspecific vocalisations, the group cohesion and vocalisation rates of bigeyes both significantly increased. These results provide the first direct experimental support for the hypotheses that vocalisations are used as contact calls to maintain group cohesion in fishes, making fish the evolutionarily oldest vertebrate group in which this phenomenon has been observed, and adding a new dimension to the interpretation of nocturnal reef soundscapes.
NASA Astrophysics Data System (ADS)
van Oosterom, L.; Montgomery, J. C.; Jeffs, A. G.; Radford, C. A.
2016-01-01
Soundscapes provide a new tool for the study of fish communities. Bigeyes (Pempheris adspersa) are nocturnal planktivorous reef fish, feed in loose shoals and are soniferous. These vocalisations have been suggested to be contact calls to maintain group cohesion, however direct evidence for this is absent, despite the fact that contact calls are well documented for many other vertebrates, including marine mammals. For fish, direct evidence for group cohesion signals is restricted to the use of visual and hydrodynamic cues. In support of adding vocalisation as a contributing cue, our laboratory experiments show that bigeyes significantly increased group cohesion when exposed to recordings of ambient reef sound at higher sound levels while also decreasing vocalisations. These patterns of behaviour are consistent with acoustic masking. When exposed to playback of conspecific vocalisations, the group cohesion and vocalisation rates of bigeyes both significantly increased. These results provide the first direct experimental support for the hypotheses that vocalisations are used as contact calls to maintain group cohesion in fishes, making fish the evolutionarily oldest vertebrate group in which this phenomenon has been observed, and adding a new dimension to the interpretation of nocturnal reef soundscapes.
van Oosterom, L.; Montgomery, J. C.; Jeffs, A. G.; Radford, C. A.
2016-01-01
Soundscapes provide a new tool for the study of fish communities. Bigeyes (Pempheris adspersa) are nocturnal planktivorous reef fish, feed in loose shoals and are soniferous. These vocalisations have been suggested to be contact calls to maintain group cohesion, however direct evidence for this is absent, despite the fact that contact calls are well documented for many other vertebrates, including marine mammals. For fish, direct evidence for group cohesion signals is restricted to the use of visual and hydrodynamic cues. In support of adding vocalisation as a contributing cue, our laboratory experiments show that bigeyes significantly increased group cohesion when exposed to recordings of ambient reef sound at higher sound levels while also decreasing vocalisations. These patterns of behaviour are consistent with acoustic masking. When exposed to playback of conspecific vocalisations, the group cohesion and vocalisation rates of bigeyes both significantly increased. These results provide the first direct experimental support for the hypotheses that vocalisations are used as contact calls to maintain group cohesion in fishes, making fish the evolutionarily oldest vertebrate group in which this phenomenon has been observed, and adding a new dimension to the interpretation of nocturnal reef soundscapes. PMID:26750559
Bright, A
2008-05-01
1. In this study, the calling rates of vocalisations known to indicate distress and aversive events (Alarm calls, Squawks, Total vocalisations) and acoustic parameters of flock noise were quantified from feather and non-feather pecking laying flocks. 2. One hour of flock noise (background machinery and hen vocalisations) was recorded from 21 commercial free-range laying hen flocks aged > or =35 weeks. Ten of the flocks were classified as feather pecking (based on a plumage condition score) and 11 as non-feather pecking. 3. Recordings were made using a Sony DAT recorder and Audio-Technica omni-directional microphone, placed in the centre of the house-1.5 m from the ground. Avisoft-SASlab Pro was used to create and analyse audio spectrograms. 4. There was no effect of flock size or farm on call/s or acoustic parameters of flock noise. However, strain had an effect on the number of Total vocalisation/s; the Hebden Black flock made more calls than Lohmann flocks. Feather pecking flocks gave more Squawk/s and more Total vocalisation/s than non-feather pecking flocks. Feather pecking did not explain variation in alarm call rate or, intensity (dB) and frequency (Hz) measures of flock noise. 5. The differences between Squawk and Total vocalisation call rates of feather and non-feather pecking flocks are a new finding. An increase or change in flock calling rate may be evident before other conventional measures of laying hen welfare such as a drop in egg production or increase in plumage damage, thus enabling farmers to make management or husbandry changes to prevent an outbreak of feather pecking.
Children's Spontaneous Vocalisations during Play: Aesthetic Dimensions
ERIC Educational Resources Information Center
Countryman, June; Gabriel, Martha; Thompson, Katherine
2016-01-01
This paper explores the phenomenon of spontaneous vocalisations in the self-chosen, unstructured outdoor play of children aged 3-12. Spontaneous vocalisations encompass the whole range of children's unprompted, natural, expressive vocal soundings beyond spoken language. Non-participant observations at childcare centres and on elementary school…
Parent-Infant Vocalisations at 12 Months Predict Psychopathology at 7 Years
ERIC Educational Resources Information Center
Allely, C. S.; Purves, D.; McConnachie, A.; Marwick, H.; Johnson, P.; Doolin, O.; Puckering, C.; Golding, J.; Gillberg, C.; Wilson, P.
2013-01-01
This study investigated the utility of adult and infant vocalisation in the prediction of child psychopathology. Families were sampled from the Avon Longitudinal Study of Parents and Children (ALSPAC) birth cohort. Vocalisation patterns were obtained from 180 videos (60 cases and 120 randomly selected sex-matched controls) of parent-infant…
Rupniak, N M; Carlson, E C; Harrison, T; Oates, B; Seward, E; Owen, S; de Felipe, C; Hunt, S; Wheeldon, A
2000-06-08
The regulation of stress-induced vocalisations by central NK(1) receptors was investigated using pharmacological antagonists in guinea-pigs, a species with human-like NK(1) receptors, and transgenic NK1R-/- mice. In guinea-pigs, i.c.v. infusion of the selective substance P agonist GR73632 (0.1 nmol) elicited a pronounced vocalisation response that was blocked enantioselectively by the NK(1) receptor antagonists CP-99,994 and L-733,060 (0.1-10 mg/kg). GR73632-induced vocalisations were also markedly attenuated by the antidepressant drugs imipramine and fluoxetine (30 mg/kg), but not by the benzodiazepine anxiolytic diazepam (3 mg/kg) or the 5-HT(1A) agonist buspirone (10 mg/kg). Similarly, vocalisations in guinea-pig pups separated from their mothers were blocked enantioselectively by the highly brain-penetrant NK(1) receptor antagonists L-733,060 and GR205171 (ID(50) 3 mg/kg), but not by the poorly brain-penetrant compounds LY303870 and CGP49823 (30 mg/kg). Separation-induced vocalisations were also blocked by the anxiolytic drugs diazepam, chlordiazepoxide and buspirone (ID(50) 0.5-1 mg/kg), and by the antidepressant drugs phenelzine, imipramine, fluoxetine and venlafaxine (ID(50) 3-8 mg/kg). In normal mouse pups, GR205171 attenuated neonatal vocalisations when administered at a high dose (30 mg/kg) only, consistent with its lower affinity for the rat than the guinea-pig NK(1) receptor. Ultrasound calls in NK1R-/- mouse pups were markedly reduced compared with those in WT pups, confirming the specific involvement of NK(1) receptors in the regulation of vocalisation. These observations suggest that centrally-acting NK(1) receptor antagonists may have clinical utility in the treatment of a range of anxiety and mood disorders.
The effects of emotion on memory for music and vocalisations.
Aubé, William; Peretz, Isabelle; Armony, Jorge L
2013-01-01
Music is a powerful tool for communicating emotions which can elicit memories through associative mechanisms. However, it is currently unknown whether emotion can modulate memory for music without reference to a context or personal event. We conducted three experiments to investigate the effect of basic emotions (fear, happiness, and sadness) on recognition memory for music, using short, novel stimuli explicitly created for research purposes, and compared them with nonlinguistic vocalisations. Results showed better memory accuracy for musical clips expressing fear and, to some extent, happiness. In the case of nonlinguistic vocalisations we confirmed a memory advantage for all emotions tested. A correlation between memory accuracy for music and vocalisations was also found, particularly in the case of fearful expressions. These results confirm that emotional expressions, particularly fearful ones, conveyed by music can influence memory as has been previously shown for other forms of expressions, such as faces and vocalisations.
Radford, Craig A; Ghazali, Shahriman M; Montgomery, John C; Jeffs, Andrew G
2016-01-01
Fish vocalisation is often a major component of underwater soundscapes. Therefore, interpretation of these soundscapes requires an understanding of the vocalisation characteristics of common soniferous fish species. This study of captive female bluefin gurnard, Chelidonichthys kumu, aims to formally characterise their vocalisation sounds and daily pattern of sound production. Four types of sound were produced and characterised, twice as many as previously reported in this species. These sounds fit two aural categories; grunt and growl, the mean peak frequencies for which ranged between 129 to 215 Hz. This species vocalized throughout the 24 hour period at an average rate of (18.5 ± 2.0 sounds fish-1 h-1) with an increase in vocalization rate at dawn and dusk. Competitive feeding did not elevate vocalisation as has been found in other gurnard species. Bluefin gurnard are common in coastal waters of New Zealand, Australia and Japan and, given their vocalization rate, are likely to be significant contributors to ambient underwater soundscape in these areas.
Radford, Craig A.; Ghazali, Shahriman M.; Montgomery, John C.; Jeffs, Andrew G.
2016-01-01
Fish vocalisation is often a major component of underwater soundscapes. Therefore, interpretation of these soundscapes requires an understanding of the vocalisation characteristics of common soniferous fish species. This study of captive female bluefin gurnard, Chelidonichthys kumu, aims to formally characterise their vocalisation sounds and daily pattern of sound production. Four types of sound were produced and characterised, twice as many as previously reported in this species. These sounds fit two aural categories; grunt and growl, the mean peak frequencies for which ranged between 129 to 215 Hz. This species vocalized throughout the 24 hour period at an average rate of (18.5 ± 2.0 sounds fish-1 h-1) with an increase in vocalization rate at dawn and dusk. Competitive feeding did not elevate vocalisation as has been found in other gurnard species. Bluefin gurnard are common in coastal waters of New Zealand, Australia and Japan and, given their vocalization rate, are likely to be significant contributors to ambient underwater soundscape in these areas. PMID:26890124
Profiles of vocal development in Korean children with and without cleft palate.
Ha, Seunghee
2018-01-01
This study longitudinally investigated vocal development in Korean children from 9 to 18 months of age with and without cleft palate (CP). Utterance samples were collected from 24 children with and without CP at 9, 12, 15 and 18 months of age. Each utterance was categorised into levels of vocalisation using the Korean-translated version of the Stark Assessment of Early Vocal Development-Revised (SAEVD-R). The results showed children with CP produced a significantly higher rate of precanonical vocalisations (the combination of Levels 1, 2, and 3) and a lower rate of Level 4 and 5 vocalisations than children without CP. Both groups showed decreases in Levels 1 and 2 and increases in Level 5 from 9 to 18 months of age. A significant increase in the proportion of Level 4 vocalisations across age was observed only in children without CP. Young Korean children with CP showed lower proportions of advanced vocalisation levels characterised by canonical and complex syllable structures across 9 and 18 months of age compared to children without CP.
Roche, Laura; Zhang, Dajie; Bartl-Pokorny, Katrin D; Pokorny, Florian B; Schuller, Björn W; Esposito, Gianluca; Bölte, Sven; Roeyers, Herbert; Poustka, Luise; Gugatschka, Markus; Waddington, Hannah; Vollmann, Ralf; Einspieler, Christa; Marschik, Peter B
2018-03-01
This article provides an overview of studies assessing the early vocalisations of children with autism spectrum disorder (ASD), Rett syndrome (RTT), and fragile X syndrome (FXS) using retrospective video analysis (RVA) during the first two years of life. Electronic databases were systematically searched and a total of 23 studies were selected. These studies were then categorised according to whether children were later diagnosed with ASD (13 studies), RTT (8 studies), or FXS (2 studies), and then described in terms of (a) participant characteristics, (b) control group characteristics, (c) video footage, (d) behaviours analysed, and (e) main findings. This overview supports the use of RVA in analysing the early development of vocalisations in children later diagnosed with ASD, RTT or FXS, and provides an in-depth analysis of vocalisation presentation, complex vocalisation production, and the rate and/or frequency of vocalisation production across the three disorders. Implications are discussed in terms of extending crude vocal analyses to more precise methods that might provide more powerful means by which to discriminate between disorders during early development. A greater understanding of the early manifestation of these disorders may then lead to improvements in earlier detection.
Carey, Daniel; McGettigan, Carolyn
2017-04-01
The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic "talent". In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI - specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Paroxysmal myoclonic dystonia with vocalisations: new entity or variant of preexisting syndromes?
Feinberg, T E; Shapiro, A K; Shapiro, E
1986-01-01
From among 1377 patients with movement disorders, four patients had an unusual movement disorder characterised by paroxysmal bursts of involuntary, regular, repetitive, rhythmic, bilateral, coordinated, simultaneous, stereotypic myoclonus and vocalisations, often associated with tonic symptoms, interference with voluntary functioning, presence of hyperactivity, attention and learning disabilities, and resistance to treatment with haloperidol and other drugs. This symptom complex may represent a new disease entity, referred to here as paroxysmal myoclonic dystonia with vocalisations or a variant or combination of other movement disorders such as Gilles de la Tourette, myoclonic, or dystonic syndromes. PMID:3457101
The vocal repertoire of the African Penguin (Spheniscus demersus): structure and function of calls.
Favaro, Livio; Ozella, Laura; Pessani, Daniela
2014-01-01
The African Penguin (Spheniscus demersus) is a highly social and vocal seabird. However, currently available descriptions of the vocal repertoire of African Penguin are mostly limited to basic descriptions of calls. Here we provide, for the first time, a detailed description of the vocal behaviour of this species by collecting audio and video recordings from a large captive colony. We combine visual examinations of spectrograms with spectral and temporal acoustic analyses to determine vocal categories. Moreover, we used a principal component analysis, followed by signal classification with a discriminant function analysis, for statistical validation of the vocalisation types. In addition, we identified the behavioural contexts in which calls were uttered. The results show that four basic vocalisations can be found in the vocal repertoire of adult African Penguin, namely a contact call emitted by isolated birds, an agonistic call used in aggressive interactions, an ecstatic display song uttered by single birds, and a mutual display song vocalised by pairs, at their nests. Moreover, we identified two distinct vocalisations interpreted as begging calls by nesting chicks (begging peep) and unweaned juveniles (begging moan). Finally, we discussed the importance of specific acoustic parameters in classifying calls and the possible use of the source-filter theory of vocal production to study penguin vocalisations.
Lavan, Nadine; Lima, César F; Harvey, Hannah; Scott, Sophie K; McGettigan, Carolyn
2015-01-01
It is well established that categorising the emotional content of facial expressions may differ depending on contextual information. Whether this malleability is observed in the auditory domain and in genuine emotion expressions is poorly explored. We examined the perception of authentic laughter and crying in the context of happy, neutral and sad facial expressions. Participants rated the vocalisations on separate unipolar scales of happiness and sadness and on arousal. Although they were instructed to focus exclusively on the vocalisations, consistent context effects were found: For both laughter and crying, emotion judgements were shifted towards the information expressed by the face. These modulations were independent of response latencies and were larger for more emotionally ambiguous vocalisations. No effects of context were found for arousal ratings. These findings suggest that the automatic encoding of contextual information during emotion perception generalises across modalities, to purely non-verbal vocalisations, and is not confined to acted expressions.
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese
ERIC Educational Resources Information Center
Saito, Akie; Inoue, Tomoyoshi
2017-01-01
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the…
Children's Improvised Vocalisations: Learning, Communication and Technology of the Self
ERIC Educational Resources Information Center
Knudsen, Jan Sverre
2008-01-01
The intention of this article is to explore, challenge and expand our understandings of children's improvised vocalisations, a fundamentally human form of expression. Based on selected examples from observation and recording in non-institutional settings, the article outlines how this phenomenon can be understood as learning and as communication.…
Replacing Maladaptive Speech with Verbal Labeling Responses: An Analysis of Generalized Responding.
ERIC Educational Resources Information Center
Foxx, R. M.; And Others
1988-01-01
Three mentally handicapped students (aged 13, 36, and 40) with maladaptive speech received training to answer questions with verbal labels. The results of their cues-pause-point training showed that the students replaced their maladaptive speech with correct labels (answers) to questions in the training setting and three generalization settings.…
A Research Focused on Improving Vocalisation Level on Violin Education
ERIC Educational Resources Information Center
Parasiz, Gökalp
2018-01-01
The research aimed to improve vocalisation levels of music teacher's candidates on performance works for violin education moving from difficulties faced by prospective teachers. At the same time, it was aimed to provide new perspectives to violin educators. Study group was composed of six 3rd grade students studying violin education in a State…
Vocalisation sound pattern identification in young broiler chickens.
Fontana, I; Tullo, E; Scrase, A; Butterworth, A
2016-09-01
In this study, we describe the monitoring of young broiler chicken vocalisation, with sound recorded and assessed at regular intervals throughout the life of the birds from day 1 to day 38, with a focus on the first week of life. We assess whether there are recognisable, and even predictable, vocalisation patterns based on frequency and sound spectrum analysis, which can be observed in birds at different ages and stages of growth within the relatively short life of the birds in commercial broiler production cycles. The experimental trials were carried out in a farm where the broiler where reared indoor, and audio recording procedures carried out over 38 days. The recordings were made using two microphones connected to a digital recorder, and the sonic data was collected in situations without disturbance of the animals beyond that created by the routine activities of the farmer. Digital files of 1 h duration were cut into short files of 10 min duration, and these sound recordings were analysed and labelled using audio analysis software. Analysis of these short sound files showed that the key vocalisation frequency and patterns changed in relation to increasing age and the weight of the broilers. Statistical analysis showed a significant correlation (P<0.001) between the frequency of vocalisation and the age of the birds. Based on the identification of specific frequencies of the sounds emitted, in relation to age and weight, it is proposed that there is potential for audio monitoring and comparison with 'anticipated' sound patterns to be used to evaluate the status of farmed broiler chicken.
Pasco, Greg; Gordon, Rosanna K; Howlin, Patricia; Charman, Tony
2008-11-01
The Classroom Observation Schedule to Measure Intentional Communication (COSMIC) was devised to provide ecologically valid outcome measures for a communication-focused intervention trial. Ninety-one children with autism spectrum disorder aged 6 years 10 months (SD 16 months) were videoed during their everyday snack, teaching and free play activities. Inter-rater reliability was high and relevant items showed significant associations with comparable items from concurrent Autism Diagnostic Observation Schedule-Generic (Lord et al. 2000, J Autism Dev Disord 30(3):205-223) assessments. In a subsample of 28 children initial differences in rates of initiations, initiated speech/vocalisation and commenting were predictive of language and communication competence 15 months later. Results suggest that the use of observational measures of intentional communication in natural settings is a valuable assessment strategy for research and clinical practice.
Replacing maladaptive speech with verbal labeling responses: an analysis of generalized responding.
Foxx, R M; Faw, G D; McMorrow, M J; Kyle, M S; Bittle, R G
1988-01-01
We taught three mentally handicapped students to answer questions with verbal labels and evaluated the generalized effects of this training on their maladaptive speech (e.g., echolalia) and correct responding to untrained questions. The students received cues-pause-point training on an initial question set followed by generalization assessments on a different set in another setting. Probes were conducted on novel questions in three other settings to determine the strength and spread of the generalization effect. A multiple baseline across subjects design revealed that maladaptive speech was replaced with correct labels (answers) to questions in the training and all generalization settings. These results replicate and extend previous research that suggested that cues-pause-point procedures may be useful in replacing maladaptive speech patterns by teaching students to use their verbal labeling repertoires. PMID:3225258
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese.
Saito, Akie; Inoue, Tomoyoshi
2017-06-01
The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the same position within a syllable or word. Most of the evidence for the effect comes from analyses of naturally occurring speech errors in Indo-European languages, and there are few studies examining the effect in experimentally elicited speech errors and in other languages. This study examined whether experimentally elicited sound errors in Japanese exhibits the syllable position effect. In Japanese, the sub-syllabic unit known as "mora" is considered to be a basic sound unit in production. Results showed that the syllable position effect occurred in mora errors, suggesting that the frame constrains the ordering of sounds during speech production.
Communication Supports for People with Motor Speech Disorders
ERIC Educational Resources Information Center
Hanson, Elizabeth K.; Fager, Susan K.
2017-01-01
Communication supports for people with motor speech disorders can include strategies and technologies to supplement natural speech efforts, resolve communication breakdowns, and replace natural speech when necessary to enhance participation in all communicative contexts. This article emphasizes communication supports that can enhance…
Laker, Caroline; Callard, Felicity; Flach, Clare; Williams, Paul; Sayer, Jane; Wykes, Til
2014-02-20
Health services are subject to frequent changes, yet there has been insufficient research to address how staff working within these services perceive the climate for implementation. Staff perceptions, particularly of barriers to change, may affect successful implementation and the resultant quality of care. This study measures staff perceptions of barriers to change in acute mental healthcare. We identify whether occupational status and job satisfaction are related to these perceptions, as this might indicate a target for intervention that could aid successful implementation. As there were no available instruments capturing staff perceptions of barriers to change, we created a new measure (VOCALISE) to assess this construct. All nursing staff from acute in-patient settings in one large London mental health trust were eligible. Using a participatory method, a nurse researcher interviewed 32 staff to explore perceptions of barriers to change. This generated a measure through thematic analyses and staff feedback (N = 6). Psychometric testing was undertaken according to standard guidelines for measure development (N = 40, 42, 275). Random effects models were used to explore the associations between VOCALISE, occupational status, and job satisfaction (N = 125). VOCALISE was easy to understand and complete, and showed acceptable reliability and validity. The factor analysis revealed three underlying constructs: 'confidence,' 'de-motivation' and 'powerlessness.' Staff with negative perceptions of barriers to change held more junior positions, and had poorer job satisfaction. Qualitatively, nursing assistants expressed a greater sense of organisational unfairness in response to change. VOCALISE can be used to explore staff perceptions of implementation climate and to assess how staff attitudes shape the successful outcomes of planned changes. Negative perceptions were linked with poor job satisfaction and to those occupying more junior roles, indicating a negative climate for implementation in those groups. Staff from these groups may therefore need special attention prior to implementing changes in mental health settings.
Concurrent Processing of Words and Their Replacements during Speech
ERIC Educational Resources Information Center
Hartsuiker, Robert J.; Catchpole, Ciara M.; de Jong, Nivja H.; Pickering, Martin J.
2008-01-01
Two picture naming experiments, in which an initial picture was occasionally replaced with another (target) picture, were conducted to study the temporal coordination of abandoning one word and resuming with another word in speech production. In Experiment 1, participants abandoned saying the initial name, and resumed with the name of the target…
Tonal synchrony in mother-infant interaction based on harmonic and pentatonic series.
Van Puyvelde, Martine; Vanfleteren, Pol; Loots, Gerrit; Deschuyffeleer, Sara; Vinck, Bart; Jacquet, Wolfgang; Verhelst, Werner
2010-12-01
This study reports the occurrence of 'tonal synchrony' as a new dimension of early mother-infant interaction synchrony. The findings are based on a tonal and temporal analysis of vocal interactions between 15 mothers and their 3-month-old infants during 5 min of free-play in a laboratory setting. In total, 558 vocal exchanges were identified and analysed, of which 84% reflected harmonic or pentatonic series. Another 10% of the exchanges contained absolute and/or relative pitch and/or interval imitations. The total durations of dyads being in tonal synchrony were normally distributed (M=3.71, SD=2.44). Vocalisations based on harmonic series appeared organised around the major triad, containing significantly more simple frequency ratios (octave, fifth and third) than complex ones (non-major triad tones). Tonal synchrony and its characteristics are discussed in relation to infant-directed speech, communicative musicality, pre-reflective communication and its impact on the quality of early mother-infant interaction and child's development. Copyright © 2010 Elsevier Inc. All rights reserved.
Functional flexibility in wild bonobo vocal behaviour
Archbold, Jahmaira; Zuberbühler, Klaus
2015-01-01
A shared principle in the evolution of language and the development of speech is the emergence of functional flexibility, the capacity of vocal signals to express a range of emotional states independently of context and biological function. Functional flexibility has recently been demonstrated in the vocalisations of pre-linguistic human infants, which has been contrasted to the functionally fixed vocal behaviour of non-human primates. Here, we revisited the presumed chasm in functional flexibility between human and non-human primate vocal behaviour, with a study on our closest living primate relatives, the bonobo (Pan paniscus). We found that wild bonobos use a specific call type (the “peep”) across a range of contexts that cover the full valence range (positive-neutral-negative) in much of their daily activities, including feeding, travel, rest, aggression, alarm, nesting and grooming. Peeps were produced in functionally flexible ways in some contexts, but not others. Crucially, calls did not vary acoustically between neutral and positive contexts, suggesting that recipients take pragmatic information into account to make inferences about call meaning. In comparison, peeps during negative contexts were acoustically distinct. Our data suggest that the capacity for functional flexibility has evolutionary roots that predate the evolution of human speech. We interpret this evidence as an example of an evolutionary early transition away from fixed vocal signalling towards functional flexibility. PMID:26290789
Acoustic correlates of body size and individual identity in banded penguins
Gamba, Marco; Gili, Claudia; Pessani, Daniela
2017-01-01
Animal vocalisations play a role in individual recognition and mate choice. In nesting penguins, acoustic variation in vocalisations originates from distinctiveness in the morphology of the vocal apparatus. Using the source-filter theory approach, we investigated vocal individuality cues and correlates of body size and mass in the ecstatic display songs the Humboldt and Magellanic penguins. We demonstrate that both fundamental frequency (f0) and formants (F1-F4) are essential vocal features to discriminate among individuals. However, we show that only duration and f0 are honest indicators of the body size and mass, respectively. We did not find any effect of body dimension on formants, formant dispersion nor estimated vocal tract length of the emitters. Overall, our findings provide the first evidence that the resonant frequencies of the vocal tract do not correlate with body size in penguins. Our results add important information to a growing body of literature on the role of the different vocal parameters in conveying biologically meaningful information in bird vocalisations. PMID:28199318
Male songbird indicates body size with low-pitched advertising songs.
Hall, Michelle L; Kingma, Sjouke A; Peters, Anne
2013-01-01
Body size is a key sexually selected trait in many animal species. If size imposes a physical limit on the production of loud low-frequency sounds, then low-pitched vocalisations could act as reliable signals of body size. However, the central prediction of this hypothesis--that the pitch of vocalisations decreases with size among competing individuals--has limited support in songbirds. One reason could be that only the lowest-frequency components of vocalisations are constrained, and this may go unnoticed when vocal ranges are large. Additionally, the constraint may only be apparent in contexts when individuals are indeed advertising their size. Here we explicitly consider signal diversity and performance limits to demonstrate that body size limits song frequency in an advertising context in a songbird. We show that in purple-crowned fairy-wrens, Malurus coronatus coronatus, larger males sing lower-pitched low-frequency advertising songs. The lower frequency bound of all advertising song types also has a significant negative relationship with body size. However, the average frequency of all their advertising songs is unrelated to body size. This comparison of different approaches to the analysis demonstrates how a negative relationship between body size and song frequency can be obscured by failing to consider signal design and the concept of performance limits. Since these considerations will be important in any complex communication system, our results imply that body size constraints on low-frequency vocalisations could be more widespread than is currently recognised.
Male Songbird Indicates Body Size with Low-Pitched Advertising Songs
Hall, Michelle L.; Kingma, Sjouke A.; Peters, Anne
2013-01-01
Body size is a key sexually selected trait in many animal species. If size imposes a physical limit on the production of loud low-frequency sounds, then low-pitched vocalisations could act as reliable signals of body size. However, the central prediction of this hypothesis – that the pitch of vocalisations decreases with size among competing individuals – has limited support in songbirds. One reason could be that only the lowest-frequency components of vocalisations are constrained, and this may go unnoticed when vocal ranges are large. Additionally, the constraint may only be apparent in contexts when individuals are indeed advertising their size. Here we explicitly consider signal diversity and performance limits to demonstrate that body size limits song frequency in an advertising context in a songbird. We show that in purple-crowned fairy-wrens, Malurus coronatus coronatus, larger males sing lower-pitched low-frequency advertising songs. The lower frequency bound of all advertising song types also has a significant negative relationship with body size. However, the average frequency of all their advertising songs is unrelated to body size. This comparison of different approaches to the analysis demonstrates how a negative relationship between body size and song frequency can be obscured by failing to consider signal design and the concept of performance limits. Since these considerations will be important in any complex communication system, our results imply that body size constraints on low-frequency vocalisations could be more widespread than is currently recognised. PMID:23437221
STI: An objective measure for the performance of voice communication systems
NASA Astrophysics Data System (ADS)
Houtgast, T.; Steeneken, H. J. M.
1981-06-01
A measuring device was developed for determining the quality of speech communication systems. It comprises two parts, a signal source which replaces the talker, producing an artificial speech-like signal, and an analysis part which replaces the listener, by which the signal at the receiving end of the system under test is evaluated. Each single measurement results in an index (ranging from 0-100%) which indicates the effect of that communication system on speech intelligibility. The index is called STI (Speech Transmission Index). A careful design of the characteristics of the test signal and of the type of signal analysis makes the present approach widely applicable. It was verified experimentally that a given STI implies a given effect on speech intelligibility, irrespective of the nature of the actual disturbance (noise interference, band-pass limiting, peak clipping, etc.).
McCracken, K.G.; Fullagar, P.J.; Slater, E.C.; Paton, D.C.; Afton, A.D.
2002-01-01
Acoustic advertising displays (n=75) of male Musk Ducks Biziura lobata were analysed at ten widely spaced geographic localities in South Australia, Victoria, and Western Australia. Vocalisations differed in a fixed, non-overlapping pattern between allopatric Musk Duck populations in southeastern and southwestern Australia. These findings suggest that Musk Duck populations are subdivided by the Nullarbor Plain, the arid treeless desert at the head of the Great Australian Bight. Three vocalisations performed by male Musk Ducks not previously reported in the literature were documented also. Vocalisations of captive Musk Ducks collected from different geographic regions (southeast and southwest) differed between regions from which captives originally were collected and were unlike those performed by wild birds. Based on calls of immature Musk Ducks, acoustic variation within regional populations and the apparent inability of captive Musk Ducks reared in isolation to develop the wild type adult call, regional dialects seemingly are acquired in a social context by repeated observance of adult males and some combination of social imprinting, learning, or practice.
The acoustic structure of male giant panda bleats varies according to intersexual context.
Charlton, Benjamin D; Keating, Jennifer L; Rengui, Li; Huang, Yan; Swaisgood, Ronald R
2015-09-01
Although the acoustic structure of mammal vocal signals often varies according to the social context of emission, relatively few mammal studies have examined acoustic variation during intersexual advertisement. In the current study male giant panda bleats were recorded during the breeding season in three behavioural contexts: vocalising alone, during vocal interactions with females outside of peak oestrus, and during vocal interactions with peak-oestrous females. Male bleats produced during vocal interactions with peak-oestrous females were longer in duration and had higher mean fundamental frequency than those produced when males were either involved in a vocal interaction with a female outside of peak oestrus or vocalising alone. In addition, males produced bleats with higher rates of fundamental frequency modulation when they were vocalising alone than when they were interacting with females. These results show that acoustic features of male giant panda bleats have the potential to signal the caller's motivational state, and suggest that males increase the rate of fundamental frequency modulation in bleats when they are alone to maximally broadcast their quality and promote close-range contact with receptive females during the breeding season.
Vocalisations of Killer Whales (Orcinus orca) in the Bremer Canyon, Western Australia.
Wellard, Rebecca; Erbe, Christine; Fouda, Leila; Blewitt, Michelle
2015-01-01
To date, there has been no dedicated study in Australian waters on the acoustics of killer whales. Hence no information has been published on the sounds produced by killer whales from this region. Here we present the first acoustical analysis of recordings collected off the Western Australian coast. Underwater sounds produced by Australian killer whales were recorded during the months of February and March 2014 and 2015 in the Bremer Canyon in Western Australia. Vocalisations recorded included echolocation clicks, burst-pulse sounds and whistles. A total of 28 hours and 29 minutes were recorded and analysed, with 2376 killer whale calls (whistles and burst-pulse sounds) detected. Recordings of poor quality or signal-to-noise ratio were excluded from analysis, resulting in 142 whistles and burst-pulse vocalisations suitable for analysis and categorisation. These were grouped based on their spectrographic features into nine Bremer Canyon (BC) "call types". The frequency of the fundamental contours of all call types ranged from 600 Hz to 29 kHz. Calls ranged from 0.05 to 11.3 seconds in duration. Biosonar clicks were also recorded, but not studied further. Surface behaviours noted during acoustic recordings were categorised as either travelling or social behaviour. A detailed description of the acoustic characteristics is necessary for species acoustic identification and for the development of passive acoustic tools for population monitoring, including assessments of population status, habitat usage, migration patterns, behaviour and acoustic ecology. This study provides the first quantitative assessment and report on the acoustic features of killer whales vocalisations in Australian waters, and presents an opportunity to further investigate this little-known population.
Vocalisations of Killer Whales (Orcinus orca) in the Bremer Canyon, Western Australia
Wellard, Rebecca; Erbe, Christine; Fouda, Leila; Blewitt, Michelle
2015-01-01
To date, there has been no dedicated study in Australian waters on the acoustics of killer whales. Hence no information has been published on the sounds produced by killer whales from this region. Here we present the first acoustical analysis of recordings collected off the Western Australian coast. Underwater sounds produced by Australian killer whales were recorded during the months of February and March 2014 and 2015 in the Bremer Canyon in Western Australia. Vocalisations recorded included echolocation clicks, burst-pulse sounds and whistles. A total of 28 hours and 29 minutes were recorded and analysed, with 2376 killer whale calls (whistles and burst-pulse sounds) detected. Recordings of poor quality or signal-to-noise ratio were excluded from analysis, resulting in 142 whistles and burst-pulse vocalisations suitable for analysis and categorisation. These were grouped based on their spectrographic features into nine Bremer Canyon (BC) “call types”. The frequency of the fundamental contours of all call types ranged from 600 Hz to 29 kHz. Calls ranged from 0.05 to 11.3 seconds in duration. Biosonar clicks were also recorded, but not studied further. Surface behaviours noted during acoustic recordings were categorised as either travelling or social behaviour. A detailed description of the acoustic characteristics is necessary for species acoustic identification and for the development of passive acoustic tools for population monitoring, including assessments of population status, habitat usage, migration patterns, behaviour and acoustic ecology. This study provides the first quantitative assessment and report on the acoustic features of killer whales vocalisations in Australian waters, and presents an opportunity to further investigate this little-known population. PMID:26352429
Branchi, I; Santucci, D; Alleva, E
2001-11-01
Ultrasonic vocalisations (USVs) emitted by altricial rodent pups are whistle-like sounds with frequencies between 30 and 90 kHz. These signals play an important communicative role in mother-offspring interaction since they elicit in the dam a prompt response concerning caregiving behaviours. Both physical and social parameters modulate the USV emission in the infant rodent. Recently, a more detailed analysis of the ultrasonic vocalisation pattern, considering the spectrographic structure of sounds has allowed a deeper investigation of this behaviour. In order to investigate neurobehavioural development, the analysis of USVs presents several advantages, mainly: (i) USVs are one of the few responses produced by very young mice that can be quantitatively analysed and elicited by quantifiable stimuli; (ii) USV production follows a clear ontogenetic profile from birth to PND 14-15, thus allowing longitudinal neurobehavioural analysis during very early postnatal ontogeny. The study of this ethologically-ecologically relevant behaviour represent a valid model to evaluate possible alterations in the neurobehavioural development of perinatally treated or genetically modified infant rodents. Furthermore, the role played by several receptor agonists and antagonists in modulating USV rate makes this measure particularly important when investigating the effects of anxiogenic and anxiolytic compounds, and emotional behaviour in general.
Kokkinakis, Kostas; Loizou, Philipos C
2011-09-01
The purpose of this study is to determine the relative impact of reverberant self-masking and overlap-masking effects on speech intelligibility by cochlear implant listeners. Sentences were presented in two conditions wherein reverberant consonant segments were replaced with clean consonants, and in another condition wherein reverberant vowel segments were replaced with clean vowels. The underlying assumption is that self-masking effects would dominate in the first condition, whereas overlap-masking effects would dominate in the second condition. Results indicated that the degradation of speech intelligibility in reverberant conditions is caused primarily by self-masking effects that give rise to flattened formant transitions. © 2011 Acoustical Society of America
Lombard effect onset times reveal the speed of vocal plasticity in a songbird.
Hardman, Samuel I; Zollinger, Sue Anne; Koselj, Klemen; Leitner, Stefan; Marshall, Rupert C; Brumm, Henrik
2017-03-15
Animals that use vocal signals to communicate often compensate for interference and masking from background noise by raising the amplitude of their vocalisations. This response has been termed the Lombard effect. However, despite more than a century of research, little is known how quickly animals can adjust the amplitude of their vocalisations after the onset of noise. The ability to respond quickly to increases in noise levels would allow animals to avoid signal masking and ensure their calls continue to be heard, even if they are interrupted by sudden bursts of high-amplitude noise. We tested how quickly singing male canaries ( Serinus canaria ) exhibit the Lombard effect by exposing them to short playbacks of white noise and measuring the speed of their responses. We show that canaries exhibit the Lombard effect in as little as 300 ms after the onset of noise and are also able to increase the amplitude of their songs mid-song and mid-phrase without pausing. Our results demonstrate high vocal plasticity in this species and suggest that birds are able to adjust the amplitude of their vocalisations very rapidly to ensure they can still be heard even during sudden changes in background noise levels. © 2017. Published by The Company of Biologists Ltd.
Favaro, Livio; Gamba, Marco; Alfieri, Chiara; Pessani, Daniela; McElligott, Alan G
2015-11-25
The African penguin is a nesting seabird endemic to southern Africa. In penguins of the genus Spheniscus vocalisations are important for social recognition. However, it is not clear which acoustic features of calls can encode individual identity information. We recorded contact calls and ecstatic display songs of 12 adult birds from a captive colony. For each vocalisation, we measured 31 spectral and temporal acoustic parameters related to both source and filter components of calls. For each parameter, we calculated the Potential of Individual Coding (PIC). The acoustic parameters showing PIC ≥ 1.1 were used to perform a stepwise cross-validated discriminant function analysis (DFA). The DFA correctly classified 66.1% of the contact calls and 62.5% of display songs to the correct individual. The DFA also resulted in the further selection of 10 acoustic features for contact calls and 9 for display songs that were important for vocal individuality. Our results suggest that studying the anatomical constraints that influence nesting penguin vocalisations from a source-filter perspective, can lead to a much better understanding of the acoustic cues of individuality contained in their calls. This approach could be further extended to study and understand vocal communication in other bird species.
Favaro, Livio; Gamba, Marco; Alfieri, Chiara; Pessani, Daniela; McElligott, Alan G.
2015-01-01
The African penguin is a nesting seabird endemic to southern Africa. In penguins of the genus Spheniscus vocalisations are important for social recognition. However, it is not clear which acoustic features of calls can encode individual identity information. We recorded contact calls and ecstatic display songs of 12 adult birds from a captive colony. For each vocalisation, we measured 31 spectral and temporal acoustic parameters related to both source and filter components of calls. For each parameter, we calculated the Potential of Individual Coding (PIC). The acoustic parameters showing PIC ≥ 1.1 were used to perform a stepwise cross-validated discriminant function analysis (DFA). The DFA correctly classified 66.1% of the contact calls and 62.5% of display songs to the correct individual. The DFA also resulted in the further selection of 10 acoustic features for contact calls and 9 for display songs that were important for vocal individuality. Our results suggest that studying the anatomical constraints that influence nesting penguin vocalisations from a source-filter perspective, can lead to a much better understanding of the acoustic cues of individuality contained in their calls. This approach could be further extended to study and understand vocal communication in other bird species. PMID:26602001
Lina, Xu; Feng, Li; Yanyun, Zhang; Nan, Gao; Mingfang, Hu
2016-12-01
To explore the phonological characteristics and rehabilitation training of abnormal velar in patients with functional articulation disorders (FAD). Eighty-seven patients with FAD were observed of the phonological characteristics of velar. Seventy-two patients with abnormal velar accepted speech training. The correlation and simple linear regression analysis were carried out on abnormal velar articulation and age. The articulation disorder of /g/ mainly showed replacement by /d/, /b/ or omission. /k/ mainly showed replacement by /d/, /t/, /g/, /p/, /b/. /h/ mainly showed replacement by /g/, /f/, /p/, /b/ or omission. The common erroneous articulation forms of /g/, /k/, /h/ were fronting of tongue and replacement by bilabial consonants. When velar combined with vowels contained /a/ and /e/, the main error was fronting of tongue. When velar combined with vowels contained /u/, the errors trended to be replacement by bilabial consonants. After 3 to 10 times of speech training, the number of erroneous words decreased to (6.24±2.61) from (40.28±6.08) before the speech training was established, the difference was statistically significant (Z=-7.379, P=0.000). The number of erroneous words was negatively correlated with age (r=-0.691, P=0.000). The result of simple linear regression analysis showed that the determination coefficient was 0.472. The articulation disorder of velar mainly shows replacement, varies with the vowels. The targeted rehabilitation training hereby established is significantly effective. Age plays an important role in the outcome of velar.
21 CFR 874.3730 - Laryngeal prosthesis (Taub design).
Code of Federal Regulations, 2010 CFR
2010-04-01
... pulmonary air flow to the pharynx in the absence of the larynx, thereby permitting esophageal speech. The device is interposed between openings in the trachea and the esophagus and may be removed and replaced... and over the esophageal mucosa to provide a sound source that is articulated as speech. (b...
Rusz, Jan; Tykalová, Tereza; Klempíř, Jiří; Čmejla, Roman; Růžička, Evžen
2016-04-01
Although speech disorders represent an early and common manifestation of Parkinson's disease (PD), little is known about their progression and relationship to dopaminergic replacement therapy. The aim of the current study was to examine longitudinal motor speech changes after the initiation of pharmacotherapy in PD. Fifteen newly-diagnosed, untreated PD patients and ten healthy controls of comparable age were investigated. PD patients were tested before the introduction of antiparkinsonian therapy and then twice within the following 6 years. Quantitative acoustic analyses of seven key speech dimensions of hypokinetic dysarthria were performed. At baseline, PD patients showed significantly altered speech including imprecise consonants, monopitch, inappropriate silences, decreased quality of voice, slow alternating motion rates, imprecise vowels and monoloudness. At follow-up assessment, preservation or slight improvement of speech performance was objectively observed in two-thirds of PD patients within the first 3-6 years of dopaminergic treatment, primarily associated with the improvement of stop consonant articulation. The extent of speech improvement correlated with L-dopa equivalent dose (r = 0.66, p = 0.008) as well as with reduction in principal motor manifestations based on the Unified Parkinson's Disease Rating Scale (r = -0.61, p = 0.02), particularly reflecting treatment-related changes in bradykinesia but not in rigidity, tremor, or axial motor manifestations. While speech disorders are frequently present in drug-naive PD patients, they tend to improve or remain relatively stable after the initiation of dopaminergic treatment and appear to be related to the dopaminergic responsiveness of bradykinesia.
First insights into the vocal repertoire of infant and juvenile Southern white rhinoceros.
Linn, Sabrina N; Boeer, Michael; Scheumann, Marina
2018-01-01
Describing vocal repertoires represents an essential step towards gaining an overview about the complexity of acoustic communication in a given species. The analysis of infant vocalisations is essential for understanding the development and usage of species-specific vocalisations, but is often underrepresented, especially in species with long inter-birth intervals such as the white rhinoceros. Thus, this study aimed for the first time to characterise the infant and juvenile vocal repertoire of the Southern white rhinoceros and to relate these findings to the adult vocal repertoire. The behaviour of seven mother-reared white rhinoceros calves (two males, five females) and one hand-reared calf (male), ranging from one month to four years, was simultaneously audio and video-taped at three zoos. Normally reared infants and juveniles uttered four discriminable call types (Whine, Snort, Threat, and Pant) that were produced in different behavioural contexts. All call types were also uttered by the hand-reared calf. Call rates of Whines, but not of the other call types, decreased with age. These findings provide the first evidence that infant and juvenile rhinoceros utter specific call types in distinct contexts, even if they grow up with limited social interaction with conspecifics. By comparing our findings with the current literature on vocalisations of adult white rhinoceros and other solitary rhinoceros species, we discuss to which extent differences in the social lifestyle across species affect acoustic communication in mammals.
Frequency of Use Leads to Automaticity of Production: Evidence from Repair in Conversation
ERIC Educational Resources Information Center
Kapatsinski, Vsevolod
2010-01-01
In spontaneous speech, speakers sometimes replace a word they have just produced or started producing by another word. The present study reports that in these replacement repairs, low-frequency replaced words are more likely to be interrupted prior to completion than high-frequency words, providing support to the hypothesis that the production of…
Speech Recognition Technology for Disabilities Education
ERIC Educational Resources Information Center
Tang, K. Wendy; Kamoua, Ridha; Sutan, Victor; Farooq, Omer; Eng, Gilbert; Chu, Wei Chern; Hou, Guofeng
2005-01-01
Speech recognition is an alternative to traditional methods of interacting with a computer, such as textual input through a keyboard. An effective system can replace or reduce the reliability on standard keyboard and mouse input. This can especially assist dyslexic students who have problems with character or word use and manipulation in a textual…
Will Microfilm and Computers Replace Clippings?
ERIC Educational Resources Information Center
Oppendahl, Alison; And Others
Four speeches are presented, each of which deals with the use of conputers to organize and retrieve news stories. The first speech relates in detail the step-by-step process devised by the "Free Press" in Detroit to analyze, categorize, code, film, process, and retrieve news stories through the use of the electronic film retrieval…
Dr Samuel Johnson's movement disorder.
Murray, T J
1979-01-01
Dr Samuel Johnson was noted by his friends to have almost constant tics and gesticulations, which startled those who met him for the first time. He also made noises and whistling sounds; he made repeated sounds and words and irregular or blowing respiratory noises. Further, he often carried out pronounced compulsive acts, such as touching posts, measuring his footsteps on leaving a room, and performing peculiar complex gestures and steps before crossing a threshold. His symptoms of (a) involuntary muscle jerking movements and complex motor acts, (b) involuntary vocalisation, and (c) compulsive actions constitute the symptom complex of Gilles de la Tourette syndrome (Tourette's syndrome), from which Johnson suffered most of his life. This syndrome is of increasing interest recently because it responds to haloperidol, and because there are new insights into a possible biochemical basis for the tics, vocalisations, and compulsions. PMID:380753
First insights into the vocal repertoire of infant and juvenile Southern white rhinoceros
Boeer, Michael; Scheumann, Marina
2018-01-01
Describing vocal repertoires represents an essential step towards gaining an overview about the complexity of acoustic communication in a given species. The analysis of infant vocalisations is essential for understanding the development and usage of species-specific vocalisations, but is often underrepresented, especially in species with long inter-birth intervals such as the white rhinoceros. Thus, this study aimed for the first time to characterise the infant and juvenile vocal repertoire of the Southern white rhinoceros and to relate these findings to the adult vocal repertoire. The behaviour of seven mother-reared white rhinoceros calves (two males, five females) and one hand-reared calf (male), ranging from one month to four years, was simultaneously audio and video-taped at three zoos. Normally reared infants and juveniles uttered four discriminable call types (Whine, Snort, Threat, and Pant) that were produced in different behavioural contexts. All call types were also uttered by the hand-reared calf. Call rates of Whines, but not of the other call types, decreased with age. These findings provide the first evidence that infant and juvenile rhinoceros utter specific call types in distinct contexts, even if they grow up with limited social interaction with conspecifics. By comparing our findings with the current literature on vocalisations of adult white rhinoceros and other solitary rhinoceros species, we discuss to which extent differences in the social lifestyle across species affect acoustic communication in mammals. PMID:29513670
Learning to Comprehend Foreign-Accented Speech by Means of Production and Listening Training
ERIC Educational Resources Information Center
Grohe, Ann-Kathrin; Weber, Andrea
2016-01-01
The effects of production and listening training on the subsequent comprehension of foreign-accented speech were investigated in a training-test paradigm. During training, German nonnative (L2) and English native (L1) participants listened to a story spoken by a German speaker who replaced all English /?/s with /t/ (e.g., *"teft" for…
Loss tolerant speech decoder for telecommunications
NASA Technical Reports Server (NTRS)
Prieto, Jr., Jaime L. (Inventor)
1999-01-01
A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.
Rapid onset of maternal vocal recognition in a colonially breeding mammal, the Australian sea lion.
Pitcher, Benjamin J; Harcourt, Robert G; Charrier, Isabelle
2010-08-13
In many gregarious mammals, mothers and offspring have developed the abilities to recognise each other using acoustic signals. Such capacity may develop at different rates after birth/parturition, varying between species and between the participants, i.e., mothers and young. Differences in selective pressures between species, and between mothers and offspring, are likely to drive the timing of the onset of mother-young recognition. We tested the ability of Australian sea lion mothers to identify their offspring by vocalisation, and examined the onset of this behaviour in these females. We hypothesise that a rapid onset of recognition may reflect an adaptation to a colonial lifestyle. In a playback study maternal responses to own pup and non-filial vocalisations were compared at 12, 24 and every subsequent 24 hours until the females' first departure post-partum. Mothers showed a clear ability to recognise their pup's voice by 48 hours of age. At 24 hours mothers called more, at 48 hours they called sooner and at 72 hours they looked sooner in response to their own pup's vocalisations compared to those of non-filial pups. We demonstrate that Australian sea lion females can vocally identify offspring within two days of birth and before mothers leave to forage post-partum. We suggest that this rapid onset is a result of selection pressures imposed by a colonial lifestyle and may be seen in other colonial vertebrates. This is the first demonstration of the timing of the onset of maternal vocal recognition in a pinniped species.
Comparing speech and nonspeech context effects across timescales in coarticulatory contexts.
Viswanathan, Navin; Kelty-Stephen, Damian G
2018-02-01
Context effects are ubiquitous in speech perception and reflect the ability of human listeners to successfully perceive highly variable speech signals. In the study of how listeners compensate for coarticulatory variability, past studies have used similar effects speech and tone analogues of speech as strong support for speech-neutral, general auditory mechanisms for compensation for coarticulation. In this manuscript, we revisit compensation for coarticulation by replacing standard button-press responses with mouse-tracking responses and examining both standard geometric measures of uncertainty as well as newer information-theoretic measures that separate fast from slow mouse movements. We found that when our analyses were restricted to end-state responses, tones and speech contexts appeared to produce similar effects. However, a more detailed time-course analysis revealed systematic differences between speech and tone contexts such that listeners' responses to speech contexts, but not to tone contexts, changed across the experimental session. Analyses of the time course of effects within trials using mouse tracking indicated that speech contexts elicited fewer x-position flips but more area under the curve (AUC) and maximum deviation (MD), and they did so in the slower portions of mouse-tracking movements. Our results indicate critical differences between the time course of speech and nonspeech context effects and that general auditory explanations, motivated by their apparent similarity, be reexamined.
Stilp, Christian E.; Goupell, Matthew J.
2015-01-01
Short-time spectral changes in the speech signal are important for understanding noise-vocoded sentences. These information-bearing acoustic changes, measured using cochlea-scaled entropy in cochlear implant simulations [CSECI; Stilp et al. (2013). J. Acoust. Soc. Am. 133(2), EL136–EL141; Stilp (2014). J. Acoust. Soc. Am. 135(3), 1518–1529], may offer better understanding of speech perception by cochlear implant (CI) users. However, perceptual importance of CSECI for normal-hearing listeners was tested at only one spectral resolution and one temporal resolution, limiting generalizability of results to CI users. Here, experiments investigated the importance of these informational changes for understanding noise-vocoded sentences at different spectral resolutions (4–24 spectral channels; Experiment 1), temporal resolutions (4–64 Hz cutoff for low-pass filters that extracted amplitude envelopes; Experiment 2), or when both parameters varied (6–12 channels, 8–32 Hz; Experiment 3). Sentence intelligibility was reduced more by replacing high-CSECI intervals with noise than replacing low-CSECI intervals, but only when sentences had sufficient spectral and/or temporal resolution. High-CSECI intervals were more important for speech understanding as spectral resolution worsened and temporal resolution improved. Trade-offs between CSECI and intermediate spectral and temporal resolutions were minimal. These results suggest that signal processing strategies that emphasize information-bearing acoustic changes in speech may improve speech perception for CI users. PMID:25698018
Zeng, Yin-Ting; Hwu, Wuh-Liang; Torng, Pao-Chuan; Lee, Ni-Chung; Shieh, Jeng-Yi; Lu, Lu; Chien, Yin-Hsiu
2017-05-01
Patients with infantile-onset Pompe disease (IOPD) can be treated by recombinant human acid alpha glucosidase (rhGAA) replacement beginning at birth with excellent survival rates, but they still commonly present with speech disorders. This study investigated the progress of speech disorders in these early-treated patients and ascertained the relationship with treatments. Speech disorders, including hypernasal resonance, articulation disorders, and speech intelligibility, were scored by speech-language pathologists using auditory perception in seven early-treated patients over a period of 6 years. Statistical analysis of the first and last evaluations of the patients was performed with the Wilcoxon signed-rank test. A total of 29 speech samples were analyzed. All the patients suffered from hypernasality, articulation disorder, and impairment in speech intelligibility at the age of 3 years. The conditions were stable, and 2 patients developed normal or near normal speech during follow-up. Speech therapy and a high dose of rhGAA appeared to improve articulation in 6 of the 7 patients (86%, p = 0.028) by decreasing the omission of consonants, which consequently increased speech intelligibility (p = 0.041). Severity of hypernasality greatly reduced only in 2 patients (29%, p = 0.131). Speech disorders were common even in early and successfully treated patients with IOPD; however, aggressive speech therapy and high-dose rhGAA could improve their speech disorders. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
Development of speech prostheses: current status and recent advances
Brumberg, Jonathan S; Guenther, Frank H
2010-01-01
Brain–computer interfaces (BCIs) have been developed over the past decade to restore communication to persons with severe paralysis. In the most severe cases of paralysis, known as locked-in syndrome, patients retain cognition and sensation, but are capable of only slight voluntary eye movements. For these patients, no standard communication method is available, although some can use BCIs to communicate by selecting letters or words on a computer. Recent research has sought to improve on existing techniques by using BCIs to create a direct prediction of speech utterances rather than to simply control a spelling device. Such methods are the first steps towards speech prostheses as they are intended to entirely replace the vocal apparatus of paralyzed users. This article outlines many well known methods for restoration of communication by BCI and illustrates the difference between spelling devices and direct speech prediction or speech prosthesis. PMID:20822389
Dissecting choral speech: properties of the accompanist critical to stuttering reduction.
Kiefte, Michael; Armson, Joy
2008-01-01
The effects of choral speech and altered auditory feedback (AAF) on stuttering frequency were compared to identify those properties of choral speech that make it a more effective condition for stuttering reduction. Seventeen adults who stutter (AWS) participated in an experiment consisting of special choral speech conditions that were manipulated to selectively eliminate specific differences between choral speech and AAF. Consistent with previous findings, results showed that both choral speech and AAF reduced stuttering compared to solo reading. Although reductions under AAF were substantial, they were less dramatic than those for choral speech. Stuttering reduction for choral speech was highly robust even when the accompanist's voice temporally lagged that of the AWS, when there was no opportunity for dynamic interplay between the AWS and accompanist, and when the accompanist was replaced by the AWS's own voice, all of which approximate specific features of AAF. Choral speech was also highly effective in reducing stuttering across changes in speech rate and for both familiar and unfamiliar passages. We concluded that differences in properties between choral speech and AAF other than those that were manipulated in this experiment must account for differences in stuttering reduction. The reader will be able to (1) describe differences in stuttering reduction associated with altered auditory feedback compared to choral speech conditions and (2) describe differences between delivery of a second voice signal as an altered rendition of the speakers own voice (altered auditory feedback) and alterations in the voice of an accompanist (choral speech).
The role of vocal individuality in conservation
Terry, Andrew MR; Peake, Tom M; McGregor, Peter K
2005-01-01
Identifying the individuals within a population can generate information on life history parameters, generate input data for conservation models, and highlight behavioural traits that may affect management decisions and error or bias within census methods. Individual animals can be discriminated by features of their vocalisations. This vocal individuality can be utilised as an alternative marking technique in situations where the marks are difficult to detect or animals are sensitive to disturbance. Vocal individuality can also be used in cases were the capture and handling of an animal is either logistically or ethically problematic. Many studies have suggested that vocal individuality can be used to count and monitor populations over time; however, few have explicitly tested the method in this role. In this review we discuss methods for extracting individuality information from vocalisations and techniques for using this to count and monitor populations over time. We present case studies in birds where vocal individuality has been applied to conservation and we discuss its role in mammals. PMID:15960848
... usually shifted, so this must be corrected. A space is often opened up and maintained for later ... with an artificial tooth serves to maintain the space and improve speech and appearance until a definitive ...
How do we use language? Shared patterns in the frequency of word use across 17 world languages
Calude, Andreea S.; Pagel, Mark
2011-01-01
We present data from 17 languages on the frequency with which a common set of words is used in everyday language. The languages are drawn from six language families representing 65 per cent of the world's 7000 languages. Our data were collected from linguistic corpora that record frequencies of use for the 200 meanings in the widely used Swadesh fundamental vocabulary. Our interest is to assess evidence for shared patterns of language use around the world, and for the relationship of language use to rates of lexical replacement, defined as the replacement of a word by a new unrelated or non-cognate word. Frequencies of use for words in the Swadesh list range from just a few per million words of speech to 191 000 or more. The average inter-correlation among languages in the frequency of use across the 200 words is 0.73 (p < 0.0001). The first principal component of these data accounts for 70 per cent of the variance in frequency of use. Elsewhere, we have shown that frequently used words in the Indo-European languages tend to be more conserved, and that this relationship holds separately for different parts of speech. A regression model combining the principal factor loadings derived from the worldwide sample along with their part of speech predicts 46 per cent of the variance in the rates of lexical replacement in the Indo-European languages. This suggests that Indo-European lexical replacement rates might be broadly representative of worldwide rates of change. Evidence for this speculation comes from using the same factor loadings and part-of-speech categories to predict a word's position in a list of 110 words ranked from slowest to most rapidly evolving among 14 of the world's language families. This regression model accounts for 30 per cent of the variance. Our results point to a remarkable regularity in the way that human speakers use language, and hint that the words for a shared set of meanings have been slowly evolving and others more rapidly evolving throughout human history. PMID:21357232
Zappella, Michele; Einspieler, Christa; Bartl-Pokorny, Katrin D; Krieber, Magdalena; Coleman, Mary; Bölte, Sven; Marschik, Peter B
2015-10-01
Little is known about the first half year of life of individuals later diagnosed with autism spectrum disorders (ASD). There is even a complete lack of observations on the first 6 months of life of individuals with transient autistic behaviours who improved in their socio-communicative functions in the pre-school age. To compare early development of individuals with transient autistic behaviours and those later diagnosed with ASD. Exploratory study; retrospective home video analysis. 18 males, videoed between birth and the age of 6 months (ten individuals later diagnosed with ASD; eight individuals who lost their autistic behaviours after the age of 3 and achieved age-adequate communicative abilities, albeit often accompanied by tics and attention deficit). The detailed video analysis focused on general movements (GMs), the concurrent motor repertoire, eye contact, responsive smiling, and pre-speech vocalisations. Abnormal GMs were observed more frequently in infants later diagnosed with ASD, whereas all but one infant with transient autistic behaviours had normal GMs (p<0.05). Eye contact and responsive smiling were inconspicuous for all individuals. Cooing was not observable in six individuals across both groups. GMs might be one of the markers which could assist the earlier identification of ASD. We recommend implementing the GM assessment in prospective studies on ASD. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Speech, stone tool-making and the evolution of language.
Cataldo, Dana Michelle; Migliano, Andrea Bamberg; Vinicius, Lucio
2018-01-01
The 'technological hypothesis' proposes that gestural language evolved in early hominins to enable the cultural transmission of stone tool-making skills, with speech appearing later in response to the complex lithic industries of more recent hominins. However, no flintknapping study has assessed the efficiency of speech alone (unassisted by gesture) as a tool-making transmission aid. Here we show that subjects instructed by speech alone underperform in stone tool-making experiments in comparison to subjects instructed through either gesture alone or 'full language' (gesture plus speech), and also report lower satisfaction with their received instruction. The results provide evidence that gesture was likely to be selected over speech as a teaching aid in the earliest hominin tool-makers; that speech could not have replaced gesturing as a tool-making teaching aid in later hominins, possibly explaining the functional retention of gesturing in the full language of modern humans; and that speech may have evolved for reasons unrelated to tool-making. We conclude that speech is unlikely to have evolved as tool-making teaching aid superior to gesture, as claimed by the technological hypothesis, and therefore alternative views should be considered. For example, gestural language may have evolved to enable tool-making in earlier hominins, while speech may have later emerged as a response to increased trade and more complex inter- and intra-group interactions in Middle Pleistocene ancestors of Neanderthals and Homo sapiens; or gesture and speech may have evolved in parallel rather than in sequence.
Ozker, Muge; Schepers, Inga M; Magnotti, John F; Yoshor, Daniel; Beauchamp, Michael S
2017-06-01
Human speech can be comprehended using only auditory information from the talker's voice. However, comprehension is improved if the talker's face is visible, especially if the auditory information is degraded as occurs in noisy environments or with hearing loss. We explored the neural substrates of audiovisual speech perception using electrocorticography, direct recording of neural activity using electrodes implanted on the cortical surface. We observed a double dissociation in the responses to audiovisual speech with clear and noisy auditory component within the superior temporal gyrus (STG), a region long known to be important for speech perception. Anterior STG showed greater neural activity to audiovisual speech with clear auditory component, whereas posterior STG showed similar or greater neural activity to audiovisual speech in which the speech was replaced with speech-like noise. A distinct border between the two response patterns was observed, demarcated by a landmark corresponding to the posterior margin of Heschl's gyrus. To further investigate the computational roles of both regions, we considered Bayesian models of multisensory integration, which predict that combining the independent sources of information available from different modalities should reduce variability in the neural responses. We tested this prediction by measuring the variability of the neural responses to single audiovisual words. Posterior STG showed smaller variability than anterior STG during presentation of audiovisual speech with noisy auditory component. Taken together, these results suggest that posterior STG but not anterior STG is important for multisensory integration of noisy auditory and visual speech.
Call Combinations in Monkeys: Compositional or Idiomatic Expressions?
ERIC Educational Resources Information Center
Arnold, Kate; Zuberbuhler, Klaus
2012-01-01
Syntax is widely considered the feature that most decisively sets human language apart from other natural communication systems. Animal vocalisations are generally considered to be holistic with few examples of utterances meaning something other than the sum of their parts. Previously, we have shown that male putty-nosed monkeys produce call…
Role of working memory and lexical knowledge in perceptual restoration of interrupted speech.
Nagaraj, Naveen K; Magimairaj, Beula M
2017-12-01
The role of working memory (WM) capacity and lexical knowledge in perceptual restoration (PR) of missing speech was investigated using the interrupted speech perception paradigm. Speech identification ability, which indexed PR, was measured using low-context sentences periodically interrupted at 1.5 Hz. PR was measured for silent gated, low-frequency speech noise filled, and low-frequency fine-structure and envelope filled interrupted conditions. WM capacity was measured using verbal and visuospatial span tasks. Lexical knowledge was assessed using both receptive vocabulary and meaning from context tests. Results showed that PR was better for speech noise filled condition than other conditions tested. Both receptive vocabulary and verbal WM capacity explained unique variance in PR for the speech noise filled condition, but were unrelated to performance in the silent gated condition. It was only receptive vocabulary that uniquely predicted PR for fine-structure and envelope filled conditions. These findings suggest that the contribution of lexical knowledge and verbal WM during PR depends crucially on the information content that replaced the silent intervals. When perceptual continuity was partially restored by filler speech noise, both lexical knowledge and verbal WM capacity facilitated PR. Importantly, for fine-structure and envelope filled interrupted conditions, lexical knowledge was crucial for PR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.
The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.« less
Ozker, Muge; Schepers, Inga M.; Magnotti, John F.; Yoshor, Daniel; Beauchamp, Michael S.
2017-01-01
Human speech can be comprehended using only auditory information from the talker’s voice. However, comprehension is improved if the talker’s face is visible, especially if the auditory information is degraded as occurs in noisy environments or with hearing loss. We explored the neural substrates of audiovisual speech perception using electrocorticography, direct recording of neural activity using electrodes implanted on the cortical surface. We observed a double dissociation in the responses to audiovisual speech with clear and noisy auditory component within the superior temporal gyrus (STG), a region long known to be important for speech perception. Anterior STG showed greater neural activity to audiovisual speech with clear auditory component, whereas posterior STG showed similar or greater neural activity to audiovisual speech in which the speech was replaced with speech-like noise. A distinct border between the two response patterns was observed, demarcated by a landmark corresponding to the posterior margin of Heschl’s gyrus. To further investigate the computational roles of both regions, we considered Bayesian models of multisensory integration, which predict that combining the independent sources of information available from different modalities should reduce variability in the neural responses. We tested this prediction by measuring the variability of the neural responses to single audiovisual words. Posterior STG showed smaller variability than anterior STG during presentation of audiovisual speech with noisy auditory component. Taken together, these results suggest that posterior STG but not anterior STG is important for multisensory integration of noisy auditory and visual speech. PMID:28253074
Complex patterns of signalling to convey different social goals of sex in bonobos, Pan paniscus.
Genty, Emilie; Neumann, Christof; Zuberbühler, Klaus
2015-11-05
Sexual behaviour in bonobos (Pan paniscus) functions beyond mere reproduction to mediate social interactions and relationships. In this study, we assessed the signalling behaviour in relation to four social goals of sex in this species: appeasement after conflict, tension reduction, social bonding and reproduction. Overall, sexual behaviour was strongly decoupled from its ancestral reproductive function with habitual use in the social domain, which was accompanied by a corresponding complexity in communication behaviour. We found that signalling behaviour varied systematically depending on the initiator's goals and gender. Although all gestures and vocalisations were part of the species-typical communication repertoire, they were often combined and produced flexibly. Generally, gestures and multi-modal combinations were more flexibly used to communicate a goal than vocalisations. There was no clear relation between signalling behaviour and success of sexual initiations, suggesting that communication was primarily used to indicate the signaller's intention, and not to influence a recipient's willingness to interact sexually. We discuss these findings in light of the larger question of what may have caused, in humans, the evolutionary transition from primate-like communication to language.
The mimetic repertoire of the spotted bowerbird Ptilonorhynchus maculatus
NASA Astrophysics Data System (ADS)
Kelley, Laura A.; Healy, Susan D.
2011-06-01
Although vocal mimicry in songbirds is well documented, little is known about the function of such mimicry. One possibility is that the mimic produces the vocalisations of predatory or aggressive species to deter potential predators or competitors. Alternatively, these sounds may be learned in error as a result of their acoustic properties such as structural simplicity. We determined the mimetic repertoires of a population of male spotted bowerbirds Ptilonorhynchus maculatus, a species that mimics predatory and aggressive species. Although male mimetic repertoires contained an overabundance of vocalisations produced by species that were generally aggressive, there was also a marked prevalence of mimicry of sounds that are associated with alarm such as predator calls, alarm calls and mobbing calls, irrespective of whether the species being mimicked was aggressive or not. We propose that it may be the alarming context in which these sounds are first heard that may lead both to their acquisition and to their later reproduction. We suggest that enhanced learning capability during acute stress may explain vocal mimicry in many species that mimic sounds associated with alarm.
Korean speech sound development in children from bilingual Japanese-Korean environments
Kim, Jeoung Suk; Lee, Jun Ho; Choi, Yoon Mi; Kim, Hyun Gi; Kim, Sung Hwan; Lee, Min Kyung
2010-01-01
Purpose This study investigates Korean speech sound development, including articulatory error patterns, among the Japanese-Korean children whose mothers are Japanese immigrants to Korea. Methods The subjects were 28 Japanese-Korean children with normal development born to Japanese women immigrants who lived in Jeonbuk province, Korea. They were assessed through Computerized Speech Lab 4500. The control group consisted of 15 Korean children who lived in the same area. Results The values of the voice onset time of consonants /ph/, /t/, /th/, and /k*/ among the children were prolonged. The children replaced the lenis sounds with aspirated or fortis sounds rather than replacing the fortis sounds with lenis or aspirated sounds, which are typical among Japanese immigrants. The children showed numerous articulatory errors for /c/ and /l/ sounds (similar to Koreans) rather than errors on /p/ sounds, which are more frequent among Japanese immigrants. The vowel formants of the children showed a significantly prolonged vowel /o/ as compared to that of Korean children (P<0.05). The Japanese immigrants and their children showed a similar substitution /n/ for /ɧ/ [Japanese immigrants (62.5%) vs Japanese-Korean children (14.3%)], which is rarely seen among Koreans. Conclusion The findings suggest that Korean speech sound development among Japanese-Korean children is influenced not only by the Korean language environment but also by their maternal language. Therefore, appropriate language education programs may be warranted not only or immigrant women but also for their children. PMID:21189968
An Efficient Acoustic Density Estimation Method with Human Detectors Applied to Gibbons in Cambodia.
Kidney, Darren; Rawson, Benjamin M; Borchers, David L; Stevenson, Ben C; Marques, Tiago A; Thomas, Len
2016-01-01
Some animal species are hard to see but easy to hear. Standard visual methods for estimating population density for such species are often ineffective or inefficient, but methods based on passive acoustics show more promise. We develop spatially explicit capture-recapture (SECR) methods for territorial vocalising species, in which humans act as an acoustic detector array. We use SECR and estimated bearing data from a single-occasion acoustic survey of a gibbon population in northeastern Cambodia to estimate the density of calling groups. The properties of the estimator are assessed using a simulation study, in which a variety of survey designs are also investigated. We then present a new form of the SECR likelihood for multi-occasion data which accounts for the stochastic availability of animals. In the context of gibbon surveys this allows model-based estimation of the proportion of groups that produce territorial vocalisations on a given day, thereby enabling the density of groups, instead of the density of calling groups, to be estimated. We illustrate the performance of this new estimator by simulation. We show that it is possible to estimate density reliably from human acoustic detections of visually cryptic species using SECR methods. For gibbon surveys we also show that incorporating observers' estimates of bearings to detected groups substantially improves estimator performance. Using the new form of the SECR likelihood we demonstrate that estimates of availability, in addition to population density and detection function parameters, can be obtained from multi-occasion data, and that the detection function parameters are not confounded with the availability parameter. This acoustic SECR method provides a means of obtaining reliable density estimates for territorial vocalising species. It is also efficient in terms of data requirements since since it only requires routine survey data. We anticipate that the low-tech field requirements will make this method an attractive option in many situations where populations can be surveyed acoustically by humans.
Kim, Min-Beom; Chung, Won-Ho; Choi, Jeesun; Hong, Sung Hwa; Cho, Yang-Sun; Park, Gyuseok; Lee, Sangmin
2014-06-01
The object was to evaluate speech perception improvement through Bluetooth-implemented hearing aids in hearing-impaired adults. Thirty subjects with bilateral symmetric moderate sensorineural hearing loss participated in this study. A Bluetooth-implemented hearing aid was fitted unilaterally in all study subjects. Objective speech recognition score and subjective satisfaction were measured with a Bluetooth-implemented hearing aid to replace the acoustic connection from either a cellular phone or a loudspeaker system. In each system, participants were assigned to 4 conditions: wireless speech signal transmission into hearing aid (wireless mode) in quiet or noisy environment and conventional speech signal transmission using external microphone of hearing aid (conventional mode) in quiet or noisy environment. Also, participants completed questionnaires to investigate subjective satisfaction. Both cellular phone and loudspeaker system situation, participants showed improvements in sentence and word recognition scores with wireless mode compared to conventional mode in both quiet and noise conditions (P < .001). Participants also reported subjective improvements, including better sound quality, less noise interference, and better accuracy naturalness, when using the wireless mode (P < .001). Bluetooth-implemented hearing aids helped to improve subjective and objective speech recognition performances in quiet and noisy environments during the use of electronic audio devices.
Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline
2018-04-27
In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Mayer, Peter; Crowley, Kevin; Kaminska, Zofia
2007-01-01
Theories of literacy acquisition, developed mostly with reference to English, have characterised this process as passing through a series of stages. The culmination of this process is a strategy which takes account of the complex relationship between graphemes and phonemes within a deep orthography (Frith (1985). In K. Patterson, & M. Coltheart,…
Directionality in Linguistic Change and Acquisition.
ERIC Educational Resources Information Center
Gomes, Christina Abreu
1999-01-01
Focuses on the directionality observed in the processes of change and acquisition of the prepositions that replaced Latin cases in the speech of Rio de Janeiro and in the contact Portuguese spoken by Brazilian Indians in the region of Xingu. (Author/VWL)
Perception of steady-state vowels and vowelless syllables by adults and children
NASA Astrophysics Data System (ADS)
Nittrouer, Susan
2005-04-01
Vowels can be produced as long, isolated, and steady-state, but that is not how they are found in natural speech. Instead natural speech consists of almost continuously changing (i.e., dynamic) acoustic forms from which mature listeners recover underlying phonetic form. Some theories suggest that children need steady-state information to recognize vowels (and so learn vowel systems), even though that information is sparse in natural speech. The current study examined whether young children can recover vowel targets from dynamic forms, or whether they need steady-state information. Vowel recognition was measured for adults and children (3, 5, and 7 years) for natural productions of /dæd/, /dUd/ /æ/, /U/ edited to make six stimulus sets: three dynamic (whole syllables; syllables with middle 50-percent replaced by cough; syllables with all but the first and last three pitch periods replaced by cough), and three steady-state (natural, isolated vowels; reiterated pitch periods from those vowels; reiterated pitch periods from the syllables). Adults scored nearly perfectly on all but first/last three pitch period stimuli. Children performed nearly perfectly only when the entire syllable was heard, and performed similarly (near 80%) for all other stimuli. Consequently, children need dynamic forms to perceive vowels; steady-state forms are not preferred.
Zappella, Michele; Einspieler, Christa; Bartl-Pokorny, Katrin D.; Krieber, Magdalena; Coleman, Mary; Bölte, Sven; Marschik, Peter B.
2018-01-01
Background Little is known about the first half year of life of individuals later diagnosed with autism spectrum disorders (ASD). There is even a complete lack of observations on the first 6 months of life of individuals with transient autistic behaviours who improved in their socio-communicative functions in the pre-school age. Aim To compare early development of individuals with transient autistic behaviours and those later diagnosed with ASD. Study design Exploratory study; retrospective home video analysis. Subjects 18 males, videoed between birth and the age of 6 months (ten individuals later diagnosed with ASD; eight individuals who lost their autistic behaviours after the age of 3 and achieved age-adequate communicative abilities, albeit often accompanied by tics and attention deficit). Method The detailed video analysis focused on general movements (GMs), the concurrent motor repertoire, eye contact, responsive smiling, and pre-speech vocalisations. Results Abnormal GMs were observed more frequently in infants later diagnosed with ASD, whereas all but one infant with transient autistic behaviours had normal GMs (p < 0.05). Eye contact and responsive smiling were inconspicuous for all individuals. Cooing was not observable in six individuals across both groups. Conclusions GMs might be one of the markers which could assist the earlier identification of ASD. We recommend to implement the GM assessment in prospective studies on ASD. PMID:26246137
Approach to the Lighting Energy Savings in Japan for Global Climate Change Prevention
NASA Astrophysics Data System (ADS)
Hanada, Teizo
This report was presented as an invited speech at “the First Lighting Symposium of China, Japan and Korea” held in Beijing on October 24, 2008. The reporter has introduced JELMA's proposal for energy saving in lighting, and explained purpose of their activities. Latest activities for replacing incandescent lamps to CFLi and its recent results are also reported. Japanese next big target for lighting energy saving is to replace the conventional fluorescent lamps to Hf-FLs.
Restoring speech perception with cochlear implants by spanning defective electrode contacts.
Frijns, Johan H M; Snel-Bongers, Jorien; Vellinga, Dirk; Schrage, Erik; Vanpoucke, Filiep J; Briaire, Jeroen J
2013-04-01
Even with six defective contacts, spanning can largely restore speech perception with the HiRes 120 speech processing strategy to the level supported by an intact electrode array. Moreover, the sound quality is not degraded. Previous studies have demonstrated reduced speech perception scores (SPS) with defective contacts in HiRes 120. This study investigated whether replacing defective contacts by spanning, i.e. current steering on non-adjacent contacts, is able to restore speech recognition to the level supported by an intact electrode array. Ten adult cochlear implant recipients (HiRes90K, HiFocus1J) with experience with HiRes 120 participated in this study. Three different defective electrode arrays were simulated (six separate defective contacts, three pairs or two triplets). The participants received three take-home strategies and were asked to evaluate the sound quality in five predefined listening conditions. After 3 weeks, SPS were evaluated with monosyllabic words in quiet and in speech-shaped background noise. The participants rated the sound quality equal for all take-home strategies. SPS with background noise were equal for all conditions tested. However, SPS in quiet (85% phonemes correct on average with the full array) decreased significantly with increasing spanning distance, with a 3% decrease for each spanned contact.
ERIC Educational Resources Information Center
Charlton, Jenna J. V.; Law, James
2014-01-01
There is evidence for co-occurrence of social, emotional and behavioural difficulties (SEBD) and communication/language difficulties in children. Our research investigated the feasibility of vocalisation technology, its combination with observational software and the efficacy of a novel coding scheme and assessment technique. It aimed to…
Couchoux, Charline; Aubert, Maxime; Garant, Dany; Réale, Denis
2015-05-06
Technological advances can greatly benefit the scientific community by making new areas of research accessible. The study of animal vocal communication, in particular, can gain new insights and knowledge from technological improvements in recording equipment. Our comprehension of the acoustic signals emitted by animals would be greatly improved if we could continuously track the daily natural emissions of individuals in the wild, especially in the context of integrating individual variation into evolutionary ecology research questions. We show here how this can be accomplished using an operational tiny audio recorder that can easily be fitted as an on-board acoustic data-logger on small free-ranging animals. The high-quality 24 h acoustic recording logged on the spy microphone device allowed us to very efficiently collect daylong chipmunk vocalisations, giving us much more detailed data than the classical use of a directional microphone over an entire field season. The recordings also allowed us to monitor individual activity patterns and record incredibly long resting heart rates, and to identify self-scratching events and even whining from pre-emerging pups in their maternal burrow.
Couchoux, Charline; Aubert, Maxime; Garant, Dany; Réale, Denis
2015-01-01
Technological advances can greatly benefit the scientific community by making new areas of research accessible. The study of animal vocal communication, in particular, can gain new insights and knowledge from technological improvements in recording equipment. Our comprehension of the acoustic signals emitted by animals would be greatly improved if we could continuously track the daily natural emissions of individuals in the wild, especially in the context of integrating individual variation into evolutionary ecology research questions. We show here how this can be accomplished using an operational tiny audio recorder that can easily be fitted as an on-board acoustic data-logger on small free-ranging animals. The high-quality 24 h acoustic recording logged on the spy microphone device allowed us to very efficiently collect daylong chipmunk vocalisations, giving us much more detailed data than the classical use of a directional microphone over an entire field season. The recordings also allowed us to monitor individual activity patterns and record incredibly long resting heart rates, and to identify self-scratching events and even whining from pre-emerging pups in their maternal burrow. PMID:25944509
Complex patterns of signalling to convey different social goals of sex in bonobos, Pan paniscus
Genty, Emilie; Neumann, Christof; Zuberbühler, Klaus
2015-01-01
Sexual behaviour in bonobos (Pan paniscus) functions beyond mere reproduction to mediate social interactions and relationships. In this study, we assessed the signalling behaviour in relation to four social goals of sex in this species: appeasement after conflict, tension reduction, social bonding and reproduction. Overall, sexual behaviour was strongly decoupled from its ancestral reproductive function with habitual use in the social domain, which was accompanied by a corresponding complexity in communication behaviour. We found that signalling behaviour varied systematically depending on the initiator’s goals and gender. Although all gestures and vocalisations were part of the species-typical communication repertoire, they were often combined and produced flexibly. Generally, gestures and multi-modal combinations were more flexibly used to communicate a goal than vocalisations. There was no clear relation between signalling behaviour and success of sexual initiations, suggesting that communication was primarily used to indicate the signaller’s intention, and not to influence a recipient’s willingness to interact sexually. We discuss these findings in light of the larger question of what may have caused, in humans, the evolutionary transition from primate-like communication to language. PMID:26538281
Bergemann, Niels; Parzer, Peter; Jaggy, Susanne; Auler, Beatrice; Mundt, Christoph; Maier-Braunleder, Sabine
2008-01-01
Objective: The effects of estrogen on comprehension of metaphoric speech, word fluency, and verbal ability were investigated in women suffering from schizophrenia. The issue of estrogen-dependent neuropsychological performance could be highly relevant because women with schizophrenia frequently suffer from hypoestrogenism. Method: A placebo-controlled, double-blind, crossover study using 17β-estradiol for replacement therapy and as an adjunct to a naturalistic maintenance antipsychotic treatment was carried out over a period of 8 months. Nineteen women (mean age = 38.0 years, SD = 9.9 years) with schizophrenia were included in the study. Comprehension of metaphoric speech was measured by a lexical decision paradigm, word fluency, and verbal ability by a paper-and-pencil test. Results: Significant improvement was seen for the activation of metaphoric meaning during estrogen treatment (P = .013); in contrast, no difference was found for the activation of concrete meaning under this condition. Verbal ability and word fluency did not improve under estrogen replacement therapy either. Conclusions: This is the very first study based on estrogen intervention instead of the physiological hormone changes to examine the estrogen effects on neuropsychological performance in women with schizophrenia. In addition, it is the first time that the effect of estrogen on metaphoric speech comprehension was investigated in this context. While in a previous study estrogen therapy as adjunct to a naturalistic maintenance treatment with antipsychotics did not show an effect on psychopathology measured by a rating scale, a significant effect of estrogen on the comprehension of metaphoric speech and/or concretism, a main feature of schizophrenic thought and language disturbance, was found in the present study. Because the improvement of formal thought disorders and language disturbances is crucial for social integration of patients with schizophrenia, the results may have implications for the treatment of these individuals. PMID:18156639
Self-Desensitization and Meditation in the Reduction of Public Speaking Anxiety.
ERIC Educational Resources Information Center
Kirsch, Irving; Henry, David
1979-01-01
Speech-anxious students were assigned to self-administered treatment conditions: (1) systematic desensitization, (2) desensitization with meditation replacing progressive relaxation, and (3) meditation only. Treatment manuals included coping-skill instructions. Treatments were equally effective in reducing anxiety and produced a greater reduction…
Characterizing resonant component in speech: A different view of tracking fundamental frequency
NASA Astrophysics Data System (ADS)
Dong, Bin
2017-05-01
Inspired by the nonlinearity and nonstationarity and the modulations in speech, Hilbert-Huang Transform and cyclostationarity analysis are employed to investigate the speech resonance in vowel in sequence. Cyclostationarity analysis is not directly manipulated on the target vowel, but on its intrinsic mode functions one by one. Thanks to the equivalence between the fundamental frequency in speech and the cyclic frequency in cyclostationarity analysis, the modulation intensity distributions of the intrinsic mode functions provide much information for the estimation of the fundamental frequency. To highlight the relationship between frequency and time, the pseudo-Hilbert spectrum is proposed to replace the Hilbert spectrum here. After contrasting the pseudo-Hilbert spectra of and the modulation intensity distributions of the intrinsic mode functions, it finds that there is usually one intrinsic mode function which works as the fundamental component of the vowel. Furthermore, the fundamental frequency of the vowel can be determined by tracing the pseudo-Hilbert spectrum of its fundamental component along the time axis. The later method is more robust to estimate the fundamental frequency, when meeting nonlinear components. Two vowels [a] and [i], picked up from a speech database FAU Aibo Emotion Corpus, are applied to validate the above findings.
NASA Astrophysics Data System (ADS)
Whang, Tom; Ratib, Osman M.; Umamoto, Kathleen; Grant, Edward G.; McCoy, Michael J.
2002-05-01
The goal of this study is to determine the financial value and workflow improvements achievable by replacing traditional transcription services with a speech recognition system in a large, university hospital setting. Workflow metrics were measured at two hospitals, one of which exclusively uses a transcription service (UCLA Medical Center), and the other which exclusively uses speech recognition (West Los Angeles VA Hospital). Workflow metrics include time spent per report (the sum of time spent interpreting, dictating, reviewing, and editing), transcription turnaround, and total report turnaround. Compared to traditional transcription, speech recognition resulted in radiologists spending 13-32% more time per report, but it also resulted in reduction of report turnaround time by 22-62% and reduction of marginal cost per report by 94%. The model developed here helps justify the introduction of a speech recognition system by showing that the benefits of reduced operating costs and decreased turnaround time outweigh the cost of increased time spent per report. Whether the ultimate goal is to achieve a financial objective or to improve operational efficiency, it is important to conduct a thorough analysis of workflow before implementation.
Automated speech understanding: the next generation
NASA Astrophysics Data System (ADS)
Picone, J.; Ebel, W. J.; Deshmukh, N.
1995-04-01
Modern speech understanding systems merge interdisciplinary technologies from Signal Processing, Pattern Recognition, Natural Language, and Linguistics into a unified statistical framework. These systems, which have applications in a wide range of signal processing problems, represent a revolution in Digital Signal Processing (DSP). Once a field dominated by vector-oriented processors and linear algebra-based mathematics, the current generation of DSP-based systems rely on sophisticated statistical models implemented using a complex software paradigm. Such systems are now capable of understanding continuous speech input for vocabularies of several thousand words in operational environments. The current generation of deployed systems, based on small vocabularies of isolated words, will soon be replaced by a new technology offering natural language access to vast information resources such as the Internet, and provide completely automated voice interfaces for mundane tasks such as travel planning and directory assistance.
ERIC Educational Resources Information Center
Partridge, Mary
2011-01-01
Mary Partridge wanted her pupils not only to become more aware of competing and contrasting voices in the past, but to understand how historians orchestrate those voices. Using Edward Grim's eye-witness account of Thomas Becket's murder, her Year 7 pupils explored nuances in the word "shocking" as a way of distinguishing the responses of…
Post-glossectomy in lingual carcinomas: a scope for sign language in rehabilitation
Cumberbatch, Keren; Jones, Thaon
2017-01-01
The treatment option for cancers of the tongue is glossectomy, which may be partial, sub-total, or total, depending on the size of the tumour. Glossectomies result in speech deficits for these patients, and rehabilitative therapy involving communication modalities is highly recommended. Sign language is a possible therapeutic solution for post-glossectomy oral cancer patients. Patients with tongue cancers who have undergone total glossectomy as a surgical treatment can utilise sign language to replace their loss of speech production and maintain their engagement in life. This manuscript emphasises the importance of sign language in rehabilitation strategies in post-glossectomy patients. PMID:28947881
Post-glossectomy in lingual carcinomas: a scope for sign language in rehabilitation.
Rajendra Santosh, Arvind Babu; Cumberbatch, Keren; Jones, Thaon
2017-01-01
The treatment option for cancers of the tongue is glossectomy, which may be partial, sub-total, or total, depending on the size of the tumour. Glossectomies result in speech deficits for these patients, and rehabilitative therapy involving communication modalities is highly recommended. Sign language is a possible therapeutic solution for post-glossectomy oral cancer patients. Patients with tongue cancers who have undergone total glossectomy as a surgical treatment can utilise sign language to replace their loss of speech production and maintain their engagement in life. This manuscript emphasises the importance of sign language in rehabilitation strategies in post-glossectomy patients.
Wirtzfeld, Michael R; Ibrahim, Rasha A; Bruce, Ian C
2017-10-01
Perceptual studies of speech intelligibility have shown that slow variations of acoustic envelope (ENV) in a small set of frequency bands provides adequate information for good perceptual performance in quiet, whereas acoustic temporal fine-structure (TFS) cues play a supporting role in background noise. However, the implications for neural coding are prone to misinterpretation because the mean-rate neural representation can contain recovered ENV cues from cochlear filtering of TFS. We investigated ENV recovery and spike-time TFS coding using objective measures of simulated mean-rate and spike-timing neural representations of chimaeric speech, in which either the ENV or the TFS is replaced by another signal. We (a) evaluated the levels of mean-rate and spike-timing neural information for two categories of chimaeric speech, one retaining ENV cues and the other TFS; (b) examined the level of recovered ENV from cochlear filtering of TFS speech; (c) examined and quantified the contribution to recovered ENV from spike-timing cues using a lateral inhibition network (LIN); and (d) constructed linear regression models with objective measures of mean-rate and spike-timing neural cues and subjective phoneme perception scores from normal-hearing listeners. The mean-rate neural cues from the original ENV and recovered ENV partially accounted for perceptual score variability, with additional variability explained by the recovered ENV from the LIN-processed TFS speech. The best model predictions of chimaeric speech intelligibility were found when both the mean-rate and spike-timing neural cues were included, providing further evidence that spike-time coding of TFS cues is important for intelligibility when the speech envelope is degraded.
van Gelder, C M; van Capelle, C I; Ebbink, B J; Moor-van Nugteren, I; van den Hout, J M P; Hakkesteegt, M M; van Doorn, P A; de Coo, I F M; Reuser, A J J; de Gier, H H W; van der Ploeg, A T
2012-05-01
Classic infantile Pompe disease is an inherited generalized glycogen storage disorder caused by deficiency of lysosomal acid α-glucosidase. If left untreated, patients die before one year of age. Although enzyme-replacement therapy (ERT) has significantly prolonged lifespan, it has also revealed new aspects of the disease. For up to 11 years, we investigated the frequency and consequences of facial-muscle weakness, speech disorders and dysphagia in long-term survivors. Sequential photographs were used to determine the timing and severity of facial-muscle weakness. Using standardized articulation tests and fibreoptic endoscopic evaluation of swallowing, we investigated speech and swallowing function in a subset of patients. This study included 11 patients with classic infantile Pompe disease. Median age at the start of ERT was 2.4 months (range 0.1-8.3 months), and median age at the end of the study was 4.3 years (range 7.7 months -12.2 years). All patients developed facial-muscle weakness before the age of 15 months. Speech was studied in four patients. Articulation was disordered, with hypernasal resonance and reduced speech intelligibility in all four. Swallowing function was studied in six patients, the most important findings being ineffective swallowing with residues of food (5/6), penetration or aspiration (3/6), and reduced pharyngeal and/or laryngeal sensibility (2/6). We conclude that facial-muscle weakness, speech disorders and dysphagia are common in long-term survivors receiving ERT for classic infantile Pompe disease. To improve speech and reduce the risk for aspiration, early treatment by a speech therapist and regular swallowing assessments are recommended.
Rehn, Nicola; Filatova, Olga A; Durban, John W; Foote, Andrew D
2011-01-01
Facial and vocal expressions of emotion have been found in a number of social mammal species and are thought to have evolved to aid social communication. There has been much debate about whether such signals are culturally inherited or are truly biologically innate. Evidence for the innateness of such signals can come from cross-cultural studies. Previous studies have identified a vocalisation (the V4 or 'excitement' call) associated with high arousal behaviours in a population of killer whales in British Columbia, Canada. In this study, we compared recordings from three different socially and reproductively isolated ecotypes of killer whales, including five vocal clans of one ecotype, each clan having discrete culturally transmitted vocal traditions. The V4 call was found in recordings of each ecotype and each vocal clan. Nine independent observers reproduced our classification of the V4 call from each population with high inter-observer agreement. Our results suggest the V4 call may be universal in Pacific killer whale populations and that transmission of this call is independent of cultural tradition or ecotype. We argue that such universality is more consistent with an innate vocalisation than one acquired through social learning and may be linked to its apparent function of motivational expression.
NASA Astrophysics Data System (ADS)
Rehn, Nicola; Filatova, Olga A.; Durban, John W.; Foote, Andrew D.
2011-01-01
Facial and vocal expressions of emotion have been found in a number of social mammal species and are thought to have evolved to aid social communication. There has been much debate about whether such signals are culturally inherited or are truly biologically innate. Evidence for the innateness of such signals can come from cross-cultural studies. Previous studies have identified a vocalisation (the V4 or `excitement' call) associated with high arousal behaviours in a population of killer whales in British Columbia, Canada. In this study, we compared recordings from three different socially and reproductively isolated ecotypes of killer whales, including five vocal clans of one ecotype, each clan having discrete culturally transmitted vocal traditions. The V4 call was found in recordings of each ecotype and each vocal clan. Nine independent observers reproduced our classification of the V4 call from each population with high inter-observer agreement. Our results suggest the V4 call may be universal in Pacific killer whale populations and that transmission of this call is independent of cultural tradition or ecotype. We argue that such universality is more consistent with an innate vocalisation than one acquired through social learning and may be linked to its apparent function of motivational expression.
Sex-biased sound symbolism in english-language first names.
Pitcher, Benjamin J; Mesoudi, Alex; McElligott, Alan G
2013-01-01
Sexual selection has resulted in sex-based size dimorphism in many mammals, including humans. In Western societies, average to taller stature men and comparatively shorter, slimmer women have higher reproductive success and are typically considered more attractive. This size dimorphism also extends to vocalisations in many species, again including humans, with larger individuals exhibiting lower formant frequencies than smaller individuals. Further, across many languages there are associations between phonemes and the expression of size (e.g. large /a, o/, small /i, e/), consistent with the frequency-size relationship in vocalisations. We suggest that naming preferences are a product of this frequency-size relationship, driving male names to sound larger and female names smaller, through sound symbolism. In a 10-year dataset of the most popular British, Australian and American names we show that male names are significantly more likely to contain larger sounding phonemes (e.g. "Thomas"), while female names are significantly more likely to contain smaller phonemes (e.g. "Emily"). The desire of parents to have comparatively larger, more masculine sons, and smaller, more feminine daughters, and the increased social success that accompanies more sex-stereotyped names, is likely to be driving English-language first names to exploit sound symbolism of size in line with sexual body size dimorphism.
Sex-Biased Sound Symbolism in English-Language First Names
Pitcher, Benjamin J.; Mesoudi, Alex; McElligott, Alan G.
2013-01-01
Sexual selection has resulted in sex-based size dimorphism in many mammals, including humans. In Western societies, average to taller stature men and comparatively shorter, slimmer women have higher reproductive success and are typically considered more attractive. This size dimorphism also extends to vocalisations in many species, again including humans, with larger individuals exhibiting lower formant frequencies than smaller individuals. Further, across many languages there are associations between phonemes and the expression of size (e.g. large /a, o/, small /i, e/), consistent with the frequency-size relationship in vocalisations. We suggest that naming preferences are a product of this frequency-size relationship, driving male names to sound larger and female names smaller, through sound symbolism. In a 10-year dataset of the most popular British, Australian and American names we show that male names are significantly more likely to contain larger sounding phonemes (e.g. “Thomas”), while female names are significantly more likely to contain smaller phonemes (e.g. “Emily”). The desire of parents to have comparatively larger, more masculine sons, and smaller, more feminine daughters, and the increased social success that accompanies more sex-stereotyped names, is likely to be driving English-language first names to exploit sound symbolism of size in line with sexual body size dimorphism. PMID:23755148
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-28
... FURTHER INFORMATION CONTACT: Colette Pollard, Reports Management Officer, QDAM, Department of Housing [email protected] or telephone 202-402-3400. Persons with hearing or speech impairments may access this number... Act of 1995, 44 U.S.C. Chapter 35. Dated: June 25, 2013. Colette Pollard, Department Reports...
ERIC Educational Resources Information Center
Samuels, Christina A.
2008-01-01
A few decades ago, Braille was on the wane. Technology was seen as likely to replace the tactile communication method, as text-to-speech readers and recorded books, for example, offered access to classroom materials. Students at special schools for the blind moved into regular classrooms, which are rich in text, but not text that is accessible to…
The Bounds on Flexibility in Speech Perception
ERIC Educational Resources Information Center
Sjerps, Matthias J.; McQueen, James M.
2010-01-01
Dutch listeners were exposed to the English theta sound (as in "bath"), which replaced [f] in /f/-final Dutch words or, for another group, [s] in /s/-final words. A subsequent identity-priming task showed that participants had learned to interpret theta as, respectively, /f/ or /s/. Priming effects were equally strong when the exposure…
A glimpsing account of the role of temporal fine structure information in speech recognition.
Apoux, Frédéric; Healy, Eric W
2013-01-01
Many behavioral studies have reported a significant decrease in intelligibility when the temporal fine structure (TFS) of a sound mixture is replaced with noise or tones (i.e., vocoder processing). This finding has led to the conclusion that TFS information is critical for speech recognition in noise. How the normal -auditory system takes advantage of the original TFS, however, remains unclear. Three -experiments on the role of TFS in noise are described. All three experiments measured speech recognition in various backgrounds while manipulating the envelope, TFS, or both. One experiment tested the hypothesis that vocoder processing may artificially increase the apparent importance of TFS cues. Another experiment evaluated the relative contribution of the target and masker TFS by disturbing only the TFS of the target or that of the masker. Finally, a last experiment evaluated the -relative contribution of envelope and TFS information. In contrast to previous -studies, however, the original envelope and TFS were both preserved - to some extent - in all conditions. Overall, the experiments indicate a limited influence of TFS and suggest that little speech information is extracted from the TFS. Concomitantly, these experiments confirm that most speech information is carried by the temporal envelope in real-world conditions. When interpreted within the framework of the glimpsing model, the results of these experiments suggest that TFS is primarily used as a grouping cue to select the time-frequency regions -corresponding to the target speech signal.
Hansen, J H; Nandkumar, S
1995-01-01
The formulation of reliable signal processing algorithms for speech coding and synthesis require the selection of a prior criterion of performance. Though coding efficiency (bits/second) or computational requirements can be used, a final performance measure must always include speech quality. In this paper, three objective speech quality measures are considered with respect to quality assessment for American English, noisy American English, and noise-free versions of seven languages. The purpose is to determine whether objective quality measures can be used to quantify changes in quality for a given voice coding method, with a known subjective performance level, as background noise or language conditions are changed. The speech coding algorithm chosen is regular-pulse excitation with long-term prediction (RPE-LTP), which has been chosen as the standard voice compression algorithm for the European Digital Mobile Radio system. Three areas are considered for objective quality assessment which include: (i) vocoder performance for American English in a noise-free environment, (ii) speech quality variation for three additive background noise sources, and (iii) noise-free performance for seven languages which include English, Japanese, Finnish, German, Hindi, Spanish, and French. It is suggested that although existing objective quality measures will never replace subjective testing, they can be a useful means of assessing changes in performance, identifying areas for improvement in algorithm design, and augmenting subjective quality tests for voice coding/compression algorithms in noise-free, noisy, and/or non-English applications.
A First Look at Sandra Day O'Connor and the First Amendment.
ERIC Educational Resources Information Center
Schwartz, Thomas A.
First Amendment students were unhappy to see Supreme Court Justice Potter Stewart retire because his voting record demonstrated a favorable attitude toward freedom of speech and press. His replacement, Sandra Day O'Connor, was predicted to be a conservative or moderate who probably would vote consistently with Stewart in other areas, but her…
Exploring super-Gaussianity toward robust information-theoretical time delay estimation.
Petsatodis, Theodoros; Talantzis, Fotios; Boukis, Christos; Tan, Zheng-Hua; Prasad, Ramjee
2013-03-01
Time delay estimation (TDE) is a fundamental component of speaker localization and tracking algorithms. Most of the existing systems are based on the generalized cross-correlation method assuming gaussianity of the source. It has been shown that the distribution of speech, captured with far-field microphones, is highly varying, depending on the noise and reverberation conditions. Thus the performance of TDE is expected to fluctuate depending on the underlying assumption for the speech distribution, being also subject to multi-path reflections and competitive background noise. This paper investigates the effect upon TDE when modeling the source signal with different speech-based distributions. An information theoretical TDE method indirectly encapsulating higher order statistics (HOS) formed the basis of this work. The underlying assumption of Gaussian distributed source has been replaced by that of generalized Gaussian distribution that allows evaluating the problem under a larger set of speech-shaped distributions, ranging from Gaussian to Laplacian and Gamma. Closed forms of the univariate and multivariate entropy expressions of the generalized Gaussian distribution are derived to evaluate the TDE. The results indicate that TDE based on the specific criterion is independent of the underlying assumption for the distribution of the source, for the same covariance matrix.
Monogenic and chromosomal causes of isolated speech and language impairment.
Barnett, C P; van Bon, B W M
2015-11-01
The importance of a precise molecular diagnosis for children with intellectual disability, autism spectrum disorder and epilepsy has become widely accepted and genetic testing is an integral part of the diagnostic evaluation of these children. In contrast, children with an isolated speech or language disorder are not often genetically evaluated, despite recent evidence supporting a role for genetic factors in the aetiology of these disorders. Several chromosomal copy number variants and single gene disorders associated with abnormalities of speech and language have been identified. Individuals without a precise genetic diagnosis will not receive optimal management including interventions such as early testosterone replacement in Klinefelter syndrome, otorhinolaryngological and audiometric evaluation in 22q11.2 deletion syndrome, cardiovascular surveillance in 7q11.23 duplications and early dietary management to prevent obesity in proximal 16p11.2 deletions. This review summarises the clinical features, aetiology and management options of known chromosomal and single gene disorders that are associated with speech and language pathology in the setting of normal or only mildly impaired cognitive function. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Mooij, Anne H; Huiskamp, Geertjan J M; Gosselaar, Peter H; Ferrier, Cyrille H
2016-02-01
Electrocorticographic (ECoG) mapping of high gamma activity induced by language tasks has been proposed as a more patient friendly alternative for electrocortical stimulation mapping (ESM), the gold standard in pre-surgical language mapping of epilepsy patients. However, ECoG mapping often reveals more language areas than considered critical with ESM. We investigated if critical language areas can be identified with a listening task consisting of speech and music phrases. Nine patients with implanted subdural grid electrodes listened to an audio fragment in which music and speech alternated. We analysed ECoG power in the 65-95 Hz band and obtained task-related activity patterns in electrodes over language areas. We compared the spatial distribution of sites that discriminated between listening to speech and music to ESM results using sensitivity and specificity calculations. Our listening task of alternating speech and music phrases had a low sensitivity (0.32) but a high specificity (0.95). The high specificity indicates that this test does indeed point to areas that are critical to language processing. Our test cannot replace ESM, but this short and simple task can give a reliable indication where to find critical language areas, better than ECoG mapping using language tasks alone. Copyright © 2015 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Nouns slow down speech across structurally and culturally diverse languages
Danielsen, Swintha; Hartmann, Iren; Pakendorf, Brigitte; Witzlack-Makarevich, Alena; de Jong, Nivja H.
2018-01-01
By force of nature, every bit of spoken language is produced at a particular speed. However, this speed is not constant—speakers regularly speed up and slow down. Variation in speech rate is influenced by a complex combination of factors, including the frequency and predictability of words, their information status, and their position within an utterance. Here, we use speech rate as an index of word-planning effort and focus on the time window during which speakers prepare the production of words from the two major lexical classes, nouns and verbs. We show that, when naturalistic speech is sampled from languages all over the world, there is a robust cross-linguistic tendency for slower speech before nouns compared with verbs, both in terms of slower articulation and more pauses. We attribute this slowdown effect to the increased amount of planning that nouns require compared with verbs. Unlike verbs, nouns can typically only be used when they represent new or unexpected information; otherwise, they have to be replaced by pronouns or be omitted. These conditions on noun use appear to outweigh potential advantages stemming from differences in internal complexity between nouns and verbs. Our findings suggest that, beneath the staggering diversity of grammatical structures and cultural settings, there are robust universals of language processing that are intimately tied to how speakers manage referential information when they communicate with one another. PMID:29760059
A simple and effective treatment for stuttering: speech practice without audience.
Yamada, Jun; Homma, Takanobu
2007-01-01
On the assumption that stuttering is essentially acquired behavior, it has been concluded that speech-related anticipatory anxiety as a major cause of stuttering accounts for virtually all apparently-different aspects of stuttering on the behavioral level. Stutterers' linguistic competence is unimpaired, although their speech production is characterized as "disfluent". Yet, such disfluency is dramatically reduced when such people speak in anxiety-free no-audience conditions. Furthermore, our pilot study of oral reading in Japanese indicates that a stutterer can easily replace stuttering events with a common interjection, "eh", and make oral reading sound natural and fluent. Given these facts, we propose the Overlearning Fluency when Alone (OFA) treatment, consisting of two distinct but overlapping steps: (1) Overlearning of fluency in a no-audience condition, and (2) Use of an interjection, "eh", as a starter when a stuttering event is anticipated. It remains to be demonstrated that this is a truly simple and effective treatment for "one of mankind's most baffling afflictions".
Using speech for mode selection in control of multifunctional myoelectric prostheses.
Fang, Peng; Wei, Zheng; Geng, Yanjuan; Yao, Fuan; Li, Guanglin
2013-01-01
Electromyogram (EMG) recorded from residual muscles of limbs is considered as suitable control information for motorized prostheses. However, in case of high-level amputations, the residual muscles are usually limited, which may not provide enough EMG for flexible control of myoelectric prostheses with multiple degrees of freedom of movements. Here, we proposed a control strategy, where the speech signals were used as additional information and combined with the EMG signals to realize more flexible control of multifunctional prostheses. By replacing the traditional "sequential mode-switching (joint-switching)", the speech signals were used to select a mode (joint) of the prosthetic arm, and then the EMG signals were applied to determine a motion class involved in the selected joint and to execute the motion. Preliminary results from three able-bodied subjects and one transhumeral amputee demonstrated the proposed strategy could achieve a high mode-selection rate and enhance the operation efficiency, suggesting the strategy may improve the control performance of commercial myoelectric prostheses.
von Merten, Sophie; Hoier, Svenja
2014-01-01
It has long been known that rodents emit signals in the ultrasonic range, but their role in social communication and mating is still under active exploration. While inbred strains of house mice have emerged as a favourite model to study ultrasonic vocalisation (USV) patterns, studies in wild animals and natural situations are still rare. We focus here on two wild derived mouse populations. We recorded them in dyadic encounters for extended periods of time to assess possible roles of USVs and their divergence between allopatric populations. We have analysed song frequency and duration, as well as spectral features of songs and syllables. We show that the populations have indeed diverged in several of these aspects and that USV patterns emitted in a mating context differ from those emitted in same sex encounters. We find that females vocalize not less, in encounters with another female even more than males. This implies that the current focus of USVs being emitted mainly by males within the mating context needs to be reconsidered. Using a statistical syntax analysis we find complex temporal sequencing patterns that could suggest that the syntax conveys meaningful information to the receivers. We conclude that wild mice use USV for complex social interactions and that USV patterns can diverge fast between populations. PMID:24816836
ERIC Educational Resources Information Center
Fogerty, Daniel; Ahlstrom, Jayne B.; Bologna, William J.; Dubno, Judy R.
2016-01-01
Purpose: This study investigated how listeners process acoustic cues preserved during sentences interrupted by nonsimultaneous noise that was amplitude modulated by a competing talker. Method: Younger adults with normal hearing and older adults with normal or impaired hearing listened to sentences with consonants or vowels replaced with noise…
Teki, Sundeep; Barnes, Gareth R; Penny, William D; Iverson, Paul; Woodhead, Zoe V J; Griffiths, Timothy D; Leff, Alexander P
2013-06-01
In this study, we used magnetoencephalography and a mismatch paradigm to investigate speech processing in stroke patients with auditory comprehension deficits and age-matched control subjects. We probed connectivity within and between the two temporal lobes in response to phonemic (different word) and acoustic (same word) oddballs using dynamic causal modelling. We found stronger modulation of self-connections as a function of phonemic differences for control subjects versus aphasics in left primary auditory cortex and bilateral superior temporal gyrus. The patients showed stronger modulation of connections from right primary auditory cortex to right superior temporal gyrus (feed-forward) and from left primary auditory cortex to right primary auditory cortex (interhemispheric). This differential connectivity can be explained on the basis of a predictive coding theory which suggests increased prediction error and decreased sensitivity to phonemic boundaries in the aphasics' speech network in both hemispheres. Within the aphasics, we also found behavioural correlates with connection strengths: a negative correlation between phonemic perception and an inter-hemispheric connection (left superior temporal gyrus to right superior temporal gyrus), and positive correlation between semantic performance and a feedback connection (right superior temporal gyrus to right primary auditory cortex). Our results suggest that aphasics with impaired speech comprehension have less veridical speech representations in both temporal lobes, and rely more on the right hemisphere auditory regions, particularly right superior temporal gyrus, for processing speech. Despite this presumed compensatory shift in network connectivity, the patients remain significantly impaired.
Viscous Flow Structures Downstream of a Model Tracheoesophageal Prosthesis
NASA Astrophysics Data System (ADS)
Hemsing, Frank; Erath, Byron
2013-11-01
In tracheoesophageal speech (TES), the glottis is replaced by the tissue of the pharyngeoesophageal segment (PES) as the vibrating element of speech production. During TES air is forced from the lungs into the esophagus via a prosthetic tube that connects the trachea with the esophagus. Air moving up the esophagus incites self-sustained oscillations of the surgically created PES, generating sound analogous to voiced speech. Despite the ubiquity with which TES is employed as a method for restoring speech to laryngectomees, the effect of viscous flow structures on voice production in TES is not well understood. Of particular interest is the flow exiting the prosthetic connection between the trachea and esophagus, because of its influence on the total pressure loss (i.e. effort required to produce speech), and the fluid-structure energy exchange that drives the PES. Understanding this flow behavior can inform prosthesis design to enhance beneficial flow structures and mitigate the need for adjustment of prosthesis placement. This study employs a physical model of the tracheoesophageal geometry to investigate the flow structures that arise in TES. The geometry of this region is modeled at three times physiological scale using water as the working fluid to obtain nondimensional numbers matching flow in TES. Modulation of the flow is achieved with a computer controlled gate valve at a scaled frequency of 0.22 Hz to mimic the oscillations of the PES. Particle image velocimetry is used to resolve flow characteristics at the tracheoesophageal prosthesis. Data are acquired for three cases of prosthesis insertion angle.
Barnes, Gareth R.; Penny, William D.; Iverson, Paul; Woodhead, Zoe V. J.; Griffiths, Timothy D.; Leff, Alexander P.
2013-01-01
In this study, we used magnetoencephalography and a mismatch paradigm to investigate speech processing in stroke patients with auditory comprehension deficits and age-matched control subjects. We probed connectivity within and between the two temporal lobes in response to phonemic (different word) and acoustic (same word) oddballs using dynamic causal modelling. We found stronger modulation of self-connections as a function of phonemic differences for control subjects versus aphasics in left primary auditory cortex and bilateral superior temporal gyrus. The patients showed stronger modulation of connections from right primary auditory cortex to right superior temporal gyrus (feed-forward) and from left primary auditory cortex to right primary auditory cortex (interhemispheric). This differential connectivity can be explained on the basis of a predictive coding theory which suggests increased prediction error and decreased sensitivity to phonemic boundaries in the aphasics’ speech network in both hemispheres. Within the aphasics, we also found behavioural correlates with connection strengths: a negative correlation between phonemic perception and an inter-hemispheric connection (left superior temporal gyrus to right superior temporal gyrus), and positive correlation between semantic performance and a feedback connection (right superior temporal gyrus to right primary auditory cortex). Our results suggest that aphasics with impaired speech comprehension have less veridical speech representations in both temporal lobes, and rely more on the right hemisphere auditory regions, particularly right superior temporal gyrus, for processing speech. Despite this presumed compensatory shift in network connectivity, the patients remain significantly impaired. PMID:23715097
ERIC Educational Resources Information Center
Jones, Elaine
2008-01-01
Over the past few decades, school teachers have been embracing a number of electronic technologies for use in the classroom. Computers are now prevalent; overhead projectors are being replaced with dynamic teaching tools such as data projection, electronic whiteboards, and video media. One key technology is just beginning to catch up to the…
Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.
Swaminathan, Jayaganesh; Mason, Christine R; Streeter, Timothy M; Best, Virginia; Roverud, Elin; Kidd, Gerald
2016-08-03
While conversing in a crowded social setting, a listener is often required to follow a target speech signal amid multiple competing speech signals (the so-called "cocktail party" problem). In such situations, separation of the target speech signal in azimuth from the interfering masker signals can lead to an improvement in target intelligibility, an effect known as spatial release from masking (SRM). This study assessed the contributions of two stimulus properties that vary with separation of sound sources, binaural envelope (ENV) and temporal fine structure (TFS), to SRM in normal-hearing (NH) human listeners. Target speech was presented from the front and speech maskers were either colocated with or symmetrically separated from the target in azimuth. The target and maskers were presented either as natural speech or as "noise-vocoded" speech in which the intelligibility was conveyed only by the speech ENVs from several frequency bands; the speech TFS within each band was replaced with noise carriers. The experiments were designed to preserve the spatial cues in the speech ENVs while retaining/eliminating them from the TFS. This was achieved by using the same/different noise carriers in the two ears. A phenomenological auditory-nerve model was used to verify that the interaural correlations in TFS differed across conditions, whereas the ENVs retained a high degree of correlation, as intended. Overall, the results from this study revealed that binaural TFS cues, especially for frequency regions below 1500 Hz, are critical for achieving SRM in NH listeners. Potential implications for studying SRM in hearing-impaired listeners are discussed. Acoustic signals received by the auditory system pass first through an array of physiologically based band-pass filters. Conceptually, at the output of each filter, there are two principal forms of temporal information: slowly varying fluctuations in the envelope (ENV) and rapidly varying fluctuations in the temporal fine structure (TFS). The importance of these two types of information in everyday listening (e.g., conversing in a noisy social situation; the "cocktail-party" problem) has not been established. This study assessed the contributions of binaural ENV and TFS cues for understanding speech in multiple-talker situations. Results suggest that, whereas the ENV cues are important for speech intelligibility, binaural TFS cues are critical for perceptually segregating the different talkers and thus for solving the cocktail party problem. Copyright © 2016 the authors 0270-6474/16/368250-08$15.00/0.
Comparison of Fluoroplastic Causse Loop Piston and Titanium Soft-Clip in Stapedotomy
Faramarzi, Mohammad; Gilanifar, Nafiseh; Roosta, Sareh
2017-01-01
Introduction: Different types of prosthesis are available for stapes replacement. Because there has been no published report on the efficacy of the titanium soft-clip vs the fluoroplastic Causse loop Teflon piston, we compared short-term hearing results of both types of prosthesis in patients who underwent stapedotomy due to otosclerosis. Materials and Methods: A total of 57 ears were included in the soft-clip group and 63 ears were included in the Teflon-piston group. Pre-operative and post-operative air conduction, bone conduction, air-bone gaps, speech discrimination score, and speech reception thresholds were analyzed. Results: Post-operative speech reception threshold gains did not differ significantly between the two groups (P=0.919). However, better post-operative air-bone gap improvement at low frequencies was observed in the Teflon-piston group over the short-term follow-up (at frequencies of 0.25 and 0.50 kHz; P=0.007 and P=0.001, respectively). Conclusion: Similar post-operative hearing results were observed in the two groups in the short-term. PMID:28229059
de Cates, Angharad N; Morlet, Julien; Antoun Reyad, Ayman; Tadros, George
2017-10-25
This is a case report of a man in his 60s who presented to an English hospital following a significant lithium overdose. He was monitored for 24 hours, and then renal replacement therapy was initiated after assessment by the renal team. As soon as the lithium level returned to normal therapeutic levels (from 4.7 mEq/L to 0.67 mEq/L), lithium was restarted by the medical team. At this point, the patient developed new slurred speech and later catatonia. In this case report, we discuss the factors that could determine which patients are at risk of neurotoxicity following lithium overdose and the appropriate decision regarding when and how to consider initiation of renal replacement therapy and restarting of lithium. © BMJ Publishing Group Ltd (unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Howard, Monique L; Palmer, Stephen J; Taylor, Kylie M; Arthurson, Geoffrey J; Spitzer, Matthew W; Du, Xin; Pang, Terence Y C; Renoir, Thibault; Hardeman, Edna C; Hannan, Anthony J
2012-03-01
Insufficiency of the transcriptional regulator GTF2IRD1 has become a strong potential explanation for some of the major characteristic features of the neurodevelopmental disorder Williams-Beuren syndrome (WBS). Genotype/phenotype correlations in humans indicate that the hemizygous loss of the GTF2IRD1 gene and an adjacent paralogue, GTF2I, play crucial roles in the neurocognitive and craniofacial aspects of the disease. In order to explore this genetic relationship in greater detail, we have generated a targeted Gtf2ird1 mutation in mice that blocks normal GTF2IRD1 protein production. Detailed analyses of homozygous null Gtf2ird1 mice have revealed a series of phenotypes that share some intriguing parallels with WBS. These include reduced body weight, a facial deformity resulting from localised epidermal hyperplasia, a motor coordination deficit, alterations in exploratory activity and, in response to specific stress-inducing stimuli; a novel audible vocalisation and increased serum corticosterone. Analysis of Gtf2ird1 expression patterns in the brain using a knock-in LacZ reporter and c-fos activity mapping illustrates the regions where these neurological abnormalities may originate. These data provide new mechanistic insight into the clinical genetic findings in WBS patients and indicate that insufficiency of GTF2IRD1 protein contributes to abnormalities of facial development, motor function and specific behavioural disorders that accompany this disease. Copyright © 2011 Elsevier Inc. All rights reserved.
Leliveld, Lisette M C; Düpjan, Sandra; Tuchscherer, Armin; Puppe, Birger
2016-04-01
In the study of animal emotions, emotional valence has been found to be difficult to measure. Many studies of farm animals' emotions have therefore focussed on the identification of indicators of strong, mainly negative, emotions. However, subtle variations in emotional valence, such as those caused by rather moderate differences in husbandry conditions, may also affect animals' mood and welfare when such variations occur consistently. In this study, we investigated whether repeated moderate aversive or rewarding events could lead to measurable differences in emotional valence in young, weaned pigs. We conditioned 105 female pigs in a test arena to either a repeated startling procedure (sudden noises or appearances of objects) or a repeated rewarding procedure (applesauce, toy and straw) over 11 sessions. Control pigs were also regularly exposed to the same test arena but without conditioning. Before and after conditioning, we measured heart rate and its variability as well as the behavioural reactions of the subjects in the test arena, with a special focus on detailed acoustic analyses of their vocalisations. The behavioural and heart rate measures were analysed as changes compared to the baseline values before conditioning. A limited number of the putative indicators of emotional valence were affected by the conditioning. We found that the negatively conditioned pigs showed changes that were significantly different from those in control pigs, namely a decrease in locomotion and an increase in standing. The positively conditioned pigs, however, showed a stronger increase in heart rate and a smaller decrease in SDNN (a heart rate variability parameter indicating changes in autonomic regulation) compared to the controls. Compared to the negatively conditioned pigs, the positively conditioned pigs produced fewer vocalisations overall as well as fewer low-frequency grunts but more high-frequency grunts. The low-frequency grunts of the negatively conditioned pigs also showed lower frequency parameters (bandwidth, maximum frequency, 25% and 50% quartiles) compared to those of the positively conditioned pigs. In any of the statistically significant results, the conditioning accounted for 1.5-11.9% of variability in the outcome variable. Hence, we conclude that repeated moderate aversive and rewarding events have weak but measurable effects on some aspects of behaviour and physiology in young pigs, possibly indicating changes in emotional valence, which could ultimately affect their welfare. The combination of ethophysiological indicators, i.e., the concurrent examination of heart rate measures, behavioural responses and especially vocalisation patterns, as used in the current study, might be a useful way of examining subtle effects on emotional valence in further studies. Copyright © 2016 Elsevier Inc. All rights reserved.
Bolt, Sarah L; Boyland, Natasha K; Mlynski, David T; James, Richard; Croft, Darren P
2017-01-01
The early social environment can influence the health and behaviour of animals, with effects lasting into adulthood. In Europe, around 60% of dairy calves are reared individually during their first eight weeks of life, while others may be housed in pairs or small groups. This study assessed the effects of varying degrees of social contact on weaning stress, health and production during pen rearing, and on the social networks that calves later formed when grouped. Forty female Holstein-Friesian calves were allocated to one of three treatments: individually housed (I, n = 8), pair-housed from day five (P5, n = 8 pairs), and pair-housed from day 28 (P28, n = 8 pairs). From day 48, calves were weaned by gradual reduction of milk over three days, and vocalisations were recorded as a measure of stress for three days before, during and after weaning. Health and production (growth rate and concentrate intakes) were not affected by treatment during the weaning period or over the whole study. Vocalisations were highest post-weaning, and were significantly higher in I calves than pair-reared calves. Furthermore, P28 calves vocalised significantly more than P5 calves. The social network of calves was measured for one month after all calves were grouped in a barn, using association data from spatial proximity loggers. We tested for week-week stability, social differentiation and assortment in the calf network. Additionally, we tested for treatment differences in: coefficient of variation (CV) in association strength, percentage of time spent with ex-penmate (P5 and P28 calves only) and weighted degree centrality (the sum of the strength of an individual's associations). The network was relatively stable from weeks one to four and was significantly differentiated, with individuals assorting based on prior familiarity. P5 calves had significantly higher CV in association strength than I calves in week one (indicating more heterogeneous social associations) but there were no significant treatment differences in week four. The mean percentage of time that individuals spent with their ex-penmate after regrouping decreased from weeks 1-4, though treatment did not affect this. There were also no significant differences in weighted degree centrality between calves in each rearing treatment. These results suggest that early pair-rearing can allow calves the stress buffering benefits of social support (and that this is more effective when calves are paired earlier) without compromising health or production, and sheds light on the early development of social behaviour in cattle.
Mlynski, David T.; James, Richard; Croft, Darren P.
2017-01-01
The early social environment can influence the health and behaviour of animals, with effects lasting into adulthood. In Europe, around 60% of dairy calves are reared individually during their first eight weeks of life, while others may be housed in pairs or small groups. This study assessed the effects of varying degrees of social contact on weaning stress, health and production during pen rearing, and on the social networks that calves later formed when grouped. Forty female Holstein-Friesian calves were allocated to one of three treatments: individually housed (I, n = 8), pair-housed from day five (P5, n = 8 pairs), and pair-housed from day 28 (P28, n = 8 pairs). From day 48, calves were weaned by gradual reduction of milk over three days, and vocalisations were recorded as a measure of stress for three days before, during and after weaning. Health and production (growth rate and concentrate intakes) were not affected by treatment during the weaning period or over the whole study. Vocalisations were highest post-weaning, and were significantly higher in I calves than pair-reared calves. Furthermore, P28 calves vocalised significantly more than P5 calves. The social network of calves was measured for one month after all calves were grouped in a barn, using association data from spatial proximity loggers. We tested for week-week stability, social differentiation and assortment in the calf network. Additionally, we tested for treatment differences in: coefficient of variation (CV) in association strength, percentage of time spent with ex-penmate (P5 and P28 calves only) and weighted degree centrality (the sum of the strength of an individual’s associations). The network was relatively stable from weeks one to four and was significantly differentiated, with individuals assorting based on prior familiarity. P5 calves had significantly higher CV in association strength than I calves in week one (indicating more heterogeneous social associations) but there were no significant treatment differences in week four. The mean percentage of time that individuals spent with their ex-penmate after regrouping decreased from weeks 1–4, though treatment did not affect this. There were also no significant differences in weighted degree centrality between calves in each rearing treatment. These results suggest that early pair-rearing can allow calves the stress buffering benefits of social support (and that this is more effective when calves are paired earlier) without compromising health or production, and sheds light on the early development of social behaviour in cattle. PMID:28052122
Understanding environmental sounds in sentence context.
Uddin, Sophia; Heald, Shannon L M; Van Hedger, Stephen C; Klos, Serena; Nusbaum, Howard C
2018-03-01
There is debate about how individuals use context to successfully predict and recognize words. One view argues that context supports neural predictions that make use of the speech motor system, whereas other views argue for a sensory or conceptual level of prediction. While environmental sounds can convey clear referential meaning, they are not linguistic signals, and are thus neither produced with the vocal tract nor typically encountered in sentence context. We compared the effect of spoken sentence context on recognition and comprehension of spoken words versus nonspeech, environmental sounds. In Experiment 1, sentence context decreased the amount of signal needed for recognition of spoken words and environmental sounds in similar fashion. In Experiment 2, listeners judged sentence meaning in both high and low contextually constraining sentence frames, when the final word was present or replaced with a matching environmental sound. Results showed that sentence constraint affected decision time similarly for speech and nonspeech, such that high constraint sentences (i.e., frame plus completion) were processed faster than low constraint sentences for speech and nonspeech. Linguistic context facilitates the recognition and understanding of nonspeech sounds in much the same way as for spoken words. This argues against a simple form of a speech-motor explanation of predictive coding in spoken language understanding, and suggests support for conceptual-level predictions. Copyright © 2017 Elsevier B.V. All rights reserved.
2000-01-01
for flight test data, and both generic and specialized tools of data filtering , data calibration, modeling , system identification, and simulation...GRAMMATICAL MODEL AND PARSER FOR AIR TRAFFIC CONTROLLER’S COMMANDS 11 A SPEECH-CONTROLLED INTERACTIVE VIRTUAL ENVIRONMENT FOR SHIP FAMILIARIZATION 12... MODELING AND SIMULATION IN THE 21ST CENTURY 23 NEW COTS HARDWARE AND SOFTWARE REDUCE THE COST AND EFFORT IN REPLACING AGING FLIGHT SIMULATORS SUBSYSTEMS
Group decision-making in chacma baboons: leadership, order and communication during movement
2011-01-01
Background Group coordination is one of the greatest challenges facing animals living in groups. Obligatory trade-offs faced by group members can potentially lead to phenomena at the group level such as the emergence of a leader, consistent structure in the organization of individuals when moving, and the use of visual or acoustic communication. This paper describes the study of collective decision-making at the time of departure (i.e. initiation) for movements of two groups of wild chacma baboons (Papio ursinus). One group was composed of 11 individuals, whilst the other consisted of about 100 individuals. Results Results for both groups showed that adult males initiated more movements even if the leadership was also distributed to adult females and young individuals. Baboons then joined a movement according to a specific order: adult males and adult females were at the front and the back of the group, sub-adults were at the back and juveniles were located in the central part of the progression. In the two groups, vocalisations, especially loud calls, were more frequently emitted just before the initiation of a group movement, but the frequency of these vocalisations did not influence the success of an initiation in any way. Conclusion The emergence of a leadership biased towards male group members might be related to their dominance rank and to the fact that they have the highest nutrient requirements in the group. Loud calls are probably not used as recruitment signals but more as a cue concerning the motivation to move, therefore enhancing coordination between group members. PMID:22014356
Group decision-making in chacma baboons: leadership, order and communication during movement.
Sueur, Cédric
2011-10-20
Group coordination is one of the greatest challenges facing animals living in groups. Obligatory trade-offs faced by group members can potentially lead to phenomena at the group level such as the emergence of a leader, consistent structure in the organization of individuals when moving, and the use of visual or acoustic communication. This paper describes the study of collective decision-making at the time of departure (i.e. initiation) for movements of two groups of wild chacma baboons (Papio ursinus). One group was composed of 11 individuals, whilst the other consisted of about 100 individuals. Results for both groups showed that adult males initiated more movements even if the leadership was also distributed to adult females and young individuals. Baboons then joined a movement according to a specific order: adult males and adult females were at the front and the back of the group, sub-adults were at the back and juveniles were located in the central part of the progression. In the two groups, vocalisations, especially loud calls, were more frequently emitted just before the initiation of a group movement, but the frequency of these vocalisations did not influence the success of an initiation in any way. The emergence of a leadership biased towards male group members might be related to their dominance rank and to the fact that they have the highest nutrient requirements in the group. Loud calls are probably not used as recruitment signals but more as a cue concerning the motivation to move, therefore enhancing coordination between group members.
Iacobucci, Paolo; Colonnello, Valentina; Fuchs, Thomas; D'Antuono, Laura; Panksepp, Jaak
2013-10-01
Preclinical models of human mood disorders commonly focus on the study of negative affectivity, without comparably stressing the role of positive affects and their ability to promote resilient coping styles. We evaluated the role of background constitutional affect of rats by studying the separation and reunion responses of infants from low and high positive affect genetic lines (i.e., differentially selected for High and Low 50 kHz ultrasonic vocalisations (USVs). Infants from Low and High 50 kHz USV breeding lines were isolated from mothers and exposed to either social (familiar or unfamiliar bedding) or neutral (clean bedding) odour cues between two short isolation periods, and tested in homeothermic and hypothermic ambient temperatures. Negative affect was estimated by monitoring separation distress calls (35-45 kHz USVs). Low Line pups called at higher rates than High Line, and their rates were stable regardless of odour cue. In contrast, High Line pups increased vocalisations during the second compared with the first isolation periods and during exposure to both familiar and unfamiliar odour cues, but not to neutral odour. Furthermore, the greatest increase in USV emission was seen in the second isolation period following exposure to the unfamiliar odour. However, both lines showed comparable elevated distress USVs to the thermal stressor. High Line animals, selected for a positive affective phenotype (50 kHz USVs), exhibited reduced separation anxiety responses in infancy, making this a promising animal model for the role of constitutional affective states in emotional responsivity and potential resilience against emotional disorders.
Barbaro, Josephine; Dissanayake, Cheryl
2017-10-01
Autism spectrum disorder diagnoses in toddlers have been established as accurate and stable across time in high-risk siblings and clinic-referred samples. Few studies have investigated diagnostic stability in children prospective identified in community-based settings. Furthermore, there is a dearth of evidence on the individual behaviours that predict diagnostic change over time. The stability and change of autism spectrum disorder diagnoses were investigated from 24 to 48 months in 77 children drawn from the Social Attention and Communication Study. Diagnostic stability was high, with 88.3% overall stability and 85.5% autism spectrum disorder stability. The behavioural markers at 24 months that contributed to diagnostic shift off the autism spectrum by 48 months included better eye contact, more directed vocalisations, the integration of gaze and directed vocalisations/gestures and higher non-verbal developmental quotient. These four variables correctly predicted 88.7% of children into the autism spectrum disorder-stable and autism spectrum disorder-crossover groups overall, with excellent prediction for the stable group (96.2%) and modest prediction for the crossover group (44.4%). Furthermore, non-verbal developmental quotient at 24 months accounted for the significant improvement across time in 'Social Affect' scores on the Autism Diagnostic Observation Schedule for both groups and was the only unique predictor of diagnostic crossover. These findings contribute to the body of evidence on the feasibility of diagnoses at earlier ages to facilitate children's access to interventions to promote positive developmental outcomes.
Success of capture of toads improved by manipulating acoustic characteristics of lures.
Muller, Benjamin J; Schwarzkopf, Lin
2017-11-01
Management of invasive vertebrates is a crucial component of conservation. Trapping reproductive adults is often effective for control, and modification of traps may greatly increase their attractiveness to such individuals. Cane toads (Rhinella marina) are invasive, and males use advertisement vocalisations to attract reproductive females. In amphibians, including toads, specific structural parameters of calls (e.g. dominant frequency and pulse rate) may be attractive to females. Some cane toad traps use an artificial advertisement vocalisation to attract toads. We determined whether variation of the call's parameters (volume, dominant frequency and pulse rate) could increase the capture rate of gravid females. Overall, traps equipped with loud calls (80 dB at 1 m) caught significantly more toads, and proportionally more gravid females, than traps with quiet calls (60 dB at 1 m), and traps with low dominant frequency calls caught more gravid females than traps with median frequency calls. Traps with high pulse rate calls attracted more females than traps with low pulse rate calls. Approximately 91% of the females trapped using a low frequency and high pulse rate combination call were gravid, whereas in traps using a call with population median parameters only approximately 75% of captured females were gravid. Calls that indicated large-bodied males (low frequency) with high energy reserves (high pulse rate) are often attractive to female anurans and were effective lures for female toads in our study. The design of future trapping regimes should account for behavioural preferences of the target sex. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Deecke, Volker B; Barrett-Lennard, Lance G; Spong, Paul; Ford, John K B
2010-05-01
A few species of mammals produce group-specific vocalisations that are passed on by learning, but the function of learned vocal variation remains poorly understood. Resident killer whales live in stable matrilineal groups with repertoires of seven to 17 stereotyped call types. Some types are shared among matrilines, but their structure typically shows matriline-specific differences. Our objective was to analyse calls of nine killer whale matrilines in British Columbia to test whether call similarity primarily reflects social or genetic relationships. Recordings were made in 1985-1995 in the presence of focal matrilines that were either alone or with groups with non-overlapping repertoires. We used neural network discrimination performance to measure the similarity of call types produced by different matrilines and determined matriline association rates from 757 encounters with one or more focal matrilines. Relatedness was measured by comparing variation at 11 microsatellite loci for the oldest female in each group. Call similarity was positively correlated with association rates for two of the three call types analysed. Similarity of the N4 call type was also correlated with matriarch relatedness. No relationship between relatedness and association frequency was detected. These results show that call structure reflects relatedness and social affiliation, but not because related groups spend more time together. Instead, call structure appears to play a role in kin recognition and shapes the association behaviour of killer whale groups. Our results therefore support the hypothesis that increasing social complexity plays a role in the evolution of learned vocalisations in some mammalian species.
NASA Astrophysics Data System (ADS)
Deecke, Volker B.; Barrett-Lennard, Lance G.; Spong, Paul; Ford, John K. B.
2010-05-01
A few species of mammals produce group-specific vocalisations that are passed on by learning, but the function of learned vocal variation remains poorly understood. Resident killer whales live in stable matrilineal groups with repertoires of seven to 17 stereotyped call types. Some types are shared among matrilines, but their structure typically shows matriline-specific differences. Our objective was to analyse calls of nine killer whale matrilines in British Columbia to test whether call similarity primarily reflects social or genetic relationships. Recordings were made in 1985-1995 in the presence of focal matrilines that were either alone or with groups with non-overlapping repertoires. We used neural network discrimination performance to measure the similarity of call types produced by different matrilines and determined matriline association rates from 757 encounters with one or more focal matrilines. Relatedness was measured by comparing variation at 11 microsatellite loci for the oldest female in each group. Call similarity was positively correlated with association rates for two of the three call types analysed. Similarity of the N4 call type was also correlated with matriarch relatedness. No relationship between relatedness and association frequency was detected. These results show that call structure reflects relatedness and social affiliation, but not because related groups spend more time together. Instead, call structure appears to play a role in kin recognition and shapes the association behaviour of killer whale groups. Our results therefore support the hypothesis that increasing social complexity plays a role in the evolution of learned vocalisations in some mammalian species.
A Vocal-Based Analytical Method for Goose Behaviour Recognition
Steen, Kim Arild; Therkildsen, Ole Roland; Karstoft, Henrik; Green, Ole
2012-01-01
Since human-wildlife conflicts are increasing, the development of cost-effective methods for reducing damage or conflict levels is important in wildlife management. A wide range of devices to detect and deter animals causing conflict are used for this purpose, although their effectiveness is often highly variable, due to habituation to disruptive or disturbing stimuli. Automated recognition of behaviours could form a critical component of a system capable of altering the disruptive stimuli to avoid this. In this paper we present a novel method to automatically recognise goose behaviour based on vocalisations from flocks of free-living barnacle geese (Branta leucopsis). The geese were observed and recorded in a natural environment, using a shielded shotgun microphone. The classification used Support Vector Machines (SVMs), which had been trained with labeled data. Greenwood Function Cepstral Coefficients (GFCC) were used as features for the pattern recognition algorithm, as they can be adjusted to the hearing capabilities of different species. Three behaviours are classified based in this approach, and the method achieves a good recognition of foraging behaviour (86–97% sensitivity, 89–98% precision) and a reasonable recognition of flushing (79–86%, 66–80%) and landing behaviour(73–91%, 79–92%). The Support Vector Machine has proven to be a robust classifier for this kind of classification, as generality and non-linear capabilities are important. We conclude that vocalisations can be used to automatically detect behaviour of conflict wildlife species, and as such, may be used as an integrated part of a wildlife management system. PMID:22737037
Bendixen, Alexandra; Scharinger, Mathias; Strauß, Antje; Obleser, Jonas
2014-04-01
Speech signals are often compromised by disruptions originating from external (e.g., masking noise) or internal (e.g., inaccurate articulation) sources. Speech comprehension thus entails detecting and replacing missing information based on predictive and restorative neural mechanisms. The present study targets predictive mechanisms by investigating the influence of a speech segment's predictability on early, modality-specific electrophysiological responses to this segment's omission. Predictability was manipulated in simple physical terms in a single-word framework (Experiment 1) or in more complex semantic terms in a sentence framework (Experiment 2). In both experiments, final consonants of the German words Lachs ([laks], salmon) or Latz ([lats], bib) were occasionally omitted, resulting in the syllable La ([la], no semantic meaning), while brain responses were measured with multi-channel electroencephalography (EEG). In both experiments, the occasional presentation of the fragment La elicited a larger omission response when the final speech segment had been predictable. The omission response occurred ∼125-165 msec after the expected onset of the final segment and showed characteristics of the omission mismatch negativity (MMN), with generators in auditory cortical areas. Suggestive of a general auditory predictive mechanism at work, this main observation was robust against varying source of predictive information or attentional allocation, differing between the two experiments. Source localization further suggested the omission response enhancement by predictability to emerge from left superior temporal gyrus and left angular gyrus in both experiments, with additional experiment-specific contributions. These results are consistent with the existence of predictive coding mechanisms in the central auditory system, and suggestive of the general predictive properties of the auditory system to support spoken word recognition. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sunami, Kishiko; Ishii, Akira; Takano, Sakurako; Yamamoto, Hidefumi; Sakashita, Tetsushi; Tanaka, Masaaki; Watanabe, Yasuyoshi; Yamane, Hideo
2013-11-06
In daily communication, we can usually still hear the spoken words as if they had not been masked and can comprehend the speech when spoken words are masked by background noise. This phenomenon is known as phonemic restoration. Since little is known about the neural mechanisms underlying phonemic restoration for speech comprehension, we aimed to identify the neural mechanisms using magnetoencephalography (MEG). Twelve healthy male volunteers with normal hearing participated in the study. Participants were requested to carefully listen to and understand recorded spoken Japanese stories, which were either played forward (forward condition) or in reverse (reverse condition), with their eyes closed. Several syllables of spoken words were replaced by 300-ms white-noise stimuli with an inter-stimulus interval of 1.6-20.3s. We compared MEG responses to white-noise stimuli during the forward condition with those during the reverse condition using time-frequency analyses. Increased 3-5 Hz band power in the forward condition compared with the reverse condition was continuously observed in the left inferior frontal gyrus [Brodmann's areas (BAs) 45, 46, and 47] and decreased 18-22 Hz band powers caused by white-noise stimuli were seen in the left transverse temporal gyrus (BA 42) and superior temporal gyrus (BA 22). These results suggest that the left inferior frontal gyrus and left transverse and superior temporal gyri are involved in phonemic restoration for speech comprehension. Our findings may help clarify the neural mechanisms of phonemic restoration as well as develop innovative treatment methods for individuals suffering from impaired speech comprehension, particularly in noisy environments. © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Marsh, John E; Vachon, François; Sörqvist, Patrik
2017-03-01
Individuals with schizophrenia typically show increased levels of distractibility. This has been attributed to impaired working memory capacity (WMC), since lower WMC is typically associated with higher distractibility, and schizophrenia is typically associated with impoverished WMC. Here, participants performed verbal and spatial serial recall tasks that were accompanied by to-be-ignored speech tokens. For the few trials wherein one speech token was replaced with a different token, impairment was produced to task scores (a deviation effect). Participants subsequently completed a schizotypy questionnaire and a WMC measure. Higher schizotypy scores were associated with lower WMC (as measured with operation span, OSPAN), but WMC and schizotypy scores explained unique variance in relation to the mean magnitude of the deviation effect. These results suggest that schizotypy is associated with heightened domain-general distractibility, but that this is independent of its relationship with WMC.
Humanlike Articulate Robotic Headform to Replace Human Volunteers in Respirator Fit Testing
2012-12-01
Vinci suggested that a wet, finely woven cloth could protect sailors from the particles [6] and, later in the 16th century, Agricola described a...survivors. An improvised response by Canadian troops, using urine-soaked cloths as primitive respirators to dissolve and neutralize the chlorine vapor...speech, and were usually covered with a thin layer of rubber or plastic that made no attempt to mimic the thickness and properties of human facial
Fogerty, Daniel
2014-01-01
The present study investigated the importance of overall segment amplitude and intrinsic segment amplitude modulation of consonants and vowels to sentence intelligibility. Sentences were processed according to three conditions that replaced consonant or vowel segments with noise matched to the long-term average speech spectrum. Segments were replaced with (1) low-level noise that distorted the overall sentence envelope, (2) segment-level noise that restored the overall syllabic amplitude modulation of the sentence, and (3) segment-modulated noise that further restored faster temporal envelope modulations during the vowel. Results from the first experiment demonstrated an incremental benefit with increasing resolution of the vowel temporal envelope. However, amplitude modulations of replaced consonant segments had a comparatively minimal effect on overall sentence intelligibility scores. A second experiment selectively noise-masked preserved vowel segments in order to equate overall performance of consonant-replaced sentences to that of the vowel-replaced sentences. Results demonstrated no significant effect of restoring consonant modulations during the interrupting noise when existing vowel cues were degraded. A third experiment demonstrated greater perceived sentence continuity with the preservation or addition of vowel envelope modulations. Overall, results support previous investigations demonstrating the importance of vowel envelope modulations to the intelligibility of interrupted sentences. PMID:24606291
The behaviour and welfare of sows and piglets in farrowing crates or lactation pens.
Singh, C; Verdon, M; Cronin, G M; Hemsworth, P H
2017-07-01
Temporary confinement during parturition and early postpartum may provide an intermediary step preceding loose housing that offers improvement in sow and piglet welfare. Three experiments were conducted to investigate the implications of replacing farrowing crates (FCs) with an alternative housing system from 3 days postpartum until weaning. In each experiment sows farrowed in FCs and were randomly allocated at day 3 of lactation to either a FC or a pen with increased floor space (lactation pen (LP)) until weaning. In experiment 1, piglet growth and sow and piglet skin injuries were recorded for 32 sows and 128 focal piglets in these litters. Behaviour around nursing and piglet behavioural time budgets were also recorded for 24 of these litters (96 focal piglets for time budgets). In experiment 2, measures of skin injury and behavioural time budgets were conducted on 28 sows and 112 focal piglets. The behavioural response of sows to piglet vocalisation (maternal responsiveness test (MRT)) was also assessed. In experiment 3, piglet mortality from day 3 of lactation until weaning was recorded in 672 litters over 12 months. While housing did not affect piglet weight gain in experiment 1, or piglet skin injuries in experiments 1 or 2, sows in both experiments sustained more injuries in LP than FC (experiment 1, 2.9 v. 1.4; experiment 2, 2.5 v. 0.8 lesions/sow; P0.05). Thus, housing sows and litters in LP from day 3 of lactation minimises piglet mortality while improving maternal behaviour in sows and social behaviour in piglets.
Neural network wavelet technology: A frontier of automation
NASA Technical Reports Server (NTRS)
Szu, Harold
1994-01-01
Neural networks are an outgrowth of interdisciplinary studies concerning the brain. These studies are guiding the field of Artificial Intelligence towards the, so-called, 6th Generation Computer. Enormous amounts of resources have been poured into R/D. Wavelet Transforms (WT) have replaced Fourier Transforms (FT) in Wideband Transient (WT) cases since the discovery of WT in 1985. The list of successful applications includes the following: earthquake prediction; radar identification; speech recognition; stock market forecasting; FBI finger print image compression; and telecommunication ISDN-data compression.
Patient monitoring in the operating theatre.
Forrest, A L; Douglas, D M; Rimmer, A R
1976-09-01
Anaesthetised patients are monitored to ensure their safety. Simple clinical observations must not be replaced by electronic instruments--these provide an extension of the clinical senses. The choice of parameters for monitoring is discussed. The design of the Ninewells main operating theatre suite is described. An 8-channel bourne in the base of the theatre table conveys patient signals to a 4-channel recorder in a monitoring laboratory. Outputs are displayed on a wall mounted display in theatre. Two-way speech intercommunication exists with monitoring technician and students.
Plessas, I N; Rusbridge, C; Driver, C J; Chandler, K E; Craig, A; McGonnell, I M; Brodbelt, D C; Volk, H A
2012-11-17
The disease complex Chiari-like malformation (CM) and syringomyelia (SM) has been associated with the development of neuropathic pain (NeP), and commonly affects Cavalier King Charles spaniels (CKCS). This prospective cohort study followed 48 CKCSs with CM and/or SM and clinical signs suggestive of NeP for a period of 39 (±14.3) months from diagnosis. At the end of the study, 36 dogs were still alive; five dogs died of an unrelated or unknown cause, and seven were euthanased due to severe clinical signs suggestive of NeP. During the follow-up period, the clinical signs of scratching, facial rubbing behaviour, vocalisation and exercise ability were evaluated. Nine out of 48 dogs stopped scratching (P<0.001), but there was no statistically significant change in the number of dogs exhibiting exercise intolerance, vocalisation or facial rubbing behaviour. The overall severity of clinical signs based on a visual analogue scale (VAS) (0 mm: no clinical signs 100 mm: severe clinical signs) increased (from median 75 mm (interquartile ranges (IQR) 68-84) to 84 mm (IQR 71.5-91), P<0.001). A quarter of the dogs were static or improved. In general, the majority of the owners felt that the quality of life of their dogs was acceptable. Medical treatments received were gabapentin or pregabalin and/or intermittently, carprofen. The owner's perception of their animal's progress, and progress based on VAS, had strong positive correlation (Spearman's rank correlation (s(r)) 0.74, P<0.001). Overall, this study suggests that clinical signs suggestive of NeP progress in three-quarters of CKCSs with CM and/or SM.
Anthropogenic noise disrupts use of vocal information about predation risk.
Kern, Julie M; Radford, Andrew N
2016-11-01
Anthropogenic noise is rapidly becoming a universal environmental feature. While the impacts of such additional noise on avian sexual signals are well documented, our understanding of its effect in other terrestrial taxa, on other vocalisations, and on receivers is more limited. Little is known, for example, about the influence of anthropogenic noise on responses to vocalisations relating to predation risk, despite the potential fitness consequences. We use playback experiments to investigate the impact of traffic noise on the responses of foraging dwarf mongooses (Helogale parvula) to surveillance calls produced by sentinels, individuals scanning for danger from a raised position whose presence usually results in reduced vigilance by foragers. Foragers exhibited a lessened response to surveillance calls in traffic-noise compared to ambient-sound playback, increasing personal vigilance. A second playback experiment, using noise playbacks without surveillance calls, suggests that the increased vigilance could arise in part from the direct influence of additional noise as there was an increase in response to traffic-noise playback alone. Acoustic masking could also play a role. Foragers maintained the ability to distinguish between sentinels of different dominance class, increasing personal vigilance when presented with subordinate surveillance calls compared to calls of a dominant groupmate in both noise treatments, suggesting complete masking was not occurring. However, an acoustic-transmission experiment showed that while surveillance calls were potentially audible during approaching traffic noise, they were probably inaudible during peak traffic intensity noise. While recent work has demonstrated detrimental effects of anthropogenic noise on defensive responses to actual predatory attacks, which are relatively rare, our results provide evidence of a potentially more widespread influence since animals should constantly assess background risk to optimise the foraging-vigilance trade-off. Copyright © 2016 Elsevier Ltd. All rights reserved.
Springer, Priscilla E; Slogrove, Amy L; Laughton, Barbara; Bettinger, Julie A; Saunders, Henriëtte H; Molteno, Christopher D; Kruger, Mariana
2018-01-01
To compare neurodevelopmental outcomes of HIV-exposed uninfected (HEU) and HIV-unexposed uninfected (HUU) infants in a peri-urban South African population. HEU infants living in Africa face unique biological and environmental risks, but uncertainty remains regarding their neurodevelopmental outcome. This is partly due to lack of well-matched HUU comparison groups needed to adjust for confounding factors. This was a prospective cohort study of infants enrolled at birth from a low-risk midwife obstetric facility. At 12 months of age, HEU and HUU infant growth and neurodevelopmental outcomes were compared. Growth was evaluated as WHO weight-for-age, length-for-age, weight-for-length and head-circumference-for-age Z-scores. Neurodevelopmental outcomes were evaluated using the Bayley scales of Infant Development III (BSID) and Alarm Distress Baby Scale (ADBB). Fifty-eight HEU and 38 HUU infants were evaluated at 11-14 months of age. Performance on the BSID did not differ in any of the domains between HEU and HUU infants. The cognitive, language and motor scores were within the average range (US standardised norms). Seven (12%) HEU and 1 (2.6%) HUU infant showed social withdrawal on the ADBB (P = 0.10), while 15 (26%) HEU and 4 (11%) HUU infants showed decreased vocalisation (P = 0.06). There were no growth differences. Three HEU and one HUU infant had minor neurological signs, while eight HEU and two HUU infants had macrocephaly. Although findings on the early neurodevelopmental outcome of HEU infants are reassuring, minor differences in vocalisation and on neurological examination indicate a need for reassessment at a later age. © 2017 John Wiley & Sons Ltd.
White, Hannah J; Haycraft, Emma; Madden, Sloane; Rhodes, Paul; Miskovic-Wheatley, Jane; Wallis, Andrew; Kohn, Michael; Meyer, Caroline
2015-01-01
To examine the range and frequency of parental mealtime strategies used during the family meal session of Family-Based Treatment (FBT) for adolescent anorexia nervosa, and to explore the relationships between parental mealtime strategies, mealtime emotional tone and parental 'success' at encouraging adolescent food consumption. Participants were 21 families with a child aged between 12 and 18 years receiving FBT for adolescent anorexia nervosa. Video recordings of the family meal session (FBT session two) were coded using the Family Mealtime Coding System adapted in this study for use with adolescents (FMCS-A) to identify frequency of parental strategies, emotional tone of the meal (measured by adolescent positive and negative vocalisations) and frequency of prompted mouthfuls consumed by the adolescent (measured by the number of mouthfuls consumed by the adolescent immediately following parental interactions). A range of parental mealtime strategies were in use. Those used repeatedly included direct eating prompts, non-direct eating prompts, physical prompts, and providing information or food-related choices. Several parental mealtime strategies (direct and non-direct eating prompts) were found to be consistently associated with the tone of adolescents' vocalisations and the number of mouthfuls consumed in response to a parental prompt. Despite associations with negativity from the adolescent, the use of food-related prompts (both verbal and physical) seems to be associated with increased eating. This indicates the potentially important role of parental control of eating. Following replication, these findings might provide a focus for therapists when supporting and coaching parents during the family meal session. © 2014 Wiley Periodicals, Inc.
The emerging phenotype of long-term survivors with infantile Pompe disease
Prater, Sean N.; Banugaria, Suhrad G.; DeArmey, Stephanie M.; Botha, Eleanor G.; Stege, Erin M.; Case, Laura E.; Jones, Harrison N.; Phornphutkul, Chanika; Wang, Raymond Y.; Young, Sarah P.; Kishnani, Priya S.
2013-01-01
Purpose Enzyme replacement therapy with alglucosidase alfa for infantile Pompe disease has improved survival creating new management challenges. We describe an emerging phenotype in a retrospective review of long-term survivors. Methods Inclusion criteria included ventilator-free status and age ≤6 months at treatment initiation, and survival to age ≥5 years. Clinical outcome measures included invasive ventilator-free survival and parameters for cardiac, pulmonary, musculoskeletal, gross motor and ambulatory status; growth; speech, hearing, and swallowing; and gastrointestinal and nutritional status. Results Eleven of 17 patients met study criteria. All were cross-reactive immunologic material-positive, alive, and invasive ventilator-free at most recent assessment, with a median age of 8.0 years (range: 5.4 to 12.0 years). All had marked improvements in cardiac parameters. Commonly present were gross motor weakness, motor speech deficits, sensorineural and/or conductive hearing loss, osteopenia, gastroesophageal reflux disease, and dysphagia with aspiration risk. Seven of 11 patients were independently ambulatory and four required the use of assistive ambulatory devices. All long-term survivors had low or undetectable anti-alglucosidase alfa antibody titers. Conclusions Long-term survivors exhibited sustained improvements in cardiac parameters and gross motor function. Residual muscle weakness, hearing loss, risk for arrhythmias, hypernasal speech, dysphagia with risk for aspiration, and osteopenia were commonly observed findings. PMID:22538254
Benninger, M S
2011-02-01
The human voice is not only the key to human communication but also serves as the primary musical instrument. Many professions rely on the voice, but the most noticeable and visible are singers. Care of the performing voice requires a thorough understanding of the interaction between the anatomy and physiology of voice production, along with an awareness of the interrelationships between vocalisation, acoustic science and non-vocal components of performance. This review gives an overview of the care and prevention of professional voice disorders by describing the unique and integrated anatomy and physiology of singing, the roles of development and training, and the importance of the voice care team.
Avian vocal mimicry: a unified conceptual framework.
Dalziell, Anastasia H; Welbergen, Justin A; Igic, Branislav; Magrath, Robert D
2015-05-01
Mimicry is a classical example of adaptive signal design. Here, we review the current state of research into vocal mimicry in birds. Avian vocal mimicry is a conspicuous and often spectacular form of animal communication, occurring in many distantly related species. However, the proximate and ultimate causes of vocal mimicry are poorly understood. In the first part of this review, we argue that progress has been impeded by conceptual confusion over what constitutes vocal mimicry. We propose a modified version of Vane-Wright's (1980) widely used definition of mimicry. According to our definition, a vocalisation is mimetic if the behaviour of the receiver changes after perceiving the acoustic resemblance between the mimic and the model, and the behavioural change confers a selective advantage on the mimic. Mimicry is therefore specifically a functional concept where the resemblance between heterospecific sounds is a target of selection. It is distinct from other forms of vocal resemblance including those that are the result of chance or common ancestry, and those that have emerged as a by-product of other processes such as ecological convergence and selection for large song-type repertoires. Thus, our definition provides a general and functionally coherent framework for determining what constitutes vocal mimicry, and takes account of the diversity of vocalisations that incorporate heterospecific sounds. In the second part we assess and revise hypotheses for the evolution of avian vocal mimicry in the light of our new definition. Most of the current evidence is anecdotal, but the diverse contexts and acoustic structures of putative vocal mimicry suggest that mimicry has multiple functions across and within species. There is strong experimental evidence that vocal mimicry can be deceptive, and can facilitate parasitic interactions. There is also increasing support for the use of vocal mimicry in predator defence, although the mechanisms are unclear. Less progress has been made in explaining why many birds incorporate heterospecific sounds into their sexual displays, and in determining whether these vocalisations are functionally mimetic or by-products of sexual selection for other traits such as repertoire size. Overall, this discussion reveals a more central role for vocal mimicry in the behavioural ecology of birds than has previously been appreciated. The final part of this review identifies important areas for future research. Detailed empirical data are needed on individual species, including on the structure of mimetic signals, the contexts in which mimicry is produced, how mimicry is acquired, and the ecological relationships between mimic, model and receiver. At present, there is little information and no consensus about the various costs of vocal mimicry for the protagonists in the mimicry complex. The diversity and complexity of vocal mimicry in birds raises important questions for the study of animal communication and challenges our view of the nature of mimicry itself. Therefore, a better understanding of avian vocal mimicry is essential if we are to account fully for the diversity of animal signals. © 2014 The Authors. Biological Reviews © 2014 Cambridge Philosophical Society.
The Responsibility to Protect: Intervention is Not Enough
2013-05-23
regarding the implementation of R2P.27 Ban Ki- Moon , the UNSG replacing Kofi Anna in October 2006, challenged the UN General Assembly to turns its...commitments to R2P in the 2005 World Summit Outcomes from "words" into "deeds" in a speech given in Berlin in 2008.28 Ban Ki- Moon attempted to reconcile...the Summit Outcome document.30 As part of his efforts to provide a better framework for implementing R2P, Ban Ki- Moon proposed a related, but
Kharb, Sandeep; Gundgurthi, Abhay; Dutta, Manoj K.; Garg, M. K.
2013-01-01
A 27-year-old male was admitted with diabetic ketoacidosis and altered sensorium with slurring of speech and ataxia. He was managed with intravenous insulin and fluids and later shifted to basal bolus insulin regimen and during further evaluation was diagnosed to be suffering from primary hypothyroidism and adrenal insufficiency. He was started on thyroxin replacement and steroids only during stress. After three months of follow up he was clinically euthyroid. His glycemic control was adequate on oral anti-hyperglycemic drugs and adrenal insufficiency recovered. However, his thyrotropin levels were persistently elevated on adequate replacement doses of thyroxin. His repeat TSH was estimated after precipitating serum with polyethylene glycol which revealed normal TSH. Here we report reversible adrenal insufficiency with hypothyroidism with falsely raised TSH because of presence of heterophile antibodies in a case of poly glandular endocrinopathy syndrome. PMID:24910843
Kharb, Sandeep; Gundgurthi, Abhay; Dutta, Manoj K; Garg, M K
2013-12-01
A 27-year-old male was admitted with diabetic ketoacidosis and altered sensorium with slurring of speech and ataxia. He was managed with intravenous insulin and fluids and later shifted to basal bolus insulin regimen and during further evaluation was diagnosed to be suffering from primary hypothyroidism and adrenal insufficiency. He was started on thyroxin replacement and steroids only during stress. After three months of follow up he was clinically euthyroid. His glycemic control was adequate on oral anti-hyperglycemic drugs and adrenal insufficiency recovered. However, his thyrotropin levels were persistently elevated on adequate replacement doses of thyroxin. His repeat TSH was estimated after precipitating serum with polyethylene glycol which revealed normal TSH. Here we report reversible adrenal insufficiency with hypothyroidism with falsely raised TSH because of presence of heterophile antibodies in a case of poly glandular endocrinopathy syndrome.
Fairweather, Glenn C; Lincoln, Michelle A; Ramsden, Robyn
2017-01-01
Difficulties in accessing allied health services, especially in rural and remote areas, appear to be driving the use of telehealth services to children in schools. The objectives of this study were to investigate the experiences and views of school executive staff and therapy assistants regarding the feasibility and acceptability of a speech-language pathology telehealth program for children attending schools in rural and remote New South Wales, Australia. The program, called Come N See, provided therapy interventions remotely via low-bandwidth videoconferencing, with email follow-up. Over a 12-week period, children were offered therapy blocks of six fortnightly sessions, each lasting a maximum of 30 minutes. School executives (n=5) and therapy assistants (n=6) described factors that promoted or threatened the program's feasibility and acceptability, during semistructured interviews. Thematic content analysis with constant comparison was applied to the transcribed interviews to identify relationships in the data. Emergent themes related to (a) unmet speech pathology needs, (b) building relationships, (c) telehealth's advantages, (d) telehealth's disadvantages, (e) anxiety replaced by joy and confidence in growing skills, and (f) supports. School executive staff and therapy assistants verified that the delivery of the school-based telehealth service was feasible and acceptable. However, the participants saw significant opportunities to enhance this acceptability through building into the program stronger working relationships and supports for stakeholders. These findings are important for the future development of allied health telehealth programs that are sustainable as well as effective and fit the needs of all crucial stakeholders. The results have significant implications for speech pathology clinical practice relating to technology, program planning and teamwork within telehealth programs.
McGettigan, Carolyn; Rosen, Stuart; Scott, Sophie K.
2014-01-01
Noise-vocoding is a transformation which, when applied to speech, severely reduces spectral resolution and eliminates periodicity, yielding a stimulus that sounds “like a harsh whisper” (Scott et al., 2000, p. 2401). This process simulates a cochlear implant, where the activity of many thousand hair cells in the inner ear is replaced by direct stimulation of the auditory nerve by a small number of tonotopically-arranged electrodes. Although a cochlear implant offers a powerful means of restoring some degree of hearing to profoundly deaf individuals, the outcomes for spoken communication are highly variable (Moore and Shannon, 2009). Some variability may arise from differences in peripheral representation (e.g., the degree of residual nerve survival) but some may reflect differences in higher-order linguistic processing. In order to explore this possibility, we used noise-vocoding to explore speech recognition and perceptual learning in normal-hearing listeners tested across several levels of the linguistic hierarchy: segments (consonants and vowels), single words, and sentences. Listeners improved significantly on all tasks across two test sessions. In the first session, individual differences analyses revealed two independently varying sources of variability: one lexico-semantic in nature and implicating the recognition of words and sentences, and the other an acoustic-phonetic factor associated with words and segments. However, consequent to learning, by the second session there was a more uniform covariance pattern concerning all stimulus types. A further analysis of phonetic feature recognition allowed greater insight into learning-related changes in perception and showed that, surprisingly, participants did not make full use of cues that were preserved in the stimuli (e.g., vowel duration). We discuss these findings in relation cochlear implantation, and suggest auditory training strategies to maximize speech recognition performance in the absence of typical cues. PMID:24616669
[Velopharyngeal closure pattern and speech performance among submucous cleft palate patients].
Heng, Yin; Chunli, Guo; Bing, Shi; Yang, Li; Jingtao, Li
2017-06-01
To characterize the velopharyngeal closure patterns and speech performance among submucous cleft palate patients. Patients with submucous cleft palate visiting the Department of Cleft Lip and Palate Surgery, West China Hospital of Stomatology, Sichuan University between 2008 and 2016 were reviewed. Outcomes of subjective speech evaluation including velopharyngeal function, consonant articulation, and objective nasopharyngeal endoscopy including the mobility of soft palate, pharyngeal walls were retrospectively analyzed. A total of 353 cases were retrieved in this study, among which 138 (39.09%) demonstrated velopharyngeal competence, 176 (49.86%) velopharyngeal incompetence, and 39 (11.05%) marginal velopharyngeal incompetence. A total of 268 cases were subjected to nasopharyngeal endoscopy examination, where 167 (62.31%) demonstrated circular closure pattern, 89 (33.21%) coronal pattern, and 12 (4.48%) sagittal pattern. Passavant's ridge existed in 45.51% (76/167) patients with circular closure and 13.48% (12/89) patients with coronal closure. Among the 353 patients included in this study, 137 (38.81%) presented normal articulation, 124 (35.13%) consonant elimination, 51 (14.45%) compensatory articulation, 36 (10.20%) consonant weakening, 25 (7.08%) consonant replacement, and 36 (10.20%) multiple articulation errors. Circular closure was the most prevalent velopharyngeal closure pattern among patients with submucous cleft palate, and high-pressure consonant deletion was the most common articulation abnormality. Articulation error occurred more frequently among patients with a low velopharyngeal closure rate.
van Kuijk, Silvy M; García-Suikkanen, Carolina; Tello-Alvarado, Julio C; Vermeer, Jan; Hill, Catherine M
2015-01-01
We calculated the population density of the critically endangered Callicebus oenanthe in the Ojos de Agua Conservation Concession, a dry forest area in the department of San Martin, Peru. Results showed significant differences (p < 0.01) in group densities between forest boundaries (16.5 groups/km2, IQR = 21.1-11.0) and forest interior (4.0 groups/km2, IQR = 5.0-0.0), suggesting the 2,550-ha area harbours roughly 1,150 titi monkeys. This makes Ojos de Agua an important cornerstone in the conservation of the species, because it is one of the largest protected areas where the species occurs. © 2016 S. Karger AG, Basel.
Continuous Speech Recognition for Clinicians
Zafar, Atif; Overhage, J. Marc; McDonald, Clement J.
1999-01-01
The current generation of continuous speech recognition systems claims to offer high accuracy (greater than 95 percent) speech recognition at natural speech rates (150 words per minute) on low-cost (under $2000) platforms. This paper presents a state-of-the-technology summary, along with insights the authors have gained through testing one such product extensively and other products superficially. The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute “twice a day” for “bid” when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians' offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal “usable” hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a “speech quality” sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate. The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription. PMID:10332653
Emotionality in growing pigs: is the open field a valid test?
Donald, Ramona D; Healy, Susan D; Lawrence, Alistair B; Rutherford, Kenneth M D
2011-10-24
The ability to assess emotionality is important within animal welfare research. Yet, for farm animals, few tests of emotionality have been well validated. Here we investigated the construct validity of behavioural measures of pig emotionality in an open-field test by manipulating the experiences of pigs in three ways. In Experiment One (pharmacological manipulation), pigs pre-treated with Azaperone, a drug used to reduce stress in commercial pigs, were more active, spent more time exploring and vocalised less than control pigs. In Experiment Two (social manipulation), pigs that experienced the open-field arena with a familiar companion were also more exploratory, spent less time behaviourally idle, and were less vocal than controls although to a lesser degree than in Experiment One. In Experiment Three (novelty manipulation), pigs experiencing the open field for a second time were less active, explored less and vocalised less than they had done in the first exposure to the arena. A principal component analysis was conducted on data from all three trials. The first two components could be interpreted as relating to the form (cautious to exploratory) and magnitude (low to high arousal) of the emotional response to open-field testing. Based on these dimensions, in Experiment One, Azaperone pigs appeared to be less fearful than saline-treated controls. However, in Experiment Two, exposure to the arena with a conspecific did not affect the first two dimensions but did affect a third behavioural dimension, relating to oro-nasal exploration of the arena floor. In Experiment Three, repeat exposure altered the form but not the magnitude of emotional response: pigs were less exploratory in the second test. In conclusion, behavioural measures taken from pigs in an open-field test are sensitive to manipulations of their prior experience in a manner that suggests they reflect underlying emotionality. Behavioural measures taken during open-field exposure can be useful for making assessments of both pig emotionality and of their welfare. Copyright © 2011 Elsevier Inc. All rights reserved.
Boumans, Tiny; Gobes, Sharon M. H.; Poirier, Colline; Theunissen, Frederic E.; Vandersmissen, Liesbeth; Pintjens, Wouter; Verhoye, Marleen; Bolhuis, Johan J.; Van der Linden, Annemie
2008-01-01
Background Male songbirds learn their songs from an adult tutor when they are young. A network of brain nuclei known as the ‘song system’ is the likely neural substrate for sensorimotor learning and production of song, but the neural networks involved in processing the auditory feedback signals necessary for song learning and maintenance remain unknown. Determining which regions show preferential responsiveness to the bird's own song (BOS) is of great importance because neurons sensitive to self-generated vocalisations could mediate this auditory feedback process. Neurons in the song nuclei and in a secondary auditory area, the caudal medial mesopallium (CMM), show selective responses to the BOS. The aim of the present study is to investigate the emergence of BOS selectivity within the network of primary auditory sub-regions in the avian pallium. Methods and Findings Using blood oxygen level-dependent (BOLD) fMRI, we investigated neural responsiveness to natural and manipulated self-generated vocalisations and compared the selectivity for BOS and conspecific song in different sub-regions of the thalamo-recipient area Field L. Zebra finch males were exposed to conspecific song, BOS and to synthetic variations on BOS that differed in spectro-temporal and/or modulation phase structure. We found significant differences in the strength of BOLD responses between regions L2a, L2b and CMM, but no inter-stimuli differences within regions. In particular, we have shown that the overall signal strength to song and synthetic variations thereof was different within two sub-regions of Field L2: zone L2a was significantly more activated compared to the adjacent sub-region L2b. Conclusions Based on our results we suggest that unlike nuclei in the song system, sub-regions in the primary auditory pallium do not show selectivity for the BOS, but appear to show different levels of activity with exposure to any sound according to their place in the auditory processing stream. PMID:18781203
Li, Jianwen; Li, Yan; Zhang, Ming; Ma, Weifang; Ma, Xuezong
2014-01-01
The current use of hearing aids and artificial cochleas for deaf-mute individuals depends on their auditory nerve. Skin-hearing technology, a patented system developed by our group, uses a cutaneous sensory nerve to substitute for the auditory nerve to help deaf-mutes to hear sound. This paper introduces a new solution, multi-channel-array skin-hearing technology, to solve the problem of speech discrimination. Based on the filtering principle of hair cells, external voice signals at different frequencies are converted to current signals at corresponding frequencies using electronic multi-channel bandpass filtering technology. Different positions on the skin can be stimulated by the electrode array, allowing the perception and discrimination of external speech signals to be determined by the skin response to the current signals. Through voice frequency analysis, the frequency range of the band-pass filter can also be determined. These findings demonstrate that the sensory nerves in the skin can help to transfer the voice signal and to distinguish the speech signal, suggesting that the skin sensory nerves are good candidates for the replacement of the auditory nerve in addressing deaf-mutes’ hearing problems. Scientific hearing experiments can be more safely performed on the skin. Compared with the artificial cochlea, multi-channel-array skin-hearing aids have lower operation risk in use, are cheaper and are more easily popularized. PMID:25317171
Cochlear hearing loss in patients with Laron syndrome.
Attias, Joseph; Zarchi, Omer; Nageris, Ben I; Laron, Zvi
2012-02-01
The aim of this prospective clinical study was to test auditory function in patients with Laron syndrome, either untreated or treated with insulin-like growth factor I (IGF-I). The study group consisted of 11 patients with Laron syndrome: 5 untreated adults, 5 children and young adults treated with replacement IGF-I starting at bone age <2 years, and 1 adolescent who started replacement therapy at bone age 4.6 years. The auditory evaluation included pure tone and speech audiometry, tympanometry and acoustic reflexes, otoacoustic emissions, loudness dynamics, auditory brain stem responses and a hyperacusis questionnaire. All untreated patients and the patient who started treatment late had various degrees of sensorineural hearing loss and auditory hypersensitivity; acoustic middle ear reflexes were absent in most of them. All treated children had normal hearing and no auditory hypersensitivity; most had recordable middle ear acoustic reflexes. In conclusion, auditory defects seem to be associated with Laron syndrome and may be prevented by starting treatment with IGF-I at an early developmental age.
Aggressive Bimodal Communication in Domestic Dogs, Canis familiaris.
Déaux, Éloïse C; Clarke, Jennifer A; Charrier, Isabelle
2015-01-01
Evidence of animal multimodal signalling is widespread and compelling. Dogs' aggressive vocalisations (growls and barks) have been extensively studied, but without any consideration of the simultaneously produced visual displays. In this study we aimed to categorize dogs' bimodal aggressive signals according to the redundant/non-redundant classification framework. We presented dogs with unimodal (audio or visual) or bimodal (audio-visual) stimuli and measured their gazing and motor behaviours. Responses did not qualitatively differ between the bimodal and two unimodal contexts, indicating that acoustic and visual signals provide redundant information. We could not further classify the signal as 'equivalent' or 'enhancing' as we found evidence for both subcategories. We discuss our findings in relation to the complex signal framework, and propose several hypotheses for this signal's function.
Mochida, Takemi; Gomi, Hiroaki; Kashino, Makio
2010-11-08
There has been plentiful evidence of kinesthetically induced rapid compensation for unanticipated perturbation in speech articulatory movements. However, the role of auditory information in stabilizing articulation has been little studied except for the control of voice fundamental frequency, voice amplitude and vowel formant frequencies. Although the influence of auditory information on the articulatory control process is evident in unintended speech errors caused by delayed auditory feedback, the direct and immediate effect of auditory alteration on the movements of articulators has not been clarified. This work examined whether temporal changes in the auditory feedback of bilabial plosives immediately affects the subsequent lip movement. We conducted experiments with an auditory feedback alteration system that enabled us to replace or block speech sounds in real time. Participants were asked to produce the syllable /pa/ repeatedly at a constant rate. During the repetition, normal auditory feedback was interrupted, and one of three pre-recorded syllables /pa/, /Φa/, or /pi/, spoken by the same participant, was presented once at a different timing from the anticipated production onset, while no feedback was presented for subsequent repetitions. Comparisons of the labial distance trajectories under altered and normal feedback conditions indicated that the movement quickened during the short period immediately after the alteration onset, when /pa/ was presented 50 ms before the expected timing. Such change was not significant under other feedback conditions we tested. The earlier articulation rapidly induced by the progressive auditory input suggests that a compensatory mechanism helps to maintain a constant speech rate by detecting errors between the internally predicted and actually provided auditory information associated with self movement. The timing- and context-dependent effects of feedback alteration suggest that the sensory error detection works in a temporally asymmetric window where acoustic features of the syllable to be produced may be coded.
Blöte, Anke W; Miers, Anne C; Van den Bos, Esther; Westenberg, P Michiel
2018-05-17
Cognitive behavioural therapy (CBT) has relatively poor outcomes for youth with social anxiety, possibly because broad-based CBT is not tailored to their specific needs. Treatment of social anxiety in youth may need to pay more attention to negative social cognitions that are considered a key factor in social anxiety development and maintenance. The aim of the present study was to learn more about the role of performance quality in adolescents' cognitions about their social performance and, in particular, the moderating role social anxiety plays in the relationship between performance quality and self-cognitions. A community sample of 229 participants, aged 11 to 18 years, gave a speech and filled in questionnaires addressing social anxiety, depression, expected and self-evaluated performance, and post-event rumination. Independent observers rated the quality of the speech. The data were analysed using moderated mediation analysis. Performance quality mediated the link between expected and self-evaluated performance in adolescents with low and medium levels of social anxiety. For adolescents with high levels of social anxiety, only a direct link between expected and self-evaluated performance was found. Their self-evaluation was not related to the quality of their performance. Performance quality also mediated the link between expected performance and rumination, but social anxiety did not moderate this mediation effect. Results suggest that a good performance does not help socially anxious adolescents to replace their negative self-evaluations with more realistic ones. Specific cognitive intervention strategies should be tailored to the needs of socially anxious adolescents who perform well.
Cheating Heisenberg: Achieving certainty in wideband spectrography
NASA Astrophysics Data System (ADS)
Fulop, Sean
2003-10-01
The spectrographic analysis of sound has been with us some 58 years, and one of the key properties of the process is the trade-off in resolution between the time and frequency dimensions in the computed graph. While spectrography has greatly advanced the development of phonetics, the uncertainty principle has always been a source of frustration to phoneticians because so many of the interesting features of speech must be observed by computing Fourier spectra over very short time frames-i.e., using a ``wideband'' spectrogram. Since the uncertainty relation between time and frequency is unbreakable, the only option for improvement is to make a new kind of spectrogram that does not graph time and frequency. An algorithm is described and demonstrated which computes a new kind of spectrogram in which Fourier transform frequency is replaced by the channelized instantaneous frequency, and time is adjusted by the local group delay. The theory behind this procedure was clarified in Nelson [J. Acoust. Soc. Am. 110, 2575-2592 (2001)]. The resulting wideband spectrograms show dramatically improved resolution of speech features, which will be demonstrated with sample figures. It is thus suggested that phoneticians should be more interested in the instantaneous frequency spectrum than in the Fourier transform.
Aggressive Bimodal Communication in Domestic Dogs, Canis familiaris
Déaux, Éloïse C.; Clarke, Jennifer A.; Charrier, Isabelle
2015-01-01
Evidence of animal multimodal signalling is widespread and compelling. Dogs’ aggressive vocalisations (growls and barks) have been extensively studied, but without any consideration of the simultaneously produced visual displays. In this study we aimed to categorize dogs’ bimodal aggressive signals according to the redundant/non-redundant classification framework. We presented dogs with unimodal (audio or visual) or bimodal (audio-visual) stimuli and measured their gazing and motor behaviours. Responses did not qualitatively differ between the bimodal and two unimodal contexts, indicating that acoustic and visual signals provide redundant information. We could not further classify the signal as ‘equivalent’ or ‘enhancing’ as we found evidence for both subcategories. We discuss our findings in relation to the complex signal framework, and propose several hypotheses for this signal’s function. PMID:26571266
The architecture of the middle ear in the small Indian mongoose (Herpestes Javanicus).
Kamali, Y; Gholami, S; Ahrari-Khafi, M S; Rasouli, B; Shayegh, H
The small Indian mongoose (Herpestes javanicus) is native to the Middle East, Iran and much of southern Asia. For this study the middle ears of a total of 6 adult small Indian mongooses, both fresh and museum samples were explored by using of dissection and plain radiography. On the one hand, at least in some species of the mongoose vocalisations and hearings play a critical role in coordinating behaviours. On the other hand, the ear region has provided useful character relevant for mammalian phylogeny. So, the aim of the present study is a brief discussion of the various anatomic particularities of the middle ear based on a combination of existing data and the results of the authors' study in the small Indian mongoose.
Zwart, Mieke C; Baker, Andrew; McGowan, Philip J K; Whittingham, Mark J
2014-01-01
To be able to monitor and protect endangered species, we need accurate information on their numbers and where they live. Survey methods using automated bioacoustic recorders offer significant promise, especially for species whose behaviour or ecology reduces their detectability during traditional surveys, such as the European nightjar. In this study we examined the utility of automated bioacoustic recorders and the associated classification software as a way to survey for wildlife, using the nightjar as an example. We compared traditional human surveys with results obtained from bioacoustic recorders. When we compared these two methods using the recordings made at the same time as the human surveys, we found that recorders were better at detecting nightjars. However, in practice fieldworkers are likely to deploy recorders for extended periods to make best use of them. Our comparison of this practical approach with human surveys revealed that recorders were significantly better at detecting nightjars than human surveyors: recorders detected nightjars during 19 of 22 survey periods, while surveyors detected nightjars on only six of these occasions. In addition, there was no correlation between the amount of vocalisation captured by the acoustic recorders and the abundance of nightjars as recorded by human surveyors. The data obtained from the recorders revealed that nightjars were most active just before dawn and just after dusk, and least active during the middle of the night. As a result, we found that recording at both dusk and dawn or only at dawn would give reasonably high levels of detection while significantly reducing recording time, preserving battery life. Our analyses suggest that automated bioacoustic recorders could increase the detection of other species, particularly those that are known to be difficult to detect using traditional survey methods. The accuracy of detection is especially important when the data are used to inform conservation.
Borrelli, Sara E; Locatelli, Anna; Nespoli, Antonella
2013-08-01
to investigate the early pushing urge (EPU) incidence in one maternity unit and explore how it is managed by midwives. The relation to some obstetric outcomes was also observed but not analysed in depth. prospective observational study. Italian maternity hospital. 60 women (44 nullips and 16 multips) experiencing EPU during labour. the total EPU incidence percentage was 7.6%. The single midwives' incidences range had a very wide margin, noting an inverse proportion between the number of diagnoses of EPU and midwife's waiting time between urge to push and vaginal examination. Two care policies were adopted in relation to the phenomenon: the stop pushing technique (n=52/60) and the 'let the woman do what she feels' technique (n=8/60). In case of stop pushing techniques, midwives proposed several combined techniques (change of maternal position, blowing breath, vocalisation, use of the bath). The EPU diagnosis at less than 8cm of cervical dilatation was associated with more medical interventions. Maternal and neonatal outcomes were within the range of normal physiology. An association between the dilatation at EPU diagnosis and obstetric outcomes was observed, in particular the modality of childbirth and perineal outcomes. this paper contributes new knowledge to the body of literature around the EPU phenomenon during labour and midwifery practices adopted in response to it. Overall, it could be argued that EPU is a physiologic variation in labour if maternal and fetal conditions are good. Midwives might suggest techniques to woman to help her to stay with the pain, such as change of position, blowing breath, vocalisation and use of the bath. However, the impact of policies, guidelines and culture on midwifery practices of the specific setting are a limitation of the study because it is not representative of other similar maternity units. Thus, a larger scale work should be considered, including different units and settings. The optimal response to the phenomenon should be studied, considering EPU at different dilatation ranges. Future investigations could also focus on qualitative analysis of women and midwives' personal experience in relation to the phenomenon. Copyright © 2012 Elsevier Ltd. All rights reserved.
Owners’ Perceptions of Their Animal’s Behavioural Response to the Loss of an Animal Companion
Walker, Jessica K.; Waran, Natalie K.; Phillips, Clive J. C.
2016-01-01
Simple Summary The loss of a companion animal is recognised as being associated with experiences of grief by the owner, but it is unclear how other animals in the household may be affected by such a loss. This paper investigates the behavioural responses of dogs and cats to the loss of an animal companion through owner-reported observations. There was consensus that behaviour changed as a result of loss including increased affectionate behaviour, territorial behaviour, and changes in food consumption and vocalisation. Abstract The loss of a companion animal is recognised as being associated with experiences of grief by the owner, but it is unclear how other animals in the household may be affected by such a loss. Our aim was to investigate companion animals’ behavioural responses to the loss of a companion through owner-report. A questionnaire was distributed via, and advertised within, publications produced by the Royal Society for the Prevention of Cruelty to Animals (RSPCA) across Australia and New Zealand, and through a selection of veterinary clinics within New Zealand. A total of 279 viable surveys were returned pertaining to 159 dogs and 152 cats. The two most common classes of behavioural changes reported for both dogs and cats were affectionate behaviours (74% of dogs and 78% of cats) and territorial behaviours (60% of dogs and 63% of cats). Both dogs and cats were reported to demand more attention from their owners and/or display affiliative behaviour, as well as spend time seeking out the deceased’s favourite spot. Dogs were reported to reduce the volume (35%) and speed (31%) of food consumption and increase the amount of time spent sleeping (34%). Cats were reported to increase the frequency (43%) and volume (32%) of vocalisations following the death of a companion. The median duration of reported behavioural changes in both species was less than 6 months. There was consensus that the behaviour of companion animals changed in response to the loss of an animal companion. These behavioural changes suggest the loss had an impact on the remaining animal. PMID:27827879
Facial correlates of emotional behaviour in the domestic cat (Felis catus).
Bennett, Valerie; Gourkow, Nadine; Mills, Daniel S
2017-08-01
Leyhausen's (1979) work on cat behaviour and facial expressions associated with offensive and defensive behaviour is widely embraced as the standard for interpretation of agonistic behaviour in this species. However, it is a largely anecdotal description that can be easily misunderstood. Recently a facial action coding system has been developed for cats (CatFACS), similar to that used for objectively coding human facial expressions. This study reports on the use of this system to describe the relationship between behaviour and facial expressions of cats in confinement contexts without and with human interaction, in order to generate hypotheses about the relationship between these expressions and underlying emotional state. Video recordings taken of 29 cats resident in a Canadian animal shelter were analysed using 1-0 sampling of 275 4-s video clips. Observations under the two conditions were analysed descriptively using hierarchical cluster analysis for binomial data and indicated that in both situations, about half of the data clustered into three groups. An argument is presented that these largely reflect states based on varying degrees of relaxed engagement, fear and frustration. Facial actions associated with fear included blinking and half-blinking and a left head and gaze bias at lower intensities. Facial actions consistently associated with frustration included hissing, nose-licking, dropping of the jaw, the raising of the upper lip, nose wrinkling, lower lip depression, parting of the lips, mouth stretching, vocalisation and showing of the tongue. Relaxed engagement appeared to be associated with a right gaze and head turn bias. The results also indicate potential qualitative changes associated with differences in intensity in emotional expression following human intervention. The results were also compared to the classic description of "offensive and defensive moods" in cats (Leyhausen, 1979) and previous work by Gourkow et al. (2014a) on behavioural styles in cats in order to assess if these observations had replicable features noted by others. This revealed evidence of convergent validity between the methods However, the use of CatFACS revealed elements relating to vocalisation and response lateralisation, not previously reported in this literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Indicators used in livestock to assess unconsciousness after stunning: a review.
Verhoeven, M T W; Gerritzen, M A; Hellebrekers, L J; Kemp, B
2015-02-01
Assessing unconsciousness is important to safeguard animal welfare shortly after stunning at the slaughter plant. Indicators that can be visually evaluated are most often used when assessing unconsciousness, as they can be easily applied in slaughter plants. These indicators include reflexes originating from the brain stem (e.g. eye reflexes) or from the spinal cord (e.g. pedal reflex) and behavioural indicators such as loss of posture, vocalisations and rhythmic breathing. When physically stunning an animal, for example, captive bolt, most important indicators looked at are posture, righting reflex, rhythmic breathing and the corneal or palpebral reflex that should all be absent if the animal is unconscious. Spinal reflexes are difficult as a measure of unconsciousness with this type of stunning, as they may occur more vigorous. For stunning methods that do not physically destroy the brain, for example, electrical and gas stunning, most important indicators looked at are posture, righting reflex, natural blinking response, rhythmic breathing, vocalisations and focused eye movement that should all be absent if the animal is unconscious. Brain stem reflexes such as the cornea reflex are difficult as measures of unconsciousness in electrically stunned animals, as they may reflect residual brain stem activity and not necessarily consciousness. Under commercial conditions, none of the indicators mentioned above should be used as a single indicator to determine unconsciousness after stunning. Multiple indicators should be used to determine unconsciousness and sufficient time should be left for the animal to die following exsanguination before starting invasive dressing procedures such as scalding or skinning. The recording and subsequent assessment of brain activity, as presented in an electroencephalogram (EEG), is considered the most objective way to assess unconsciousness compared with reflexes and behavioural indicators, but is only applied in experimental set-ups. Studies performed in an experimental set-up have often looked at either the EEG or reflexes and behavioural indicators and there is a scarcity of studies that correlate these different readout parameters. It is recommended to study these correlations in more detail to investigate the validity of reflexes and behavioural indicators and to accurately determine the point in time at which the animal loses consciousness.
Prospects and features of robotics in russian crop farming
NASA Astrophysics Data System (ADS)
Dokin, B. D.; Aletdinova, A. A.; Kravchenko, M. S.
2017-01-01
Specificity of agriculture, low levels of technical and technological, information and communication, human resources and managerial capacities of small and medium Russian agricultural producers explain the slow pace of implementation of robotics in plant breeding. Existing models are characterized by low levels of speech understanding technologies, the creation of modern power supplies, bionic systems and the use of micro-robots. Serial production of robotics for agriculture will replace human labor in the future. Also, it will help to solve the problem of hunger, reduce environmental damage and reduce the consumption of non-renewable resources. Creating and using robotics should be based on the generated System of machines and technologies for the perfect machine-tractor fleet.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-15
...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...: This is a summary of the Commission's Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...), Internet Protocol Relay (IP Relay), and IP captioned telephone service (IP CTS) as compensable forms of TRS...
Temmel, Andreas F P; Quint, Christian; Schickinger-Fischer, Bettina; Hummel, Thomas
2005-04-01
The feeling of a dry mouth may affect individual dietary habits, nutritional status, oral hygiene, speech, and gustatory sensitivity. The present study aimed to specifically investigate gustatory function before and after saliva replacement therapy. Whole-mouth gustatory function was assessed in 25 patients suffering from xerostomia (6 male, 19 female; age range 42-82 years) before and after 4 to 6 weeks of saliva replacement therapy using a preparation containing carboxymethylcellulose. The results were compared with those from healthy controls matched for age and sex (6 male, 19 female; age range 42-82 years). Using a whole-mouth test, gustatory function was assessed for sucrose, citric acid, sodium chloride, and caffeine. All subjects detected the four taste qualities at the highest concentration. However, the patients with xerostomia had lower scores in the gustatory test compared with the healthy controls (p < .001). No correlation was found between gustatory scores and the duration or severity of the disorder. Therapy had no effect on measured gustatory function (p = .33); however, saliva replacement led to a significant improvement in other xerostomia-related symptoms (p < .001). This study confirms previous work indicating that xerostomia is accompanied by decreased gustatory sensitivity. Lubricants based on carboxymethylcellulose may have a positive effect on some of the symptoms of xerostomia. However, these "simple" lubricants based on carboxymethylcellulose have little or no effect on whole-mouth gustatory function.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-15
...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services... Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay... (IP Relay) and video relay service (VRS), the Commission should bundle national STS outreach efforts...
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, J.F.; Ng, L.C.
1998-03-17
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
Holzrichter, John F.; Ng, Lawrence C.
1998-01-01
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn
2018-01-29
To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound disorders. Non-speech oral motor exercise use was most frequently reported in the treatment of dysarthria. Non-speech oral motor exercise use when targeting speech sound disorders is not widely endorsed in the literature.
2014-01-01
Background A system providing disabled persons with control of various assistive devices with the tongue has been developed at Aalborg University in Denmark. The system requires an activation unit attached to the tongue with a small piercing. The aim of this study was to establish and evaluate a safe and tolerable procedure for medical tongue piercing and to evaluate the expected and perceived procedural discomfort. Methods Four tetraplegic subjects volunteered for the study. A surgical protocol for a safe insertion of a tongue barbell piercing was presented using sterilized instruments and piercing parts. Moreover, post-procedural observations of participant complications such as bleeding, edema, and infection were recorded. Finally, procedural discomforts were monitored by VAS scores of pain, changes in taste and speech as well as problems related to hitting the teeth. Results The piercings were all successfully inserted in less than 5 min and the pain level was moderate compared with oral injections. No bleeding, infection, embedding of the piercing, or tooth/gingival injuries were encountered; a moderate edema was found in one case without affecting the speech. In two cases the piercing rod later had to be replaced by a shorter rod, because participants complained that the rod hit their teeth. The replacements prevented further problems. Moreover, loosening of balls was encountered, which could be prevented with the addition of dental glue. No cases of swallowing or aspiration of the piercing parts were recorded. Conclusions The procedure proved simple, fast, and safe for insertion of tongue piercings for tetraplegic subjects in a clinical setting. The procedure represented several precautions in order to avoid risks in these susceptible participants with possible co-morbidity. No serious complications were encountered, and the procedure was found tolerable to the participants. The procedure may be used in future studies with tongue piercings being a prerequisite for similar systems, and this may include insertion in an out-patient setting. PMID:24684776
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, J.F.; Ng, L.C.
The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less
Sex differences in the use of delayed semantic context when listening to disrupted speech.
Liederman, Jacqueline; Fisher, Janet McGraw; Coty, Alexis; Matthews, Geetha; Frye, Richard E; Lincoln, Alexis; Alexander, Rebecca
2013-02-01
Female as opposed to male listeners were better able to use a delayed informative cue at the end of a long sentence to report an earlier word which was disrupted by noise. Informative (semantically related) or uninformative (semantically unrelated) word cues were presented 2, 6, or 10 words after a target word whose initial phoneme had been replaced with noise. A total of 84 young adults (45 males) listened to each sentence and then repeated it after its offset. The semantic benefit effect (SBE) was the difference in the accuracy of report of the disrupted target word during informative vs. uninformative sentences. Women had significantly higher SBEs than men even though there were no significant sex differences in terms of number of non-target words reported, the effect of distance between the disrupted target word and the informative cue, or kinds of errors generated. We suggest that the superior ability of women to use delayed semantic information to decode an earlier ambiguous speech signal may be linked to women's tendency to engage the hemispheres more bilaterally than men during word processing. Since the maintenance of semantic context under ambiguous conditions demands more right than left hemispheric resources, this may give women an advantage.
Speech processing using maximum likelihood continuity mapping
Hogden, John E.
2000-01-01
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Speech processing using maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.E.
Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.
Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang
2018-04-04
Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.
The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.
De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet
2016-10-01
This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.
Drijvers, Linda; Özyürek, Asli
2017-01-01
This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Twenty participants watched videos of an actress uttering an action verb and completed a free-recall task. The videos were presented in 3 speech conditions (2-band noise-vocoding, 6-band noise-vocoding, clear), 3 multimodal conditions (speech + lips blurred, speech + visible speech, speech + visible speech + gesture), and 2 visual-only conditions (visible speech, visible speech + gesture). Accuracy levels were higher when both visual articulators were present compared with 1 or none. The enhancement effects of (a) visible speech, (b) gestural information on top of visible speech, and (c) both visible speech and iconic gestures were larger in 6-band than 2-band noise-vocoding or visual-only conditions. Gestural enhancement in 2-band noise-vocoding did not differ from gestural enhancement in visual-only conditions. When perceiving degraded speech in a visual context, listeners benefit more from having both visual articulators present compared with 1. This benefit was larger at 6-band than 2-band noise-vocoding, where listeners can benefit from both phonological cues from visible speech and semantic cues from iconic gestures to disambiguate speech.
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.
Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E
2017-09-01
Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Non-fluent speech following stroke is caused by impaired efference copy.
Feenaughty, Lynda; Basilakos, Alexandra; Bonilha, Leonardo; den Ouden, Dirk-Bart; Rorden, Chris; Stark, Brielle; Fridriksson, Julius
2017-09-01
Efference copy is a cognitive mechanism argued to be critical for initiating and monitoring speech: however, the extent to which breakdown of efference copy mechanisms impact speech production is unclear. This study examined the best mechanistic predictors of non-fluent speech among 88 stroke survivors. Objective speech fluency measures were subjected to a principal component analysis (PCA). The primary PCA factor was then entered into a multiple stepwise linear regression analysis as the dependent variable, with a set of independent mechanistic variables. Participants' ability to mimic audio-visual speech ("speech entrainment response") was the best independent predictor of non-fluent speech. We suggest that this "speech entrainment" factor reflects integrity of internal monitoring (i.e., efference copy) of speech production, which affects speech initiation and maintenance. Results support models of normal speech production and suggest that therapy focused on speech initiation and maintenance may improve speech fluency for individuals with chronic non-fluent aphasia post stroke.
Autistic traits and attention to speech: Evidence from typically developing individuals.
Korhonen, Vesa; Werner, Stefan
2017-04-01
Individuals with autism spectrum disorder have a preference for attending to non-speech stimuli over speech stimuli. We are interested in whether non-speech preference is only a feature of diagnosed individuals, and whether we can we test implicit preference experimentally. In typically developed individuals, serial recall is disrupted more by speech stimuli than by non-speech stimuli. Since behaviour of individuals with autistic traits resembles that of individuals with autism, we have used serial recall to test whether autistic traits influence task performance during irrelevant speech sounds. The errors made on the serial recall task during speech or non-speech sounds were counted as a measure of speech or non-speech preference in relation to no sound condition. We replicated the serial order effect and found the speech to be more disruptive than the non-speech sounds, but were unable to find any associations between the autism quotient scores and the non-speech sounds. Our results may indicate a learnt behavioural response to speech sounds.
Electrophysiological evidence for speech-specific audiovisual integration.
Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean
2014-01-01
Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.
Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.
Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A
2017-09-18
Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p < .01) and with mean length of utterance produced during a written picture description (r = .96, p < .01). Correlations between inner speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.
Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P
2017-01-01
Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Stefanidou, Chrysi; McCleery, Joseph P.
2017-01-01
Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6—year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age. PMID:28738063
Dualibi, Ana Paula Fiuza Funicello; Martins, Ana Maria; Moreira, Gustavo Antônio; de Azevedo, Marisa Frasson; Fujita, Reginaldo Raimundo; Pignatari, Shirley Shizue Nagata
2016-01-01
Mucopolysaccharidosis (MPS) is a lysosomal storage disease caused by deficiency of α-l-iduronidase. The otolaryngological findings include hearing loss, otorrhea, recurrent otitis, hypertrophy of tonsils and adenoid, recurrent rhinosinusitis, speech disorders, snoring, oral breathing and nasal obstruction. To evaluate the impact of enzymatic replacement therapy with laronidase (Aldurazyme(®)) in patients with mucopolysaccharidosis (MPS I), regarding sleep and hearing disorders, and clinical manifestations in the upper respiratory tract (URT). Nine patients with MPS I (8 Hurler-Scheie, and 1 Scheie phenotypes) of both sexes, ages ranging between 3 and 20 years, were included in this study. Patients were evaluated between seven and 11 months before the treatment and between 16 and 22 months after the onset of the enzymatic replacement. They were all submitted to a clinical and otolaryngological evaluation, including nasofibroscopical, polysomnographic and audiologic exams. The results' data showed decreasing of the frequency of ear, nose and throat infections, with improvement of the rhinorrhea and respiratory quality. No remarkable changes were observed regarding macroglossia and tonsil and adenoid hypertrophy. Audiometric and polysomnographic evaluations did not show statistical significance. Enzymatic replacement therapy in patients with mucopolysaccharidosis I provides control of recurrent URT infections, rhinorrhea and respiratory quality, however it is does not seem to improve audiologic and polisomnographic parameters, with no effect on adenoid and tonsils hypertrophy and macroglossia. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Patterns of call communication between group-housed zebra finches change during the breeding cycle.
Gill, Lisa F; Goymann, Wolfgang; Ter Maat, Andries; Gahr, Manfred
2015-10-06
Vocal signals such as calls play a crucial role for survival and successful reproduction, especially in group-living animals. However, call interactions and call dynamics within groups remain largely unexplored because their relation to relevant contexts or life-history stages could not be studied with individual-level resolution. Using on-bird microphone transmitters, we recorded the vocalisations of individual zebra finches (Taeniopygia guttata) behaving freely in social groups, while females and males previously unknown to each other passed through different stages of the breeding cycle. As birds formed pairs and shifted their reproductive status, their call repertoire composition changed. The recordings revealed that calls occurred non-randomly in fine-tuned vocal interactions and decreased within groups while pair-specific patterns emerged. Call-type combinations of vocal interactions changed within pairs and were associated with successful egg-laying, highlighting a potential fitness relevance of calling dynamics in communication systems.
Clinical Significance of an Unusual Variation
Murugan, M. Senthil; Sudha, R.; Bhargavan, Rajesh
2016-01-01
The infrahyoid muscles are involved in vocalisation and swallowing; among these, the sternothyroid muscle is derived from the common primitive sheet. The improper differentiation of this muscle may therefore result in morphological variations. We report an unusual variation found during the dissection of a 65-year-old male cadaver at the Sri Manakula Vinayagar Medical College, Madagadipet, Pondicherry, India, in 2015. An anomalous belly of the right sternothyroid muscle was observed between the internal jugular (IJ) vein and the internal carotid artery with an additional insertion into the tympanic plate and petrous part of the temporal bone and the presence of a levator glandulae thyroideae muscle. The anomalous muscle may compress the IJ vein if it is related to the neurovascular structures of neck; hence, knowledge of variations of the infrahyoid muscles can aid in the evaluation of IJ vein compression among patients with idiopathic symptoms resulting from venous congestion. PMID:28003898
Chimpanzee Alarm Call Production Meets Key Criteria for Intentionality
Schel, Anne Marijke; Townsend, Simon W.; Machanda, Zarin; Zuberbühler, Klaus; Slocombe, Katie E.
2013-01-01
Determining the intentionality of primate communication is critical to understanding the evolution of human language. Although intentional signalling has been claimed for some great ape gestural signals, comparable evidence is currently lacking for their vocal signals. We presented wild chimpanzees with a python model and found that two of three alarm call types exhibited characteristics previously used to argue for intentionality in gestural communication. These alarm calls were: (i) socially directed and given to the arrival of friends, (ii) associated with visual monitoring of the audience and gaze alternations, and (iii) goal directed, as calling only stopped when recipients were safe from the predator. Our results demonstrate that certain vocalisations of our closest living relatives qualify as intentional signals, in a directly comparable way to many great ape gestures. We conclude that our results undermine a central argument of gestural theories of language evolution and instead support a multimodal origin of human language. PMID:24146908
Acoustic communication at the water's edge: evolutionary insights from a mudskipper.
Polgar, Gianluca; Malavasi, Stefano; Cipolato, Giacomo; Georgalas, Vyron; Clack, Jennifer A; Torricelli, Patrizia
2011-01-01
Coupled behavioural observations and acoustical recordings of aggressive dyadic contests showed that the mudskipper Periophthalmodon septemradiatus communicates acoustically while out of water. An analysis of intraspecific variability showed that specific acoustic components may act as tags for individual recognition, further supporting the sounds' communicative value. A correlative analysis amongst acoustical properties and video-acoustical recordings in slow-motion supported first hypotheses on the emission mechanism. Acoustic transmission through the wet exposed substrate was also discussed. These observations were used to support an "exaptation hypothesis", i.e. the maintenance of key adaptations during the first stages of water-to-land vertebrate eco-evolutionary transitions (based on eco-evolutionary and palaeontological considerations), through a comparative bioacoustic analysis of aquatic and semiterrestrial gobiid taxa. In fact, a remarkable similarity was found between mudskipper vocalisations and those emitted by gobioids and other soniferous benthonic fishes.
Clinical reasoning in feline epilepsy: Which combination of clinical information is useful?
Stanciu, Gabriela-Dumitrita; Packer, Rowena Mary Anne; Pakozdy, Akos; Solcan, Gheorghe; Volk, Holger Andreas
2017-07-01
We sought to identify the association between clinical risk factors and the diagnosis of idiopathic epilepsy (IE) or structural epilepsy (SE) in cats, using statistical models to identify combinations of discrete parameters from the patient signalment, history and neurological examination findings that could suggest the most likely diagnosis. Data for 138 cats with recurrent seizures were reviewed, of which 110 were valid for inclusion. Seizure aetiology was classified as IE in 57% and SE in 43% of cats. Binomial logistic regression analyses demonstrated that pedigree status, older age at seizure onset (particularly >7years old), abnormal neurological examinations, and ictal vocalisation were associated with a diagnosis of SE compared to IE, and that ictal salivation was more likely to be associated with a diagnosis of IE than SE. These findings support the importance of considering inter-ictal neurological deficits and seizure history in clinical reasoning. Copyright © 2017 Elsevier Ltd. All rights reserved.
Acoustic Communication at the Water's Edge: Evolutionary Insights from a Mudskipper
Polgar, Gianluca; Malavasi, Stefano; Cipolato, Giacomo; Georgalas, Vyron; Clack, Jennifer A.; Torricelli, Patrizia
2011-01-01
Coupled behavioural observations and acoustical recordings of aggressive dyadic contests showed that the mudskipper Periophthalmodon septemradiatus communicates acoustically while out of water. An analysis of intraspecific variability showed that specific acoustic components may act as tags for individual recognition, further supporting the sounds' communicative value. A correlative analysis amongst acoustical properties and video-acoustical recordings in slow-motion supported first hypotheses on the emission mechanism. Acoustic transmission through the wet exposed substrate was also discussed. These observations were used to support an “exaptation hypothesis”, i.e. the maintenance of key adaptations during the first stages of water-to-land vertebrate eco-evolutionary transitions (based on eco-evolutionary and palaeontological considerations), through a comparative bioacoustic analysis of aquatic and semiterrestrial gobiid taxa. In fact, a remarkable similarity was found between mudskipper vocalisations and those emitted by gobioids and other soniferous benthonic fishes. PMID:21738663
Parental perspectives on the communication abilities of their daughters with Rett syndrome.
Urbanowicz, Anna; Leonard, Helen; Girdler, Sonya; Ciccone, Natalie; Downs, Jenny
2016-01-01
This study describes, from the perspective of parents, how females with Rett syndrome communicate in everyday life and the barriers and facilitators to successful communication. Sixteen interviews were conducted with parents with a daughter with Rett syndrome with a pathogenic mutation in the methyl-CpG-binding protein 2 gene. Interviews were recorded and transcribed verbatim. Transcripts were analysed using directed content analysis. All parents reported their daughters were able to express discomfort and pleasure, and make requests and choices using a variety of modalities including vocalisations, body movements and eye gaze. Parents also reported their daughters understood most of what they said and that the level of functional abilities, such as mobility, and environmental factors, such as characteristics of the communication partner, influenced successful communication. The perspectives of parents are integral to the assessment of communication abilities and have the potential to inform communication interventions for girls and women with Rett syndrome.
Investigating lexical competition and the cost of phonemic restoration.
Balling, Laura Winther; Morris, David Jackson; Tøndering, John
2017-12-01
Due to phonemic restoration, listeners can reliably perceive words when a phoneme is replaced with noise. The cost associated with this process was investigated along with the effect of lexical uniqueness on phonemic restoration, using data from a lexical decision experiment where noise replaced phonemes that were either uniqueness points (the phoneme at which a word deviates from all nonrelated words that share the same onset) or phonemes immediately prior to these. A baseline condition was also included with no noise-interrupted stimuli. Results showed a significant cost of phonemic restoration, with 100 ms longer word identification times and a 14% decrease in word identification accuracy for interrupted stimuli compared to the baseline. Regression analysis of response times from the interrupted conditions showed no effect of whether the interrupted phoneme was a uniqueness point, but significant effects for several temporal attributes of the stimuli, including the duration and position of the interrupted segment. These results indicate that uniqueness points are not distinct breakpoints in the cohort reduction that occurs during lexical processing, but that temporal properties of the interrupted stimuli are central to auditory word recognition. These results are interpreted in the context of models of speech perception.
NASA Astrophysics Data System (ADS)
Kayasith, Prakasith; Theeramunkong, Thanaruk
It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
Poole, Matthew L; Brodtmann, Amy; Darby, David; Vogel, Adam P
2017-04-14
Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Speech and neuroimaging data described in studies of FTD and PPA were systematically reviewed. A meta-analysis was conducted for speech measures that were used consistently in multiple studies. The methods and nomenclature used to describe speech in these disorders varied between studies. Our meta-analysis identified 3 speech measures which differentiate variants or healthy control-group participants (e.g., nonfluent and logopenic variants of PPA from all other groups, behavioral-variant FTD from a control group). Deficits within the frontal-lobe speech networks are linked to motor speech profiles of the nonfluent variant of PPA and progressive apraxia of speech. Motor speech impairment is rarely reported in semantic and logopenic variants of PPA. Limited data are available on motor speech impairment in the behavioral variant of FTD. Our review identified several measures of speech which may assist with diagnosis and classification, and consolidated the brain-behavior associations relating to speech in FTD, PPA, and progressive apraxia of speech.
Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.
2002-01-01
Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.
Inner Speech and Clarity of Self-Concept in Thought Disorder and Auditory-Verbal Hallucinations
de Sousa, Paulo; Sellwood, William; Spray, Amy; Fernyhough, Charles; Bentall, Richard P.
2016-01-01
Abstract Eighty patients and thirty controls were interviewed using one interview that promoted personal disclosure and another about everyday topics. Speech was scored using the Thought, Language and Communication scale (TLC). All participants completed the Self-Concept Clarity Scale (SCCS) and the Varieties of Inner Speech Questionnaire (VISQ). Patients scored lower than comparisons on the SCCS. Low scores were associated the disorganized dimension of TD. Patients also scored significantly higher on condensed and other people in inner speech, but not on dialogical or evaluative inner speech. The poverty of speech dimension of TD was associated with less dialogical inner speech, other people in inner speech, and less evaluative inner speech. Hallucinations were significantly associated with more other people in inner speech and evaluative inner speech. Clarity of self-concept and qualities of inner speech are differentially associated with dimensions of TD. The findings also support inner speech models of hallucinations. PMID:27898489
Inner Speech and Clarity of Self-Concept in Thought Disorder and Auditory-Verbal Hallucinations.
de Sousa, Paulo; Sellwood, William; Spray, Amy; Fernyhough, Charles; Bentall, Richard P
2016-12-01
Eighty patients and thirty controls were interviewed using one interview that promoted personal disclosure and another about everyday topics. Speech was scored using the Thought, Language and Communication scale (TLC). All participants completed the Self-Concept Clarity Scale (SCCS) and the Varieties of Inner Speech Questionnaire (VISQ). Patients scored lower than comparisons on the SCCS. Low scores were associated the disorganized dimension of TD. Patients also scored significantly higher on condensed and other people in inner speech, but not on dialogical or evaluative inner speech. The poverty of speech dimension of TD was associated with less dialogical inner speech, other people in inner speech, and less evaluative inner speech. Hallucinations were significantly associated with more other people in inner speech and evaluative inner speech. Clarity of self-concept and qualities of inner speech are differentially associated with dimensions of TD. The findings also support inner speech models of hallucinations.
Greene, Beth G; Logan, John S; Pisoni, David B
1986-03-01
We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Neural pathways for visual speech perception
Bernstein, Lynne E.; Liebenthal, Einat
2014-01-01
This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder
Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.
2010-01-01
In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615
Speech entrainment enables patients with Broca’s aphasia to produce fluent speech
Hubbard, H. Isabel; Hudspeth, Sarah Grace; Holland, Audrey L.; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris
2012-01-01
A distinguishing feature of Broca’s aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect ‘speech entrainment’ and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca’s aphasia. In Experiment 1, 13 patients with Broca’s aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca’s area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and functional magnetic resonance imaging data were collected before and after the treatment phase. Patients were able to produce a greater variety of words with and without speech entrainment at 1 and 6 weeks after training. Treatment-related decrease in cortical activation associated with speech entrainment was found in areas of the left posterior-inferior parietal lobe. We conclude that speech entrainment allows patients with Broca’s aphasia to double their speech output compared with spontaneous speech. Neuroimaging results suggest that speech entrainment allows patients to produce fluent speech by providing an external gating mechanism that yokes a ventral language network that encodes conceptual aspects of speech. Preliminary results suggest that training with speech entrainment improves speech production in Broca’s aphasia providing a potential therapeutic method for a disorder that has been shown to be particularly resistant to treatment. PMID:23250889
[Speech fluency developmental profile in Brazilian Portuguese speakers].
Martins, Vanessa de Oliveira; Andrade, Claudia Regina Furquim de
2008-01-01
speech fluency varies from one individual to the next, fluent or stutterer, depending on several factors. Studies that investigate the influence of age on fluency patterns have been identified; however these differences were investigated in isolated age groups. Studies about life span fluency variations were not found. to verify the speech fluency developmental profile. speech samples of 594 fluent participants of both genders, with ages between 2:0 and 99:11 years, speakers of the Brazilian Portuguese language, were analyzed. Participants were grouped as follows: pre-scholars, scholars, early adolescence, late adolescence, adults and elderlies. Speech samples were analyzed according to the Speech Fluency Profile variables and were compared regarding: typology of speech disruptions (typical and less typical), speech rate (words and syllables per minute) and frequency of speech disruptions (percentage of speech discontinuity). although isolated variations were identified, overall there was no significant difference between the age groups for the speech disruption indexes (typical and less typical speech disruptions and percentage of speech discontinuity). Significant differences were observed between the groups when considering speech rate. the development of the neurolinguistic system for speech fluency, in terms of speech disruptions, seems to stabilize itself during the first years of life, presenting no alterations during the life span. Indexes of speech rate present variations in the age groups, indicating patterns of acquisition, development, stabilization and degeneration.
GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.
2012-01-01
We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
ERIC Educational Resources Information Center
van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.
2007-01-01
Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…
Detection of target phonemes in spontaneous and read speech.
Mehta, G; Cutler, A
1988-01-01
Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalise to the recognition of spontaneous speech. In the present study listeners were presented with both spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Responses were, overall, equally fast in each speech mode. However, analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than in unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claims from previous work that listeners pay great attention to prosodic information in the process of recognising speech.
Relative Salience of Speech Rhythm and Speech Rate on Perceived Foreign Accent in a Second Language.
Polyanskaya, Leona; Ordin, Mikhail; Busa, Maria Grazia
2017-09-01
We investigated the independent contribution of speech rate and speech rhythm to perceived foreign accent. To address this issue we used a resynthesis technique that allows neutralizing segmental and tonal idiosyncrasies between identical sentences produced by French learners of English at different proficiency levels and maintaining the idiosyncrasies pertaining to prosodic timing patterns. We created stimuli that (1) preserved the idiosyncrasies in speech rhythm while controlling for the differences in speech rate between the utterances; (2) preserved the idiosyncrasies in speech rate while controlling for the differences in speech rhythm between the utterances; and (3) preserved the idiosyncrasies both in speech rate and speech rhythm. All the stimuli were created in intoned (with imposed intonational contour) and flat (with monotonized, constant F0) conditions. The original and the resynthesized sentences were rated by native speakers of English for degree of foreign accent. We found that both speech rate and speech rhythm influence the degree of perceived foreign accent, but the effect of speech rhythm is larger than that of speech rate. We also found that intonation enhances the perception of fine differences in rhythmic patterns but reduces the perceptual salience of fine differences in speech rate.
Review of Visual Speech Perception by Hearing and Hearing-Impaired People: Clinical Implications
ERIC Educational Resources Information Center
Woodhouse, Lynn; Hickson, Louise; Dodd, Barbara
2009-01-01
Background: Speech perception is often considered specific to the auditory modality, despite convincing evidence that speech processing is bimodal. The theoretical and clinical roles of speech-reading for speech perception, however, have received little attention in speech-language therapy. Aims: The role of speech-read information for speech…
Electrophysiological Evidence for a Multisensory Speech-Specific Mode of Perception
ERIC Educational Resources Information Center
Stekelenburg, Jeroen J.; Vroomen, Jean
2012-01-01
We investigated whether the interpretation of auditory stimuli as speech or non-speech affects audiovisual (AV) speech integration at the neural level. Perceptually ambiguous sine-wave replicas (SWS) of natural speech were presented to listeners who were either in "speech mode" or "non-speech mode". At the behavioral level, incongruent lipread…
ERIC Educational Resources Information Center
Tedford, Thomas L., Ed.
This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…
ERIC Educational Resources Information Center
Adank, Patti
2012-01-01
The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…
Methods and apparatus for non-acoustic speech characterization and recognition
Holzrichter, John F.
1999-01-01
By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Methods and apparatus for non-acoustic speech characterization and recognition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, J.F.
By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
NASA Technical Reports Server (NTRS)
Wolf, Jared J.
1977-01-01
The following research was discussed: (1) speech signal processing; (2) automatic speech recognition; (3) continuous speech understanding; (4) speaker recognition; (5) speech compression; (6) subjective and objective evaluation of speech communication system; (7) measurement of the intelligibility and quality of speech when degraded by noise or other masking stimuli; (8) speech synthesis; (9) instructional aids for second-language learning and for training of the deaf; and (10) investigation of speech correlates of psychological stress. Experimental psychology, control systems, and human factors engineering, which are often relevant to the proper design and operation of speech systems are described.
Knowles, J C; Chalian, V A; Shanks, J C
1984-02-01
Surgery for cancer of the floor of the mouth often results in alteration of the muscles of the tongue and floor of the mouth. Both primary and secondary surgical procedures often result in scar formation with reduced mobility of the tongue during speech and deglutition. Speech is often used as a diagnostic tool in the placement of the anterior teeth during fabrication of a prosthesis. Speech can similarly be used to help determine the proper placement of a speech portion of the prosthesis. The prosthetic rehabilitation approach described lowers the palatal vault with a false palate to enable the tongue to function against it during speech (Fig. 15). Group studies have shown that the design and fabrication of speech prostheses for partial glossectomy patients have significantly improved speech and swallowing for these patients. A speech pathologist is helpful during diagnosis, and speech therapy is necessary for significant speech improvement. Prosthetic rehabilitation alone cannot be expected to improve speech.
Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain
Gross, Joachim; Hoogenboom, Nienke; Thut, Gregor; Schyns, Philippe; Panzeri, Stefano; Belin, Pascal; Garrod, Simon
2013-01-01
Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations. PMID:24391472
Specific acoustic models for spontaneous and dictated style in indonesian speech recognition
NASA Astrophysics Data System (ADS)
Vista, C. B.; Satriawan, C. H.; Lestari, D. P.; Widyantoro, D. H.
2018-03-01
The performance of an automatic speech recognition system is affected by differences in speech style between the data the model is originally trained upon and incoming speech to be recognized. In this paper, the usage of GMM-HMM acoustic models for specific speech styles is investigated. We develop two systems for the experiments; the first employs a speech style classifier to predict the speech style of incoming speech, either spontaneous or dictated, then decodes this speech using an acoustic model specifically trained for that speech style. The second system uses both acoustic models to recognise incoming speech and decides upon a final result by calculating a confidence score of decoding. Results show that training specific acoustic models for spontaneous and dictated speech styles confers a slight recognition advantage as compared to a baseline model trained on a mixture of spontaneous and dictated training data. In addition, the speech style classifier approach of the first system produced slightly more accurate results than the confidence scoring employed in the second system.
Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
Mustafa, Mumtaz Begum; Salim, Siti Salwah; Mohamed, Noraini; Al-Qatab, Bassam; Siong, Chng Eng
2014-01-01
Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. PMID:24466004
Speech outcomes in Cantonese patients after glossectomy.
Wong, Ripley Kit; Poon, Esther Sok-Man; Woo, Cynthia Yuen-Man; Chan, Sabina Ching-Shun; Wong, Elsa Siu-Ping; Chu, Ada Wai-Sze
2007-08-01
We sought to determine the major factors affecting speech production of Cantonese-speaking glossectomized patients. Error pattern was analyzed. Forty-one Cantonese-speaking subjects who had undergone glossectomy > or = 6 months previously were recruited. Speech production evaluation included (1) phonetic error analysis in nonsense syllable; (2) speech intelligibility in sentences evaluated by naive listeners; (3) overall speech intelligibility in conversation evaluated by experienced speech therapists. Patients receiving adjuvant radiotherapy had significantly poorer segmental and connected speech production. Total or subtotal glossectomy also resulted in poor speech outcomes. Patients having free flap reconstruction showed the best speech outcomes. Patients without lymph node metastasis had significantly better speech scores when compared with patients with lymph node metastasis. Initial consonant production had the worst scores, while vowel production was the least affected. Speech outcomes of Cantonese-speaking glossectomized patients depended on the severity of the disease. Initial consonants had the greatest effect on speech intelligibility.
Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age
Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik
2015-01-01
Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker’s age. Here, we report two experiments on age estimation by “naïve” listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers’ natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60–65 years) speakers in comparison with younger (20–25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40–45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed. PMID:26236259
Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang
2018-05-01
Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Giving speech a hand: gesture modulates activity in auditory cortex during speech perception.
Hubbard, Amy L; Wilson, Stephen M; Callan, Daniel E; Dapretto, Mirella
2009-03-01
Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture-a fundamental type of hand gesture that marks speech prosody-might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions.
Giving Speech a Hand: Gesture Modulates Activity in Auditory Cortex During Speech Perception
Hubbard, Amy L.; Wilson, Stephen M.; Callan, Daniel E.; Dapretto, Mirella
2008-01-01
Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture – a fundamental type of hand gesture that marks speech prosody – might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions. PMID:18412134
ERIC Educational Resources Information Center
Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve
2018-01-01
To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
Multilevel Analysis in Analyzing Speech Data
ERIC Educational Resources Information Center
Guddattu, Vasudeva; Krishna, Y.
2011-01-01
The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Speech communications in noise
NASA Technical Reports Server (NTRS)
1984-01-01
The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.
Schmid, Gabriele; Thielmann, Anke; Ziegler, Wolfram
2009-03-01
Patients with lesions of the left hemisphere often suffer from oral-facial apraxia, apraxia of speech, and aphasia. In these patients, visual features often play a critical role in speech and language therapy, when pictured lip shapes or the therapist's visible mouth movements are used to facilitate speech production and articulation. This demands audiovisual processing both in speech and language treatment and in the diagnosis of oral-facial apraxia. The purpose of this study was to investigate differences in audiovisual perception of speech as compared to non-speech oral gestures. Bimodal and unimodal speech and non-speech items were used and additionally discordant stimuli constructed, which were presented for imitation. This study examined a group of healthy volunteers and a group of patients with lesions of the left hemisphere. Patients made substantially more errors than controls, but the factors influencing imitation accuracy were more or less the same in both groups. Error analyses in both groups suggested different types of representations for speech as compared to the non-speech domain, with speech having a stronger weight on the auditory modality and non-speech processing on the visual modality. Additionally, this study was able to show that the McGurk effect is not limited to speech.
ERIC Educational Resources Information Center
Gray, Christina; Baylor, Carolyn; Eadie, Tanya; Kendall, Diane; Yorkston, Kathryn
2012-01-01
Background: The term "speech usage" refers to what people want or need to do with their speech to fulfil the communication demands in their life roles. Speech-language pathologists (SLPs) need to know about clients' speech usage to plan appropriate interventions to meet their life participation goals. The Levels of Speech Usage is a…
Speech Anxiety: The Importance of Identification in the Basic Speech Course.
ERIC Educational Resources Information Center
Mandeville, Mary Y.
A study investigated speech anxiety in the basic speech course by means of pre and post essays. Subjects, 73 students in 3 classes in the basic speech course at a southwestern multiuniversity, wrote a two-page essay on their perceptions of their speech anxiety before the first speaking project. Students discussed speech anxiety in class and were…
NASA Technical Reports Server (NTRS)
Begault, Durand R.; Bittner, Rachel M.; Anderson, Mark R.
2012-01-01
Auditory communication displays within the NextGen data link system may use multiple synthetic speech messages replacing traditional ATC and company communications. The design of an interface for selecting amongst multiple incoming messages can impact both performance (time to select, audit and release a message) and preference. Two design factors were evaluated: physical pressure-sensitive switches versus flat panel "virtual switches", and the presence or absence of auditory feedback from switch contact. Performance with stimuli using physical switches was 1.2 s faster than virtual switches (2.0 s vs. 3.2 s); auditory feedback provided a 0.54 s performance advantage (2.33 s vs. 2.87 s). There was no interaction between these variables. Preference data were highly correlated with performance.
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review
NASA Astrophysics Data System (ADS)
Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH
2017-09-01
This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
Obstructive sleep apnea, seizures, and childhood apraxia of speech.
Caspari, Susan S; Strand, Edythe A; Kotagal, Suresh; Bergqvist, Christina
2008-06-01
Associations between obstructive sleep apnea and motor speech disorders in adults have been suggested, though little has been written about possible effects of sleep apnea on speech acquisition in children with motor speech disorders. This report details the medical and speech history of a nonverbal child with seizures and severe apraxia of speech. For 6 years, he made no functional gains in speech production, despite intensive speech therapy. After tonsillectomy for obstructive sleep apnea at age 6 years, he experienced a reduction in seizures and rapid growth in speech production. The findings support a relationship between obstructive sleep apnea and childhood apraxia of speech. The rather late diagnosis and treatment of obstructive sleep apnea, especially in light of what was such a life-altering outcome (gaining functional speech), has significant implications. Most speech sounds develop during ages 2-5 years, which is also the peak time of occurrence of adenotonsillar hypertrophy and childhood obstructive sleep apnea. Hence it is important to establish definitive diagnoses, and to consider early and more aggressive treatments for obstructive sleep apnea, in children with motor speech disorders.
Bernstein, Lynne E.; Jiang, Jintao; Pantazis, Dimitrios; Lu, Zhong-Lin; Joshi, Anand
2011-01-01
The talking face affords multiple types of information. To isolate cortical sites with responsibility for integrating linguistically relevant visual speech cues, speech and non-speech face gestures were presented in natural video and point-light displays during fMRI scanning at 3.0T. Participants with normal hearing viewed the stimuli and also viewed localizers for the fusiform face area (FFA), the lateral occipital complex (LOC), and the visual motion (V5/MT) regions of interest (ROIs). The FFA, the LOC, and V5/MT were significantly less activated for speech relative to non-speech and control stimuli. Distinct activation of the posterior superior temporal sulcus and the adjacent middle temporal gyrus to speech, independent of media, was obtained in group analyses. Individual analyses showed that speech and non-speech stimuli were associated with adjacent but different activations, with the speech activations more anterior. We suggest that the speech activation area is the temporal visual speech area (TVSA), and that it can be localized with the combination of stimuli used in this study. PMID:20853377
Namasivayam, Aravind Kumar; Pukonen, Margit; Goshulak, Debra; Yu, Vickie Y; Kadis, Darren S; Kroll, Robert; Pang, Elizabeth W; De Nil, Luc F
2013-01-01
The current study was undertaken to investigate the impact of speech motor issues on the speech intelligibility of children with moderate to severe speech sound disorders (SSD) within the context of the PROMPT intervention approach. The word-level Children's Speech Intelligibility Measure (CSIM), the sentence-level Beginner's Intelligibility Test (BIT) and tests of speech motor control and articulation proficiency were administered to 12 children (3:11 to 6:7 years) before and after PROMPT therapy. PROMPT treatment was provided for 45 min twice a week for 8 weeks. Twenty-four naïve adult listeners aged 22-46 years judged the intelligibility of the words and sentences. For CSIM, each time a recorded word was played to the listeners they were asked to look at a list of 12 words (multiple-choice format) and circle the word while for BIT sentences, the listeners were asked to write down everything they heard. Words correctly circled (CSIM) or transcribed (BIT) were averaged across three naïve judges to calculate percentage speech intelligibility. Speech intelligibility at both the word and sentence level was significantly correlated with speech motor control, but not articulatory proficiency. Further, the severity of speech motor planning and sequencing issues may potentially be a limiting factor in connected speech intelligibility and highlights the need to target these issues early and directly in treatment. The reader will be able to: (1) outline the advantages and disadvantages of using word- and sentence-level speech intelligibility tests; (2) describe the impact of speech motor control and articulatory proficiency on speech intelligibility; and (3) describe how speech motor control and speech intelligibility data may provide critical information to aid treatment planning. Copyright © 2013 Elsevier Inc. All rights reserved.
A common functional neural network for overt production of speech and gesture.
Marstaller, L; Burianová, H
2015-01-22
The perception of co-speech gestures, i.e., hand movements that co-occur with speech, has been investigated by several studies. The results show that the perception of co-speech gestures engages a core set of frontal, temporal, and parietal areas. However, no study has yet investigated the neural processes underlying the production of co-speech gestures. Specifically, it remains an open question whether Broca's area is central to the coordination of speech and gestures as has been suggested previously. The objective of this study was to use functional magnetic resonance imaging to (i) investigate the regional activations underlying overt production of speech, gestures, and co-speech gestures, and (ii) examine functional connectivity with Broca's area. We hypothesized that co-speech gesture production would activate frontal, temporal, and parietal regions that are similar to areas previously found during co-speech gesture perception and that both speech and gesture as well as co-speech gesture production would engage a neural network connected to Broca's area. Whole-brain analysis confirmed our hypothesis and showed that co-speech gesturing did engage brain areas that form part of networks known to subserve language and gesture. Functional connectivity analysis further revealed a functional network connected to Broca's area that is common to speech, gesture, and co-speech gesture production. This network consists of brain areas that play essential roles in motor control, suggesting that the coordination of speech and gesture is mediated by a shared motor control network. Our findings thus lend support to the idea that speech can influence co-speech gesture production on a motoric level. Copyright © 2014 IBRO. Published by Elsevier Ltd. All rights reserved.
Nixon, C; Anderson, T; Morris, L; McCavitt, A; McKinley, R; Yeager, D; McDaniel, M
1998-11-01
The intelligibility of female and male speech is equivalent under most ordinary living conditions. However, due to small differences between their acoustic speech signals, called speech spectra, one can be more or less intelligible than the other in certain situations such as high levels of noise. Anecdotal information, supported by some empirical observations, suggests that some of the high intensity noise spectra of military aircraft cockpits may degrade the intelligibility of female speech more than that of male speech. In an applied research study, the intelligibility of female and male speech was measured in several high level aircraft cockpit noise conditions experienced in military aviation. In Part I, (Nixon CW, et al. Aviat Space Environ Med 1998; 69:675-83) female speech intelligibility measured in the spectra and levels of aircraft cockpit noises and with noise-canceling microphones was lower than that of the male speech in all conditions. However, the differences were small and only those at some of the highest noise levels were significant. Although speech intelligibility of both genders was acceptable during normal cruise noises, improvements are required in most of the highest levels of noise created during maximum aircraft operating conditions. These results are discussed in a Part I technical report. This Part II report examines the intelligibility in the same aircraft cockpit noises of vocoded female and male speech and the accuracy with which female and male speech in some of the cockpit noises were understood by automatic speech recognition systems. The intelligibility of vocoded female speech was generally the same as that of vocoded male speech. No significant differences were measured between the recognition accuracy of male and female speech by the automatic speech recognition systems. The intelligibility of female and male speech was equivalent for these conditions.
Kim, Heejung; Hahm, Jarang; Lee, Hyekyoung; Kang, Eunjoo; Kang, Hyejin; Lee, Dong Soo
2015-05-01
The human brain naturally integrates audiovisual information to improve speech perception. However, in noisy environments, understanding speech is difficult and may require much effort. Although the brain network is supposed to be engaged in speech perception, it is unclear how speech-related brain regions are connected during natural bimodal audiovisual or unimodal speech perception with counterpart irrelevant noise. To investigate the topological changes of speech-related brain networks at all possible thresholds, we used a persistent homological framework through hierarchical clustering, such as single linkage distance, to analyze the connected component of the functional network during speech perception using functional magnetic resonance imaging. For speech perception, bimodal (audio-visual speech cue) or unimodal speech cues with counterpart irrelevant noise (auditory white-noise or visual gum-chewing) were delivered to 15 subjects. In terms of positive relationship, similar connected components were observed in bimodal and unimodal speech conditions during filtration. However, during speech perception by congruent audiovisual stimuli, the tighter couplings of left anterior temporal gyrus-anterior insula component and right premotor-visual components were observed than auditory or visual speech cue conditions, respectively. Interestingly, visual speech is perceived under white noise by tight negative coupling in the left inferior frontal region-right anterior cingulate, left anterior insula, and bilateral visual regions, including right middle temporal gyrus, right fusiform components. In conclusion, the speech brain network is tightly positively or negatively connected, and can reflect efficient or effortful processes during natural audiovisual integration or lip-reading, respectively, in speech perception.
Sensorimotor Oscillations Prior to Speech Onset Reflect Altered Motor Networks in Adults Who Stutter
Mersov, Anna-Maria; Jobst, Cecilia; Cheyne, Douglas O.; De Nil, Luc
2016-01-01
Adults who stutter (AWS) have demonstrated atypical coordination of motor and sensory regions during speech production. Yet little is known of the speech-motor network in AWS in the brief time window preceding audible speech onset. The purpose of the current study was to characterize neural oscillations in the speech-motor network during preparation for and execution of overt speech production in AWS using magnetoencephalography (MEG). Twelve AWS and 12 age-matched controls were presented with 220 words, each word embedded in a carrier phrase. Controls were presented with the same word list as their matched AWS participant. Neural oscillatory activity was localized using minimum-variance beamforming during two time periods of interest: speech preparation (prior to speech onset) and speech execution (following speech onset). Compared to controls, AWS showed stronger beta (15–25 Hz) suppression in the speech preparation stage, followed by stronger beta synchronization in the bilateral mouth motor cortex. AWS also recruited the right mouth motor cortex significantly earlier in the speech preparation stage compared to controls. Exaggerated motor preparation is discussed in the context of reduced coordination in the speech-motor network of AWS. It is further proposed that exaggerated beta synchronization may reflect a more strongly inhibited motor system that requires a stronger beta suppression to disengage prior to speech initiation. These novel findings highlight critical differences in the speech-motor network of AWS that occur prior to speech onset and emphasize the need to investigate further the speech-motor assembly in the stuttering population. PMID:27642279
Are Current Insulin Pumps Accessible to Blind and Visually Impaired People?
Burton, Darren M.; Uslan, Mark M.; Blubaugh, Morgan V.; Clements, Charles W.
2009-01-01
Background In 2004, Uslan and colleagues determined that insulin pumps (IPs) on the market were largely inaccessible to blind and visually impaired persons. The objective of this study is to determine if accessibility status changed in the ensuing 4 years. Methods Five IPs on the market in 2008 were acquired and analyzed for key accessibility traits such as speech and other audio output, tactual nature of control buttons, and the quality of visual displays. It was also determined whether or not a blind or visually impaired person could independently complete tasks such as programming the IP for insulin delivery, replacing batteries, and reading manuals and other documentation. Results It was found that IPs have not improved in accessibility since 2004. None have speech output, and with the exception of the Animas IR 2020, no significantly improved visual display characteristics were found. Documentation is still not completely accessible. Conclusion Insulin pumps are relatively complex devices, with serious health consequences resulting from improper use. For IPs to be used safely and independently by blind and visually impaired patients, they must include voice output to communicate all the information presented on their display screens. Enhancing display contrast and the size of the displayed information would also improve accessibility for visually impaired users. The IPs must also come with accessible user documentation in alternate formats. PMID:20144301
Are current insulin pumps accessible to blind and visually impaired people?
Burton, Darren M; Uslan, Mark M; Blubaugh, Morgan V; Clements, Charles W
2009-05-01
In 2004, Uslan and colleagues determined that insulin pumps (IPs) on the market were largely inaccessible to blind and visually impaired persons. The objective of this study is to determine if accessibility status changed in the ensuing 4 years. Five IPs on the market in 2008 were acquired and analyzed for key accessibility traits such as speech and other audio output, tactual nature of control buttons, and the quality of visual displays. It was also determined whether or not a blind or visually impaired person could independently complete tasks such as programming the IP for insulin delivery, replacing batteries, and reading manuals and other documentation. It was found that IPs have not improved in accessibility since 2004. None have speech output, and with the exception of the Animas IR 2020, no significantly improved visual display characteristics were found. Documentation is still not completely accessible. Insulin pumps are relatively complex devices, with serious health consequences resulting from improper use. For IPs to be used safely and independently by blind and visually impaired patients, they must include voice output to communicate all the information presented on their display screens. Enhancing display contrast and the size of the displayed information would also improve accessibility for visually impaired users. The IPs must also come with accessible user documentation in alternate formats. 2009 Diabetes Technology Society.
The Speech multi features fusion perceptual hash algorithm based on tensor decomposition
NASA Astrophysics Data System (ADS)
Huang, Y. B.; Fan, M. H.; Zhang, Q. Y.
2018-03-01
With constant progress in modern speech communication technologies, the speech data is prone to be attacked by the noise or maliciously tampered. In order to make the speech perception hash algorithm has strong robustness and high efficiency, this paper put forward a speech perception hash algorithm based on the tensor decomposition and multi features is proposed. This algorithm analyses the speech perception feature acquires each speech component wavelet packet decomposition. LPCC, LSP and ISP feature of each speech component are extracted to constitute the speech feature tensor. Speech authentication is done by generating the hash values through feature matrix quantification which use mid-value. Experimental results showing that the proposed algorithm is robust for content to maintain operations compared with similar algorithms. It is able to resist the attack of the common background noise. Also, the algorithm is highly efficiency in terms of arithmetic, and is able to meet the real-time requirements of speech communication and complete the speech authentication quickly.
Associations between speech features and phenotypic severity in Treacher Collins syndrome
2014-01-01
Background Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Methods Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5–74 years, median 34 years) divided into three groups comprising children 5–10 years (n = 4), adolescents 11–18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0–6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Results Children and adolescents presented with significantly higher speech composite scores (median 4, range 1–6) than adults (median 1, range 0–5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31–99) than in adults (98%, range 93–100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Conclusions Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services. PMID:24775909
Associations between speech features and phenotypic severity in Treacher Collins syndrome.
Asten, Pamela; Akre, Harriet; Persson, Christina
2014-04-28
Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in children, adolescents and a subgroup of adults with TCS. Only children displayed markedly reduced intelligibility. Speech was significantly correlated with phenotypic severity of TCS and orofacial dysfunction. Follow-up and treatment of speech should still be focused on young patients, but some adults with TCS seem to require continuing speech and language pathology services.
Gauvin, Hanna S; De Baene, Wouter; Brass, Marcel; Hartsuiker, Robert J
2016-02-01
To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated whether internal verbal monitoring takes place through the speech perception system, as proposed by perception-based theories of speech monitoring, or whether mechanisms independent of perception are applied, as proposed by production-based theories of speech monitoring. With the use of fMRI during a tongue twister task we observed that error detection in internal speech during noise-masked overt speech production and error detection in speech perception both recruit the same neural network, which includes pre-supplementary motor area (pre-SMA), dorsal anterior cingulate cortex (dACC), anterior insula (AI), and inferior frontal gyrus (IFG). Although production and perception recruit similar areas, as proposed by perception-based accounts, we did not find activation in superior temporal areas (which are typically associated with speech perception) during internal speech monitoring in speech production as hypothesized by these accounts. On the contrary, results are highly compatible with a domain general approach to speech monitoring, by which internal speech monitoring takes place through detection of conflict between response options, which is subsequently resolved by a domain general executive center (e.g., the ACC). Copyright © 2015 Elsevier Inc. All rights reserved.
A novel radar sensor for the non-contact detection of speech signals.
Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi
2010-01-01
Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects.
A Novel Radar Sensor for the Non-Contact Detection of Speech Signals
Jiao, Mingke; Lu, Guohua; Jing, Xijing; Li, Sheng; Li, Yanfeng; Wang, Jianqi
2010-01-01
Different speech detection sensors have been developed over the years but they are limited by the loss of high frequency speech energy, and have restricted non-contact detection due to the lack of penetrability. This paper proposes a novel millimeter microwave radar sensor to detect speech signals. The utilization of a high operating frequency and a superheterodyne receiver contributes to the high sensitivity of the radar sensor for small sound vibrations. In addition, the penetrability of microwaves allows the novel sensor to detect speech signals through nonmetal barriers. Results show that the novel sensor can detect high frequency speech energies and that the speech quality is comparable to traditional microphone speech. Moreover, the novel sensor can detect speech signals through a nonmetal material of a certain thickness between the sensor and the subject. Thus, the novel speech sensor expands traditional speech detection techniques and provides an exciting alternative for broader application prospects. PMID:22399895
Acoustics in human communication: evolving ideas about the nature of speech.
Cooper, F S
1980-07-01
This paper discusses changes in attitude toward the nature of speech during the past half century. After reviewing early views on the subject, it considers the role of speech spectrograms, speech articulation, speech perception, messages and computers, and the nature of fluent speech.
Music and Speech Perception in Children Using Sung Speech
Nie, Yingjiu; Galvin, John J.; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie
2018-01-01
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners. PMID:29609496
Music and Speech Perception in Children Using Sung Speech.
Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie
2018-01-01
This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
THE COMPREHENSION OF RAPID SPEECH BY THE BLIND, PART III.
ERIC Educational Resources Information Center
FOULKE, EMERSON
A REVIEW OF THE RESEARCH ON THE COMPREHENSION OF RAPID SPEECH BY THE BLIND IDENTIFIES FIVE METHODS OF SPEECH COMPRESSION--SPEECH CHANGING, ELECTROMECHANICAL SAMPLING, COMPUTER SAMPLING, SPEECH SYNTHESIS, AND FREQUENCY DIVIDING WITH THE HARMONIC COMPRESSOR. THE SPEECH CHANGING AND ELECTROMECHANICAL SAMPLING METHODS AND THE NECESSARY APPARATUS HAVE…
ERIC Educational Resources Information Center
Lee, Jimin; Hustad, Katherine C.; Weismer, Gary
2014-01-01
Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-04
... Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities; IP Captioned Telephone..., the Commission released Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities, CC Docket No. 98-67, Declaratory Ruling, published at 68 FR...
Extensions to the Speech Disorders Classification System (SDCS)
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.
2010-01-01
This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…
Brouwer, Susanne; Van Engen, Kristin J; Calandruccio, Lauren; Bradlow, Ann R
2012-02-01
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. © 2012 Acoustical Society of America
Brouwer, Susanne; Van Engen, Kristin J.; Calandruccio, Lauren; Bradlow, Ann R.
2012-01-01
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener’s knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. PMID:22352516
Auditory-Perceptual Learning Improves Speech Motor Adaptation in Children
Shiller, Douglas M.; Rochon, Marie-Lyne
2015-01-01
Auditory feedback plays an important role in children’s speech development by providing the child with information about speech outcomes that is used to learn and fine-tune speech motor plans. The use of auditory feedback in speech motor learning has been extensively studied in adults by examining oral motor responses to manipulations of auditory feedback during speech production. Children are also capable of adapting speech motor patterns to perceived changes in auditory feedback, however it is not known whether their capacity for motor learning is limited by immature auditory-perceptual abilities. Here, the link between speech perceptual ability and the capacity for motor learning was explored in two groups of 5–7-year-old children who underwent a period of auditory perceptual training followed by tests of speech motor adaptation to altered auditory feedback. One group received perceptual training on a speech acoustic property relevant to the motor task while a control group received perceptual training on an irrelevant speech contrast. Learned perceptual improvements led to an enhancement in speech motor adaptation (proportional to the perceptual change) only for the experimental group. The results indicate that children’s ability to perceive relevant speech acoustic properties has a direct influence on their capacity for sensory-based speech motor adaptation. PMID:24842067
Cortical activation patterns correlate with speech understanding after cochlear implantation
Olds, Cristen; Pollonini, Luca; Abaya, Homer; Larky, Jannine; Loy, Megan; Bortfeld, Heather; Beauchamp, Michael S.; Oghalai, John S.
2015-01-01
Objectives Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, we used functional near-infrared spectroscopy (fNIRS) to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design We studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. We used fNIRS to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). We also used environmental sounds as a control stimulus. Behavioral measures consisted of the Speech Reception Threshold, CNC words, and AzBio Sentence tests measured in quiet. Results Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the CNC Words and AzBio Sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced cortical activations in all implanted participants. Conclusions Together, these data indicate that the responses we measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation. PMID:26709749
Measures to Evaluate the Effects of DBS on Speech Production
Weismer, Gary; Yunusova, Yana; Bunton, Kate
2011-01-01
The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066
Cooke, Martin; Lu, Youyi
2010-10-01
Talkers change the way they speak in noisy conditions. For energetic maskers, speech production changes are relatively well-understood, but less is known about how informational maskers such as competing speech affect speech production. The current study examines the effect of energetic and informational maskers on speech production by talkers speaking alone or in pairs. Talkers produced speech in quiet and in backgrounds of speech-shaped noise, speech-modulated noise, and competing speech. Relative to quiet, speech output level and fundamental frequency increased and spectral tilt flattened in proportion to the energetic masking capacity of the background. In response to modulated backgrounds, talkers were able to reduce substantially the degree of temporal overlap with the noise, with greater reduction for the competing speech background. Reduction in foreground-background overlap can be expected to lead to a release from both energetic and informational masking for listeners. Passive changes in speech rate, mean pause length or pause distribution cannot explain the overlap reduction, which appears instead to result from a purposeful process of listening while speaking. Talkers appear to monitor the background and exploit upcoming pauses, a strategy which is particularly effective for backgrounds containing intelligible speech.
Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis
Vielsmeier, Veronika; Kreuzer, Peter M.; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R. O.; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin
2016-01-01
Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (“How would you rate your ability to understand speech?”; “How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?”). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role. PMID:28018209
Speech Comprehension Difficulties in Chronic Tinnitus and Its Relation to Hyperacusis.
Vielsmeier, Veronika; Kreuzer, Peter M; Haubner, Frank; Steffens, Thomas; Semmler, Philipp R O; Kleinjung, Tobias; Schlee, Winfried; Langguth, Berthold; Schecklmann, Martin
2016-01-01
Objective: Many tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1) estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2) compare subjective reports of speech comprehension difficulties with behavioral measurements in a standardized speech comprehension test and to (3) explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram), as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and Methods: Speech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessments (pure tone audiometry, tinnitus pitch, and loudness matching), the Goettingen sentence test (in quiet) for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments ("How would you rate your ability to understand speech?"; "How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?"). Results: Subjectively-reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation). 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test). Subjective speech comprehension complaints (both for general and in noisy environment) were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated with speech comprehension difficulties in noisy environments, but not with speech comprehension difficulties in general. Conclusion: Speech comprehension deficits are frequent among tinnitus patients. Whereas speech comprehension deficits in quiet environments are primarily due to peripheral hearing loss, speech comprehension deficits in noisy environments are related to both peripheral hearing loss and dysfunctional central auditory processing. Disturbed speech comprehension in noisy environments might be modulated by a central inhibitory deficit. In addition, attentional and cognitive aspects may play a role.
The effect of filtered speech feedback on the frequency of stuttering
NASA Astrophysics Data System (ADS)
Rami, Manish Krishnakant
2000-10-01
This study investigated the effects of filtered components of speech and whispered speech on the frequency of stuttering. It is known that choral speech, shadowing, and altered auditory feedback are the only conditions which induce fluency without any additional effort than normally required to speak on the part of people who stutter. All these conditions use speech as a second signal. This experiment examined the role of components of speech signal as delineated by the source- filter theory of speech production. Three filtered speech signals, a whispered speech signal, and a choral speech signal formed the stimuli. It was postulated that if the speech signal in whole was necessary for producing fluency in people who stutter, then all other conditions except choral speech should fail to produce fluency enhancement. If the glottal source alone was adequate in restoring fluency, then only the conditions of NAF and whispered speech should fail in promoting fluency. In the event that full filter characteristics are necessary for the fluency creating effects, then all conditions except the choral speech and whispered speech should fail to produce fluency. If any part of the filter characteristics is sufficient in yielding fluency, then only the NAF and the approximate glottal source should fail to demonstrate an increase in the amount of fluency. Twelve adults who stuttered read passages under the six conditions while receiving auditory feedback consisting of one of the six experimental conditions: (a)NAF; (b)approximate glottal source; (c)glottal source and first formant; (d)glottal source and first two formants; and (e)whispered speech. Frequencies of stuttering were obtained for each condition and submitted to descriptive and inferential statistical analysis. Statistically significant differences in means were found within the choral feedback conditions. Specifically, the choral speech, the source and first formant, source and the first two formants, and the whispered speech conditions all decreased the frequency of stuttering while the approximate glottal source did not. It is suggested that articulatory events, chiefly the encoded speech output of the vocal tract origin, afford effective cues and induces fluent speech in people who stutter.
Su, Qiaotong; Galvin, John J.; Zhang, Guoping; Li, Yongxin
2016-01-01
Cochlear implant (CI) speech performance is typically evaluated using well-enunciated speech produced at a normal rate by a single talker. CI users often have greater difficulty with variations in speech production encountered in everyday listening. Within a single talker, speaking rate, amplitude, duration, and voice pitch information may be quite variable, depending on the production context. The coarse spectral resolution afforded by the CI limits perception of voice pitch, which is an important cue for speech prosody and for tonal languages such as Mandarin Chinese. In this study, sentence recognition from the Mandarin speech perception database was measured in adult and pediatric Mandarin-speaking CI listeners for a variety of speaking styles: voiced speech produced at slow, normal, and fast speaking rates; whispered speech; voiced emotional speech; and voiced shouted speech. Recognition of Mandarin Hearing in Noise Test sentences was also measured. Results showed that performance was significantly poorer with whispered speech relative to the other speaking styles and that performance was significantly better with slow speech than with fast or emotional speech. Results also showed that adult and pediatric performance was significantly poorer with Mandarin Hearing in Noise Test than with Mandarin speech perception sentences at the normal rate. The results suggest that adult and pediatric Mandarin-speaking CI patients are highly susceptible to whispered speech, due to the lack of lexically important voice pitch cues and perhaps other qualities associated with whispered speech. The results also suggest that test materials may contribute to differences in performance observed between adult and pediatric CI users. PMID:27363714
EEG oscillations entrain their phase to high-level features of speech sound.
Zoefel, Benedikt; VanRullen, Rufin
2016-01-01
Phase entrainment of neural oscillations, the brain's adjustment to rhythmic stimulation, is a central component in recent theories of speech comprehension: the alignment between brain oscillations and speech sound improves speech intelligibility. However, phase entrainment to everyday speech sound could also be explained by oscillations passively following the low-level periodicities (e.g., in sound amplitude and spectral content) of auditory stimulation-and not by an adjustment to the speech rhythm per se. Recently, using novel speech/noise mixture stimuli, we have shown that behavioral performance can entrain to speech sound even when high-level features (including phonetic information) are not accompanied by fluctuations in sound amplitude and spectral content. In the present study, we report that neural phase entrainment might underlie our behavioral findings. We observed phase-locking between electroencephalogram (EEG) and speech sound in response not only to original (unprocessed) speech but also to our constructed "high-level" speech/noise mixture stimuli. Phase entrainment to original speech and speech/noise sound did not differ in the degree of entrainment, but rather in the actual phase difference between EEG signal and sound. Phase entrainment was not abolished when speech/noise stimuli were presented in reverse (which disrupts semantic processing), indicating that acoustic (rather than linguistic) high-level features play a major role in the observed neural entrainment. Our results provide further evidence for phase entrainment as a potential mechanism underlying speech processing and segmentation, and for the involvement of high-level processes in the adjustment to the rhythm of speech. Copyright © 2015 Elsevier Inc. All rights reserved.
Nasal patency and otorhinolaryngologic-orofacial features in children.
Milanesi, Jovana de Moura; Berwig, Luana Cristina; Schuch, Luiz Henrique; Ritzel, Rodrigo Agne; Silva, Ana Maria Toniolo da; Corrêa, Eliane Castilhos Rodrigues
2017-11-21
Nasal obstruction is a common symptom in childhood, related to rhinitis and pharyngeal tonsil hypertrophy. In the presence of nasal obstruction, nasal patency may be reduced, and nasal breathing is replaced by mouth breathing. Orofacial and otorhinolaryngologic changes are related to this breathing mode. Objective evaluation of upper airways may be obtained through nasal patency measurement. To compare nasal patency and otorhinolaryngologic-orofacial features in children. One hundred and twenty three children, 6-12 year-old, and of both sexes underwent speech therapy evaluation, according to Orofacial Myofunctional Evaluation protocol, clinical and endoscopic otorhinolaryngologic examination and nasal patency measurement, using the absolute and predicted (%) peak nasal inspiratory flow values. Lower values of absolute and estimated peak nasal inspiratory flow values were found in children with restless sleep (p=0.006 and p=0.002), nasal obstruction report (p=0.027 and p=0.023), runny nose (p=0.004 and p=0.012), unsystematic lip closure during mastication (p=0.040 and p=0.026), masticatory speed reduced (p=0.006 and p=0.008) and altered solid food swallowing (p=0.006 and p=0.001). Absolute peak nasal inspiratory flow was lower in children with pale inferior turbinate (p=0.040), reduced hard palate width (p=0.037) and altered speech (p=0.004). Higher absolute values were found in children with increased tongue width (p=0.027) and, higher absolute and predicted (%) in children with mild everted lip (p=0.008 and p=0.000). Nasal patency was lower in children with restless sleep, rhinitis signs and symptoms, hard palate width reduced and with changes in mastication, deglutition and speech functions. It is also emphasized that most of the children presented signs and symptom of allergic rhinitis. Copyright © 2017 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
ERIC Educational Resources Information Center
Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike
2014-01-01
Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…
ERIC Educational Resources Information Center
Poole, Matthew L.; Brodtmann, Amy; Darby, David; Vogel, Adam P.
2017-01-01
Purpose: Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Method: Speech and…
Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition
ERIC Educational Resources Information Center
Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.
2018-01-01
Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…
The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech
ERIC Educational Resources Information Center
Wayne, Rachel V.; Johnsrude, Ingrid S.
2012-01-01
Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…
Inner Speech's Relationship with Overt Speech in Poststroke Aphasia
ERIC Educational Resources Information Center
Stark, Brielle C.; Geva, Sharon; Warburton, Elizabeth A.
2017-01-01
Purpose: Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech…
TEACHER'S GUIDE TO HIGH SCHOOL SPEECH.
ERIC Educational Resources Information Center
JENKINSON, EDWARD B., ED.
THIS GUIDE TO HIGH SCHOOL SPEECH FOCUSES ON SPEECH AS ORAL COMPOSITION, STRESSING THE IMPORTANCE OF CLEAR THINKING AND COMMUNICATION. THE PROPOSED 1-SEMESTER BASIC COURSE IN SPEECH ATTEMPTS TO IMPROVE THE STUDENT'S ABILITY TO COMPOSE AND DELIVER SPEECHES, TO THINK AND LISTEN CRITICALLY, AND TO UNDERSTAND THE SOCIAL FUNCTION OF SPEECH. IN ADDITION…
Peng, Jianxin; Yan, Nanjie; Wang, Dan
2015-01-01
The present study investigated Chinese speech intelligibility in 28 classrooms from nine different elementary schools in Guangzhou, China. The subjective Chinese speech intelligibility in the classrooms was evaluated with children in grades 2, 4, and 6 (7 to 12 years old). Acoustical measurements were also performed in these classrooms. Subjective Chinese speech intelligibility scores and objective speech intelligibility parameters, such as speech transmission index (STI), were obtained at each listening position for all tests. The relationship between subjective Chinese speech intelligibility scores and STI was revealed and analyzed. The effects of age on Chinese speech intelligibility scores were compared. Results indicate high correlations between subjective Chinese speech intelligibility scores and STI for grades 2, 4, and 6 children. Chinese speech intelligibility scores increase with increase of age under the same STI condition. The differences in scores among different age groups decrease as STI increases. To achieve 95% Chinese speech intelligibility scores, the STIs required for grades 2, 4, and 6 children are 0.75, 0.69, and 0.63, respectively.
Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.
2015-01-01
In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749
Affective Properties of Mothers' Speech to Infants With Hearing Impairment and Cochlear Implants
Bergeson, Tonya R.; Xu, Huiping; Kitamura, Christine
2015-01-01
Purpose The affective properties of infant-directed speech influence the attention of infants with normal hearing to speech sounds. This study explored the affective quality of maternal speech to infants with hearing impairment (HI) during the 1st year after cochlear implantation as compared to speech to infants with normal hearing. Method Mothers of infants with HI and mothers of infants with normal hearing matched by age (NH-AM) or hearing experience (NH-EM) were recorded playing with their infants during 3 sessions over a 12-month period. Speech samples of 25 s were low-pass filtered, leaving intonation but not speech information intact. Sixty adults rated the stimuli along 5 scales: positive/negative affect and intention to express affection, to encourage attention, to comfort/soothe, and to direct behavior. Results Low-pass filtered speech to HI and NH-EM groups was rated as more positive, affective, and comforting compared with the such speech to the NH-AM group. Speech to infants with HI and with NH-AM was rated as more directive than speech to the NH-EM group. Mothers decreased affective qualities in speech to all infants but increased directive qualities in speech to infants with NH-EM over time. Conclusions Mothers fine-tune communicative intent in speech to their infant's developmental stage. They adjust affective qualities to infants' hearing experience rather than to chronological age but adjust directive qualities of speech to the chronological age of their infants. PMID:25679195
Freedom of racist speech: Ego and expressive threats.
White, Mark H; Crandall, Christian S
2017-09-01
Do claims of "free speech" provide cover for prejudice? We investigate whether this defense of racist or hate speech serves as a justification for prejudice. In a series of 8 studies (N = 1,624), we found that explicit racial prejudice is a reliable predictor of the "free speech defense" of racist expression. Participants endorsed free speech values for singing racists songs or posting racist comments on social media; people high in prejudice endorsed free speech more than people low in prejudice (meta-analytic r = .43). This endorsement was not principled-high levels of prejudice did not predict endorsement of free speech values when identical speech was directed at coworkers or the police. Participants low in explicit racial prejudice actively avoided endorsing free speech values in racialized conditions compared to nonracial conditions, but participants high in racial prejudice increased their endorsement of free speech values in racialized conditions. Three experiments failed to find evidence that defense of racist speech by the highly prejudiced was based in self-relevant or self-protective motives. Two experiments found evidence that the free speech argument protected participants' own freedom to express their attitudes; the defense of other's racist speech seems motivated more by threats to autonomy than threats to self-regard. These studies serve as an elaboration of the Justification-Suppression Model (Crandall & Eshleman, 2003) of prejudice expression. The justification of racist speech by endorsing fundamental political values can serve to buffer racial and hate speech from normative disapproval. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Speech Entrainment Compensates for Broca's Area Damage
Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris
2015-01-01
Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443
The role of hearing ability and speech distortion in the facilitation of articulatory motor cortex.
Nuttall, Helen E; Kennedy-Higgins, Daniel; Devlin, Joseph T; Adank, Patti
2017-01-08
Excitability of articulatory motor cortex is facilitated when listening to speech in challenging conditions. Beyond this, however, we have little knowledge of what listener-specific and speech-specific factors engage articulatory facilitation during speech perception. For example, it is unknown whether speech motor activity is independent or dependent on the form of distortion in the speech signal. It is also unknown if speech motor facilitation is moderated by hearing ability. We investigated these questions in two experiments. We applied transcranial magnetic stimulation (TMS) to the lip area of primary motor cortex (M1) in young, normally hearing participants to test if lip M1 is sensitive to the quality (Experiment 1) or quantity (Experiment 2) of distortion in the speech signal, and if lip M1 facilitation relates to the hearing ability of the listener. Experiment 1 found that lip motor evoked potentials (MEPs) were larger during perception of motor-distorted speech that had been produced using a tongue depressor, and during perception of speech presented in background noise, relative to natural speech in quiet. Experiment 2 did not find evidence of motor system facilitation when speech was presented in noise at signal-to-noise ratios where speech intelligibility was at 50% or 75%, which were significantly less severe noise levels than used in Experiment 1. However, there was a significant interaction between noise condition and hearing ability, which indicated that when speech stimuli were correctly classified at 50%, speech motor facilitation was observed in individuals with better hearing, whereas individuals with relatively worse but still normal hearing showed more activation during perception of clear speech. These findings indicate that the motor system may be sensitive to the quantity, but not quality, of degradation in the speech signal. Data support the notion that motor cortex complements auditory cortex during speech perception, and point to a role for the motor cortex in compensating for differences in hearing ability. Copyright © 2016 Elsevier Ltd. All rights reserved.
How visual timing and form information affect speech and non-speech processing.
Kim, Jeesun; Davis, Chris
2014-10-01
Auditory speech processing is facilitated when the talker's face/head movements are seen. This effect is typically explained in terms of visual speech providing form and/or timing information. We determined the effect of both types of information on a speech/non-speech task (non-speech stimuli were spectrally rotated speech). All stimuli were presented paired with the talker's static or moving face. Two types of moving face stimuli were used: full-face versions (both spoken form and timing information available) and modified face versions (only timing information provided by peri-oral motion available). The results showed that the peri-oral timing information facilitated response time for speech and non-speech stimuli compared to a static face. An additional facilitatory effect was found for full-face versions compared to the timing condition; this effect only occurred for speech stimuli. We propose the timing effect was due to cross-modal phase resetting; the form effect to cross-modal priming. Copyright © 2014 Elsevier Inc. All rights reserved.
Burnett, Greg C [Livermore, CA; Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA
2006-08-08
The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.
2004-03-23
The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.
2006-02-14
The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Children perceive speech onsets by ear and eye*
JERGER, SUSAN; DAMIAN, MARKUS F.; TYE-MURRAY, NANCY; ABDI, HERVÉ
2016-01-01
Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: −b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children – like adults – perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception. PMID:26752548
Impairments of speech fluency in Lewy body spectrum disorder.
Ash, Sharon; McMillan, Corey; Gross, Rachel G; Cook, Philip; Gunawardena, Delani; Morgan, Brianna; Boller, Ashley; Siderowf, Andrew; Grossman, Murray
2012-03-01
Few studies have examined connected speech in demented and non-demented patients with Parkinson's disease (PD). We assessed the speech production of 35 patients with Lewy body spectrum disorder (LBSD), including non-demented PD patients, patients with PD dementia (PDD), and patients with dementia with Lewy bodies (DLB), in a semi-structured narrative speech sample in order to characterize impairments of speech fluency and to determine the factors contributing to reduced speech fluency in these patients. Both demented and non-demented PD patients exhibited reduced speech fluency, characterized by reduced overall speech rate and long pauses between sentences. Reduced speech rate in LBSD correlated with measures of between-utterance pauses, executive functioning, and grammatical comprehension. Regression analyses related non-fluent speech, grammatical difficulty, and executive difficulty to atrophy in frontal brain regions. These findings indicate that multiple factors contribute to slowed speech in LBSD, and this is mediated in part by disease in frontal brain regions. Copyright © 2011 Elsevier Inc. All rights reserved.
Speech processing using conditional observable maximum likelihood continuity mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, John; Nix, David
A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence ofmore » speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.« less
Walking the talk--speech activates the leg motor cortex.
Liuzzi, Gianpiero; Ellger, Tanja; Flöel, Agnes; Breitenstein, Caterina; Jansen, Andreas; Knecht, Stefan
2008-09-01
Speech may have evolved from earlier modes of communication based on gestures. Consistent with such a motor theory of speech, cortical orofacial and hand motor areas are activated by both speech production and speech perception. However, the extent of speech-related activation of the motor cortex remains unclear. Therefore, we examined if reading and listening to continuous prose also activates non-brachiofacial motor representations like the leg motor cortex. We found corticospinal excitability of bilateral leg muscle representations to be enhanced by speech production and silent reading. Control experiments showed that speech production yielded stronger facilitation of the leg motor system than non-verbal tongue-mouth mobilization and silent reading more than a visuo-attentional task thus indicating speech-specificity of the effect. In the frame of the motor theory of speech this finding suggests that the system of gestural communication, from which speech may have evolved, is not confined to the hand but includes gestural movements of other body parts as well.
Religious Speech in the Military: Freedoms and Limitations
2011-01-01
abridging the freedom of speech .” Speech is construed broadly and includes both oral and written speech, as well as expressive conduct and displays when...intended to convey a message that is likely to be understood.7 Religious speech is certainly included. As a bedrock constitutional right, freedom of speech has...to good order and discipline or of a nature to bring discredit upon the armed forces)—the First Amendment’s freedom of speech will not provide them
Dog-directed speech: why do we use it and do dogs pay attention to it?
Ben-Aderet, Tobey; Gallego-Abenza, Mario
2017-01-01
Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. PMID:28077769
McKinnon, David H; McLeod, Sharynne; Reilly, Sheena
2007-01-01
The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to schoolchildren with speech disorders. Students with speech disorders were identified from 10,425 students in Australia using a 4-stage process: training in the data collection process, teacher identification, confirmation by a speech-language pathologist, and consultation with district special needs advisors. The prevalence of students with speech disorders was estimated; specifically, 0.33% of students were identified as stuttering, 0.12% as having a voice disorder, and 1.06% as having a speech-sound disorder. There was a higher prevalence of speech disorders in males than in females. As grade level increased, the prevalence of speech disorders decreased. There was no significant difference in the pattern of prevalence across the three speech disorders and four socioeconomic groups; however, students who were identified with a speech disorder were more likely to be in the higher socioeconomic groups. Finally, there was a difference between the perceived and actual level of support that was provided to these students. These prevalence figures are lower than those using initial identification by speech-language pathologists and similar to those using parent report.
Role of contextual cues on the perception of spectrally reduced interrupted speech.
Patro, Chhayakanta; Mendel, Lisa Lucks
2016-08-01
Understanding speech within an auditory scene is constantly challenged by interfering noise in suboptimal listening environments when noise hinders the continuity of the speech stream. In such instances, a typical auditory-cognitive system perceptually integrates available speech information and "fills in" missing information in the light of semantic context. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing counterparts. This inefficiency in perceptual integration of speech could be attributed to further degradations in the spectral-temporal domain imposed by CIs making it difficult to utilize the contextual evidence effectively. To address these issues, 20 normal hearing adults listened to speech that was spectrally reduced and spectrally reduced interrupted in a manner similar to CI processing. The Revised Speech Perception in Noise test, which includes contextually rich and contextually poor sentences, was used to evaluate the influence of semantic context on speech perception. Results indicated that listeners benefited more from semantic context when they listened to spectrally reduced speech alone. For the spectrally reduced interrupted speech, contextual information was not as helpful under significant spectral reductions, but became beneficial as the spectral resolution improved. These results suggest top-down processing facilitates speech perception up to a point, and it fails to facilitate speech understanding when the speech signals are significantly degraded.
Dog-directed speech: why do we use it and do dogs pay attention to it?
Ben-Aderet, Tobey; Gallego-Abenza, Mario; Reby, David; Mathevon, Nicolas
2017-01-11
Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. © 2017 The Author(s).
Crosse, Michael J; Lalor, Edmund C
2014-04-01
Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-09
...: Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities... Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities; Americans with Disabilities Act of 1990, CC Docket No. 98-67, CG Docket No. 10-123, Second Report and Order, Order on...
ERIC Educational Resources Information Center
Drijvers, Linda; Ozyurek, Asli
2017-01-01
Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…
ERIC Educational Resources Information Center
Dailey, K. Anne
Time-compressed speech (also called compressed speech, speeded speech, or accelerated speech) is an extension of the normal recording procedure for reproducing the spoken word. Compressed speech can be used to achieve dramatic reductions in listening time without significant loss in comprehension. The implications of such temporal reductions in…
Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder
ERIC Educational Resources Information Center
Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.
2006-01-01
Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…
Poor Speech Perception Is Not a Core Deficit of Childhood Apraxia of Speech: Preliminary Findings
ERIC Educational Resources Information Center
Zuk, Jennifer; Iuzzini-Seigel, Jenya; Cabbage, Kathryn; Green, Jordan R.; Hogan, Tiffany P.
2018-01-01
Purpose: Childhood apraxia of speech (CAS) is hypothesized to arise from deficits in speech motor planning and programming, but the influence of abnormal speech perception in CAS on these processes is debated. This study examined speech perception abilities among children with CAS with and without language impairment compared to those with…
Neural tracking of attended versus ignored speech is differentially affected by hearing loss.
Petersen, Eline Borch; Wöstmann, Malte; Obleser, Jonas; Lunner, Thomas
2017-01-01
Hearing loss manifests as a reduced ability to understand speech, particularly in multitalker situations. In these situations, younger normal-hearing listeners' brains are known to track attended speech through phase-locking of neural activity to the slow-varying envelope of the speech. This study investigates how hearing loss, compensated by hearing aids, affects the neural tracking of the speech-onset envelope in elderly participants with varying degree of hearing loss (n = 27, 62-86 yr; hearing thresholds 11-73 dB hearing level). In an active listening task, a to-be-attended audiobook (signal) was presented either in quiet or against a competing to-be-ignored audiobook (noise) presented at three individualized signal-to-noise ratios (SNRs). The neural tracking of the to-be-attended and to-be-ignored speech was quantified through the cross-correlation of the electroencephalogram (EEG) and the temporal envelope of speech. We primarily investigated the effects of hearing loss and SNR on the neural envelope tracking. First, we found that elderly hearing-impaired listeners' neural responses reliably track the envelope of to-be-attended speech more than to-be-ignored speech. Second, hearing loss relates to the neural tracking of to-be-ignored speech, resulting in a weaker differential neural tracking of to-be-attended vs. to-be-ignored speech in listeners with worse hearing. Third, neural tracking of to-be-attended speech increased with decreasing background noise. Critically, the beneficial effect of reduced noise on neural speech tracking decreased with stronger hearing loss. In sum, our results show that a common sensorineural processing deficit, i.e., hearing loss, interacts with central attention mechanisms and reduces the differential tracking of attended and ignored speech. The present study investigates the effect of hearing loss in older listeners on the neural tracking of competing speech. Interestingly, we observed that whereas internal degradation (hearing loss) relates to the neural tracking of ignored speech, external sound degradation (ratio between attended and ignored speech; signal-to-noise ratio) relates to tracking of attended speech. This provides the first evidence for hearing loss affecting the ability to neurally track speech. Copyright © 2017 the American Physiological Society.
De Jonge-Hoekstra, Lisette; Van der Steen, Steffie; Van Geert, Paul; Cox, Ralf F A
2016-01-01
As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6) from Kindergarten (n = 5) and first grade (n = 7) participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA) to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on (1) the temporal relation between gestures and speech, (2) the relative strength and direction of the interaction between gestures and speech, (3) the relative strength and direction between gestures and speech for different levels of understanding, and (4) relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal) asymmetry in the gestures-speech interaction. For younger children, the balance leans more toward gestures leading speech in time, while the balance leans more toward speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools' language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between gestures and speech. The picture that emerges from our analyses suggests that the relation between gestures, speech and cognition is more complex than previously thought. We suggest that temporal differences and asymmetry in influence between gestures and speech arise from simultaneous coordination of synergies.
Luyten, A; Bettens, K; D'haeseleer, E; Hodges, A; Galiwango, G; Vermeersch, H; Van Lierde, K
2016-01-01
The purpose of the current study was to assess the short-term effectiveness of short and intensive speech therapy provided to patients with cleft (lip and) palate (C(L)P) in terms of articulation and resonance. Five Ugandan patients (age: 7.3-19.6 years) with non-syndromic C(L)P received six hours of individualized speech therapy in three to four days. Speech therapy focused on correct phonetic placement and contrasts between oral and nasal airflow and resonance. Speech evaluations performed before and immediately after speech therapy, including perceptual and instrumental assessment techniques, were compared. Post-therapy, improvement of speech was noted for most of the patients, although to varying degrees. Clinically relevant progress of objective nasalance values and/or articulation was obtained in four patients. Overall, two patients showed normal speech intelligibility, while three patients required additional speech therapy. These preliminary short-term results demonstrate that short and intensive speech therapy can be effective for patients with C(L)P in countries with limited access to speech-language therapy. However, further research is needed on the long-term effectiveness and the advantages of applying this treatment protocol in countries with good access to speech therapy. The reader will be able to (1) list the challenges in resource poor-countries to achieve access to speech-language therapy services, (2) describe when the application of speech therapy is appropriate in patients with C(L)P, (3) describe the speech therapy that can be applied to reduce compensatory articulation and resonance disorders in patients with C(L)P, and (4) list the (possible) advantages of short, intensive speech therapy for both resource-poor and developed countries. Copyright © 2016 Elsevier Inc. All rights reserved.
Davidow, Jason H
2014-01-01
Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control speech rate between conditions limits our ability to determine if the changes were necessary for fluency. This study examined the effect of speech rate on several speech production variables during one-syllable-per-beat metronomic speech in order to determine changes that may be important for fluency during this fluency-inducing condition. Thirteen persons who stutter (PWS), aged 18-62 years, completed a series of speaking tasks. Several speech production variables were compared between conditions produced at different metronome beat rates, and between a control condition and a metronome-paced speech condition produced at a rate equal to the control condition. Vowel duration, voice onset time, pressure rise time and phonated intervals were significantly impacted by metronome beat rate. Voice onset time and the percentage of short (30-100 ms) phonated intervals significantly decreased from the control condition to the equivalent rate metronome-paced speech condition. A reduction in the percentage of short phonated intervals may be important for fluency during syllable-based metronome-paced speech for PWS. Future studies should continue examining the necessity of this reduction. In addition, speech rate must be controlled in future fluency-inducing condition studies, including neuroimaging investigations, in order for this research to make a substantial contribution to finding the fluency-inducing mechanism of fluency-inducing conditions. © 2013 Royal College of Speech and Language Therapists.
Intensive treatment of speech disorders in robin sequence: a case report.
Pinto, Maria Daniela Borro; Pegoraro-Krook, Maria Inês; Andrade, Laura Katarine Félix de; Correa, Ana Paula Carvalho; Rosa-Lugo, Linda Iris; Dutka, Jeniffer de Cássia Rillo
2017-10-23
To describe the speech of a patient with Pierre Robin Sequence (PRS) and severe speech disorders before and after participating in an Intensive Speech Therapy Program (ISTP). The ISTP consisted of two daily sessions of therapy over a 36-week period, resulting in a total of 360 therapy sessions. The sessions included the phases of establishment, generalization, and maintenance. A combination of strategies, such as modified contrast therapy and speech sound perception training, were used to elicit adequate place of articulation. The ISTP addressed correction of place of production of oral consonants and maximization of movement of the pharyngeal walls with a speech bulb reduction program. Therapy targets were addressed at the phonetic level with a gradual increase in the complexity of the productions hierarchically (e.g., syllables, words, phrases, conversation) while simultaneously addressing the velopharyngeal hypodynamism with speech bulb reductions. Re-evaluation after the ISTP revealed normal speech resonance and articulation with the speech bulb. Nasoendoscopic assessment indicated consistent velopharyngeal closure for all oral sounds with the speech bulb in place. Intensive speech therapy, combined with the use of the speech bulb, yielded positive outcomes in the rehabilitation of a clinical case with severe speech disorders associated with velopharyngeal dysfunction in Pierre Robin Sequence.
Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Green, Jordan R
2017-05-24
The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and speech delay. Participants included 48 children ranging between 4;7 to 17;8 (years;months) with CAS (n = 10), CAS + language impairment (n = 10), speech delay (n = 10), language impairment (n = 9), or typical development (n = 9). Speech inconsistency was assessed at phonemic and token-to-token levels using a variety of stimuli. Children with CAS and CAS + language impairment performed equivalently on all inconsistency assessments. Children with language impairment evidenced high levels of speech inconsistency on the phrase "buy Bobby a puppy." Token-to-token inconsistency of monosyllabic words and the phrase "buy Bobby a puppy" was sensitive and specific in differentiating children with CAS and speech delay, whereas inconsistency calculated on other stimuli (e.g., multisyllabic words) was less efficacious in differentiating between these disorders. Speech inconsistency is a core feature of CAS and is efficacious in differentiating between children with CAS and speech delay; however, sensitivity and specificity are stimuli dependent.
Shriberg, Lawrence D; Strand, Edythe A; Fourakis, Marios; Jakielski, Kathy J; Hall, Sheryl D; Karlsson, Heather B; Mabie, Heather L; McSweeny, Jane L; Tilkens, Christie M; Wilson, David L
2017-04-14
Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. PM and other scores were obtained for 264 participants in 6 groups: CAS in idiopathic, neurogenetic, and complex neurodevelopmental disorders; adult-onset apraxia of speech (AAS) consequent to stroke and primary progressive apraxia of speech; and idiopathic speech delay. Participants with CAS and AAS had significantly lower scores than typically speaking reference participants and speech delay controls on measures posited to assess representational and transcoding processes. Representational deficits differed between CAS and AAS groups, with support for both underspecified linguistic representations and memory/access deficits in CAS, but for only the latter in AAS. CAS-AAS similarities in the age-sex standardized percentages of occurrence of the most frequent type of inappropriate pauses (abrupt) and significant differences in the standardized occurrence of appropriate pauses were consistent with speech processing findings. Results support the hypotheses of core representational and transcoding speech processing deficits in CAS and theoretical coherence of the PM's pause-speech elements with these deficits.
Cortical Integration of Audio-Visual Information
Vander Wyk, Brent C.; Ramsay, Gordon J.; Hudac, Caitlin M.; Jones, Warren; Lin, David; Klin, Ami; Lee, Su Mei; Pelphrey, Kevin A.
2013-01-01
We investigated the neural basis of audio-visual processing in speech and non-speech stimuli. Physically identical auditory stimuli (speech and sinusoidal tones) and visual stimuli (animated circles and ellipses) were used in this fMRI experiment. Relative to unimodal stimuli, each of the multimodal conjunctions showed increased activation in largely non-overlapping areas. The conjunction of Ellipse and Speech, which most resembles naturalistic audiovisual speech, showed higher activation in the right inferior frontal gyrus, fusiform gyri, left posterior superior temporal sulcus, and lateral occipital cortex. The conjunction of Circle and Tone, an arbitrary audio-visual pairing with no speech association, activated middle temporal gyri and lateral occipital cortex. The conjunction of Circle and Speech showed activation in lateral occipital cortex, and the conjunction of Ellipse and Tone did not show increased activation relative to unimodal stimuli. Further analysis revealed that middle temporal regions, although identified as multimodal only in the Circle-Tone condition, were more strongly active to Ellipse-Speech or Circle-Speech, but regions that were identified as multimodal for Ellipse-Speech were always strongest for Ellipse-Speech. Our results suggest that combinations of auditory and visual stimuli may together be processed by different cortical networks, depending on the extent to which speech or non-speech percepts are evoked. PMID:20709442
Jeon, Jin Yong; Hong, Joo Young; Jang, Hyung Suk; Kim, Jae Hyeon
2015-12-01
It is necessary to consider not only annoyance of interior noises but also speech privacy to achieve acoustic comfort in a passenger car of a high-speed train because speech from other passengers can be annoying. This study aimed to explore an optimal acoustic environment to satisfy speech privacy and reduce annoyance in a passenger car. Two experiments were conducted using speech sources and compartment noise of a high speed train with varying speech-to-noise ratios (SNRA) and background noise levels (BNL). Speech intelligibility was tested in experiment I, and in experiment II, perceived speech privacy, annoyance, and acoustic comfort of combined sounds with speech and background noise were assessed. The results show that speech privacy and annoyance were significantly influenced by the SNRA. In particular, the acoustic comfort was evaluated as acceptable when the SNRA was less than -6 dB for both speech privacy and noise annoyance. In addition, annoyance increased significantly as the BNL exceeded 63 dBA, whereas the effect of the background-noise level on the speech privacy was not significant. These findings suggest that an optimal level of interior noise in a passenger car might exist between 59 and 63 dBA, taking normal speech levels into account.
Using on-line altered auditory feedback treating Parkinsonian speech
NASA Astrophysics Data System (ADS)
Wang, Emily; Verhagen, Leo; de Vries, Meinou H.
2005-09-01
Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].
An acoustic comparison of two women's infant- and adult-directed speech
NASA Astrophysics Data System (ADS)
Andruski, Jean; Katz-Gershon, Shiri
2003-04-01
In addition to having prosodic characteristics that are attractive to infant listeners, infant-directed (ID) speech shares certain characteristics of adult-directed (AD) clear speech, such as increased acoustic distance between vowels, that might be expected to make ID speech easier for adults to perceive in noise than AD conversational speech. However, perceptual tests of two women's ID productions by Andruski and Bessega [J. Acoust. Soc. Am. 112, 2355] showed that is not always the case. In a word identification task that compared ID speech with AD clear and conversational speech, one speaker's ID productions were less well-identified than AD clear speech, but better identified than AD conversational speech. For the second woman, ID speech was the least accurately identified of the three speech registers. For both speakers, hard words (infrequent words with many lexical neighbors) were also at an increased disadvantage relative to easy words (frequent words with few lexical neighbors) in speech registers that were less accurately perceived. This study will compare several acoustic properties of these women's productions, including pitch and formant-frequency characteristics. Results of the acoustic analyses will be examined with the original perceptual results to suggest reasons for differences in listener's accuracy in identifying these two women's ID speech in noise.
Objective speech quality evaluation of real-time speech coders
NASA Astrophysics Data System (ADS)
Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.
1984-02-01
This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.
Shin, Yu-Jeong; Ko, Seung-O
2015-12-01
Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.
Sources of Variability in Children’s Language Growth
Huttenlocher, Janellen; Waterfall, Heidi; Vasilyeva, Marina; Vevea, Jack; Hedges, Larry V.
2010-01-01
The present longitudinal study examines the role of caregiver speech in language development, especially syntactic development, using 47 parent-child pairs of diverse SES background from 14 to 46 months. We assess the diversity (variety) of words and syntactic structures produced by caregivers and children. We use lagged correlations to examine language growth and its relation to caregiver speech. Results show substantial individual differences among children, and indicate that diversity of earlier caregiver speech significantly predicts corresponding diversity in later child speech. For vocabulary, earlier child speech also predicts later caregiver speech, suggesting mutual influence. However, for syntax, earlier child speech does not significantly predict later caregiver speech, suggesting a causal flow from caregiver to child. Finally, demographic factors, notably SES, are related to language growth, and are, at least partially, mediated by differences in caregiver speech, showing the pervasive influence of caregiver speech on language growth. PMID:20832781
Rate and rhythm control strategies for apraxia of speech in nonfluent primary progressive aphasia.
Beber, Bárbara Costa; Berbert, Monalise Costa Batista; Grawer, Ruth Siqueira; Cardoso, Maria Cristina de Almeida Freitas
2018-01-01
The nonfluent/agrammatic variant of primary progressive aphasia is characterized by apraxia of speech and agrammatism. Apraxia of speech limits patients' communication due to slow speaking rate, sound substitutions, articulatory groping, false starts and restarts, segmentation of syllables, and increased difficulty with increasing utterance length. Speech and language therapy is known to benefit individuals with apraxia of speech due to stroke, but little is known about its effects in primary progressive aphasia. This is a case report of a 72-year-old, illiterate housewife, who was diagnosed with nonfluent primary progressive aphasia and received speech and language therapy for apraxia of speech. Rate and rhythm control strategies for apraxia of speech were trained to improve initiation of speech. We discuss the importance of these strategies to alleviate apraxia of speech in this condition and the future perspectives in the area.
Multistage audiovisual integration of speech: dissociating identification and detection.
Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias S
2011-02-01
Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech signal. Here, we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers, the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multistage account of audiovisual integration of speech in which the many attributes of the audiovisual speech signal are integrated by separate integration processes.
Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds
Agnew, Z.K.; McGettigan, C.; Scott, S.K.
2012-01-01
Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557
How our own speech rate influences our perception of others.
Bosker, Hans Rutger
2017-08-01
In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects induced by our own speech through 6 experiments, specifically targeting rate normalization (i.e., perceiving phonetic segments relative to surrounding speech rate). Experiment 1 revealed that hearing prerecorded fast or slow context sentences altered the perception of ambiguous vowels, replicating earlier work. Experiment 2 demonstrated that talking at a fast or slow rate prior to target presentation also altered target perception, though the effect of preceding speech rate was reduced. Experiment 3 showed that silent talking (i.e., inner speech) at fast or slow rates did not modulate the perception of others, suggesting that the effect of self-produced speech rate in Experiment 2 arose through monitoring of the external speech signal. Experiment 4 demonstrated that, when participants were played back their own (fast/slow) speech, no reduction of the effect of preceding speech rate was observed, suggesting that the additional task of speech production may be responsible for the reduced effect in Experiment 2. Finally, Experiments 5 and 6 replicate Experiments 2 and 3 with new participant samples. Taken together, these results suggest that variation in speech production may induce variation in speech perception, thus carrying implications for our understanding of spoken communication in dialogue settings. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
ERIC Educational Resources Information Center
Davidow, Jason H.
2014-01-01
Background: Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control…
Functional speech disorders: clinical manifestations, diagnosis, and management.
Duffy, J R
2016-01-01
Acquired psychogenic or functional speech disorders are a subtype of functional neurologic disorders. They can mimic organic speech disorders and, although any aspect of speech production can be affected, they manifest most often as dysphonia, stuttering, or prosodic abnormalities. This chapter reviews the prevalence of functional speech disorders, the spectrum of their primary clinical characteristics, and the clues that help distinguish them from organic neurologic diseases affecting the sensorimotor networks involved in speech production. Diagnosis of a speech disorder as functional can be supported by sometimes rapidly achieved positive outcomes of symptomatic speech therapy. The general principles of such therapy are reviewed. © 2016 Elsevier B.V. All rights reserved.
A model of serial order problems in fluent, stuttered and agrammatic speech.
Howell, Peter
2007-10-01
Many models of speech production have attempted to explain dysfluent speech. Most models assume that the disruptions that occur when speech is dysfluent arise because the speakers make errors while planning an utterance. In this contribution, a model of the serial order of speech is described that does not make this assumption. It involves the coordination or 'interlocking' of linguistic planning and execution stages at the language-speech interface. The model is examined to determine whether it can distinguish two forms of dysfluent speech (stuttered and agrammatic speech) that are characterized by iteration and omission of whole words and parts of words.
Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope.
Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan; Simon, Jonathan Z; Francart, Tom
2018-04-01
Speech intelligibility is currently measured by scoring how well a person can identify a speech signal. The results of such behavioral measures reflect neural processing of the speech signal, but are also influenced by language processing, motivation, and memory. Very often, electrophysiological measures of hearing give insight in the neural processing of sound. However, in most methods, non-speech stimuli are used, making it hard to relate the results to behavioral measures of speech intelligibility. The use of natural running speech as a stimulus in electrophysiological measures of hearing is a paradigm shift which allows to bridge the gap between behavioral and electrophysiological measures. Here, by decoding the speech envelope from the electroencephalogram, and correlating it with the stimulus envelope, we demonstrate an electrophysiological measure of neural processing of running speech. We show that behaviorally measured speech intelligibility is strongly correlated with our electrophysiological measure. Our results pave the way towards an objective and automatic way of assessing neural processing of speech presented through auditory prostheses, reducing confounds such as attention and cognitive capabilities. We anticipate that our electrophysiological measure will allow better differential diagnosis of the auditory system, and will allow the development of closed-loop auditory prostheses that automatically adapt to individual users.
Abrams, Daniel A; Nicol, Trent; White-Schwoch, Travis; Zecker, Steven; Kraus, Nina
2017-05-01
Speech perception relies on a listener's ability to simultaneously resolve multiple temporal features in the speech signal. Little is known regarding neural mechanisms that enable the simultaneous coding of concurrent temporal features in speech. Here we show that two categories of temporal features in speech, the low-frequency speech envelope and periodicity cues, are processed by distinct neural mechanisms within the same population of cortical neurons. We measured population activity in primary auditory cortex of anesthetized guinea pig in response to three variants of a naturally produced sentence. Results show that the envelope of population responses closely tracks the speech envelope, and this cortical activity more closely reflects wider bandwidths of the speech envelope compared to narrow bands. Additionally, neuronal populations represent the fundamental frequency of speech robustly with phase-locked responses. Importantly, these two temporal features of speech are simultaneously observed within neuronal ensembles in auditory cortex in response to clear, conversation, and compressed speech exemplars. Results show that auditory cortical neurons are adept at simultaneously resolving multiple temporal features in extended speech sentences using discrete coding mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
Cognitive Load in Voice Therapy Carry-Over Exercises.
Iwarsson, Jenny; Morris, David Jackson; Balling, Laura Winther
2017-01-01
The cognitive load generated by online speech production may vary with the nature of the speech task. This article examines 3 speech tasks used in voice therapy carry-over exercises, in which a patient is required to adopt and automatize new voice behaviors, ultimately in daily spontaneous communication. Twelve subjects produced speech in 3 conditions: rote speech (weekdays), sentences in a set form, and semispontaneous speech. Subjects simultaneously performed a secondary visual discrimination task for which response times were measured. On completion of each speech task, subjects rated their experience on a questionnaire. Response times from the secondary, visual task were found to be shortest for the rote speech, longer for the semispontaneous speech, and longest for the sentences within the set framework. Principal components derived from the subjective ratings were found to be linked to response times on the secondary visual task. Acoustic measures reflecting fundamental frequency distribution and vocal fold compression varied across the speech tasks. The results indicate that consideration should be given to the selection of speech tasks during the process leading to automation of revised speech behavior and that self-reports may be a reliable index of cognitive load.
Speech identification in noise: Contribution of temporal, spectral, and visual speech cues.
Kim, Jeesun; Davis, Chris; Groot, Christopher
2009-12-01
This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual speech cues. Due to a ceiling effect, the degree to which these effects combined could not be determined. In noise, these facilitation effects were more marked and were additive. Experiment 2 examined consonant and vowel identification in the context of CVC or VCV syllables presented in noise. For consonants, both spectral and visual speech cues facilitated identification and these effects were additive. For vowels, the effect of combined cues was underadditive, with the effect of spectral cues reduced when presented with visual speech cues. Analysis indicated that without visual speech, spectral cues facilitated the transmission of place information and vowel height, whereas with visual speech, they facilitated lip rounding, with little impact on the transmission of place information.
Motor speech skills in children with Down syndrome: A descriptive study.
Rupela, Vani; Velleman, Shelley L; Andrianopoulos, Mary V
2016-10-01
Motor speech characteristics of children with Down syndrome (DS) have historically been viewed as either Childhood Dysarthria (CD) or, more infrequently, as Childhood Apraxia of Speech (CAS). The objective of this study was to investigate motor speech deficits in a systematic manner, considering characteristics from both CAS and CD. Motor speech assessments were carried out on seven 3;4-8;11-year old children with DS in comparison with younger, typically-developing children using a Language-Neutral Assessment of Motor Speech for young children (LAMS). Additionally, the motor speech and non-speech oral motor skills of all participants were analysed qualitatively using an investigator checklist of characteristics of CAS, CD and Motor Speech Disorder-Not Otherwise Specified (MSD-NOS). Results indicated that the children with DS exhibited symptoms of CAS, CD and MSD-NOS, with variability within the group and overlapping symptoms of the disorder types. This finding is different from previous assumptions that children with DS have either CD or CAS. The motor speech disorder accompanying DS is complex. The data provide some preliminary descriptions of motor speech disorders in this population and some tools that clinicians would find useful when assessing motor speech skills of young children with DS.
Davidow, Jason H; Ingham, Roger J
2013-01-01
This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also examined. An added purpose was to determine if chorus reading could be further refined so as to provide a perceptual guide for gauging the level of physical effort exerted during speech production. A repeated-measures design was used to compare data obtained during control reading conditions and during several chorus reading conditions produced at different speech rates. Participants included 8 persons who stutter (PWS) between the ages of 16 and 32 years. There were significant reductions in the frequency of short PIs from the habitual reading condition during slower chorus conditions, no change when speech rates were matched between habitual reading and chorus conditions, and an increase in the frequency of short PIs during chorus reading produced at a faster rate than the habitual condition. Speech rate did not have an effect on stuttering frequency during chorus reading. In general, speech effort ratings improved and naturalness ratings worsened as speech rate decreased. These results provide evidence that (a) a reduction in the frequency of short PIs is not necessary for fluency improvement during chorus reading, and (b) speech rate may be altered to provide PWS with a more appropriate reference for how physically effortful normally fluent speech production should be. Future investigations should examine the necessity of changes in the activation of neural regions during chorus reading, the possibility of defining individualized units on a 9-point effort scale, and if there are upper and lower speech rate boundaries for receiving ratings of "highly natural sounding" speech during chorus reading. The reader will be able to: (1) describe the effect of changes in speech rate on the frequency of short phonated intervals during chorus reading, (2) describe changes to speaker-judged speech effort as speech rate changes during chorus reading, (3) and describe the effect of changes in speech rate on listener-judged naturalness ratings during chorus reading. Copyright © 2012 Elsevier Inc. All rights reserved.
Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex
Rogers, Jack C.; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E.
2014-01-01
Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener’s ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips (“ba” vs. “da”); (2) two speech sounds not articulated by the lips (“ga” vs. “da”); and (3) non-speech sounds produced by the hands (“claps” vs. “clicks”). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba–da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way. PMID:25076928
Rogers, Jack C; Möttönen, Riikka; Boyles, Rowan; Watkins, Kate E
2014-01-01
Perceiving speech engages parts of the motor system involved in speech production. The role of the motor cortex in speech perception has been demonstrated using low-frequency repetitive transcranial magnetic stimulation (rTMS) to suppress motor excitability in the lip representation and disrupt discrimination of lip-articulated speech sounds (Möttönen and Watkins, 2009). Another form of rTMS, continuous theta-burst stimulation (cTBS), can produce longer-lasting disruptive effects following a brief train of stimulation. We investigated the effects of cTBS on motor excitability and discrimination of speech and non-speech sounds. cTBS was applied for 40 s over either the hand or the lip representation of motor cortex. Motor-evoked potentials recorded from the lip and hand muscles in response to single pulses of TMS revealed no measurable change in motor excitability due to cTBS. This failure to replicate previous findings may reflect the unreliability of measurements of motor excitability related to inter-individual variability. We also measured the effects of cTBS on a listener's ability to discriminate: (1) lip-articulated speech sounds from sounds not articulated by the lips ("ba" vs. "da"); (2) two speech sounds not articulated by the lips ("ga" vs. "da"); and (3) non-speech sounds produced by the hands ("claps" vs. "clicks"). Discrimination of lip-articulated speech sounds was impaired between 20 and 35 min after cTBS over the lip motor representation. Specifically, discrimination of across-category ba-da sounds presented with an 800-ms inter-stimulus interval was reduced to chance level performance. This effect was absent for speech sounds that do not require the lips for articulation and non-speech sounds. Stimulation over the hand motor representation did not affect discrimination of speech or non-speech sounds. These findings show that stimulation of the lip motor representation disrupts discrimination of speech sounds in an articulatory feature-specific way.
A Generative Model of Speech Production in Broca’s and Wernicke’s Areas
Price, Cathy J.; Crinion, Jenny T.; MacSweeney, Mairéad
2011-01-01
Speech production involves the generation of an auditory signal from the articulators and vocal tract. When the intended auditory signal does not match the produced sounds, subsequent articulatory commands can be adjusted to reduce the difference between the intended and produced sounds. This requires an internal model of the intended speech output that can be compared to the produced speech. The aim of this functional imaging study was to identify brain activation related to the internal model of speech production after activation related to vocalization, auditory feedback, and movement in the articulators had been controlled. There were four conditions: silent articulation of speech, non-speech mouth movements, finger tapping, and visual fixation. In the speech conditions, participants produced the mouth movements associated with the words “one” and “three.” We eliminated auditory feedback from the spoken output by instructing participants to articulate these words without producing any sound. The non-speech mouth movement conditions involved lip pursing and tongue protrusions to control for movement in the articulators. The main difference between our speech and non-speech mouth movement conditions is that prior experience producing speech sounds leads to the automatic and covert generation of auditory and phonological associations that may play a role in predicting auditory feedback. We found that, relative to non-speech mouth movements, silent speech activated Broca’s area in the left dorsal pars opercularis and Wernicke’s area in the left posterior superior temporal sulcus. We discuss these results in the context of a generative model of speech production and propose that Broca’s and Wernicke’s areas may be involved in predicting the speech output that follows articulation. These predictions could provide a mechanism by which rapid movement of the articulators is precisely matched to the intended speech outputs during future articulations. PMID:21954392
Method and apparatus for obtaining complete speech signals for speech recognition applications
NASA Technical Reports Server (NTRS)
Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)
2009-01-01
The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.
Automatic Speech Recognition from Neural Signals: A Focused Review.
Herff, Christian; Schultz, Tanja
2016-01-01
Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.
Stekelenburg, Jeroen J; Keetels, Mirjam; Vroomen, Jean
2018-05-01
Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect. Here, we examined whether written text, like visual speech, can induce an illusory change in the perception of speech sounds on both the behavioural and neural levels. In a sound categorization task, we found that both text and visual speech changed the identity of speech sounds from an /aba/-/ada/ continuum, but the size of this audiovisual effect was considerably smaller for text than visual speech. To examine at which level in the information processing hierarchy these multisensory interactions occur, we recorded electroencephalography in an audiovisual mismatch negativity (MMN, a component of the event-related potential reflecting preattentive auditory change detection) paradigm in which deviant text or visual speech was used to induce an illusory change in a sequence of ambiguous sounds halfway between /aba/ and /ada/. We found that only deviant visual speech induced an MMN, but not deviant text, which induced a late P3-like positive potential. These results demonstrate that text has much weaker effects on sound processing than visual speech does, possibly because text has different biological roots than visual speech. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Careers in Speech Communication.
ERIC Educational Resources Information Center
Speech Communication Association, New York, NY.
Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…
SAM: speech-aware applications in medicine to support structured data entry.
Wormek, A. K.; Ingenerf, J.; Orthner, H. F.
1997-01-01
In the last two years, improvement in speech recognition technology has directed the medical community's interest to porting and using such innovations in clinical systems. The acceptance of speech recognition systems in clinical domains increases with recognition speed, large medical vocabulary, high accuracy, continuous speech recognition, and speaker independence. Although some commercial speech engines approach these requirements, the greatest benefit can be achieved in adapting a speech recognizer to a specific medical application. The goals of our work are first, to develop a speech-aware core component which is able to establish connections to speech recognition engines of different vendors. This is realized in SAM. Second, with applications based on SAM we want to support the physician in his/her routine clinical care activities. Within the STAMP project (STAndardized Multimedia report generator in Pathology), we extend SAM by combining a structured data entry approach with speech recognition technology. Another speech-aware application in the field of Diabetes care is connected to a terminology server. The server delivers a controlled vocabulary which can be used for speech recognition. PMID:9357730
Calandruccio, Lauren; Zhou, Haibo
2014-01-01
Purpose To examine whether improved speech recognition during linguistically mismatched target–masker experiments is due to linguistic unfamiliarity of the masker speech or linguistic dissimilarity between the target and masker speech. Method Monolingual English speakers (n = 20) and English–Greek simultaneous bilinguals (n = 20) listened to English sentences in the presence of competing English and Greek speech. Data were analyzed using mixed-effects regression models to determine differences in English recogition performance between the 2 groups and 2 masker conditions. Results Results indicated that English sentence recognition for monolinguals and simultaneous English–Greek bilinguals improved when the masker speech changed from competing English to competing Greek speech. Conclusion The improvement in speech recognition that has been observed for linguistically mismatched target–masker experiments cannot be simply explained by the masker language being linguistically unknown or unfamiliar to the listeners. Listeners can improve their speech recognition in linguistically mismatched target–masker experiments even when the listener is able to obtain meaningful linguistic information from the masker speech. PMID:24167230
Audibility-based predictions of speech recognition for children and adults with normal hearing.
McCreery, Ryan W; Stelmachowicz, Patricia G
2011-12-01
This study investigated the relationship between audibility and predictions of speech recognition for children and adults with normal hearing. The Speech Intelligibility Index (SII) is used to quantify the audibility of speech signals and can be applied to transfer functions to predict speech recognition scores. Although the SII is used clinically with children, relatively few studies have evaluated SII predictions of children's speech recognition directly. Children have required more audibility than adults to reach maximum levels of speech understanding in previous studies. Furthermore, children may require greater bandwidth than adults for optimal speech understanding, which could influence frequency-importance functions used to calculate the SII. Speech recognition was measured for 116 children and 19 adults with normal hearing. Stimulus bandwidth and background noise level were varied systematically in order to evaluate speech recognition as predicted by the SII and derive frequency-importance functions for children and adults. Results suggested that children required greater audibility to reach the same level of speech understanding as adults. However, differences in performance between adults and children did not vary across frequency bands. © 2011 Acoustical Society of America
Speech fluency profile on different tasks for individuals with Parkinson's disease.
Juste, Fabiola Staróbole; Andrade, Claudia Regina Furquim de
2017-07-20
To characterize the speech fluency profile of patients with Parkinson's disease. Study participants were 40 individuals of both genders aged 40 to 80 years divided into 2 groups: Research Group - RG (20 individuals with diagnosis of Parkinson's disease) and Control Group - CG (20 individuals with no communication or neurological disorders). For all of the participants, three speech samples involving different tasks were collected: monologue, individual reading, and automatic speech. The RG presented a significant larger number of speech disruptions, both stuttering-like and typical dysfluencies, and higher percentage of speech discontinuity in the monologue and individual reading tasks compared with the CG. Both groups presented reduced number of speech disruptions (stuttering-like and typical dysfluencies) in the automatic speech task; the groups presented similar performance in this task. Regarding speech rate, individuals in the RG presented lower number of words and syllables per minute compared with those in the CG in all speech tasks. Participants of the RG presented altered parameters of speech fluency compared with those of the CG; however, this change in fluency cannot be considered a stuttering disorder.
ERIC Educational Resources Information Center
Chapman, Kathy L.
2004-01-01
This study examined the relationship between presurgery speech measures and speech and language performance at 39 months as well as the relationship between early postsurgery speech measures and speech and language performance at 39 months of age. Fifteen children with cleft lip and palate participated in the study. Spontaneous speech samples were…
ERIC Educational Resources Information Center
Davidow, Jason H.; Ingham, Roger J.
2013-01-01
Purpose: This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also…
ERIC Educational Resources Information Center
Iuzzini-Seigel, Jenya; Hogan, Tiffany P.; Green, Jordan R.
2017-01-01
Purpose: The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and…
The sensorimotor and social sides of the architecture of speech.
Pezzulo, Giovanni; Barca, Laura; D'Ausilio, Alessando
2014-12-01
Speech is a complex skill to master. In addition to sophisticated phono-articulatory abilities, speech acquisition requires neuronal systems configured for vocal learning, with adaptable sensorimotor maps that couple heard speech sounds with motor programs for speech production; imitation and self-imitation mechanisms that can train the sensorimotor maps to reproduce heard speech sounds; and a "pedagogical" learning environment that supports tutor learning.
ERIC Educational Resources Information Center
Cowan, Gloria; Khatchadourian, Desiree
2003-01-01
Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…
Total Ossicular Replacement Prosthesis: A New Fat Interposition Technique.
Saliba, Issam; Sabbah, Valérie; Poirier, Jackie Bibeau
2018-01-01
To compare audiometric results between the standard total ossicular replacement prosthesis (TORP-S) and a new fat interposition total ossicular replacement prosthesis (TORP-F) in pediatric and adult patients and to assess the complication and the undesirable outcome. This is a retrospective study. This study included 104 patients who had undergone titanium implants with TORP-F and 54 patients who had undergone the procedure with TORP-S between 2008 and 2013 in our tertiary care centers. The new technique consists of interposing a fat graft between the 4 legs of the universal titanium prosthesis (Medtronic Xomed Inc, Jacksonville, FL, USA) to provide a more stable TORP in the ovale window niche. Normally, this prosthesis is designed to fit on the stapes' head as a partial ossicular replacement prosthesis. The postoperative air-bone gap less than 25 dB for the combined cohort was 69.2% and 41.7% for the TORP-F and the TORP-S groups, respectively. The mean follow-up was 17 months postoperatively. By stratifying data, the pediatric cohort shows 56.5% in the TORP-F group (n = 52) compared with 40% in the TORP-S group (n = 29). However, the adult cohort shows 79.3% in the TORP-F group (n = 52) compared with 43.75% in the TORP-S group (n = 25). These improvements in hearing were statistically significant. There were no statistically significant differences in the speech discrimination scores. The only undesirable outcome that was statistically different between the 2 groups was the prosthesis displacement: 7% in the TORP-F group compared with 19% in the TORP-S group ( P = .03). The interposition of a fat graft between the legs of the titanium implants (TORP-F) provides superior hearing results compared with a standard procedure (TORP-S) in pediatric and adult populations because of its better stability in the oval window niche.
Total Ossicular Replacement Prosthesis: A New Fat Interposition Technique
Saliba, Issam; Sabbah, Valérie; Poirier, Jackie Bibeau
2018-01-01
Objective: To compare audiometric results between the standard total ossicular replacement prosthesis (TORP-S) and a new fat interposition total ossicular replacement prosthesis (TORP-F) in pediatric and adult patients and to assess the complication and the undesirable outcome. Study design: This is a retrospective study. Methods: This study included 104 patients who had undergone titanium implants with TORP-F and 54 patients who had undergone the procedure with TORP-S between 2008 and 2013 in our tertiary care centers. The new technique consists of interposing a fat graft between the 4 legs of the universal titanium prosthesis (Medtronic Xomed Inc, Jacksonville, FL, USA) to provide a more stable TORP in the ovale window niche. Normally, this prosthesis is designed to fit on the stapes’ head as a partial ossicular replacement prosthesis. Results: The postoperative air-bone gap less than 25 dB for the combined cohort was 69.2% and 41.7% for the TORP-F and the TORP-S groups, respectively. The mean follow-up was 17 months postoperatively. By stratifying data, the pediatric cohort shows 56.5% in the TORP-F group (n = 52) compared with 40% in the TORP-S group (n = 29). However, the adult cohort shows 79.3% in the TORP-F group (n = 52) compared with 43.75% in the TORP-S group (n = 25). These improvements in hearing were statistically significant. There were no statistically significant differences in the speech discrimination scores. The only undesirable outcome that was statistically different between the 2 groups was the prosthesis displacement: 7% in the TORP-F group compared with 19% in the TORP-S group (P = .03). Conclusions: The interposition of a fat graft between the legs of the titanium implants (TORP-F) provides superior hearing results compared with a standard procedure (TORP-S) in pediatric and adult populations because of its better stability in the oval window niche. PMID:29326537
Van Borsel, John; Eeckhout, Hannelore
2008-09-01
This study investigated listeners' perception of the speech naturalness of people who stutter (PWS) speaking under delayed auditory feedback (DAF) with particular attention for possible listener differences. Three panels of judges consisting of 14 stuttering individuals, 14 speech language pathologists, and 14 naive listeners rated the naturalness of speech samples of stuttering and non-stuttering individuals using a 9-point interval scale. Results clearly indicate that these three groups evaluate naturalness differently. Naive listeners appear to be more severe in their judgements than speech language pathologists and stuttering listeners, and speech language pathologists are apparently more severe than PWS. The three listener groups showed similar trends with respect to the relationship between speech naturalness and speech rate. Results of all three indicated that for PWS, the slower a speaker's rate was, the less natural speech was judged to sound. The three listener groups also showed similar trends with regard to naturalness of the stuttering versus the non-stuttering individuals. All three panels considered the speech of the non-stuttering participants more natural. The reader will be able to: (1) discuss the speech naturalness of people who stutter speaking under delayed auditory feedback, (2) discuss listener differences about the naturalness of people who stutter speaking under delayed auditory feedback, and (3) discuss the importance of speech rate for the naturalness of speech.
Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.
2017-01-01
Purpose Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method PM and other scores were obtained for 264 participants in 6 groups: CAS in idiopathic, neurogenetic, and complex neurodevelopmental disorders; adult-onset apraxia of speech (AAS) consequent to stroke and primary progressive apraxia of speech; and idiopathic speech delay. Results Participants with CAS and AAS had significantly lower scores than typically speaking reference participants and speech delay controls on measures posited to assess representational and transcoding processes. Representational deficits differed between CAS and AAS groups, with support for both underspecified linguistic representations and memory/access deficits in CAS, but for only the latter in AAS. CAS–AAS similarities in the age–sex standardized percentages of occurrence of the most frequent type of inappropriate pauses (abrupt) and significant differences in the standardized occurrence of appropriate pauses were consistent with speech processing findings. Conclusions Results support the hypotheses of core representational and transcoding speech processing deficits in CAS and theoretical coherence of the PM's pause-speech elements with these deficits. PMID:28384751
Scaling and universality in the human voice.
Luque, Jordi; Luque, Bartolo; Lacasa, Lucas
2015-04-06
Speech is a distinctive complex feature of human capabilities. In order to understand the physics underlying speech production, in this work, we empirically analyse the statistics of large human speech datasets ranging several languages. We first show that during speech, the energy is unevenly released and power-law distributed, reporting a universal robust Gutenberg-Richter-like law in speech. We further show that such 'earthquakes in speech' show temporal correlations, as the interevent statistics are again power-law distributed. As this feature takes place in the intraphoneme range, we conjecture that the process responsible for this complex phenomenon is not cognitive, but it resides in the physiological (mechanical) mechanisms of speech production. Moreover, we show that these waiting time distributions are scale invariant under a renormalization group transformation, suggesting that the process of speech generation is indeed operating close to a critical point. These results are put in contrast with current paradigms in speech processing, which point towards low dimensional deterministic chaos as the origin of nonlinear traits in speech fluctuations. As these latter fluctuations are indeed the aspects that humanize synthetic speech, these findings may have an impact in future speech synthesis technologies. Results are robust and independent of the communication language or the number of speakers, pointing towards a universal pattern and yet another hint of complexity in human speech. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Speech training alters consonant and vowel responses in multiple auditory cortex fields
Engineer, Crystal T.; Rahebi, Kimiya C.; Buell, Elizabeth P.; Fink, Melyssa K.; Kilgard, Michael P.
2015-01-01
Speech sounds evoke unique neural activity patterns in primary auditory cortex (A1). Extensive speech sound discrimination training alters A1 responses. While the neighboring auditory cortical fields each contain information about speech sound identity, each field processes speech sounds differently. We hypothesized that while all fields would exhibit training-induced plasticity following speech training, there would be unique differences in how each field changes. In this study, rats were trained to discriminate speech sounds by consonant or vowel in quiet and in varying levels of background speech-shaped noise. Local field potential and multiunit responses were recorded from four auditory cortex fields in rats that had received 10 weeks of speech discrimination training. Our results reveal that training alters speech evoked responses in each of the auditory fields tested. The neural response to consonants was significantly stronger in anterior auditory field (AAF) and A1 following speech training. The neural response to vowels following speech training was significantly weaker in ventral auditory field (VAF) and posterior auditory field (PAF). This differential plasticity of consonant and vowel sound responses may result from the greater paired pulse depression, expanded low frequency tuning, reduced frequency selectivity, and lower tone thresholds, which occurred across the four auditory fields. These findings suggest that alterations in the distributed processing of behaviorally relevant sounds may contribute to robust speech discrimination. PMID:25827927
Infants’ brain responses to speech suggest Analysis by Synthesis
Kuhl, Patricia K.; Ramírez, Rey R.; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki
2014-01-01
Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners’ knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca’s area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of “motherese” on early language learning, and (iii) the “social-gating” hypothesis and humans’ development of social understanding. PMID:25024207
Infants' brain responses to speech suggest analysis by synthesis.
Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki
2014-08-05
Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.
Davidow, Jason H.
2013-01-01
Background Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control speech rate between conditions limits our ability to determine if the changes were necessary for fluency. Aims This study examined the effect of speech rate on several speech production variables during one-syllable-per-beat metronomic speech, in order to determine changes that may be important for fluency during this fluency-inducing condition. Methods and Procedures Thirteen persons who stutter (PWS), aged 18–62 years, completed a series of speaking tasks. Several speech production variables were compared between conditions produced at different metronome beat rates, and between a control condition and a metronome-paced speech condition produced at a rate equal to the control condition. Outcomes & Results Vowel duration, voice onset time, pressure rise time, and phonated intervals were significantly impacted by metronome beat rate. Voice onset time and the percentage of short (30–100 ms) phonated intervals significantly decreased from the control condition to the equivalent rate metronome-paced speech condition. Conclusions & Implications A reduction in the percentage of short phonated intervals may be important for fluency during syllable-based metronome-paced speech for PWS. Future studies should continue examining the necessity of this reduction. In addition, speech rate must be controlled in future fluency-inducing condition studies, including neuroimaging investigations, in order for this research to make a substantial contribution to finding the fluency-inducing mechanism of fluency-inducing conditions. PMID:24372888
Reference-free automatic quality assessment of tracheoesophageal speech.
Huang, Andy; Falk, Tiago H; Chan, Wai-Yip; Parsa, Vijay; Doyle, Philip
2009-01-01
Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.
An analysis of the masking of speech by competing speech using self-report data.
Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot
2009-01-01
Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.
NASA Astrophysics Data System (ADS)
Dat, Tran Huy; Takeda, Kazuya; Itakura, Fumitada
We present a multichannel speech enhancement method based on MAP speech spectral magnitude estimation using a generalized gamma model of speech prior distribution, where the model parameters are adapted from actual noisy speech in a frame-by-frame manner. The utilization of a more general prior distribution with its online adaptive estimation is shown to be effective for speech spectral estimation in noisy environments. Furthermore, the multi-channel information in terms of cross-channel statistics are shown to be useful to better adapt the prior distribution parameters to the actual observation, resulting in better performance of speech enhancement algorithm. We tested the proposed algorithm in an in-car speech database and obtained significant improvements of the speech recognition performance, particularly under non-stationary noise conditions such as music, air-conditioner and open window.
Some Effects of Training on the Perception of Synthetic Speech
Schwab, Eileen C.; Nusbaum, Howard C.; Pisoni, David B.
2012-01-01
The present study was conducted to determine the effects of training on the perception of synthetic speech. Three groups of subjects were tested with synthetic speech using the same tasks before and after training. One group was trained with synthetic speech. A second group went through the identical training procedures using natural speech. The third group received no training. Although performance of the three groups was the same prior to training, significant differences on the post-test measures of word recognition were observed: the group trained with synthetic speech performed much better than the other two groups. A six-month follow-up indicated that the group trained with synthetic speech displayed long-term retention of the knowledge and experience gained with prior exposure to synthetic speech generated by a text-to-speech system. PMID:2936671
Lexical Effects on Second Language Acquisition
ERIC Educational Resources Information Center
Kemp, Renee Lorraine
2017-01-01
Speech production and perception are inextricably linked systems. Speakers modify their speech in response to listener characteristics, such as age, hearing ability, and language background. Listener-oriented modifications in speech production, commonly referred to as clear speech, have also been found to affect speech perception by enhancing…
Sperry Univac speech communications technology
NASA Technical Reports Server (NTRS)
Medress, Mark F.
1977-01-01
Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.
Auditory-Motor Processing of Speech Sounds
Möttönen, Riikka; Dutton, Rebekah; Watkins, Kate E.
2013-01-01
The motor regions that control movements of the articulators activate during listening to speech and contribute to performance in demanding speech recognition and discrimination tasks. Whether the articulatory motor cortex modulates auditory processing of speech sounds is unknown. Here, we aimed to determine whether the articulatory motor cortex affects the auditory mechanisms underlying discrimination of speech sounds in the absence of demanding speech tasks. Using electroencephalography, we recorded responses to changes in sound sequences, while participants watched a silent video. We also disrupted the lip or the hand representation in left motor cortex using transcranial magnetic stimulation. Disruption of the lip representation suppressed responses to changes in speech sounds, but not piano tones. In contrast, disruption of the hand representation had no effect on responses to changes in speech sounds. These findings show that disruptions within, but not outside, the articulatory motor cortex impair automatic auditory discrimination of speech sounds. The findings provide evidence for the importance of auditory-motor processes in efficient neural analysis of speech sounds. PMID:22581846
Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.
2006-04-25
The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Contextual modulation of reading rate for direct versus indirect speech quotations.
Yao, Bo; Scheepers, Christoph
2011-12-01
In human communication, direct speech (e.g., Mary said: "I'm hungry") is perceived to be more vivid than indirect speech (e.g., Mary said [that] she was hungry). However, the processing consequences of this distinction are largely unclear. In two experiments, participants were asked to either orally (Experiment 1) or silently (Experiment 2, eye-tracking) read written stories that contained either a direct speech or an indirect speech quotation. The context preceding those quotations described a situation that implied either a fast-speaking or a slow-speaking quoted protagonist. It was found that this context manipulation affected reading rates (in both oral and silent reading) for direct speech quotations, but not for indirect speech quotations. This suggests that readers are more likely to engage in perceptual simulations of the reported speech act when reading direct speech as opposed to meaning-equivalent indirect speech quotations, as part of a more vivid representation of the former. Copyright © 2011 Elsevier B.V. All rights reserved.
Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders.
Liss, Julie M; Utianski, Rene; Lansford, Kaitlin
2013-01-01
Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. The goals of this paper are to (i) provide a review of the cognitive-linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. Copyright © 2013 S. Karger AG, Basel.
Crosslinguistic Application of English-Centric Rhythm Descriptors in Motor Speech Disorders
Liss, Julie M.; Utianski, Rene; Lansford, Kaitlin
2014-01-01
Background Rhythmic disturbances are a hallmark of motor speech disorders, in which the motor control deficits interfere with the outward flow of speech and by extension speech understanding. As the functions of rhythm are language-specific, breakdowns in rhythm should have language-specific consequences for communication. Objective The goals of this paper are to (i) provide a review of the cognitive- linguistic role of rhythm in speech perception in a general sense and crosslinguistically; (ii) present new results of lexical segmentation challenges posed by different types of dysarthria in American English, and (iii) offer a framework for crosslinguistic considerations for speech rhythm disturbances in the diagnosis and treatment of communication disorders associated with motor speech disorders. Summary This review presents theoretical and empirical reasons for considering speech rhythm as a critical component of communication deficits in motor speech disorders, and addresses the need for crosslinguistic research to explore language-universal versus language-specific aspects of motor speech disorders. PMID:24157596
An investigation of articulatory setting using real-time magnetic resonance imaging
Ramanarayanan, Vikram; Goldstein, Louis; Byrd, Dani; Narayanan, Shrikanth S.
2013-01-01
This paper presents an automatic procedure to analyze articulatory setting in speech production using real-time magnetic resonance imaging of the moving human vocal tract. The procedure extracts frames corresponding to inter-speech pauses, speech-ready intervals and absolute rest intervals from magnetic resonance imaging sequences of read and spontaneous speech elicited from five healthy speakers of American English and uses automatically extracted image features to quantify vocal tract posture during these intervals. Statistical analyses show significant differences between vocal tract postures adopted during inter-speech pauses and those at absolute rest before speech; the latter also exhibits a greater variability in the adopted postures. In addition, the articulatory settings adopted during inter-speech pauses in read and spontaneous speech are distinct. The results suggest that adopted vocal tract postures differ on average during rest positions, ready positions and inter-speech pauses, and might, in that order, involve an increasing degree of active control by the cognitive speech planning mechanism. PMID:23862826
Alternative Speech Communication System for Persons with Severe Speech Disorders
NASA Astrophysics Data System (ADS)
Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas
2009-12-01
Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
Neurophysiological Influence of Musical Training on Speech Perception
Shahin, Antoine J.
2011-01-01
Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL. PMID:21716639
Neurophysiological influence of musical training on speech perception.
Shahin, Antoine J
2011-01-01
Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL.
Speech production gains following constraint-induced movement therapy in children with hemiparesis.
Allison, Kristen M; Reidy, Teressa Garcia; Boyle, Mary; Naber, Erin; Carney, Joan; Pidcock, Frank S
2017-01-01
The purpose of this study was to investigate changes in speech skills of children who have hemiparesis and speech impairment after participation in a constraint-induced movement therapy (CIMT) program. While case studies have reported collateral speech gains following CIMT, the effect of CIMT on speech production has not previously been directly investigated to the knowledge of these investigators. Eighteen children with hemiparesis and co-occurring speech impairment participated in a 21-day clinical CIMT program. The Goldman-Fristoe Test of Articulation-2 (GFTA-2) was used to assess children's articulation of speech sounds before and after the intervention. Changes in percent of consonants correct (PCC) on the GFTA-2 were used as a measure of change in speech production. Children made significant gains in PCC following CIMT. Gains were similar in children with left and right-sided hemiparesis, and across age groups. This study reports significant collateral gains in speech production following CIMT and suggests benefits of CIMT may also spread to speech motor domains.
Long short-term memory for speaker generalization in supervised speech separation
Chen, Jitong; Wang, DeLiang
2017-01-01
Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. To improve speaker generalization, a separation model based on long short-term memory (LSTM) is proposed, which naturally accounts for temporal dynamics of speech. Systematic evaluation shows that the proposed model substantially outperforms a DNN-based model on unseen speakers and unseen noises in terms of objective speech intelligibility. Analyzing LSTM internal representations reveals that LSTM captures long-term speech contexts. It is also found that the LSTM model is more advantageous for low-latency speech separation and it, without future frames, performs better than the DNN model with future frames. The proposed model represents an effective approach for speaker- and noise-independent speech separation. PMID:28679261
The right hemisphere is highlighted in connected natural speech production and perception.
Alexandrou, Anna Maria; Saarinen, Timo; Mäkelä, Sasu; Kujala, Jan; Salmelin, Riitta
2017-05-15
Current understanding of the cortical mechanisms of speech perception and production stems mostly from studies that focus on single words or sentences. However, it has been suggested that processing of real-life connected speech may rely on additional cortical mechanisms. In the present study, we examined the neural substrates of natural speech production and perception with magnetoencephalography by modulating three central features related to speech: amount of linguistic content, speaking rate and social relevance. The amount of linguistic content was modulated by contrasting natural speech production and perception to speech-like non-linguistic tasks. Meaningful speech was produced and perceived at three speaking rates: normal, slow and fast. Social relevance was probed by having participants attend to speech produced by themselves and an unknown person. These speech-related features were each associated with distinct spatiospectral modulation patterns that involved cortical regions in both hemispheres. Natural speech processing markedly engaged the right hemisphere in addition to the left. In particular, the right temporo-parietal junction, previously linked to attentional processes and social cognition, was highlighted in the task modulations. The present findings suggest that its functional role extends to active generation and perception of meaningful, socially relevant speech. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Lee, Jimin; Hustad, Katherine C; Weismer, Gary
2014-10-01
Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (speech motor impairment [SMI] group) and 9 judged to be free of dysarthria (no SMI [NSMI] group). Data from children with CP were compared to data from age-matched typically developing children. Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and typically developing groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (adjusted R2 = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R2 analyses revealed that any single variable explained less than 9% of speech intelligibility variability. Children in the SMI group had articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.
Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.
2017-01-01
Purpose The goal of this article (PM I) is to describe the rationale for and development of the Pause Marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech from speech delay. Method The authors describe and prioritize 7 criteria with which to evaluate the research and clinical utility of a diagnostic marker for childhood apraxia of speech, including evaluation of the present proposal. An overview is given of the Speech Disorders Classification System, including extensions completed in the same approximately 3-year period in which the PM was developed. Results The finalized Speech Disorders Classification System includes a nosology and cross-classification procedures for childhood and persistent speech disorders and motor speech disorders (Shriberg, Strand, & Mabie, 2017). A PM is developed that provides procedural and scoring information, and citations to papers and technical reports that include audio exemplars of the PM and reference data used to standardize PM scores are provided. Conclusions The PM described here is an acoustic-aided perceptual sign that quantifies one aspect of speech precision in the linguistic domain of phrasing. This diagnostic marker can be used to discriminate early or persistent childhood apraxia of speech from speech delay. PMID:28384779
Stoppelman, Nadav; Harpaz, Tamar; Ben-Shachar, Michal
2013-05-01
Speech processing engages multiple cortical regions in the temporal, parietal, and frontal lobes. Isolating speech-sensitive cortex in individual participants is of major clinical and scientific importance. This task is complicated by the fact that responses to sensory and linguistic aspects of speech are tightly packed within the posterior superior temporal cortex. In functional magnetic resonance imaging (fMRI), various baseline conditions are typically used in order to isolate speech-specific from basic auditory responses. Using a short, continuous sampling paradigm, we show that reversed ("backward") speech, a commonly used auditory baseline for speech processing, removes much of the speech responses in frontal and temporal language regions of adult individuals. On the other hand, signal correlated noise (SCN) serves as an effective baseline for removing primary auditory responses while maintaining strong signals in the same language regions. We show that the response to reversed speech in left inferior frontal gyrus decays significantly faster than the response to speech, thus suggesting that this response reflects bottom-up activation of speech analysis followed up by top-down attenuation once the signal is classified as nonspeech. The results overall favor SCN as an auditory baseline for speech processing.
The Use of Electroencephalography in Language Production Research: A Review
Ganushchak, Lesya Y.; Christoffels, Ingrid K.; Schiller, Niels O.
2011-01-01
Speech production long avoided electrophysiological experiments due to the suspicion that potential artifacts caused by muscle activity of overt speech may lead to a bad signal-to-noise ratio in the measurements. Therefore, researchers have sought to assess speech production by using indirect speech production tasks, such as tacit or implicit naming, delayed naming, or meta-linguistic tasks, such as phoneme-monitoring. Covert speech may, however, involve different processes than overt speech production. Recently, overt speech has been investigated using electroencephalography (EEG). As the number of papers published is rising steadily, this clearly indicates the increasing interest and demand for overt speech research within the field of cognitive neuroscience of language. Our main goal here is to review all currently available results of overt speech production involving EEG measurements, such as picture naming, Stroop naming, and reading aloud. We conclude that overt speech production can be successfully studied using electrophysiological measures, for instance, event-related brain potentials (ERPs). We will discuss possible relevant components in the ERP waveform of speech production and aim to address the issue of how to interpret the results of ERP research using overt speech, and whether the ERP components in language production are comparable to results from other fields. PMID:21909333
Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging.
Hagedorn, Christina; Proctor, Michael; Goldstein, Louis; Wilson, Stephen M; Miller, Bruce; Gorno-Tempini, Maria Luisa; Narayanan, Shrikanth S
2017-04-14
Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech discussed. One adult male speaker with apraxia of speech was imaged using real-time MRI while producing spontaneous speech, repeated naming tasks, and self-paced repetition of word pairs designed to elicit speech errors. Articulatory data were analyzed, and speech errors were detected using time series reflecting articulatory activity in regions of interest. Real-time MRI captured two types of apraxic gestural intrusion errors in a word pair repetition task. Gestural intrusion errors in nonrepetitive speech, multiple silent initiation gestures at the onset of speech, and covert (unphonated) articulation of entire monosyllabic words were also captured. Real-time MRI and accompanying analytical methods capture and quantify many features of apraxic speech that have been previously observed using other modalities while offering high spatial resolution. This patient's apraxia of speech affected the ability to select only the appropriate vocal tract gestures for a target utterance, suppressing others, and to coordinate them in time.
Describing Speech Usage in Daily Activities in Typical Adults.
Anderson, Laine; Baylor, Carolyn R; Eadie, Tanya L; Yorkston, Kathryn M
2016-01-01
"Speech usage" refers to what people want or need to do with their speech to meet communication demands in life roles. The purpose of this study was to contribute to validation of the Levels of Speech Usage scale by providing descriptive data from a sample of adults without communication disorders, comparing this scale to a published Occupational Voice Demands scale and examining predictors of speech usage levels. This is a survey design. Adults aged ≥25 years without reported communication disorders were recruited nationally to complete an online questionnaire. The questionnaire included the Levels of Speech Usage scale, questions about relevant occupational and nonoccupational activities (eg, socializing, hobbies, childcare, and so forth), and demographic information. Participants were also categorized according to Koufman and Isaacson occupational voice demands scale. A total of 276 participants completed the questionnaires. People who worked for pay tended to report higher levels of speech usage than those who do not work for pay. Regression analyses showed employment to be the major contributor to speech usage; however, considerable variance left unaccounted for suggests that determinants of speech usage and the relationship between speech usage, employment, and other life activities are not yet fully defined. The Levels of Speech Usage may be a viable instrument to systematically rate speech usage because it captures both occupational and nonoccupational speech demands. These data from a sample of typical adults may provide a reference to help in interpreting the impact of communication disorders on speech usage patterns. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius
2015-01-01
Background and Purpose Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions regarding if AOS emerges from a unique pattern of brain damage or as a sub-element of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Methods Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The Apraxia of Speech Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with AOS and/or aphasia. Localized brain damage was identified using structural MRI, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS and/or aphasia, and brain damage. Results The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS and/or aphasia were associated with damage to the temporal lobe and the inferior pre-central frontal regions. Conclusion AOS likely occurs in conjunction with aphasia due to the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. PMID:25908457
42 CFR 485.715 - Condition of participation: Speech pathology services.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 42 Public Health 5 2010-10-01 2010-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
42 CFR 485.715 - Condition of participation: Speech pathology services.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 42 Public Health 5 2013-10-01 2013-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
42 CFR 485.715 - Condition of participation: Speech pathology services.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 42 Public Health 5 2011-10-01 2011-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
42 CFR 485.715 - Condition of participation: Speech pathology services.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 42 Public Health 5 2014-10-01 2014-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
42 CFR 485.715 - Condition of participation: Speech pathology services.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 42 Public Health 5 2012-10-01 2012-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
Freedom of Speech: A Clear and Present Need to Teach. ERIC Report.
ERIC Educational Resources Information Center
Boileau, Don M.
1983-01-01
Presents annotations of 21 documents in the ERIC system on the following subjects: (1) theory of freedom of speech; (2) theorists; (3) research on freedom of speech; (4) broadcasting and freedom of speech; and (5) international questions of freedom of speech. (PD)
Retrieval from Memory: Vulnerable or Inviolable?
ERIC Educational Resources Information Center
Jones, Dylan M.; Marsh, John E.; Hughes, Robert W.
2012-01-01
We show that retrieval from semantic memory is vulnerable even to the mere presence of speech. Irrelevant speech impairs semantic fluency--namely, lexical retrieval cued by a semantic category name--but only if it is meaningful (forward speech compared to reversed speech or words compared to nonwords). Moreover, speech related semantically to the…
Freedom of Speech Newsletter, September, 1975.
ERIC Educational Resources Information Center
Allen, Winfred G., Jr., Ed.
The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…
Voice and Speech after Laryngectomy
ERIC Educational Resources Information Center
Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka
2006-01-01
The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…
Speech Patterns and Racial Wage Inequality
ERIC Educational Resources Information Center
Grogger, Jeffrey
2011-01-01
Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…
The Relationship between Speech Production and Speech Perception Deficits in Parkinson's Disease
ERIC Educational Resources Information Center
De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet
2016-01-01
Purpose: This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Method: Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified…
Speech Characteristics Associated with Three Genotypes of Ataxia
ERIC Educational Resources Information Center
Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana
2011-01-01
Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…
ERIC Educational Resources Information Center
Gass, Susan M., Ed.; Neu, Joyce, Ed.
Articles on speech acts and intercultural communication include: "Investigating the Production of Speech Act Sets" (Andrew Cohen); "Non-Native Refusals: A Methodological Perspective" (Noel Houck, Susan M. Gass); "Natural Speech Act Data versus Written Questionnaire Data: How Data Collection Method Affects Speech Act…
The "Checkers" Speech and Televised Political Communication.
ERIC Educational Resources Information Center
Flaningam, Carl
Richard Nixon's 1952 "Checkers" speech was an innovative use of television for political communication. Like television news itself, the campaign fund crisis behind the speech can be thought of in the same terms as other television melodrama, with the speech serving as its climactic episode. The speech adapted well to television because…
Phonemic Characteristics of Apraxia of Speech Resulting from Subcortical Hemorrhage
ERIC Educational Resources Information Center
Peach, Richard K.; Tonkovich, John D.
2004-01-01
Reports describing subcortical apraxia of speech (AOS) have received little consideration in the development of recent speech processing models because the speech characteristics of patients with this diagnosis have not been described precisely. We describe a case of AOS with aphasia secondary to basal ganglia hemorrhage. Speech-language symptoms…
The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech
ERIC Educational Resources Information Center
Ye, Ruijuan
2010-01-01
This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…
The Effectiveness of Clear Speech as a Masker
ERIC Educational Resources Information Center
Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.
2010-01-01
Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-25
... Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities; Structure and Practices of the Video Relay Service Program AGENCY: Federal Communications Commission. ACTION...-minute video relay service (``VRS'') compensation rates, and adopts per-minute compensation rates for the...
Advanced Persuasive Speaking, English, Speech: 5114.112.
ERIC Educational Resources Information Center
Dade County Public Schools, Miami, FL.
Developed as a high school quinmester unit on persuasive speaking, this guide provides the teacher with teaching strategies for a course which analyzes speeches from "Vital Speeches of the Day," political speeches, TV commercials, and other types of speeches. Practical use of persuasive methods for school, community, county, state, and…
ON THE NATURE OF SPEECH SCIENCE.
ERIC Educational Resources Information Center
PETERSON, GORDON E.
IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…
Automated Speech Rate Measurement in Dysarthria
ERIC Educational Resources Information Center
Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc
2015-01-01
Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…
Neural and Behavioral Mechanisms of Clear Speech
ERIC Educational Resources Information Center
Luque, Jenna Silver
2017-01-01
Clear speech is a speaking style that has been shown to improve intelligibility in adverse listening conditions, for various listener and talker populations. Clear-speech phonetic enhancements include a slowed speech rate, expanded vowel space, and expanded pitch range. Although clear-speech phonetic enhancements have been demonstrated across a…
Left Lateralized Enhancement of Orofacial Somatosensory Processing Due to Speech Sounds
ERIC Educational Resources Information Center
Ito, Takayuki; Johns, Alexis R.; Ostry, David J.
2013-01-01
Purpose: Somatosensory information associated with speech articulatory movements affects the perception of speech sounds and vice versa, suggesting an intimate linkage between speech production and perception systems. However, it is unclear which cortical processes are involved in the interaction between speech sounds and orofacial somatosensory…
Audiovisual Cues and Perceptual Learning of Spectrally Distorted Speech
ERIC Educational Resources Information Center
Pilling, Michael; Thomas, Sharon
2011-01-01
Two experiments investigate the effectiveness of audiovisual (AV) speech cues (cues derived from both seeing and hearing a talker speak) in facilitating perceptual learning of spectrally distorted speech. Speech was distorted through an eight channel noise-vocoder which shifted the spectral envelope of the speech signal to simulate the properties…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-09
...: Telecommunications Relay Services and Speech-to-Speech Services for Individuals with Hearing and Speech Disabilities... enable the Commission to collect waiver reports from Telecommunications Relay Service (TRS) providers... Report and Order and Order on Reconsideration in Telecommunications Relay Services and Speech-to- Speech...
ERIC Educational Resources Information Center
Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.
2017-01-01
Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…
The influence of speech rate and accent on access and use of semantic information.
Sajin, Stanislav M; Connine, Cynthia M
2017-04-01
Circumstances in which the speech input is presented in sub-optimal conditions generally lead to processing costs affecting spoken word recognition. The current study indicates that some processing demands imposed by listening to difficult speech can be mitigated by feedback from semantic knowledge. A set of lexical decision experiments examined how foreign accented speech and word duration impact access to semantic knowledge in spoken word recognition. Results indicate that when listeners process accented speech, the reliance on semantic information increases. Speech rate was not observed to influence semantic access, except in the setting in which unusually slow accented speech was presented. These findings support interactive activation models of spoken word recognition in which attention is modulated based on speech demands.
Comparison of formant detection methods used in speech processing applications
NASA Astrophysics Data System (ADS)
Belean, Bogdan
2013-11-01
The paper describes time frequency representations of speech signal together with the formant significance in speech processing applications. Speech formants can be used in emotion recognition, sex discrimination or diagnosing different neurological diseases. Taking into account the various applications of formant detection in speech signal, two methods for detecting formants are presented. First, the poles resulted after a complex analysis of LPC coefficients are used for formants detection. The second approach uses the Kalman filter for formant prediction along the speech signal. Results are presented for both approaches on real life speech spectrograms. A comparison regarding the features of the proposed methods is also performed, in order to establish which method is more suitable in case of different speech processing applications.
Long term rehabilitation of a total glossectomy patient.
Bachher, Gurmit Kaur; Dholam, Kanchan P
2010-09-01
Malignant tumours of the oral cavity that require resection of the tongue result in severe deficiencies in speech and deglutition. Speech misarticulation leads to loss of speech intelligibility, which can prevent or limit communication. Prosthodontic rehabilitation involves fabrication of a Palatal Augmentation Prosthesis (PAP) following partial glossectomy and a mandibular tongue prosthesis after total glossectomy [1]. Speech analysis of a total glossectmy patient rehabilitated with a tongue prosthesis was done with the help of Dr. Speech Software Version 4 (Tiger DRS, Inc., Seattle) twelve years after treatment. Speech therapy sessions along with a prosthesis helped him to correct the dental sounds by using the lower lip and upper dentures (labio-dentals). It was noticed that speech intelligibility, intonation pattern, speech articulation and overall loudness was noticeably improved.
Speech perception and production in severe environments
NASA Astrophysics Data System (ADS)
Pisoni, David B.
1990-09-01
The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.
Reaction times of normal listeners to laryngeal, alaryngeal, and synthetic speech.
Evitts, Paul M; Searl, Jeff
2006-12-01
The purpose of this study was to compare listener processing demands when decoding alaryngeal compared to laryngeal speech. Fifty-six listeners were presented with single words produced by 1 proficient speaker from 5 different modes of speech: normal, tracheosophageal (TE), esophageal (ES), electrolaryngeal (EL), and synthetic speech (SS). Cognitive processing load was indexed by listener reaction time (RT). To account for significant durational differences among the modes of speech, an RT ratio was calculated (stimulus duration divided by RT). Results indicated that the cognitive processing load was greater for ES and EL relative to normal speech. TE and normal speech did not differ in terms of RT ratio, suggesting fairly comparable cognitive demands placed on the listener. SS required greater cognitive processing load than normal and alaryngeal speech. The results are discussed relative to alaryngeal speech intelligibility and the role of the listener. Potential clinical applications and directions for future research are also presented.
Should visual speech cues (speechreading) be considered when fitting hearing aids?
NASA Astrophysics Data System (ADS)
Grant, Ken
2002-05-01
When talker and listener are face-to-face, visual speech cues become an important part of the communication environment, and yet, these cues are seldom considered when designing hearing aids. Models of auditory-visual speech recognition highlight the importance of complementary versus redundant speech information for predicting auditory-visual recognition performance. Thus, for hearing aids to work optimally when visual speech cues are present, it is important to know whether the cues provided by amplification and the cues provided by speechreading complement each other. In this talk, data will be reviewed that show nonmonotonicity between auditory-alone speech recognition and auditory-visual speech recognition, suggesting that efforts designed solely to improve auditory-alone recognition may not always result in improved auditory-visual recognition. Data will also be presented showing that one of the most important speech cues for enhancing auditory-visual speech recognition performance, voicing, is often the cue that benefits least from amplification.
Yao, Bo; Belin, Pascal; Scheepers, Christoph
2012-04-15
In human communication, direct speech (e.g., Mary said, "I'm hungry") is perceived as more vivid than indirect speech (e.g., Mary said that she was hungry). This vividness distinction has previously been found to underlie silent reading of quotations: Using functional magnetic resonance imaging (fMRI), we found that direct speech elicited higher brain activity in the temporal voice areas (TVA) of the auditory cortex than indirect speech, consistent with an "inner voice" experience in reading direct speech. Here we show that listening to monotonously spoken direct versus indirect speech quotations also engenders differential TVA activity. This suggests that individuals engage in top-down simulations or imagery of enriched supra-segmental acoustic representations while listening to monotonous direct speech. The findings shed new light on the acoustic nature of the "inner voice" in understanding direct speech. Copyright © 2012 Elsevier Inc. All rights reserved.
Clear speech and lexical competition in younger and older adult listeners.
Van Engen, Kristin J
2017-08-01
This study investigated whether clear speech reduces the cognitive demands of lexical competition by crossing speaking style with lexical difficulty. Younger and older adults identified more words in clear versus conversational speech and more easy words than hard words. An initial analysis suggested that the effect of lexical difficulty was reduced in clear speech, but more detailed analyses within each age group showed this interaction was significant only for older adults. The results also showed that both groups improved over the course of the task and that clear speech was particularly helpful for individuals with poorer hearing: for younger adults, clear speech eliminated hearing-related differences that affected performance on conversational speech. For older adults, clear speech was generally more helpful to listeners with poorer hearing. These results suggest that clear speech affords perceptual benefits to all listeners and, for older adults, mitigates the cognitive challenge associated with identifying words with many phonological neighbors.
Recognizing intentions in infant-directed speech: evidence for universals.
Bryant, Gregory A; Barrett, H Clark
2007-08-01
In all languages studied to date, distinct prosodic contours characterize different intention categories of infant-directed (ID) speech. This vocal behavior likely exists universally as a species-typical trait, but little research has examined whether listeners can accurately recognize intentions in ID speech using only vocal cues, without access to semantic information. We recorded native-English-speaking mothers producing four intention categories of utterances (prohibition, approval, comfort, and attention) as both ID and adult-directed (AD) speech, and we then presented the utterances to Shuar adults (South American hunter-horticulturalists). Shuar subjects were able to reliably distinguish ID from AD speech and were able to reliably recognize the intention categories in both types of speech, although performance was significantly better with ID speech. This is the first demonstration that adult listeners in an indigenous, nonindustrialized, and nonliterate culture can accurately infer intentions from both ID speech and AD speech in a language they do not speak.
An integrated approach to improving noisy speech perception
NASA Astrophysics Data System (ADS)
Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail
2002-05-01
For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.
Alpermann, Anke; Huber, Walter; Natke, Ulrich; Willmes, Klaus
2010-09-01
Improved fluency after stuttering therapy is usually measured by the percentage of stuttered syllables. However, outcome studies rarely evaluate the use of trained speech patterns that speakers use to manage stuttering. This study investigated whether the modified time interval analysis can distinguish between trained speech patterns, fluent speech, and stuttered speech. Seventeen German experts on stuttering judged a speech sample on two occasions. Speakers of the sample were stuttering adults, who were not undergoing therapy, as well as participants in a fluency shaping and a stuttering modification therapy. Results showed satisfactory inter-judge and intra-judge agreement above 80%. Intervals with trained speech patterns were identified as consistently as stuttered and fluent intervals. We discuss limitations of the study, as well as implications of our findings for the development of training for identification of trained speech patterns and future outcome studies. The reader will be able to (a) explain different methods to measure the use of trained speech patterns, (b) evaluate whether German experts are able to discriminate intervals with trained speech patterns reliably from fluent and stuttered intervals and (c) describe how the measurement of trained speech patterns can contribute to outcome studies.
A systematic review of treatment intensity in speech disorders.
Kaipa, Ramesh; Peterson, Abigail Marie
2016-12-01
Treatment intensity (sometimes referred to as "practice amount") has been well-investigated in learning non-speech tasks, but its role in treating speech disorders has not been largely analysed. This study reviewed the literature regarding treatment intensity in speech disorders. A systematic search was conducted in four databases using appropriate search terms. Seven articles from a total of 580 met the inclusion criteria. The speech disorders investigated included speech sound disorders, dysarthria, acquired apraxia of speech and childhood apraxia of speech. All seven studies were evaluated for their methodological quality, research phase and evidence level. Evidence level of reviewed studies ranged from moderate to strong. With regard to the research phase, only one study was considered to be phase III research, which corresponds to the controlled trial phase. The remaining studies were considered to be phase II research, which corresponds to the phase where magnitude of therapeutic effect is assessed. Results suggested that higher treatment intensity was favourable over lower treatment intensity of specific treatment technique(s) for treating childhood apraxia of speech and speech sound (phonological) disorders. Future research should incorporate randomised-controlled designs to establish optimal treatment intensity that is specific to each of the speech disorders.
Speech and nonspeech: What are we talking about?
Maas, Edwin
2017-08-01
Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.
Maas, Edwin; Mailend, Marja-Liisa
2012-10-01
The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Following a brief description of limitations of offline perceptual methods, we provide a narrative review of various types of RT paradigms from the (speech) motor programming and psycholinguistic literatures and their (thus far limited) application with AOS. On the basis of the review of the literature, we conclude that with careful consideration of potential challenges and caveats, RT approaches hold great promise to advance our understanding of AOS, in particular with respect to the speech planning processes that generate the speech signal before initiation. A deeper understanding of the nature and time course of speech planning and its disruptions in AOS may enhance diagnosis and treatment for AOS. Only a handful of published studies on apraxia of speech have used reaction time methods. However, these studies have provided deeper insight into speech planning impairments in AOS based on a variety of experimental paradigms.
Neural integration of iconic and unrelated coverbal gestures: a functional MRI study.
Green, Antonia; Straube, Benjamin; Weis, Susanne; Jansen, Andreas; Willmes, Klaus; Konrad, Kerstin; Kircher, Tilo
2009-10-01
Gestures are an important part of interpersonal communication, for example by illustrating physical properties of speech contents (e.g., "the ball is round"). The meaning of these so-called iconic gestures is strongly intertwined with speech. We investigated the neural correlates of the semantic integration for verbal and gestural information. Participants watched short videos of five speech and gesture conditions performed by an actor, including variation of language (familiar German vs. unfamiliar Russian), variation of gesture (iconic vs. unrelated), as well as isolated familiar language, while brain activation was measured using functional magnetic resonance imaging. For familiar speech with either of both gesture types contrasted to Russian speech-gesture pairs, activation increases were observed at the left temporo-occipital junction. Apart from this shared location, speech with iconic gestures exclusively engaged left occipital areas, whereas speech with unrelated gestures activated bilateral parietal and posterior temporal regions. Our results demonstrate that the processing of speech with speech-related versus speech-unrelated gestures occurs in two distinct but partly overlapping networks. The distinct processing streams (visual versus linguistic/spatial) are interpreted in terms of "auxiliary systems" allowing the integration of speech and gesture in the left temporo-occipital region.
Engaged listeners: shared neural processing of powerful political speeches
Häcker, Frank E. K.; Honey, Christopher J.; Hasson, Uri
2015-01-01
Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners’ brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. PMID:25653012
Hayes-Harb, Rachel; Smith, Bruce L.; Bent, Tessa; Bradlow, Ann R.
2009-01-01
This study investigated the intelligibility of native and Mandarin-accented English speech for native English and native Mandarin listeners. The word-final voicing contrast was considered (as in minimal pairs such as `cub' and `cup') in a forced-choice word identification task. For these particular talkers and listeners, there was evidence of an interlanguage speech intelligibility benefit for listeners (i.e., native Mandarin listeners were more accurate than native English listeners at identifying Mandarin-accented English words). However, there was no evidence of an interlanguage speech intelligibility benefit for talkers (i.e., native Mandarin listeners did not find Mandarin-accented English speech more intelligible than native English speech). When listener and talker phonological proficiency (operationalized as accentedness) was taken into account, it was found that the interlanguage speech intelligibility benefit for listeners held only for the low phonological proficiency listeners and low phonological proficiency speech. The intelligibility data were also considered in relation to various temporal-acoustic properties of native English and Mandarin-accented English speech in effort to better understand the properties of speech that may contribute to the interlanguage speech intelligibility benefit. PMID:19606271
Some articulatory details of emotional speech
NASA Astrophysics Data System (ADS)
Lee, Sungbok; Yildirim, Serdar; Bulut, Murtaza; Kazemzadeh, Abe; Narayanan, Shrikanth
2005-09-01
Differences in speech articulation among four emotion types, neutral, anger, sadness, and happiness are investigated by analyzing tongue tip, jaw, and lip movement data collected from one male and one female speaker of American English. The data were collected using an electromagnetic articulography (EMA) system while subjects produce simulated emotional speech. Pitch, root-mean-square (rms) energy and the first three formants were estimated for vowel segments. For both speakers, angry speech exhibited the largest rms energy and largest articulatory activity in terms of displacement range and movement speed. Happy speech is characterized by largest pitch variability. It has higher rms energy than neutral speech but articulatory activity is rather comparable to, or less than, neutral speech. That is, happy speech is more prominent in voicing activity than in articulation. Sad speech exhibits longest sentence duration and lower rms energy. However, its articulatory activity is no less than neutral speech. Interestingly, for the male speaker, articulation for vowels in sad speech is consistently more peripheral (i.e., more forwarded displacements) when compared to other emotions. However, this does not hold for female subject. These and other results will be discussed in detail with associated acoustics and perceived emotional qualities. [Work supported by NIH.
Processing of speech signals for physical and sensory disabilities.
Levitt, H
1995-01-01
Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities. Images Fig. 4 PMID:7479816
Cooke, Martin; Aubanel, Vincent
2017-01-01
Algorithmic modifications to the durational structure of speech designed to avoid intervals of intense masking lead to increases in intelligibility, but the basis for such gains is not clear. The current study addressed the possibility that the reduced information load produced by speech rate slowing might explain some or all of the benefits of durational modifications. The study also investigated the influence of masker stationarity on the effectiveness of durational changes. Listeners identified keywords in sentences that had undergone linear and nonlinear speech rate changes resulting in overall temporal lengthening in the presence of stationary and fluctuating maskers. Relative to unmodified speech, a slower speech rate produced no intelligibility gains for the stationary masker, suggesting that a reduction in information rate does not underlie intelligibility benefits of durationally modified speech. However, both linear and nonlinear modifications led to substantial intelligibility increases in fluctuating noise. One possibility is that overall increases in speech duration provide no new phonetic information in stationary masking conditions, but that temporal fluctuations in the background increase the likelihood of glimpsing additional salient speech cues. Alternatively, listeners may have benefitted from an increase in the difference in speech rates between the target and background. PMID:28618803
NASA Astrophysics Data System (ADS)
Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki
2013-07-01
Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed. However, there is room for improvement particularly in terms of sound quality. BCU speech is accompanied by a strong high-pitched tone and contain some distortion. In this study, the sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulations] and air-conducted (AC) speech was quantitatively evaluated using semantic differential and factor analysis. The results showed that all the types of BCU speech had higher metallic and lower esthetic factor scores than AC speech. On the other hand, transposed speech was closer than the other types of BCU speech to AC speech generally; the transposed speech showed a higher powerfulness factor score than the other types of BCU speech and a higher esthetic factor score than DSB-SC speech. These results provide useful information for further development of the BCUHA.
Lexical and phonological variability in preschool children with speech sound disorder.
Macrae, Toby; Tyler, Ann A; Lewis, Kerry E
2014-02-01
The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.
Effect of gap detection threshold on consistency of speech in children with speech sound disorder.
Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz
2017-02-01
The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.
Williams, Jason A
2012-06-01
The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.
Developing a Weighted Measure of Speech Sound Accuracy
Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.
2010-01-01
Purpose The purpose is to develop a system for numerically quantifying a speaker’s phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, we describe a system for differentially weighting speech sound errors based on various levels of phonetic accuracy with a Weighted Speech Sound Accuracy (WSSA) score. We then evaluate the reliability and validity of this measure. Method Phonetic transcriptions are analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy is compared to existing measures, is used to discriminate typical and disordered speech production, and is evaluated to determine whether it is sensitive to changes in phonetic accuracy over time. Results Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners’ judgments of severity of a child’s speech disorder. The measure separates children with and without speech sound disorders. WSSA scores also capture growth in phonetic accuracy in toddler’s speech over time. Conclusion Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children’s speech. PMID:20699344
Duration, Pitch, and Loudness in Kunqu Opera Stage Speech.
Han, Qichao; Sundberg, Johan
2017-03-01
Kunqu is a special type of opera within the Chinese tradition with 600 years of history. In it, stage speech is used for the spoken dialogue. It is performed in Ming Dynasty's mandarin language and is a much more dominant part of the play than singing. Stage speech deviates considerably from normal conversational speech with respect to duration, loudness and pitch. This paper compares these properties in stage speech conversational speech. A famous, highly experienced female singer's performed stage speech and reading of the same lyrics in a conversational speech mode. Clear differences are found. As compared with conversational speech, stage speech had longer word and sentence duration and word duration was less variable. Average sound level was 16 dB higher. Also mean fundamental frequency was considerably higher and more varied. Within sentences, both loudness and fundamental frequency tended to vary according to a low-high-low pattern. Some of the findings fail to support current opinions regarding the characteristics of stage speech, and in this sense the study demonstrates the relevance of objective measurements in descriptions of vocal styles. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Infant-Directed Visual Prosody: Mothers’ Head Movements and Speech Acoustics
Smith, Nicholas A.; Strader, Heather L.
2014-01-01
Acoustical changes in the prosody of mothers’ speech to infants are distinct and near universal. However, less is known about the visible properties mothers’ infant-directed (ID) speech, and their relation to speech acoustics. Mothers’ head movements were tracked as they interacted with their infants using ID speech, and compared to movements accompanying their adult-directed (AD) speech. Movement measures along three dimensions of head translation, and three axes of head rotation were calculated. Overall, more head movement was found for ID than AD speech, suggesting that mothers exaggerate their visual prosody in a manner analogous to the acoustical exaggerations in their speech. Regression analyses examined the relation between changing head position and changing acoustical pitch (F0) over time. Head movements and voice pitch were more strongly related in ID speech than in AD speech. When these relations were examined across time windows of different durations, stronger relations were observed for shorter time windows (< 5 sec). However, the particular form of these more local relations did not extend or generalize to longer time windows. This suggests that the multimodal correspondences in speech prosody are variable in form, and occur within limited time spans. PMID:25242907
Processing of Speech Signals for Physical and Sensory Disabilities
NASA Astrophysics Data System (ADS)
Levitt, Harry
1995-10-01
Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.
VOT in speech-disordered individuals: History, theory, data, reminiscence
NASA Astrophysics Data System (ADS)
Weismer, Gary
2004-05-01
Forty years ago Lisker and Abramson published their landmark paper on VOT; the speech-research world has never been the same. The concept of VOT as a measure relevant to phonology, speech physiology, and speech perception made it a prime choice for scientists who saw an opportunity to exploit the techniques and analytic frameworks of ``speech science'' in the study of speech disorders. Modifications of VOT in speech disorders have been used to draw specific inferences concerning phonological representations, glottal-supraglottal timing, and speech intelligibility. This presentation will provide a review of work on VOT in speech disorders, including (among others) stuttering, hearing impairment, and neurogenic disorders. An attempt will be made to collect published data in summary graphic form, and to discuss their implications. Emphasis will be placed on how VOT has been used to inform theories of disordered speech production. I will close with some personal comments about the influence (unbeknowest to them) these two outstanding scientists had on me in the 1970s, when under the spell of their work I first became aware that the world of speech research did not start and end with moving parts.
Cortical activity patterns predict robust speech discrimination ability in noise
Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.
2012-01-01
The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
Speech fluency profile in Williams-Beuren syndrome: a preliminary study.
Rossi, Natalia Freitas; Souza, Deise Helena de; Moretti-Ferreira, Danilo; Giacheti, Célia Maria
2009-01-01
the speech fluency pattern attributed to individuals with Williams-Beuren syndrome (WBS) is supported by the effectiveness of the phonological loop. Some studies have reported the occurrence of speech disruptions caused by lexical and semantic deficits. However, the type and frequency of such speech disruptions has not been well elucidated. to determine the speech fluency profile of individuals with WBS and to compare the speech performance of these individuals to a control group matched by gender and mental age. Twelve subjects with Williams-Beuren syndrome, chronologically aged between 6.6 and 23.6 years and mental age ranging from 4.8 to 14.3 years, were evaluated. They were compared with another group consisting of 12 subjects with similar mental age and with no speech or learning difficulties. Speech fluency parameters were assessed according to the ABFW Language Test: type and frequency of speech disruptions and speech rate. The obtained results were compared between the groups. In comparison with individuals of similar mental age and typical speech and language development, the group with Williams-Beuren syndrome showed a greater percentage of speech discontinuity, and an increased frequency of common hesitations and word repetition. The speech fluency profile presented by individuals with WBS in this study suggests that the presence of disfluencies can be caused by deficits in the lexical, semantic, and syntactic processing of verbal information. The authors stress that further systematic investigations on the subject are warranted.
Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena
2015-06-01
The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.
Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji
2012-12-01
Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
Venezia, Jonathan H; Fillmore, Paul; Matchin, William; Isenberg, A Lisette; Hickok, Gregory; Fridriksson, Julius
2016-02-01
Sensory information is critical for movement control, both for defining the targets of actions and providing feedback during planning or ongoing movements. This holds for speech motor control as well, where both auditory and somatosensory information have been shown to play a key role. Recent clinical research demonstrates that individuals with severe speech production deficits can show a dramatic improvement in fluency during online mimicking of an audiovisual speech signal suggesting the existence of a visuomotor pathway for speech motor control. Here we used fMRI in healthy individuals to identify this new visuomotor circuit for speech production. Participants were asked to perceive and covertly rehearse nonsense syllable sequences presented auditorily, visually, or audiovisually. The motor act of rehearsal, which is prima facie the same whether or not it is cued with a visible talker, produced different patterns of sensorimotor activation when cued by visual or audiovisual speech (relative to auditory speech). In particular, a network of brain regions including the left posterior middle temporal gyrus and several frontoparietal sensorimotor areas activated more strongly during rehearsal cued by a visible talker versus rehearsal cued by auditory speech alone. Some of these brain regions responded exclusively to rehearsal cued by visual or audiovisual speech. This result has significant implications for models of speech motor control, for the treatment of speech output disorders, and for models of the role of speech gesture imitation in development. Copyright © 2015 Elsevier Inc. All rights reserved.
Lohmander, A; Willadsen, E; Persson, C; Henningsson, G; Bowden, M; Hutters, B
2009-07-01
To present the methodology for speech assessment in the Scandcleft project and discuss issues from a pilot study. Description of methodology and blinded test for speech assessment. Speech samples and instructions for data collection and analysis for comparisons of speech outcomes across five included languages were developed and tested. PARTICIPANTS AND MATERIALS: Randomly selected video recordings of 10 5-year-old children from each language (n = 50) were included in the project. Speech material consisted of test consonants in single words, connected speech, and syllable chains with nasal consonants. Five experienced speech and language pathologists participated as observers. Narrow phonetic transcription of test consonants translated into cleft speech characteristics, ordinal scale rating of resonance, and perceived velopharyngeal closure (VPC). A velopharyngeal composite score (VPC-sum) was extrapolated from raw data. Intra-agreement comparisons were performed. Range for intra-agreement for consonant analysis was 53% to 89%, for hypernasality on high vowels in single words the range was 20% to 80%, and the agreement between the VPC-sum and the overall rating of VPC was 78%. Pooling data of speakers of different languages in the same trial and comparing speech outcome across trials seems possible if the assessment of speech concerns consonants and is confined to speech units that are phonetically similar across languages. Agreed conventions and rules are important. A composite variable for perceptual assessment of velopharyngeal function during speech seems usable; whereas, the method for hypernasality evaluation requires further testing.
Venezia, Jonathan H.; Fillmore, Paul; Matchin, William; Isenberg, A. Lisette; Hickok, Gregory; Fridriksson, Julius
2015-01-01
Sensory information is critical for movement control, both for defining the targets of actions and providing feedback during planning or ongoing movements. This holds for speech motor control as well, where both auditory and somatosensory information have been shown to play a key role. Recent clinical research demonstrates that individuals with severe speech production deficits can show a dramatic improvement in fluency during online mimicking of an audiovisual speech signal suggesting the existence of a visuomotor pathway for speech motor control. Here we used fMRI in healthy individuals to identify this new visuomotor circuit for speech production. Participants were asked to perceive and covertly rehearse nonsense syllable sequences presented auditorily, visually, or audiovisually. The motor act of rehearsal, which is prima facie the same whether or not it is cued with a visible talker, produced different patterns of sensorimotor activation when cued by visual or audiovisual speech (relative to auditory speech). In particular, a network of brain regions including the left posterior middle temporal gyrus and several frontoparietal sensorimotor areas activated more strongly during rehearsal cued by a visible talker versus rehearsal cued by auditory speech alone. Some of these brain regions responded exclusively to rehearsal cued by visual or audiovisual speech. This result has significant implications for models of speech motor control, for the treatment of speech output disorders, and for models of the role of speech gesture imitation in development. PMID:26608242
Speech endpoint detection with non-language speech sounds for generic speech processing applications
NASA Astrophysics Data System (ADS)
McClain, Matthew; Romanowski, Brian
2009-05-01
Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Calandruccio, Lauren; Bradlow, Ann R; Dhar, Sumitrajit
2014-04-01
Masking release for an English sentence-recognition task in the presence of foreign-accented English speech compared with native-accented English speech was reported in Calandruccio et al (2010a). The masking release appeared to increase as the masker intelligibility decreased. However, it could not be ruled out that spectral differences between the speech maskers were influencing the significant differences observed. The purpose of the current experiment was to minimize spectral differences between speech maskers to determine how various amounts of linguistic information within competing speech Affiliationect masking release. A mixed-model design with within-subject (four two-talker speech maskers) and between-subject (listener group) factors was conducted. Speech maskers included native-accented English speech and high-intelligibility, moderate-intelligibility, and low-intelligibility Mandarin-accented English. Normalizing the long-term average speech spectra of the maskers to each other minimized spectral differences between the masker conditions. Three listener groups were tested, including monolingual English speakers with normal hearing, nonnative English speakers with normal hearing, and monolingual English speakers with hearing loss. The nonnative English speakers were from various native language backgrounds, not including Mandarin (or any other Chinese dialect). Listeners with hearing loss had symmetric mild sloping to moderate sensorineural hearing loss. Listeners were asked to repeat back sentences that were presented in the presence of four different two-talker speech maskers. Responses were scored based on the key words within the sentences (100 key words per masker condition). A mixed-model regression analysis was used to analyze the difference in performance scores between the masker conditions and listener groups. Monolingual English speakers with normal hearing benefited when the competing speech signal was foreign accented compared with native accented, allowing for improved speech recognition. Various levels of intelligibility across the foreign-accented speech maskers did not influence results. Neither the nonnative English-speaking listeners with normal hearing nor the monolingual English speakers with hearing loss benefited from masking release when the masker was changed from native-accented to foreign-accented English. Slight modifications between the target and the masker speech allowed monolingual English speakers with normal hearing to improve their recognition of native-accented English, even when the competing speech was highly intelligible. Further research is needed to determine which modifications within the competing speech signal caused the Mandarin-accented English to be less effective with respect to masking. Determining the influences within the competing speech that make it less effective as a masker or determining why monolingual normal-hearing listeners can take advantage of these differences could help improve speech recognition for those with hearing loss in the future. American Academy of Audiology.
[Improving speech comprehension using a new cochlear implant speech processor].
Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A
2009-06-01
The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg sentences in the clinical setting S(0)N(CI), with speech signal at 0 degrees and noise lateral to the CI at 90 degrees . With the convincing findings from our evaluations of this multicenter study cohort, a trial with the Freedom 24 sound processor for all suitable CI users is recommended. For evaluating the benefits of a new processor, the comparative assessment paradigm used in our study design would be considered ideal for use with individual patients.
Cortical oscillations and entrainment in speech processing during working memory load.
Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten
2018-02-02
Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (< 13 Hz). We found that increases in both types of WM load (background noise and n-back level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Significance of parametric spectral ratio methods in detection and recognition of whispered speech
NASA Astrophysics Data System (ADS)
Mathur, Arpit; Reddy, Shankar M.; Hegde, Rajesh M.
2012-12-01
In this article the significance of a new parametric spectral ratio method that can be used to detect whispered speech segments within normally phonated speech is described. Adaptation methods based on the maximum likelihood linear regression (MLLR) are then used to realize a mismatched train-test style speech recognition system. This proposed parametric spectral ratio method computes a ratio spectrum of the linear prediction (LP) and the minimum variance distortion-less response (MVDR) methods. The smoothed ratio spectrum is then used to detect whispered segments of speech within neutral speech segments effectively. The proposed LP-MVDR ratio method exhibits robustness at different SNRs as indicated by the whisper diarization experiments conducted on the CHAINS and the cell phone whispered speech corpus. The proposed method also performs reasonably better than the conventional methods for whisper detection. In order to integrate the proposed whisper detection method into a conventional speech recognition engine with minimal changes, adaptation methods based on the MLLR are used herein. The hidden Markov models corresponding to neutral mode speech are adapted to the whispered mode speech data in the whispered regions as detected by the proposed ratio method. The performance of this method is first evaluated on whispered speech data from the CHAINS corpus. The second set of experiments are conducted on the cell phone corpus of whispered speech. This corpus is collected using a set up that is used commercially for handling public transactions. The proposed whisper speech recognition system exhibits reasonably better performance when compared to several conventional methods. The results shown indicate the possibility of a whispered speech recognition system for cell phone based transactions.
Gao, Yayue; Wang, Qian; Ding, Yu; Wang, Changming; Li, Haifeng; Wu, Xihong; Qu, Tianshu; Li, Liang
2017-01-01
Human listeners are able to selectively attend to target speech in a noisy environment with multiple-people talking. Using recordings of scalp electroencephalogram (EEG), this study investigated how selective attention facilitates the cortical representation of target speech under a simulated “cocktail-party” listening condition with speech-on-speech masking. The result shows that the cortical representation of target-speech signals under the multiple-people talking condition was specifically improved by selective attention relative to the non-selective-attention listening condition, and the beta-band activity was most strongly modulated by selective attention. Moreover, measured with the Granger Causality value, selective attention to the single target speech in the mixed-speech complex enhanced the following four causal connectivities for the beta-band oscillation: the ones (1) from site FT7 to the right motor area, (2) from the left frontal area to the right motor area, (3) from the central frontal area to the right motor area, and (4) from the central frontal area to the right frontal area. However, the selective-attention-induced change in beta-band causal connectivity from the central frontal area to the right motor area, but not other beta-band causal connectivities, was significantly correlated with the selective-attention-induced change in the cortical beta-band representation of target speech. These findings suggest that under the “cocktail-party” listening condition, the beta-band oscillation in EEGs to target speech is specifically facilitated by selective attention to the target speech that is embedded in the mixed-speech complex. The selective attention-induced unmasking of target speech may be associated with the improved beta-band functional connectivity from the central frontal area to the right motor area, suggesting a top-down attentional modulation of the speech-motor process. PMID:28239344
A Networking of Community-Based Speech Therapy: Borabue District, Maha Sarakham.
Pumnum, Tawitree; Kum-ud, Weawta; Prathanee, Benjamas
2015-08-01
Most children with cleft lip and palate have articulation problems because of compensatory articulation disorders from velopharyngeal insufficiency. Theoretically, children should receive speech therapy from a speech and language pathologist (SLP) 1-2 sessions per week. For developing countries, particularly Thailand, most of them cannot reach standard speech services because of limitation of speech services and SLP Networking of a Community-Based Speech Model might be an appropriate way to solve this problem. To study the effectiveness of a networking of Khon Kaen University (KKU) Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham, in decreasing the number of articulation errors for children with CLP. Six children with cleft lip and palate (CLP) who lived in Borabue and the surrounding district, Maha Sarakham, and had medical records in Srinagarind Hospital. They were assessed for pre- and post-articulation errors and provided speech therapy by SLP via teaching on service for speech assistant (SA). Then, children with CLP received speech correction (SC) by SA based on assignment and caregivers practiced home program for a year. Networking of Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham significantly reduce the number of post-articulation errors for 3 children with CLP. There were factors affecting the results in treatment of other children as follows: delayed speech and language development, hypernaslaity, and consistency of SC at local hospital and home. A networking of KKU Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, and Maha Sarakham was a good way to enhance speech therapy in Thailand or other developing countries, where have limitation of speech services or lack of professionals.
Park, Hyojin; Ince, Robin A A; Schyns, Philippe G; Thut, Gregor; Gross, Joachim
2015-06-15
Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C
1999-01-01
The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.
Neural evidence for predictive coding in auditory cortex during speech production.
Okada, Kayoko; Matchin, William; Hickok, Gregory
2018-02-01
Recent models of speech production suggest that motor commands generate forward predictions of the auditory consequences of those commands, that these forward predications can be used to monitor and correct speech output, and that this system is hierarchically organized (Hickok, Houde, & Rong, Neuron, 69(3), 407--422, 2011; Pickering & Garrod, Behavior and Brain Sciences, 36(4), 329--347, 2013). Recent psycholinguistic research has shown that internally generated speech (i.e., imagined speech) produces different types of errors than does overt speech (Oppenheim & Dell, Cognition, 106(1), 528--537, 2008; Oppenheim & Dell, Memory & Cognition, 38(8), 1147-1160, 2010). These studies suggest that articulated speech might involve predictive coding at additional levels than imagined speech. The current fMRI experiment investigates neural evidence of predictive coding in speech production. Twenty-four participants from UC Irvine were recruited for the study. Participants were scanned while they were visually presented with a sequence of words that they reproduced in sync with a visual metronome. On each trial, they were cued to either silently articulate the sequence or to imagine the sequence without overt articulation. As expected, silent articulation and imagined speech both engaged a left hemisphere network previously implicated in speech production. A contrast of silent articulation with imagined speech revealed greater activation for articulated speech in inferior frontal cortex, premotor cortex and the insula in the left hemisphere, consistent with greater articulatory load. Although both conditions were silent, this contrast also produced significantly greater activation in auditory cortex in dorsal superior temporal gyrus in both hemispheres. We suggest that these activations reflect forward predictions arising from additional levels of the perceptual/motor hierarchy that are involved in monitoring the intended speech output.
Park, Hyojin; Ince, Robin A.A.; Schyns, Philippe G.; Thut, Gregor; Gross, Joachim
2015-01-01
Summary Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1, 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3, 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception. PMID:26028433
Gao, Yayue; Wang, Qian; Ding, Yu; Wang, Changming; Li, Haifeng; Wu, Xihong; Qu, Tianshu; Li, Liang
2017-01-01
Human listeners are able to selectively attend to target speech in a noisy environment with multiple-people talking. Using recordings of scalp electroencephalogram (EEG), this study investigated how selective attention facilitates the cortical representation of target speech under a simulated "cocktail-party" listening condition with speech-on-speech masking. The result shows that the cortical representation of target-speech signals under the multiple-people talking condition was specifically improved by selective attention relative to the non-selective-attention listening condition, and the beta-band activity was most strongly modulated by selective attention. Moreover, measured with the Granger Causality value, selective attention to the single target speech in the mixed-speech complex enhanced the following four causal connectivities for the beta-band oscillation: the ones (1) from site FT7 to the right motor area, (2) from the left frontal area to the right motor area, (3) from the central frontal area to the right motor area, and (4) from the central frontal area to the right frontal area. However, the selective-attention-induced change in beta-band causal connectivity from the central frontal area to the right motor area, but not other beta-band causal connectivities, was significantly correlated with the selective-attention-induced change in the cortical beta-band representation of target speech. These findings suggest that under the "cocktail-party" listening condition, the beta-band oscillation in EEGs to target speech is specifically facilitated by selective attention to the target speech that is embedded in the mixed-speech complex. The selective attention-induced unmasking of target speech may be associated with the improved beta-band functional connectivity from the central frontal area to the right motor area, suggesting a top-down attentional modulation of the speech-motor process.
Speech training alters tone frequency tuning in rat primary auditory cortex
Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Kilgard, Michael P.
2013-01-01
Previous studies in both humans and animals have documented improved performance following discrimination training. This enhanced performance is often associated with cortical response changes. In this study, we tested the hypothesis that long-term speech training on multiple tasks can improve primary auditory cortex (A1) responses compared to rats trained on a single speech discrimination task or experimentally naïve rats. Specifically, we compared the percent of A1 responding to trained sounds, the responses to both trained and untrained sounds, receptive field properties of A1 neurons, and the neural discrimination of pairs of speech sounds in speech trained and naïve rats. Speech training led to accurate discrimination of consonant and vowel sounds, but did not enhance A1 response strength or the neural discrimination of these sounds. Speech training altered tone responses in rats trained on six speech discrimination tasks but not in rats trained on a single speech discrimination task. Extensive speech training resulted in broader frequency tuning, shorter onset latencies, a decreased driven response to tones, and caused a shift in the frequency map to favor tones in the range where speech sounds are the loudest. Both the number of trained tasks and the number of days of training strongly predict the percent of A1 responding to a low frequency tone. Rats trained on a single speech discrimination task performed less accurately than rats trained on multiple tasks and did not exhibit A1 response changes. Our results indicate that extensive speech training can reorganize the A1 frequency map, which may have downstream consequences on speech sound processing. PMID:24344364
Pulse Vector-Excitation Speech Encoder
NASA Technical Reports Server (NTRS)
Davidson, Grant; Gersho, Allen
1989-01-01
Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.
Versatile simulation testbed for rotorcraft speech I/O system design
NASA Technical Reports Server (NTRS)
Simpson, Carol A.
1986-01-01
A versatile simulation testbed for the design of a rotorcraft speech I/O system is described in detail. The testbed will be used to evaluate alternative implementations of synthesized speech displays and speech recognition controls for the next generation of Army helicopters including the LHX. The message delivery logic is discussed as well as the message structure, the speech recognizer command structure and features, feedback from the recognizer, and random access to controls via speech command.
Degraded neural and behavioral processing of speech sounds in a rat model of Rett syndrome
Engineer, Crystal T.; Rahebi, Kimiya C.; Borland, Michael S.; Buell, Elizabeth P.; Centanni, Tracy M.; Fink, Melyssa K.; Im, Kwok W.; Wilson, Linda G.; Kilgard, Michael P.
2015-01-01
Individuals with Rett syndrome have greatly impaired speech and language abilities. Auditory brainstem responses to sounds are normal, but cortical responses are highly abnormal. In this study, we used the novel rat Mecp2 knockout model of Rett syndrome to document the neural and behavioral processing of speech sounds. We hypothesized that both speech discrimination ability and the neural response to speech sounds would be impaired in Mecp2 rats. We expected that extensive speech training would improve speech discrimination ability and the cortical response to speech sounds. Our results reveal that speech responses across all four auditory cortex fields of Mecp2 rats were hyperexcitable, responded slower, and were less able to follow rapidly presented sounds. While Mecp2 rats could accurately perform consonant and vowel discrimination tasks in quiet, they were significantly impaired at speech sound discrimination in background noise. Extensive speech training improved discrimination ability. Training shifted cortical responses in both Mecp2 and control rats to favor the onset of speech sounds. While training increased the response to low frequency sounds in control rats, the opposite occurred in Mecp2 rats. Although neural coding and plasticity are abnormal in the rat model of Rett syndrome, extensive therapy appears to be effective. These findings may help to explain some aspects of communication deficits in Rett syndrome and suggest that extensive rehabilitation therapy might prove beneficial. PMID:26321676
Ng, Elaine Hoi Ning; Rudner, Mary; Lunner, Thomas; Rönnberg, Jerker
2015-01-01
A hearing aid noise reduction (NR) algorithm reduces the adverse effect of competing speech on memory for target speech for individuals with hearing impairment with high working memory capacity. In the present study, we investigated whether the positive effect of NR could be extended to individuals with low working memory capacity, as well as how NR influences recall performance for target native speech when the masker language is non-native. A sentence-final word identification and recall (SWIR) test was administered to 26 experienced hearing aid users. In this test, target spoken native language (Swedish) sentence lists were presented in competing native (Swedish) or foreign (Cantonese) speech with or without binary masking NR algorithm. After each sentence list, free recall of sentence final words was prompted. Working memory capacity was measured using a reading span (RS) test. Recall performance was associated with RS. However, the benefit obtained from NR was not associated with RS. Recall performance was more disrupted by native than foreign speech babble and NR improved recall performance in native but not foreign competing speech. Noise reduction improved memory for speech heard in competing speech for hearing aid users. Memory for native speech was more disrupted by native babble than foreign babble, but the disruptive effect of native speech babble was reduced to that of foreign babble when there was NR.
Ward, Roslyn; Leitão, Suze; Strauss, Geoff
2014-08-01
This study evaluates perceptual changes in speech production accuracy in six children (3-11 years) with moderate-to-severe speech impairment associated with cerebral palsy before, during, and after participation in a motor-speech intervention program (Prompts for Restructuring Oral Muscular Phonetic Targets). An A1BCA2 single subject research design was implemented. Subsequent to the baseline phase (phase A1), phase B targeted each participant's first intervention priority on the PROMPT motor-speech hierarchy. Phase C then targeted one level higher. Weekly speech probes were administered, containing trained and untrained words at the two levels of intervention, plus an additional level that served as a control goal. The speech probes were analysed for motor-speech-movement-parameters and perceptual accuracy. Analysis of the speech probe data showed all participants recorded a statistically significant change. Between phases A1-B and B-C 6/6 and 4/6 participants, respectively, recorded a statistically significant increase in performance level on the motor speech movement patterns targeted during the training of that intervention. The preliminary data presented in this study make a contribution to providing evidence that supports the use of a treatment approach aligned with dynamic systems theory to improve the motor-speech movement patterns and speech production accuracy in children with cerebral palsy.
An articulatorily constrained, maximum entropy approach to speech recognition and speech coding
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.
Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values aremore » constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.« less
Evaluation of the importance of time-frequency contributions to speech intelligibility in noise
Yu, Chengzhu; Wójcicki, Kamil K.; Loizou, Philipos C.; Hansen, John H. L.; Johnson, Michael T.
2014-01-01
Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280
Multimodal processing of emotional information in 9-month-old infants I: emotional faces and voices.
Otte, R A; Donkers, F C L; Braeken, M A K A; Van den Bergh, B R H
2015-04-01
Making sense of emotions manifesting in human voice is an important social skill which is influenced by emotions in other modalities, such as that of the corresponding face. Although processing emotional information from voices and faces simultaneously has been studied in adults, little is known about the neural mechanisms underlying the development of this ability in infancy. Here we investigated multimodal processing of fearful and happy face/voice pairs using event-related potential (ERP) measures in a group of 84 9-month-olds. Infants were presented with emotional vocalisations (fearful/happy) preceded by the same or a different facial expression (fearful/happy). The ERP data revealed that the processing of emotional information appearing in human voice was modulated by the emotional expression appearing on the corresponding face: Infants responded with larger auditory ERPs after fearful compared to happy facial primes. This finding suggests that infants dedicate more processing capacities to potentially threatening than to non-threatening stimuli. Copyright © 2014 Elsevier Inc. All rights reserved.
Patterns of call communication between group-housed zebra finches change during the breeding cycle
Gill, Lisa F; Goymann, Wolfgang; Ter Maat, Andries; Gahr, Manfred
2015-01-01
Vocal signals such as calls play a crucial role for survival and successful reproduction, especially in group-living animals. However, call interactions and call dynamics within groups remain largely unexplored because their relation to relevant contexts or life-history stages could not be studied with individual-level resolution. Using on-bird microphone transmitters, we recorded the vocalisations of individual zebra finches (Taeniopygia guttata) behaving freely in social groups, while females and males previously unknown to each other passed through different stages of the breeding cycle. As birds formed pairs and shifted their reproductive status, their call repertoire composition changed. The recordings revealed that calls occurred non-randomly in fine-tuned vocal interactions and decreased within groups while pair-specific patterns emerged. Call-type combinations of vocal interactions changed within pairs and were associated with successful egg-laying, highlighting a potential fitness relevance of calling dynamics in communication systems. DOI: http://dx.doi.org/10.7554/eLife.07770.001 PMID:26441403
Le Roux, Aliza; Cherry, Michael I; Manser, Marta B
2009-05-01
We describe the vocal repertoire of a facultatively social carnivore, the yellow mongoose, Cynictis penicillata. Using a combination of close-range observations, recordings and experiments with simulated predators, we were able to obtain clear descriptions of call structure and function for a wide range of calls used by this herpestid. The vocal repertoire of the yellow mongooses comprised ten call types, half of which were used in appeasing or fearful contexts and half in aggressive interactions. Data from this study suggest that the yellow mongoose uses an urgency-based alarm calling system, indicating high and low urgency through two distinct call types. Compared to solitary mongooses, the yellow mongoose has a large proportion of 'friendly' vocalisations that enhance group cohesion, but its vocal repertoire is smaller and less context-specific than those of obligate social species. This study of the vocal repertoire of the yellow mongoose is, to our knowledge, the most complete to have been conducted on a facultatively social species in its natural habitat.
NASA Astrophysics Data System (ADS)
Le Roux, Aliza; Cherry, Michael I.; Manser, Marta B.
2009-05-01
We describe the vocal repertoire of a facultatively social carnivore, the yellow mongoose, Cynictis penicillata. Using a combination of close-range observations, recordings and experiments with simulated predators, we were able to obtain clear descriptions of call structure and function for a wide range of calls used by this herpestid. The vocal repertoire of the yellow mongooses comprised ten call types, half of which were used in appeasing or fearful contexts and half in aggressive interactions. Data from this study suggest that the yellow mongoose uses an urgency-based alarm calling system, indicating high and low urgency through two distinct call types. Compared to solitary mongooses, the yellow mongoose has a large proportion of ‘friendly’ vocalisations that enhance group cohesion, but its vocal repertoire is smaller and less context-specific than those of obligate social species. This study of the vocal repertoire of the yellow mongoose is, to our knowledge, the most complete to have been conducted on a facultatively social species in its natural habitat.
Reminiscence in dementia: a concept analysis.
Dempsey, Laura; Murphy, Kathy; Cooney, Adeline; Casey, Dympna; O'Shea, Eamon; Devane, Declan; Jordan, Fionnuala; Hunter, Andrew
2014-03-01
This paper is a report of an analysis of the concept of reminiscence in dementia and highlights its uses as a therapeutic intervention used on individuals with dementia. No single definition of reminiscence exists in healthcare literature; however, definitions offered have similar components. The term life review is commonly used when discussing reminiscence; however, both terms are quite different in their goals, theory base and content. This concept analysis identified reminiscence as a process which occurs in stages, involving the recalling of early life events and interaction between individuals. The antecedents of reminiscence are age, life transitions, attention span, ability to recall, ability to vocalise and stressful situations. Reminiscence can lead to positive mental health, enhanced self esteem and improved communication skills. It also facilitates preparation for death, increases interaction between people, prepares for the future and evaluates a past life. Reminiscence therapy is used extensively in dementia care and evidence shows when used effectively it helps individuals retain a sense of self worth, identity and individuality.
Bourguet, Cécile; Deiss, Véronique; Tannugi, Carole Cohen; Terlouw, E M Claudia
2011-05-01
Behavioural, physiological and metabolic reactions of cattle to handling and slaughter procedures were evaluated in a commercial abattoir, from arrival until slaughter. Different genders or breeds were not subjected to the same procedures due to abattoir equipment or organisational aspects of the abattoir. Reactions to similar slaughter procedures varied according to animal characteristics and could have consequences for subsequent handling procedures. Factors that appeared to cause handling problems and vocalisation were excessive pressure during restraint, and distractions in the corridor such as noise, darkness, seeing people and activity. Post-mortem muscle metabolism depended on slaughter procedures. Following stunning or halal slaughter, some animals showed head rising movements despite the abolition of the corneal reflex, suggesting that head rising is not always indicative of consciousness. Overall, this study presents concrete data on how different types of cattle may react to slaughter procedures with a direct interest for the abattoir itself but also for scientific purposes. Copyright © 2010. Published by Elsevier Ltd.
Vocal tract length and acoustics of vocalization in the domestic dog (Canis familiaris).
Riede, T; Fitch, T
1999-10-01
The physical nature of the vocal tract results in the production of formants during vocalisation. In some animals (including humans), receivers can derive information (such as body size) about sender characteristics on the basis of formant characteristics. Domestication and selective breeding have resulted in a high variability in head size and shape in the dog (Canis familiaris), suggesting that there might be large differences in the vocal tract length, which could cause formant behaviour to affect interbreed communication. Lateral radiographs were made of dogs from several breeds ranging in size from a Yorkshire terrier (2.5 kg) to a German shepherd (50 kg) and were used to measure vocal tract length. In addition, we recorded an acoustic signal (growling) from some dogs. Significant correlations were found between vocal tract length, body mass and formant dispersion, suggesting that formant dispersion can deliver information about the body size of the vocalizer. Because of the low correlation between vocal tract length and the first formant, we predict a non-uniform vocal tract shape.
The physics of birdsong production
NASA Astrophysics Data System (ADS)
Mindlin, G. B.
2013-04-01
Human babies need to learn how to talk. The need of a tutor to achieve acceptable vocalisations is a feature that we share with a few species in the animal kingdom. Among those are Songbirds, which account for nearly half of the known bird species. For that reason, Songbirds have become an ideal animal model to study how a brain reconfigures itself during the process of learning a complex task. In the last few years, neuroscientists have invested important resources in order to unveil the neural architecture involved in birdsong production and learning. Yet, behaviour emerges from the interaction between a nervous system, a peripheral biomechanical architecture and environment, and therefore its study should be just as integrated. In particular, the physical study of the avian vocal organ can help to elucidate which features found in the song of birds are under direct control of specific neural instructions and which emerge from the biomechanics involved in its generation. This work describes recent advances in the study of the physics of birdsong production.
Audio-Visual Speech Perception Is Special
ERIC Educational Resources Information Center
Tuomainen, J.; Andersen, T.S.; Tiippana, K.; Sams, M.
2005-01-01
In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and…
Phonetic Recalibration Only Occurs in Speech Mode
ERIC Educational Resources Information Center
Vroomen, Jean; Baart, Martijn
2009-01-01
Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…
Infant Perception of Atypical Speech Signals
ERIC Educational Resources Information Center
Vouloumanos, Athena; Gelfand, Hanna M.
2013-01-01
The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…
Intensive Speech and Language Therapy for Older Children with Cerebral Palsy: A Systems Approach
ERIC Educational Resources Information Center
Pennington, Lindsay; Miller, Nick; Robson, Sheila; Steen, Nick
2010-01-01
Aim: To investigate whether speech therapy using a speech systems approach to controlling breath support, phonation, and speech rate can increase the speech intelligibility of children with dysarthria and cerebral palsy (CP). Method: Sixteen children with dysarthria and CP participated in a modified time series design. Group characteristics were…
Speech Sound Disorders in a Community Study of Preschool Children
ERIC Educational Resources Information Center
McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane
2013-01-01
Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…
Status Report on Speech Research, No. 27, July-September 1971.
ERIC Educational Resources Information Center
Haskins Labs., New Haven, CT.
This report contains fourteen papers on a wide range of current topics and experiments in speech research, ranging from the relationship between speech and reading to questions of memory and perception of speech sounds. The following papers are included: "How Is Language Conveyed by Speech?;""Reading, the Linguistic Process, and Linguistic…
Monkey Lipsmacking Develops Like the Human Speech Rhythm
ERIC Educational Resources Information Center
Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.
2012-01-01
Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…
Increasing Parental Involvement in Speech-Sound Remediation
ERIC Educational Resources Information Center
Roberts, Micah Renee Ferguson
2014-01-01
Speech therapy homework is a key component of a successful speech therapy program, increasing carryover of learned speech sounds. Poor return rate of homework assigned, with a lack of parental involvement, is a problem. The purpose of this project study was to examine what may increase parental participation in speech therapy homework. Guided by…
Analysis of False Starts in Spontaneous Speech.
ERIC Educational Resources Information Center
O'Shaughnessy, Douglas
A primary difference between spontaneous speech and read speech concerns the use of false starts, where a speaker interrupts the flow of speech to restart his or her utterance. A study examined the acoustic aspects of such restarts in a widely-used speech database, examining approximately 1000 utterances, about 10% of which contained a restart.…
Foundational Tuning: How Infants' Attention to Speech Predicts Language Development
ERIC Educational Resources Information Center
Vouloumanos, Athena; Curtin, Suzanne
2014-01-01
Orienting biases for speech may provide a foundation for language development. Although human infants show a bias for listening to speech from birth, the relation of a speech bias to later language development has not been established. Here, we examine whether infants' attention to speech directly predicts expressive vocabulary. Infants…
ERIC Educational Resources Information Center
Schaadt, Gesa; Männel, Claudia; van der Meer, Elke; Pannekamp, Ann; Friederici, Angela D.
2016-01-01
Successful communication in everyday life crucially involves the processing of auditory and visual components of speech. Viewing our interlocutor and processing visual components of speech facilitates speech processing by triggering auditory processing. Auditory phoneme processing, analyzed by event-related brain potentials (ERP), has been shown…
Speech in the Junior High School. Michigan Speech Association Curriculum Guide Series, No. 4.
ERIC Educational Resources Information Center
Herman, Deldee; Ratliffe, Sharon
Designed to provide the student with experience in oral communication, this curriculum guide presents a one-semester speech course for junior high school students with "normal" rather than defective speech. The eight units cover speech in social interaction; group discussion and business meetings; demonstrations and reports; creative dramatics;…
Philosophical Perspectives on Values and Ethics in Speech Communication.
ERIC Educational Resources Information Center
Becker, Carl B.
There are three very different concerns of communication ethics: (1) applied speech ethics, (2) ethical rules or standards, and (3) metaethical issues. In the area of applied speech ethics, communications theorists attempt to determine whether a speech act is moral or immoral by focusing on the content and effects of specific speech acts. Specific…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-23
...] Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities... for telecommunications relay services (TRS) by eliminating standards for Internet-based relay services... comments, identified by CG Docket No. 03-123, by any of the following methods: Electronic Filers: Comments...
Tracking Change in Children with Severe and Persisting Speech Difficulties
ERIC Educational Resources Information Center
Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill
2013-01-01
Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…