vocal affect recognition: Topics by Science.gov

Sample records for vocal affect recognition

Vocal Affect Recognition and Psychopathy: Converging Findings Across Traditional and Cluster Analytic Approaches to Assessing the Construct

PubMed Central

Bagley, Amy D.; Abramowitz, Carolyn S.; Kosson, David S.

2010-01-01

Deficits in emotion processing have been widely reported to be central to psychopathy. However, few prior studies have examined vocal affect recognition in psychopaths, and these studies suffer from significant methodological limitations. Moreover, prior studies have yielded conflicting findings regarding the specificity of psychopaths’ affect recognition deficits. This study examined vocal affect recognition in 107 male inmates under conditions requiring isolated prosodic vs. semantic analysis of affective cues and compared subgroups of offenders identified via cluster analysis on vocal affect recognition. Psychopaths demonstrated deficits in vocal affect recognition under conditions requiring use of semantic cues and conditions requiring use of prosodic cues. Moreover, both primary and secondary psychopaths exhibited relatively similar emotional deficits in the semantic analysis condition compared to nonpsychopathic control participants. This study demonstrates that psychopaths’ vocal affect recognition deficits are not due to methodological limitations of previous studies and provides preliminary evidence that primary and secondary psychopaths exhibit generally similar deficits in vocal affect recognition. PMID:19413412
Relationships between alexithymia, affect recognition, and empathy after traumatic brain injury.

PubMed

Neumann, Dawn; Zupan, Barbra; Malec, James F; Hammond, Flora

2014-01-01

To determine (1) alexithymia, affect recognition, and empathy differences in participants with and without traumatic brain injury (TBI); (2) the amount of affect recognition variance explained by alexithymia; and (3) the amount of empathy variance explained by alexithymia and affect recognition. Sixty adults with moderate-to-severe TBI; 60 age and gender-matched controls. Participants were evaluated for alexithymia (difficulty identifying feelings, difficulty describing feelings, and externally-oriented thinking); facial and vocal affect recognition; and affective and cognitive empathy (empathic concern and perspective-taking, respectively). Participants with TBI had significantly higher alexithymia; poorer facial and vocal affect recognition; and lower empathy scores. For TBI participants, facial and vocal affect recognition variances were significantly explained by alexithymia (12% and 8%, respectively); however, the majority of the variances were accounted for by externally-oriented thinking alone. Affect recognition and alexithymia significantly accounted for 16.5% of cognitive empathy. Again, the majority of the variance was primarily explained by externally-oriented thinking. Affect recognition and alexithymia did not explain affective empathy. Results suggest that people who have a tendency to avoid thinking about emotions (externally-oriented thinking) are more likely to have problems recognizing others' emotions and assuming others' points of view. Clinical implications are discussed.
Development of the Ability to Use Facial, Situational, and Vocal Cues to Infer Others' Affective States.

ERIC Educational Resources Information Center

Farber, Ellen A.; Moely, Barbara E.

Results of two studies investigating children's abilities to use different kinds of cues to infer another's affective state are reported in this paper. In the first study, 48 children (3, 4, and 6 to 7 years of age) were given three different kinds of tasks (interpersonal task, facial recognition task, and vocal recognition task). A cross-age…
Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations.

PubMed

Laukka, Petri; Elfenbein, Hillary Anger; Söder, Nela; Nordström, Henrik; Althoff, Jean; Chui, Wanda; Iraki, Frederick K; Rockstuhl, Thomas; Thingujam, Nutankumar S

2013-01-01

Which emotions are associated with universally recognized non-verbal signals?We address this issue by examining how reliably non-linguistic vocalizations (affect bursts) can convey emotions across cultures. Actors from India, Kenya, Singapore, and USA were instructed to produce vocalizations that would convey nine positive and nine negative emotions to listeners. The vocalizations were judged by Swedish listeners using a within-valence forced-choice procedure, where positive and negative emotions were judged in separate experiments. Results showed that listeners could recognize a wide range of positive and negative emotions with accuracy above chance. For positive emotions, we observed the highest recognition rates for relief, followed by lust, interest, serenity and positive surprise, with affection and pride receiving the lowest recognition rates. Anger, disgust, fear, sadness, and negative surprise received the highest recognition rates for negative emotions, with the lowest rates observed for guilt and shame. By way of summary, results showed that the voice can reveal both basic emotions and several positive emotions other than happiness across cultures, but self-conscious emotions such as guilt, pride, and shame seem not to be well recognized from non-linguistic vocalizations.
Deficits in auditory processing contribute to impairments in vocal affect recognition in autism spectrum disorders: A MEG study.

PubMed

Demopoulos, Carly; Hopkins, Joyce; Kopald, Brandon E; Paulson, Kim; Doyle, Lauren; Andrews, Whitney E; Lewine, Jeffrey David

2015-11-01

The primary aim of this study was to examine whether there is an association between magnetoencephalography-based (MEG) indices of basic cortical auditory processing and vocal affect recognition (VAR) ability in individuals with autism spectrum disorder (ASD). MEG data were collected from 25 children/adolescents with ASD and 12 control participants using a paired-tone paradigm to measure quality of auditory physiology, sensory gating, and rapid auditory processing. Group differences were examined in auditory processing and vocal affect recognition ability. The relationship between differences in auditory processing and vocal affect recognition deficits was examined in the ASD group. Replicating prior studies, participants with ASD showed longer M1n latencies and impaired rapid processing compared with control participants. These variables were significantly related to VAR, with the linear combination of auditory processing variables accounting for approximately 30% of the variability after controlling for age and language skills in participants with ASD. VAR deficits in ASD are typically interpreted as part of a core, higher order dysfunction of the "social brain"; however, these results suggest they also may reflect basic deficits in auditory processing that compromise the extraction of socially relevant cues from the auditory environment. As such, they also suggest that therapeutic targeting of sensory dysfunction in ASD may have additional positive implications for other functional deficits. (c) 2015 APA, all rights reserved).
Gender Differences in the Recognition of Vocal Emotions

PubMed Central

Lausen, Adi; Schacht, Annekathrin

2018-01-01

The conflicting findings from the few studies conducted with regard to gender differences in the recognition of vocal expressions of emotion have left the exact nature of these differences unclear. Several investigators have argued that a comprehensive understanding of gender differences in vocal emotion recognition can only be achieved by replicating these studies while accounting for influential factors such as stimulus type, gender-balanced samples, number of encoders, decoders, and emotional categories. This study aimed to account for these factors by investigating whether emotion recognition from vocal expressions differs as a function of both listeners' and speakers' gender. A total of N = 290 participants were randomly and equally allocated to two groups. One group listened to words and pseudo-words, while the other group listened to sentences and affect bursts. Participants were asked to categorize the stimuli with respect to the expressed emotions in a fixed-choice response format. Overall, females were more accurate than males when decoding vocal emotions, however, when testing for specific emotions these differences were small in magnitude. Speakers' gender had a significant impact on how listeners' judged emotions from the voice. The group listening to words and pseudo-words had higher identification rates for emotions spoken by male than by female actors, whereas in the group listening to sentences and affect bursts the identification rates were higher when emotions were uttered by female than male actors. The mixed pattern for emotion-specific effects, however, indicates that, in the vocal channel, the reliability of emotion judgments is not systematically influenced by speakers' gender and the related stereotypes of emotional expressivity. Together, these results extend previous findings by showing effects of listeners' and speakers' gender on the recognition of vocal emotions. They stress the importance of distinguishing these factors to explain recognition ability in the processing of emotional prosody. PMID:29922202
Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations

PubMed Central

Laukka, Petri; Elfenbein, Hillary Anger; Söder, Nela; Nordström, Henrik; Althoff, Jean; Chui, Wanda; Iraki, Frederick K.; Rockstuhl, Thomas; Thingujam, Nutankumar S.

2013-01-01

Which emotions are associated with universally recognized non-verbal signals?We address this issue by examining how reliably non-linguistic vocalizations (affect bursts) can convey emotions across cultures. Actors from India, Kenya, Singapore, and USA were instructed to produce vocalizations that would convey nine positive and nine negative emotions to listeners. The vocalizations were judged by Swedish listeners using a within-valence forced-choice procedure, where positive and negative emotions were judged in separate experiments. Results showed that listeners could recognize a wide range of positive and negative emotions with accuracy above chance. For positive emotions, we observed the highest recognition rates for relief, followed by lust, interest, serenity and positive surprise, with affection and pride receiving the lowest recognition rates. Anger, disgust, fear, sadness, and negative surprise received the highest recognition rates for negative emotions, with the lowest rates observed for guilt and shame. By way of summary, results showed that the voice can reveal both basic emotions and several positive emotions other than happiness across cultures, but self-conscious emotions such as guilt, pride, and shame seem not to be well recognized from non-linguistic vocalizations. PMID:23914178
Towards Real-Time Speech Emotion Recognition for Affective E-Learning

ERIC Educational Resources Information Center

Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

2016-01-01

This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner's vocal intonations and facial expressions in order…
Cross-Cultural Differences in the Processing of Non-Verbal Affective Vocalizations by Japanese and Canadian Listeners

PubMed Central

Koeda, Michihiko; Belin, Pascal; Hama, Tomoko; Masuda, Tadashi; Matsuura, Masato; Okubo, Yoshiro

2013-01-01

The Montreal Affective Voices (MAVs) consist of a database of non-verbal affect bursts portrayed by Canadian actors, and high recognitions accuracies were observed in Canadian listeners. Whether listeners from other cultures would be as accurate is unclear. We tested for cross-cultural differences in perception of the MAVs: Japanese listeners were asked to rate the MAVs on several affective dimensions and ratings were compared to those obtained by Canadian listeners. Significant Group × Emotion interactions were observed for ratings of Intensity, Valence, and Arousal. Whereas Intensity and Valence ratings did not differ across cultural groups for sad and happy vocalizations, they were significantly less intense and less negative in Japanese listeners for angry, disgusted, and fearful vocalizations. Similarly, pleased vocalizations were rated as less intense and less positive by Japanese listeners. These results demonstrate important cross-cultural differences in affective perception not just of non-verbal vocalizations expressing positive affect (Sauter et al., 2010), but also of vocalizations expressing basic negative emotions. PMID:23516137
Cross-cultural differences in the processing of non-verbal affective vocalizations by Japanese and canadian listeners.

PubMed

Koeda, Michihiko; Belin, Pascal; Hama, Tomoko; Masuda, Tadashi; Matsuura, Masato; Okubo, Yoshiro

2013-01-01

The Montreal Affective Voices (MAVs) consist of a database of non-verbal affect bursts portrayed by Canadian actors, and high recognitions accuracies were observed in Canadian listeners. Whether listeners from other cultures would be as accurate is unclear. We tested for cross-cultural differences in perception of the MAVs: Japanese listeners were asked to rate the MAVs on several affective dimensions and ratings were compared to those obtained by Canadian listeners. Significant Group × Emotion interactions were observed for ratings of Intensity, Valence, and Arousal. Whereas Intensity and Valence ratings did not differ across cultural groups for sad and happy vocalizations, they were significantly less intense and less negative in Japanese listeners for angry, disgusted, and fearful vocalizations. Similarly, pleased vocalizations were rated as less intense and less positive by Japanese listeners. These results demonstrate important cross-cultural differences in affective perception not just of non-verbal vocalizations expressing positive affect (Sauter et al., 2010), but also of vocalizations expressing basic negative emotions.
Syllable acoustics, temporal patterns, and call composition vary with behavioral context in Mexican free-tailed bats

PubMed Central

Bohn, Kirsten M.; Schmidt-French, Barbara; Ma, Sean T.; Pollak, George D.

2008-01-01

Recent research has shown that some bat species have rich vocal repertoires with diverse syllable acoustics. Few studies, however, have compared vocalizations across different behavioral contexts or examined the temporal emission patterns of vocalizations. In this paper, a comprehensive examination of the vocal repertoire of Mexican free-tailed bats, T. brasiliensis, is presented. Syllable acoustics and temporal emission patterns for 16 types of vocalizations including courtship song revealed three main findings. First, although in some cases syllables are unique to specific calls, other syllables are shared among different calls. Second, entire calls associated with one behavior can be embedded into more complex vocalizations used in entirely different behavioral contexts. Third, when different calls are composed of similar syllables, distinctive temporal emission patterns may facilitate call recognition. These results indicate that syllable acoustics alone do not likely provide enough information for call recognition; rather, the acoustic context and temporal emission patterns of vocalizations may affect meaning. PMID:19045674
Bilateral lesions of the medial frontal cortex disrupt recognition of social hierarchy during antiphonal communication in naked mole-rats (Heterocephalus glaber).

PubMed

Yosida, Shigeto; Okanoya, Kazuo

2012-02-01

Generation of the motor patterns of emotional sounds in mammals occurs in the periaqueductal gray matter of the midbrain and is not directly controlled by the cortex. The medial frontal cortex indirectly controls vocalizations, based on the recognition of social context. We examined whether the medial frontal cortex was responsible for antiphonal vocalization, or turn-taking, in naked mole-rats. In normal turn-taking, naked mole-rats vocalize more frequently to dominant individuals than to subordinate ones. Bilateral lesions of the medial frontal cortex disrupted differentiation of call rates to the stimulus animals, which had varied social relationships to the subject. However, medial frontal cortex lesions did not affect either the acoustic properties of the vocalizations or the timing of the vocal exchanges. This suggests that the medial frontal cortex may be involved in social cognition or decision making during turn-taking, while other regions of the brain regulate when animals vocalize and the vocalizations themselves.
Nonlinguistic vocalizations from online amateur videos for emotion research: A validated corpus.

PubMed

Anikin, Andrey; Persson, Tomas

2017-04-01

This study introduces a corpus of 260 naturalistic human nonlinguistic vocalizations representing nine emotions: amusement, anger, disgust, effort, fear, joy, pain, pleasure, and sadness. The recognition accuracy in a rating task varied greatly per emotion, from <40% for joy and pain, to >70% for amusement, pleasure, fear, and sadness. In contrast, the raters' linguistic-cultural group had no effect on recognition accuracy: The predominantly English-language corpus was classified with similar accuracies by participants from Brazil, Russia, Sweden, and the UK/USA. Supervised random forest models classified the sounds as accurately as the human raters. The best acoustic predictors of emotion were pitch, harmonicity, and the spacing and regularity of syllables. This corpus of ecologically valid emotional vocalizations can be filtered to include only sounds with high recognition rates, in order to study reactions to emotional stimuli of known perceptual types (reception side), or can be used in its entirety to study the association between affective states and vocal expressions (production side).
The development of cross-cultural recognition of vocal emotion during childhood and adolescence.

PubMed

Chronaki, Georgia; Wigelsworth, Michael; Pell, Marc D; Kotz, Sonja A

2018-06-14

Humans have an innate set of emotions recognised universally. However, emotion recognition also depends on socio-cultural rules. Although adults recognise vocal emotions universally, they identify emotions more accurately in their native language. We examined developmental trajectories of universal vocal emotion recognition in children. Eighty native English speakers completed a vocal emotion recognition task in their native language (English) and foreign languages (Spanish, Chinese, and Arabic) expressing anger, happiness, sadness, fear, and neutrality. Emotion recognition was compared across 8-to-10, 11-to-13-year-olds, and adults. Measures of behavioural and emotional problems were also taken. Results showed that although emotion recognition was above chance for all languages, native English speaking children were more accurate in recognising vocal emotions in their native language. There was a larger improvement in recognising vocal emotion from the native language during adolescence. Vocal anger recognition did not improve with age for the non-native languages. This is the first study to demonstrate universality of vocal emotion recognition in children whilst supporting an "in-group advantage" for more accurate recognition in the native language. Findings highlight the role of experience in emotion recognition, have implications for child development in modern multicultural societies and address important theoretical questions about the nature of emotions.
Developmental change and cross-domain links in vocal and musical emotion recognition performance in childhood.

PubMed

Allgood, Rebecca; Heaton, Pamela

2015-09-01

Although the configurations of psychoacoustic cues signalling emotions in human vocalizations and instrumental music are very similar, cross-domain links in recognition performance have yet to be studied developmentally. Two hundred and twenty 5- to 10-year-old children were asked to identify musical excerpts and vocalizations as happy, sad, or fearful. The results revealed age-related increases in overall recognition performance with significant correlations across vocal and musical conditions at all developmental stages. Recognition scores were greater for musical than vocal stimuli and were superior in females compared with males. These results confirm that recognition of emotions in vocal and musical stimuli is linked by 5 years and that sensitivity to emotions in auditory stimuli is influenced by age and gender. © 2015 The British Psychological Society.
Second Language Ability and Emotional Prosody Perception

PubMed Central

Bhatara, Anjali; Laukka, Petri; Boll-Avetisyan, Natalie; Granjon, Lionel; Anger Elfenbein, Hillary; Bänziger, Tanja

2016-01-01

The present study examines the effect of language experience on vocal emotion perception in a second language. Native speakers of French with varying levels of self-reported English ability were asked to identify emotions from vocal expressions produced by American actors in a forced-choice task, and to rate their pleasantness, power, alertness and intensity on continuous scales. Stimuli included emotionally expressive English speech (emotional prosody) and non-linguistic vocalizations (affect bursts), and a baseline condition with Swiss-French pseudo-speech. Results revealed effects of English ability on the recognition of emotions in English speech but not in non-linguistic vocalizations. Specifically, higher English ability was associated with less accurate identification of positive emotions, but not with the interpretation of negative emotions. Moreover, higher English ability was associated with lower ratings of pleasantness and power, again only for emotional prosody. This suggests that second language skills may sometimes interfere with emotion recognition from speech prosody, particularly for positive emotions. PMID:27253326
Slowing down presentation of facial movements and vocal sounds enhances facial expression recognition and induces facial-vocal imitation in children with autism.

PubMed

Tardif, Carole; Lainé, France; Rodriguez, Mélissa; Gepner, Bruno

2007-09-01

This study examined the effects of slowing down presentation of facial expressions and their corresponding vocal sounds on facial expression recognition and facial and/or vocal imitation in children with autism. Twelve autistic children and twenty-four normal control children were presented with emotional and non-emotional facial expressions on CD-Rom, under audio or silent conditions, and under dynamic visual conditions (slowly, very slowly, at normal speed) plus a static control. Overall, children with autism showed lower performance in expression recognition and more induced facial-vocal imitation than controls. In the autistic group, facial expression recognition and induced facial-vocal imitation were significantly enhanced in slow conditions. Findings may give new perspectives for understanding and intervention for verbal and emotional perceptive and communicative impairments in autistic populations.
Experimental evidence of vocal recognition in young and adult black-legged kittiwakes

USGS Publications Warehouse

Mulard, Hervé; Aubin, T.; White, J.F.; Hatch, Shyla A.; Danchin, E.

2008-01-01

Individual recognition is required in most social interactions, and its presence has been confirmed in many species. In birds, vocal cues appear to be a major component of recognition. Curiously, vocal recognition seems absent or limited in some highly social species such as the black-legged kittiwake, Rissa tridactyla. Using playback experiments, we found that kittiwake chicks recognized their parents vocally, this capacity being detectable as early as 20 days after hatching, the youngest age tested. Mates also recognized each other's long calls. Some birds reacted to their partner's voice when only a part of the long call was played back. Nevertheless, only about a third of the tested birds reacted to their mate's or parents' call and we were unable to detect recognition among neighbours. We discuss the low reactivity of kittiwakes in relation to their cliff-nesting habit and compare our results with evidence of vocal recognition in other larids. ?? 2008 The Association for the Study of Animal Behaviour.
Compensating for age limits through emotional crossmodal integration

PubMed Central

Chaby, Laurence; Boullay, Viviane Luherne-du; Chetouani, Mohamed; Plaza, Monique

2015-01-01

Social interactions in daily life necessitate the integration of social signals from different sensory modalities. In the aging literature, it is well established that the recognition of emotion in facial expressions declines with advancing age, and this also occurs with vocal expressions. By contrast, crossmodal integration processing in healthy aging individuals is less documented. Here, we investigated the age-related effects on emotion recognition when faces and voices were presented alone or simultaneously, allowing for crossmodal integration. In this study, 31 young adults (M = 25.8 years) and 31 older adults (M = 67.2 years) were instructed to identify several basic emotions (happiness, sadness, anger, fear, disgust) and a neutral expression, which were displayed as visual (facial expressions), auditory (non-verbal affective vocalizations) or crossmodal (simultaneous, congruent facial and vocal affective expressions) stimuli. The results showed that older adults performed slower and worse than younger adults at recognizing negative emotions from isolated faces and voices. In the crossmodal condition, although slower, older adults were as accurate as younger except for anger. Importantly, additional analyses using the “race model” demonstrate that older adults benefited to the same extent as younger adults from the combination of facial and vocal emotional stimuli. These results help explain some conflicting results in the literature and may clarify emotional abilities related to daily life that are partially spared among older adults. PMID:26074845
Slowing down Presentation of Facial Movements and Vocal Sounds Enhances Facial Expression Recognition and Induces Facial-Vocal Imitation in Children with Autism

ERIC Educational Resources Information Center

Tardif, Carole; Laine, France; Rodriguez, Melissa; Gepner, Bruno

2007-01-01

This study examined the effects of slowing down presentation of facial expressions and their corresponding vocal sounds on facial expression recognition and facial and/or vocal imitation in children with autism. Twelve autistic children and twenty-four normal control children were presented with emotional and non-emotional facial expressions on…

Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli.

PubMed

Naranjo, C; Kornreich, C; Campanella, S; Noël, X; Vandriette, Y; Gillain, B; de Longueville, X; Delatte, B; Verbanck, P; Constant, E

2011-02-01

The processing of emotional stimuli is thought to be negatively biased in major depression. This study investigates this issue using musical, vocal and facial affective stimuli. 23 depressed in-patients and 23 matched healthy controls were recruited. Affective information processing was assessed through musical, vocal and facial emotion recognition tasks. Depression, anxiety level and attention capacity were controlled. The depressed participants demonstrated less accurate identification of emotions than the control group in all three sorts of emotion-recognition tasks. The depressed group also gave higher intensity ratings than the controls when scoring negative emotions, and they were more likely to attribute negative emotions to neutral voices and faces. Our in-patient group might differ from the more general population of depressed adults. They were all taking anti-depressant medication, which may have had an influence on their emotional information processing. Major depression is associated with a general negative bias in the processing of emotional stimuli. Emotional processing impairment in depression is not confined to interpersonal stimuli (faces and voices), being also present in the ability to feel music accurately. © 2010 Elsevier B.V. All rights reserved.
The voice conveys emotion in ten globalized cultures and one remote village in Bhutan.

PubMed

Cordaro, Daniel T; Keltner, Dacher; Tshering, Sumjay; Wangchuk, Dorji; Flynn, Lisa M

2016-02-01

With data from 10 different globalized cultures and 1 remote, isolated village in Bhutan, we examined universals and cultural variations in the recognition of 16 nonverbal emotional vocalizations. College students in 10 nations (Study 1) and villagers in remote Bhutan (Study 2) were asked to match emotional vocalizations to 1-sentence stories of the same valence. Guided by previous conceptualizations of recognition accuracy, across both studies, 7 of the 16 vocal burst stimuli were found to have strong or very strong recognition in all 11 cultures, 6 vocal bursts were found to have moderate recognition, and 4 were not universally recognized. All vocal burst stimuli varied significantly in terms of the degree to which they were recognized across the 11 cultures. Our discussion focuses on the implications of these results for current debates concerning the emotion conveyed in the voice. (c) 2016 APA, all rights reserved).
Brief Report: Impaired Differentiation of Vegetative/Affective and Intentional Nonverbal Vocalizations in a Subject with Asperger Syndrome (AS)

ERIC Educational Resources Information Center

Dietrich, Susanne; Hertrich, Ingo; Riedel, Andreas; Ackermann, Hermann

2012-01-01

The Asperger syndrome (AS) includes impaired recognition of other people's mental states. Since language-based diagnostic procedures may be confounded by cognitive-linguistic compensation strategies, nonverbal test materials were created, including human affective and vegetative sounds. Depending on video context, each sound could be interpreted…
Biases in facial and vocal emotion recognition in chronic schizophrenia

PubMed Central

Dondaine, Thibaut; Robert, Gabriel; Péron, Julie; Grandjean, Didier; Vérin, Marc; Drapier, Dominique; Millet, Bruno

2014-01-01

There has been extensive research on impaired emotion recognition in schizophrenia in the facial and vocal modalities. The literature points to biases toward non-relevant emotions for emotional faces but few studies have examined biases in emotional recognition across different modalities (facial and vocal). In order to test emotion recognition biases, we exposed 23 patients with stabilized chronic schizophrenia and 23 healthy controls (HCs) to emotional facial and vocal tasks asking them to rate emotional intensity on visual analog scales. We showed that patients with schizophrenia provided higher intensity ratings on the non-target scales (e.g., surprise scale for fear stimuli) than HCs for the both tasks. Furthermore, with the exception of neutral vocal stimuli, they provided the same intensity ratings on the target scales as the HCs. These findings suggest that patients with chronic schizophrenia have emotional biases when judging emotional stimuli in the visual and vocal modalities. These biases may stem from a basic sensorial deficit, a high-order cognitive dysfunction, or both. The respective roles of prefrontal-subcortical circuitry and the basal ganglia are discussed. PMID:25202287
Mother goats do not forget their kids’ calls

PubMed Central

Briefer, Elodie F.; Padilla de la Torre, Monica; McElligott, Alan G.

2012-01-01

Parent–offspring recognition is crucial for offspring survival. At long distances, this recognition is mainly based on vocalizations. Because of maturation-related changes to the structure of vocalizations, parents have to learn successive call versions produced by their offspring throughout ontogeny in order to maintain recognition. However, because of the difficulties involved in following the same individuals over years, it is not clear how long this vocal memory persists. Here, we investigated long-term vocal recognition in goats. We tested responses of mothers to their kids’ calls 7–13 months after weaning. We then compared mothers’ responses to calls of their previous kids with their responses to the same calls at five weeks postpartum. Subjects tended to respond more to their own kids at five weeks postpartum than 11–17 months later, but displayed stronger responses to their previous kids than to familiar kids from other females. Acoustic analyses showed that it is unlikely that mothers were responding to their previous kids simply because they confounded them with the new kids they were currently nursing. Therefore, our results provide evidence for strong, long-term vocal memory capacity in goats. The persistence of offspring vocal recognition beyond weaning could have important roles in kin social relationships and inbreeding avoidance. PMID:22719031
Mother goats do not forget their kids' calls.

PubMed

Briefer, Elodie F; Padilla de la Torre, Monica; McElligott, Alan G

2012-09-22

Parent-offspring recognition is crucial for offspring survival. At long distances, this recognition is mainly based on vocalizations. Because of maturation-related changes to the structure of vocalizations, parents have to learn successive call versions produced by their offspring throughout ontogeny in order to maintain recognition. However, because of the difficulties involved in following the same individuals over years, it is not clear how long this vocal memory persists. Here, we investigated long-term vocal recognition in goats. We tested responses of mothers to their kids' calls 7-13 months after weaning. We then compared mothers' responses to calls of their previous kids with their responses to the same calls at five weeks postpartum. Subjects tended to respond more to their own kids at five weeks postpartum than 11-17 months later, but displayed stronger responses to their previous kids than to familiar kids from other females. Acoustic analyses showed that it is unlikely that mothers were responding to their previous kids simply because they confounded them with the new kids they were currently nursing. Therefore, our results provide evidence for strong, long-term vocal memory capacity in goats. The persistence of offspring vocal recognition beyond weaning could have important roles in kin social relationships and inbreeding avoidance.
Sex differences in razorbill (Family: Alcidae) parent-offspring vocal recognition

NASA Astrophysics Data System (ADS)

Insley, Stephen J.; Paredes Vela, Rosana; Jones, Ian L.

2002-05-01

In this study we examines how a pattern of parental care may result in a sex bias in vocal recognition. In Razorbills (Alca torda), both sexes provide parental care to their chicks while at the nest, after which the male is the sole caregiver for an additional period at sea. Selection pressure acting on recognition behavior is expected to be strongest during the time when males and chicks are together at sea, and as a result, parent-offspring recognition was predicted to be better developed in the male parent, that is, show a paternal bias. In order to test this hypothesis, vocal playback experiments were conducted on breeding Razorbills at the Gannet Islands, Labrador, 2001. The data provide clear evidence of mutual vocal recognition between the male parent and chick but not between the female parent and chick, supporting the hypothesis that parent-offspring recognition is male biased in this species. In addition to acoustic recognition, such a bias could have important social implications for a variety of behavioral and basic life history traits such as cooperation and sex-biased dispersal.
Emotional Recognition in Autism Spectrum Conditions from Voices and Faces

ERIC Educational Resources Information Center

Stewart, Mary E.; McAdam, Clair; Ota, Mitsuhiko; Peppe, Sue; Cleland, Joanne

2013-01-01

The present study reports on a new vocal emotion recognition task and assesses whether people with autism spectrum conditions (ASC) perform differently from typically developed individuals on tests of emotional identification from both the face and the voice. The new test of vocal emotion contained trials in which the vocal emotion of the sentence…
Not just fear and sadness: meta-analytic evidence of pervasive emotion recognition deficits for facial and vocal expressions in psychopathy.

PubMed

Dawel, Amy; O'Kearney, Richard; McKone, Elinor; Palermo, Romina

2012-11-01

The present meta-analysis aimed to clarify whether deficits in emotion recognition in psychopathy are restricted to certain emotions and modalities or whether they are more pervasive. We also attempted to assess the influence of other important variables: age, and the affective factor of psychopathy. A systematic search of electronic databases and a subsequent manual search identified 26 studies that included 29 experiments (N = 1376) involving six emotion categories (anger, disgust, fear, happiness, sadness, surprise) across three modalities (facial, vocal, postural). Meta-analyses found evidence of pervasive impairments across modalities (facial and vocal) with significant deficits evident for several emotions (i.e., not only fear and sadness) in both adults and children/adolescents. These results are consistent with recent theorizing that the amygdala, which is believed to be dysfunctional in psychopathy, has a broad role in emotion processing. We discuss limitations of the available data that restrict the ability of meta-analysis to consider the influence of age and separate the sub-factors of psychopathy, highlighting important directions for future research. Copyright © 2012 Elsevier Ltd. All rights reserved.
Children's Recognition of Emotions from Vocal Cues

ERIC Educational Resources Information Center

Sauter, Disa A.; Panattoni, Charlotte; Happe, Francesca

2013-01-01

Emotional cues contain important information about the intentions and feelings of others. Despite a wealth of research into children's understanding of facial signals of emotions, little research has investigated the developmental trajectory of interpreting affective cues in the voice. In this study, 48 children ranging between 5 and 10 years were…
Atypical neural responses to vocal anger in attention-deficit/hyperactivity disorder.

PubMed

Chronaki, Georgia; Benikos, Nicholas; Fairchild, Graeme; Sonuga-Barke, Edmund J S

2015-04-01

Deficits in facial emotion processing, reported in attention-deficit/hyperactivity disorder (ADHD), have been linked to both early perceptual and later attentional components of event-related potentials (ERPs). However, the neural underpinnings of vocal emotion processing deficits in ADHD have yet to be characterised. Here, we report the first ERP study of vocal affective prosody processing in ADHD. Event-related potentials of 6-11-year-old children with ADHD (n = 25) and typically developing controls (n = 25) were recorded as they completed a task measuring recognition of vocal prosodic stimuli (angry, happy and neutral). Audiometric assessments were conducted to screen for hearing impairments. Children with ADHD were less accurate than controls at recognising vocal anger. Relative to controls, they displayed enhanced N100 and attenuated P300 components to vocal anger. The P300 effect was reduced, but remained significant, after controlling for N100 effects by rebaselining. Only the N100 effect was significant when children with ADHD and comorbid conduct disorder (n = 10) were excluded. This study provides the first evidence linking ADHD to atypical neural activity during the early perceptual stages of vocal anger processing. These effects may reflect preattentive hyper-vigilance to vocal anger in ADHD. © 2014 Association for Child and Adolescent Mental Health.
Cochlear Implants Special Issue Article: Vocal Emotion Recognition by Normal-Hearing Listeners and Cochlear Implant Users

PubMed Central

Luo, Xin; Fu, Qian-Jie; Galvin, John J.

2007-01-01

The present study investigated the ability of normal-hearing listeners and cochlear implant users to recognize vocal emotions. Sentences were produced by 1 male and 1 female talker according to 5 target emotions: angry, anxious, happy, sad, and neutral. Overall amplitude differences between the stimuli were either preserved or normalized. In experiment 1, vocal emotion recognition was measured in normal-hearing and cochlear implant listeners; cochlear implant subjects were tested using their clinically assigned processors. When overall amplitude cues were preserved, normal-hearing listeners achieved near-perfect performance, whereas listeners with cochlear implant recognized less than half of the target emotions. Removing the overall amplitude cues significantly worsened mean normal-hearing and cochlear implant performance. In experiment 2, vocal emotion recognition was measured in listeners with cochlear implant as a function of the number of channels (from 1 to 8) and envelope filter cutoff frequency (50 vs 400 Hz) in experimental speech processors. In experiment 3, vocal emotion recognition was measured in normal-hearing listeners as a function of the number of channels (from 1 to 16) and envelope filter cutoff frequency (50 vs 500 Hz) in acoustic cochlear implant simulations. Results from experiments 2 and 3 showed that both cochlear implant and normal-hearing performance significantly improved as the number of channels or the envelope filter cutoff frequency was increased. The results suggest that spectral, temporal, and overall amplitude cues each contribute to vocal emotion recognition. The poorer cochlear implant performance is most likely attributable to the lack of salient pitch cues and the limited functional spectral resolution. PMID:18003871
In the ear of the beholder: how age shapes emotion processing in nonverbal vocalizations.

PubMed

Lima, César F; Alves, Tiago; Scott, Sophie K; Castro, São Luís

2014-02-01

It is well established that emotion recognition of facial expressions declines with age, but evidence for age-related differences in vocal emotions is more limited. This is especially true for nonverbal vocalizations such as laughter, sobs, or sighs. In this study, 43 younger adults (M = 22 years) and 43 older ones (M = 61.4 years) provided multiple emotion ratings of nonverbal emotional vocalizations. Contrasting with previous research, which often includes only one positive emotion (happiness) versus several negative ones, we examined 4 positive and 4 negative emotions: achievement/triumph, amusement, pleasure, relief, anger, disgust, fear, and sadness. We controlled for hearing loss and assessed general cognitive decline, cognitive control, verbal intelligence, working memory, current affect, emotion regulation, and personality. Older adults were less sensitive than younger ones to the intended vocal emotions, as indicated by decrements in ratings on the intended emotion scales and accuracy. These effects were similar for positive and negative emotions, and they were independent of age-related differences in cognitive, affective, and personality measures. Regression analyses revealed that younger and older participants' responses could be predicted from the acoustic properties of the temporal, intensity, fundamental frequency, and spectral profile of the vocalizations. The two groups were similarly efficient in using the acoustic cues, but there were differences in the patterns of emotion-specific predictors. This study suggests that ageing produces specific changes on the processing of nonverbal vocalizations. That decrements were not attenuated for positive emotions indicates that they cannot be explained by a positivity effect in older adults. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Emotion recognition and social adjustment in school-aged girls and boys.

PubMed

Leppänen, J M; Hietanen, J K

2001-12-01

The present study investigated emotion recognition accuracy and its relation to social adjustment in 7-10 year-old children. The ability to recognize basic emotions from facial and vocal expressions was measured and compared to peer popularity and to teacher-rated social competence. The results showed that emotion recognition was related to these measures of social adjustment, but the gender of a child and emotion category affected this relationship. Emotion recognition accuracy was significantly related to social adjustment for the girls, but not for the boys. For the girls, especially the recognition of surprise was related to social adjustment. Together, these results suggest that the ability to recognize others' emotional states from nonverbal cues is an important socio-cognitive ability for school-aged girls.
Intelligibility of emotional speech in younger and older adults.

PubMed

Dupuis, Kate; Pichora-Fuller, M Kathleen

2014-01-01

Little is known about the influence of vocal emotions on speech understanding. Word recognition accuracy for stimuli spoken to portray seven emotions (anger, disgust, fear, sadness, neutral, happiness, and pleasant surprise) was tested in younger and older listeners. Emotions were presented in either mixed (heterogeneous emotions mixed in a list) or blocked (homogeneous emotion blocked in a list) conditions. Three main hypotheses were tested. First, vocal emotion affects word recognition accuracy; specifically, portrayals of fear enhance word recognition accuracy because listeners orient to threatening information and/or distinctive acoustical cues such as high pitch mean and variation. Second, older listeners recognize words less accurately than younger listeners, but the effects of different emotions on intelligibility are similar across age groups. Third, blocking emotions in list results in better word recognition accuracy, especially for older listeners, and reduces the effect of emotion on intelligibility because as listeners develop expectations about vocal emotion, the allocation of processing resources can shift from emotional to lexical processing. Emotion was the within-subjects variable: all participants heard speech stimuli consisting of a carrier phrase followed by a target word spoken by either a younger or an older talker, with an equal number of stimuli portraying each of seven vocal emotions. The speech was presented in multi-talker babble at signal to noise ratios adjusted for each talker and each listener age group. Listener age (younger, older), condition (mixed, blocked), and talker (younger, older) were the main between-subjects variables. Fifty-six students (Mage= 18.3 years) were recruited from an undergraduate psychology course; 56 older adults (Mage= 72.3 years) were recruited from a volunteer pool. All participants had clinically normal pure-tone audiometric thresholds at frequencies ≤3000 Hz. There were significant main effects of emotion, listener age group, and condition on the accuracy of word recognition in noise. Stimuli spoken in a fearful voice were the most intelligible, while those spoken in a sad voice were the least intelligible. Overall, word recognition accuracy was poorer for older than younger adults, but there was no main effect of talker, and the pattern of the effects of different emotions on intelligibility did not differ significantly across age groups. Acoustical analyses helped elucidate the effect of emotion and some intertalker differences. Finally, all participants performed better when emotions were blocked. For both groups, performance improved over repeated presentations of each emotion in both blocked and mixed conditions. These results are the first to demonstrate a relationship between vocal emotion and word recognition accuracy in noise for younger and older listeners. In particular, the enhancement of intelligibility by emotion is greatest for words spoken to portray fear and presented heterogeneously with other emotions. Fear may have a specialized role in orienting attention to words heard in noise. This finding may be an auditory counterpart to the enhanced detection of threat information in visual displays. The effect of vocal emotion on word recognition accuracy is preserved in older listeners with good audiograms and both age groups benefit from blocking and the repetition of emotions.
Music Education Intervention Improves Vocal Emotion Recognition

ERIC Educational Resources Information Center

Mualem, Orit; Lavidor, Michal

2015-01-01

The current study is an interdisciplinary examination of the interplay among music, language, and emotions. It consisted of two experiments designed to investigate the relationship between musical abilities and vocal emotional recognition. In experiment 1 (N = 24), we compared the influence of two short-term intervention programs--music and…
On the effectiveness of vocal imitations and verbal descriptions of sounds.

PubMed

Lemaitre, Guillaume; Rocchesso, Davide

2014-02-01

Describing unidentified sounds with words is a frustrating task and vocally imitating them is often a convenient way to address the issue. This article reports on a study that compared the effectiveness of vocal imitations and verbalizations to communicate different referent sounds. The stimuli included mechanical and synthesized sounds and were selected on the basis of participants' confidence in identifying the cause of the sounds, ranging from easy-to-identify to unidentifiable sounds. The study used a selection of vocal imitations and verbalizations deemed adequate descriptions of the referent sounds. These descriptions were used in a nine-alternative forced-choice experiment: Participants listened to a description and picked one sound from a list of nine possible referent sounds. Results showed that recognition based on verbalizations was maximally effective when the referent sounds were identifiable. Recognition accuracy with verbalizations dropped when identifiability of the sounds decreased. Conversely, recognition accuracy with vocal imitations did not depend on the identifiability of the referent sounds and was as high as with the best verbalizations. This shows that vocal imitations are an effective means of representing and communicating sounds and suggests that they could be used in a number of applications.
Feeling backwards? How temporal order in speech affects the time course of vocal emotion recognition

PubMed Central

Rigoulot, Simon; Wassiliwizky, Eugen; Pell, Marc D.

2013-01-01

Recent studies suggest that the time course for recognizing vocal expressions of basic emotion in speech varies significantly by emotion type, implying that listeners uncover acoustic evidence about emotions at different rates in speech (e.g., fear is recognized most quickly whereas happiness and disgust are recognized relatively slowly; Pell and Kotz, 2011). To investigate whether vocal emotion recognition is largely dictated by the amount of time listeners are exposed to speech or the position of critical emotional cues in the utterance, 40 English participants judged the meaning of emotionally-inflected pseudo-utterances presented in a gating paradigm, where utterances were gated as a function of their syllable structure in segments of increasing duration from the end of the utterance (i.e., gated syllable-by-syllable from the offset rather than the onset of the stimulus). Accuracy for detecting six target emotions in each gate condition and the mean identification point for each emotion in milliseconds were analyzed and compared to results from Pell and Kotz (2011). We again found significant emotion-specific differences in the time needed to accurately recognize emotions from speech prosody, and new evidence that utterance-final syllables tended to facilitate listeners' accuracy in many conditions when compared to utterance-initial syllables. The time needed to recognize fear, anger, sadness, and neutral from speech cues was not influenced by how utterances were gated, although happiness and disgust were recognized significantly faster when listeners heard the end of utterances first. Our data provide new clues about the relative time course for recognizing vocally-expressed emotions within the 400–1200 ms time window, while highlighting that emotion recognition from prosody can be shaped by the temporal properties of speech. PMID:23805115
Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations

PubMed Central

Sauter, Disa A.; Eisner, Frank; Ekman, Paul; Scott, Sophie K.

2010-01-01

Emotional signals are crucial for sharing important information, with conspecifics, for example, to warn humans of danger. Humans use a range of different cues to communicate to others how they feel, including facial, vocal, and gestural signals. We examined the recognition of nonverbal emotional vocalizations, such as screams and laughs, across two dramatically different cultural groups. Western participants were compared to individuals from remote, culturally isolated Namibian villages. Vocalizations communicating the so-called “basic emotions” (anger, disgust, fear, joy, sadness, and surprise) were bidirectionally recognized. In contrast, a set of additional emotions was only recognized within, but not across, cultural boundaries. Our findings indicate that a number of primarily negative emotions have vocalizations that can be recognized across cultures, while most positive emotions are communicated with culture-specific signals. PMID:20133790
The effect of time on word learning: an examination of decay of the memory trace and vocal rehearsal in children with and without specific language impairment.

PubMed

Alt, Mary; Spaulding, Tammie

2011-01-01

The purpose of this study was to measure the effect of time to response in a fast-mapping word learning task for children with specific language impairment (SLI) and children with typically developing language skills (TD). Manipulating time to response allows us to examine decay of the memory trace, the use of vocal rehearsal, and their effects on word learning. Participants included 40 school-age children: half with SLI and half with TD. The children were asked to expressively and receptively fast-map 24 novel labels for 24 novel animated dinosaurs. They were asked to demonstrate learning either immediately after presentation of the novel word or after a 10-second delay. Data were collected on the use of vocal rehearsal and for recognition and production accuracy. Although the SLI group was less accurate overall, there was no evidence of decay of the memory trace. Both groups used vocal rehearsal at comparable rates, which did not vary when learning was tested immediately or after a delay. Use of vocal rehearsal resulted in better accuracy on the recognition task, but only for the TD group. A delay in time to response without interference was not an undue burden for either group. Despite the fact that children with SLI used a vocal rehearsal strategy as often as unimpaired peers, they did not benefit from the strategy in the same way as their peers. Possible explanations for these findings and clinical implications will be discussed. Readers will learn about how time to response affects word learning in children with specific language impairment and unimpaired peers. They will see how this issue fits into a framework of phonological working memory. They will also become acquainted with the effect of vocal rehearsal on word learning. Copyright © 2011 Elsevier Inc. All rights reserved.

How Psychological Stress Affects Emotional Prosody.

PubMed

Paulmann, Silke; Furnes, Desire; Bøkenes, Anne Ming; Cozzolino, Philip J

2016-01-01

We explored how experimentally induced psychological stress affects the production and recognition of vocal emotions. In Study 1a, we demonstrate that sentences spoken by stressed speakers are judged by naïve listeners as sounding more stressed than sentences uttered by non-stressed speakers. In Study 1b, negative emotions produced by stressed speakers are generally less well recognized than the same emotions produced by non-stressed speakers. Multiple mediation analyses suggest this poorer recognition of negative stimuli was due to a mismatch between the variation of volume voiced by speakers and the range of volume expected by listeners. Together, this suggests that the stress level of the speaker affects judgments made by the receiver. In Study 2, we demonstrate that participants who were induced with a feeling of stress before carrying out an emotional prosody recognition task performed worse than non-stressed participants. Overall, findings suggest detrimental effects of induced stress on interpersonal sensitivity.
How Psychological Stress Affects Emotional Prosody

PubMed Central

Paulmann, Silke; Furnes, Desire; Bøkenes, Anne Ming; Cozzolino, Philip J.

2016-01-01

We explored how experimentally induced psychological stress affects the production and recognition of vocal emotions. In Study 1a, we demonstrate that sentences spoken by stressed speakers are judged by naïve listeners as sounding more stressed than sentences uttered by non-stressed speakers. In Study 1b, negative emotions produced by stressed speakers are generally less well recognized than the same emotions produced by non-stressed speakers. Multiple mediation analyses suggest this poorer recognition of negative stimuli was due to a mismatch between the variation of volume voiced by speakers and the range of volume expected by listeners. Together, this suggests that the stress level of the speaker affects judgments made by the receiver. In Study 2, we demonstrate that participants who were induced with a feeling of stress before carrying out an emotional prosody recognition task performed worse than non-stressed participants. Overall, findings suggest detrimental effects of induced stress on interpersonal sensitivity. PMID:27802287
Vocal Tract Representation in the Recognition of Cerebral Palsied Speech

ERIC Educational Resources Information Center

Rudzicz, Frank; Hirst, Graeme; van Lieshout, Pascal

2012-01-01

Purpose: In this study, the authors explored articulatory information as a means of improving the recognition of dysarthric speech by machine. Method: Data were derived chiefly from the TORGO database of dysarthric articulation (Rudzicz, Namasivayam, & Wolff, 2011) in which motions of various points in the vocal tract are measured during speech.…
Social Communication and Vocal Recognition in Free-Ranging Rhesus Monkeys

NASA Astrophysics Data System (ADS)

Rendall, Christopher Andrew

Kinship and individual identity are key determinants of primate sociality, and the capacity for vocal recognition of individuals and kin is hypothesized to be an important adaptation facilitating intra-group social communication. Research was conducted on adult female rhesus monkeys on Cayo Santiago, Puerto Rico to test this hypothesis for three acoustically distinct calls characterized by varying selective pressures on communicating identity: coos (contact calls), grunts (close range social calls), and noisy screams (agonistic recruitment calls). Vocalization playback experiments confirmed a capacity for both individual and kin recognition of coos, but not screams (grunts were not tested). Acoustic analyses, using traditional spectrographic methods as well as linear predictive coding techniques, indicated that coos (but not grunts or screams) were highly distinctive, and that the effects of vocal tract filtering--formants --contributed more to statistical discriminations of both individuals and kin groups than did temporal or laryngeal source features. Formants were identified from very short (23 ms.) segments of coos and were stable within calls, indicating that formant cues to individual and kin identity were available throughout a call. This aspect of formant cues is predicted to be an especially important design feature for signaling identity efficiently in complex acoustic environments. Results of playback experiments involving manipulated coo stimuli provided preliminary perceptual support for the statistical inference that formant cues take precedence in facilitating vocal recognition. The similarity of formants among female kin suggested a mechanism for the development of matrilineal vocal signatures from the genetic and environmental determinants of vocal tract morphology shared among relatives. The fact that screams --calls strongly expected to communicate identity--were not individually distinctive nor recognized suggested the possibility that their acoustic structure and role in signaling identity might be constrained by functional or morphological design requirements associated with their role in signaling submission.
Paternal kin recognition in the high frequency / ultrasonic range in a solitary foraging mammal

PubMed Central

2012-01-01

Background Kin selection is a driving force in the evolution of mammalian social complexity. Recognition of paternal kin using vocalizations occurs in taxa with cohesive, complex social groups. This is the first investigation of paternal kin recognition via vocalizations in a small-brained, solitary foraging mammal, the grey mouse lemur (Microcebus murinus), a frequent model for ancestral primates. We analyzed the high frequency/ultrasonic male advertisement (courtship) call and alarm call. Results Multi-parametric analyses of the calls’ acoustic parameters and discriminant function analyses showed that advertisement calls, but not alarm calls, contain patrilineal signatures. Playback experiments controlling for familiarity showed that females paid more attention to advertisement calls from unrelated males than from their fathers. Reactions to alarm calls from unrelated males and fathers did not differ. Conclusions 1) Findings provide the first evidence of paternal kin recognition via vocalizations in a small-brained, solitarily foraging mammal. 2) High predation, small body size, and dispersed social systems may select for acoustic paternal kin recognition in the high frequency/ultrasonic ranges, thus limiting risks of inbreeding and eavesdropping by predators or conspecific competitors. 3) Paternal kin recognition via vocalizations in mammals is not dependent upon a large brain and high social complexity, but may already have been an integral part of the dispersed social networks from which more complex, kin-based sociality emerged. PMID:23198727
Emotional recognition from dynamic facial, vocal and musical expressions following traumatic brain injury.

PubMed

Drapeau, Joanie; Gosselin, Nathalie; Peretz, Isabelle; McKerral, Michelle

2017-01-01

To assess emotion recognition from dynamic facial, vocal and musical expressions in sub-groups of adults with traumatic brain injuries (TBI) of different severities and identify possible common underlying mechanisms across domains. Forty-one adults participated in this study: 10 with moderate-severe TBI, nine with complicated mild TBI, 11 with uncomplicated mild TBI and 11 healthy controls, who were administered experimental (emotional recognition, valence-arousal) and control tasks (emotional and structural discrimination) for each domain. Recognition of fearful faces was significantly impaired in moderate-severe and in complicated mild TBI sub-groups, as compared to those with uncomplicated mild TBI and controls. Effect sizes were medium-large. Participants with lower GCS scores performed more poorly when recognizing fearful dynamic facial expressions. Emotion recognition from auditory domains was preserved following TBI, irrespective of severity. All groups performed equally on control tasks, indicating no perceptual disorders. Although emotional recognition from vocal and musical expressions was preserved, no correlation was found across auditory domains. This preliminary study may contribute to improving comprehension of emotional recognition following TBI. Future studies of larger samples could usefully include measures of functional impacts of recognition deficits for fearful facial expressions. These could help refine interventions for emotional recognition following a brain injury.
Sensory contribution to vocal emotion deficit in Parkinson's disease after subthalamic stimulation.

PubMed

Péron, Julie; Cekic, Sezen; Haegelen, Claire; Sauleau, Paul; Patel, Sona; Drapier, Dominique; Vérin, Marc; Grandjean, Didier

2015-02-01

Subthalamic nucleus (STN) deep brain stimulation in Parkinson's disease induces modifications in the recognition of emotion from voices (or emotional prosody). Nevertheless, the underlying mechanisms are still only poorly understood, and the role of acoustic features in these deficits has yet to be elucidated. Our aim was to identify the influence of acoustic features on changes in emotional prosody recognition following STN stimulation in Parkinson's disease. To this end, we analysed the performances of patients on vocal emotion recognition in pre-versus post-operative groups, as well as of matched controls, entering the acoustic features of the stimuli into our statistical models. Analyses revealed that the post-operative biased ratings on the Fear scale when patients listened to happy stimuli were correlated with loudness, while the biased ratings on the Sadness scale when they listened to happiness were correlated with fundamental frequency (F0). Furthermore, disturbed ratings on the Happiness scale when the post-operative patients listened to sadness were found to be correlated with F0. These results suggest that inadequate use of acoustic features following subthalamic stimulation has a significant impact on emotional prosody recognition in patients with Parkinson's disease, affecting the extraction and integration of acoustic cues during emotion perception. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sensory contribution to vocal emotion deficit in Parkinson’s disease after subthalamic stimulation

PubMed Central

Péron, Julie; Cekic, Sezen; Haegelen, Claire; Sauleau, Paul; Patel, Sona; Drapier, Dominique; Vérin, Marc; Grandjean, Didier

2016-01-01

Subthalamic nucleus (STN) deep brain stimulation in Parkinson’s disease induces modifications in the recognition of emotion from voices (or emotional prosody). Nevertheless, the underlying mechanisms are still only poorly understood, and the role of acoustic features in these deficits has yet to be elucidated. Our aim was to identify the influence of acoustic features on changes in emotional prosody recognition following STN stimulation in Parkinson’s disease. To this end, we analysed the performances of patients on vocal emotion recognition in pre-versus post-operative groups, as well as of matched controls, entering the acoustic features of the stimuli into our statistical models. Analyses revealed that the post-operative biased ratings on the Fear scale when patients listened to happy stimuli were correlated with loudness, while the biased ratings on the Sadness scale when they listened to happiness were correlated with fundamental frequency (F0). Furthermore, disturbed ratings on the Happiness scale when the post-operative patients listened to sadness were found to be correlated with F0. These results suggest that inadequate use of acoustic features following subthalamic stimulation has a significant impact on emotional prosody recognition in patients with Parkinson’s disease, affecting the extraction and integration of acoustic cues during emotion perception. PMID:25282055
Age-related differences in emotion recognition ability: a cross-sectional study.

PubMed

Mill, Aire; Allik, Jüri; Realo, Anu; Valk, Raivo

2009-10-01

Experimental studies indicate that recognition of emotions, particularly negative emotions, decreases with age. However, there is no consensus at which age the decrease in emotion recognition begins, how selective this is to negative emotions, and whether this applies to both facial and vocal expression. In the current cross-sectional study, 607 participants ranging in age from 18 to 84 years (mean age = 32.6 +/- 14.9 years) were asked to recognize emotions expressed either facially or vocally. In general, older participants were found to be less accurate at recognizing emotions, with the most distinctive age difference pertaining to a certain group of negative emotions. Both modalities revealed an age-related decline in the recognition of sadness and -- to a lesser degree -- anger, starting at about 30 years of age. Although age-related differences in the recognition of expression of emotion were not mediated by personality traits, 2 of the Big 5 traits, openness and conscientiousness, made an independent contribution to emotion-recognition performance. Implications of age-related differences in facial and vocal emotion expression and early onset of the selective decrease in emotion recognition are discussed in terms of previous findings and relevant theoretical models.
Adaptation to Vocal Expressions Reveals Multistep Perception of Auditory Emotion

PubMed Central

Maurage, Pierre; Rouger, Julien; Latinus, Marianne; Belin, Pascal

2014-01-01

The human voice carries speech as well as important nonlinguistic signals that influence our social interactions. Among these cues that impact our behavior and communication with other people is the perceived emotional state of the speaker. A theoretical framework for the neural processing stages of emotional prosody has suggested that auditory emotion is perceived in multiple steps (Schirmer and Kotz, 2006) involving low-level auditory analysis and integration of the acoustic information followed by higher-level cognition. Empirical evidence for this multistep processing chain, however, is still sparse. We examined this question using functional magnetic resonance imaging and a continuous carry-over design (Aguirre, 2007) to measure brain activity while volunteers listened to non-speech-affective vocalizations morphed on a continuum between anger and fear. Analyses dissociated neuronal adaptation effects induced by similarity in perceived emotional content between consecutive stimuli from those induced by their acoustic similarity. We found that bilateral voice-sensitive auditory regions as well as right amygdala coded the physical difference between consecutive stimuli. In contrast, activity in bilateral anterior insulae, medial superior frontal cortex, precuneus, and subcortical regions such as bilateral hippocampi depended predominantly on the perceptual difference between morphs. Our results suggest that the processing of vocal affect recognition is a multistep process involving largely distinct neural networks. Amygdala and auditory areas predominantly code emotion-related acoustic information while more anterior insular and prefrontal regions respond to the abstract, cognitive representation of vocal affect. PMID:24920615
Adaptation to vocal expressions reveals multistep perception of auditory emotion.

PubMed

Bestelmeyer, Patricia E G; Maurage, Pierre; Rouger, Julien; Latinus, Marianne; Belin, Pascal

2014-06-11

The human voice carries speech as well as important nonlinguistic signals that influence our social interactions. Among these cues that impact our behavior and communication with other people is the perceived emotional state of the speaker. A theoretical framework for the neural processing stages of emotional prosody has suggested that auditory emotion is perceived in multiple steps (Schirmer and Kotz, 2006) involving low-level auditory analysis and integration of the acoustic information followed by higher-level cognition. Empirical evidence for this multistep processing chain, however, is still sparse. We examined this question using functional magnetic resonance imaging and a continuous carry-over design (Aguirre, 2007) to measure brain activity while volunteers listened to non-speech-affective vocalizations morphed on a continuum between anger and fear. Analyses dissociated neuronal adaptation effects induced by similarity in perceived emotional content between consecutive stimuli from those induced by their acoustic similarity. We found that bilateral voice-sensitive auditory regions as well as right amygdala coded the physical difference between consecutive stimuli. In contrast, activity in bilateral anterior insulae, medial superior frontal cortex, precuneus, and subcortical regions such as bilateral hippocampi depended predominantly on the perceptual difference between morphs. Our results suggest that the processing of vocal affect recognition is a multistep process involving largely distinct neural networks. Amygdala and auditory areas predominantly code emotion-related acoustic information while more anterior insular and prefrontal regions respond to the abstract, cognitive representation of vocal affect. Copyright © 2014 Bestelmeyer et al.
In the Beginning Was the Familiar Voice Personally Familiar Voices in the Evolutionary and Contemporary Biology of Communication

PubMed Central

Sidtis, Diana; Kreiman, Jody

2011-01-01

The human voice is described in dialogic linguistics as an embodiment of self in a social context, contributing to expression, perception and mutual exchange of self, consciousness, inner life, and personhood. While these approaches are subjective and arise from phenomenological perspectives, scientific facts about personal vocal identity, and its role in biological development, support these views. It is our purpose to review studies of the biology of personal vocal identity -- the familiar voice pattern-- as providing an empirical foundation for the view that the human voice is an embodiment of self in the social context. Recent developments in the biology and evolution of communication are concordant with these notions, revealing that familiar voice recognition (also known as vocal identity recognition or individual vocal recognition) or contributed to survival in the earliest vocalizing species. Contemporary ethology documents the crucial role of familiar voices across animal species in signaling and perceiving internal states and personal identities. Neuropsychological studies of voice reveal multimodal cerebral associations arising across brain structures involved in memory, emotion, attention, and arousal in vocal perception and production, such that the voice represents the whole person. Although its roots are in evolutionary biology, human competence for processing layered social and personal meanings in the voice, as well as personal identity in a large repertory of familiar voice patterns, has achieved an immense sophistication. PMID:21710374
On the recognition of emotional vocal expressions: motivations for a holistic approach.

PubMed

Esposito, Anna; Esposito, Antonietta M

2012-10-01

Human beings seem to be able to recognize emotions from speech very well and information communication technology aims to implement machines and agents that can do the same. However, to be able to automatically recognize affective states from speech signals, it is necessary to solve two main technological problems. The former concerns the identification of effective and efficient processing algorithms capable of capturing emotional acoustic features from speech sentences. The latter focuses on finding computational models able to classify, with an approximation as good as human listeners, a given set of emotional states. This paper will survey these topics and provide some insights for a holistic approach to the automatic analysis, recognition and synthesis of affective states.
Humans (Homo sapiens) judge the emotional content of piglet (Sus scrofa domestica) calls based on simple acoustic parameters, not personality, empathy, nor attitude toward animals.

PubMed

Maruščáková, Iva L; Linhart, Pavel; Ratcliffe, Victoria F; Tallet, Céline; Reby, David; Špinka, Marek

2015-05-01

The vocal expression of emotion is likely driven by shared physiological principles among species. However, which acoustic features promote decoding of emotional state and how the decoding is affected by their listener's psychology remain poorly understood. Here we tested how acoustic features of piglet vocalizations interact with psychological profiles of human listeners to affect judgments of emotional content of heterospecific vocalizations. We played back 48 piglet call sequences recorded in four different contexts (castration, isolation, reunion, nursing) to 60 listeners. Listeners judged the emotional intensity and valence of the recordings and were further asked to attribute a context of emission from four proposed contexts. Furthermore, listeners completed a series of questionnaires assessing their personality (NEO-FFI personality inventory), empathy [Interpersonal Reactivity Index (IRI)] and attitudes to animals (Animal Attitudes Scale). None of the listeners' psychological traits affected the judgments. On the contrary, acoustic properties of recordings had a substantial effect on ratings. Recordings were rated as more intense with increasing pitch (mean fundamental frequency) and increasing proportion of vocalized sound within each stimulus recording and more negative with increasing pitch and increasing duration of the calls within the recording. More complex acoustic properties (jitter, harmonic-to-noise ratio, and presence of subharmonics) did not seem to affect the judgments. The probability of correct context recognition correlated positively with the assessed emotion intensity for castration and reunion calls, and negatively for nursing calls. In conclusion, listeners judged emotions from pig calls using simple acoustic properties and the perceived emotional intensity might guide the identification of the context. (c) 2015 APA, all rights reserved).
Subauditory Speech Recognition based on EMG/EPG Signals

NASA Technical Reports Server (NTRS)

Jorgensen, Charles; Lee, Diana Dee; Agabon, Shane; Lau, Sonie (Technical Monitor)

2003-01-01

Sub-vocal electromyogram/electro palatogram (EMG/EPG) signal classification is demonstrated as a method for silent speech recognition. Recorded electrode signals from the larynx and sublingual areas below the jaw are noise filtered and transformed into features using complex dual quad tree wavelet transforms. Feature sets for six sub-vocally pronounced words are trained using a trust region scaled conjugate gradient neural network. Real time signals for previously unseen patterns are classified into categories suitable for primitive control of graphic objects. Feature construction, recognition accuracy and an approach for extension of the technique to a variety of real world application areas are presented.
Social power and recognition of emotional prosody: High power is associated with lower recognition accuracy than low power.

PubMed

Uskul, Ayse K; Paulmann, Silke; Weick, Mario

2016-02-01

Listeners have to pay close attention to a speaker's tone of voice (prosody) during daily conversations. This is particularly important when trying to infer the emotional state of the speaker. Although a growing body of research has explored how emotions are processed from speech in general, little is known about how psychosocial factors such as social power can shape the perception of vocal emotional attributes. Thus, the present studies explored how social power affects emotional prosody recognition. In a correlational study (Study 1) and an experimental study (Study 2), we show that high power is associated with lower accuracy in emotional prosody recognition than low power. These results, for the first time, suggest that individuals experiencing high or low power perceive emotional tone of voice differently. (c) 2016 APA, all rights reserved).
Evidence for a Caregiving Instinct: Rapid Differentiation of Infant from Adult Vocalizations Using Magnetoencephalography.

PubMed

Young, Katherine S; Parsons, Christine E; Jegindoe Elmholdt, Else-Marie; Woolrich, Mark W; van Hartevelt, Tim J; Stevner, Angus B A; Stein, Alan; Kringelbach, Morten L

2016-03-01

Crying is the most salient vocal signal of distress. The cries of a newborn infant alert adult listeners and often elicit caregiving behavior. For the parent, rapid responding to an infant in distress is an adaptive behavior, functioning to ensure offspring survival. The ability to react rapidly requires quick recognition and evaluation of stimuli followed by a co-ordinated motor response. Previous neuroimaging research has demonstrated early specialized activity in response to infant faces. Using magnetoencephalography, we found similarly early (100-200 ms) differences in neural responses to infant and adult cry vocalizations in auditory, emotional, and motor cortical brain regions. We propose that this early differential activity may help to rapidly identify infant cries and engage affective and motor neural circuitry to promote adaptive behavioral responding, before conscious awareness. These differences were observed in adults who were not parents, perhaps indicative of a universal brain-based "caregiving instinct." © The Author 2015. Published by Oxford University Press.
Evidence for a Caregiving Instinct: Rapid Differentiation of Infant from Adult Vocalizations Using Magnetoencephalography

PubMed Central

Young, Katherine S.; Parsons, Christine E.; Jegindoe Elmholdt, Else-Marie; Woolrich, Mark W.; van Hartevelt, Tim J.; Stevner, Angus B. A.; Stein, Alan; Kringelbach, Morten L.

2016-01-01

Crying is the most salient vocal signal of distress. The cries of a newborn infant alert adult listeners and often elicit caregiving behavior. For the parent, rapid responding to an infant in distress is an adaptive behavior, functioning to ensure offspring survival. The ability to react rapidly requires quick recognition and evaluation of stimuli followed by a co-ordinated motor response. Previous neuroimaging research has demonstrated early specialized activity in response to infant faces. Using magnetoencephalography, we found similarly early (100–200 ms) differences in neural responses to infant and adult cry vocalizations in auditory, emotional, and motor cortical brain regions. We propose that this early differential activity may help to rapidly identify infant cries and engage affective and motor neural circuitry to promote adaptive behavioral responding, before conscious awareness. These differences were observed in adults who were not parents, perhaps indicative of a universal brain-based “caregiving instinct.” PMID:26656998
Encoding conditions affect recognition of vocally expressed emotions across cultures.

PubMed

Jürgens, Rebecca; Drolet, Matthis; Pirow, Ralph; Scheiner, Elisabeth; Fischer, Julia

2013-01-01

Although the expression of emotions in humans is considered to be largely universal, cultural effects contribute to both emotion expression and recognition. To disentangle the interplay between these factors, play-acted and authentic (non-instructed) vocal expressions of emotions were used, on the assumption that cultural effects may contribute differentially to the recognition of staged and spontaneous emotions. Speech tokens depicting four emotions (anger, sadness, joy, fear) were obtained from German radio archives and re-enacted by professional actors, and presented to 120 participants from Germany, Romania, and Indonesia. Participants in all three countries were poor at distinguishing between play-acted and spontaneous emotional utterances (58.73% correct on average with only marginal cultural differences). Nevertheless, authenticity influenced emotion recognition: across cultures, anger was recognized more accurately when play-acted (z = 15.06, p < 0.001) and sadness when authentic (z = 6.63, p < 0.001), replicating previous findings from German populations. German subjects revealed a slight advantage in recognizing emotions, indicating a moderate in-group advantage. There was no difference between Romanian and Indonesian subjects in the overall emotion recognition. Differential cultural effects became particularly apparent in terms of differential biases in emotion attribution. While all participants labeled play-acted expressions as anger more frequently than expected, German participants exhibited a further bias toward choosing anger for spontaneous stimuli. In contrast to the German sample, Romanian and Indonesian participants were biased toward choosing sadness. These results support the view that emotion recognition rests on a complex interaction of human universals and cultural specificities. Whether and in which way the observed biases are linked to cultural differences in self-construal remains an issue for further investigation.
Contextual influences on children's use of vocal affect cues during referential interpretation.

PubMed

Berman, Jared M J; Graham, Susan A; Chambers, Craig G

2013-01-01

In three experiments, we investigated 5-year-olds' sensitivity to speaker vocal affect during referential interpretation in cases where the indeterminacy is or is not resolved by speech information. In Experiment 1, analyses of eye gaze patterns and pointing behaviours indicated that 5-year-olds used vocal affect cues at the point where an ambiguous description was encountered. In Experiments 2 and 3, we used unambiguous situations to investigate how the referential context influences the ability to use affect cues earlier in the utterance. Here, we found a differential use of speaker vocal affect whereby 5-year-olds' referential hypotheses were influenced by negative vocal affect cues in advance of the noun, but not by positive affect cues. Together, our findings reveal how 5-year-olds use a speaker's vocal affect to identify potential referents in different contextual situations and also suggest that children may be more attuned to negative vocal affect than positive vocal affect, particularly early in an utterance.

A Comparison of Social Cognitive Profiles in Children with Autism Spectrum Disorders and Attention-Deficit/Hyperactivity Disorder: A Matter of Quantitative but Not Qualitative Difference?

ERIC Educational Resources Information Center

Demopoulos, Carly; Hopkins, Joyce; Davis, Amy

2013-01-01

The aim of this study was to compare social cognitive profiles of children and adolescents with Autism Spectrum Disorders (ASD) and ADHD. Participants diagnosed with an ASD (n = 137) were compared to participants with ADHD (n = 436) on tests of facial and vocal affect recognition, social judgment and problem-solving, and parent- and teacher-report…
Pupils dilate for vocal or familiar music.

PubMed

Weiss, Michael W; Trehub, Sandra E; Schellenberg, E Glenn; Habashi, Peter

2016-08-01

Previous research reveals that vocal melodies are remembered better than instrumental renditions. Here we explored the possibility that the voice, as a highly salient stimulus, elicits greater arousal than nonvocal stimuli, resulting in greater pupil dilation for vocal than for instrumental melodies. We also explored the possibility that pupil dilation indexes memory for melodies. We tracked pupil dilation during a single exposure to 24 unfamiliar folk melodies (half sung to la la, half piano) and during a subsequent recognition test in which the previously heard melodies were intermixed with 24 novel melodies (half sung, half piano) from the same corpus. Pupil dilation was greater for vocal melodies than for piano melodies in the exposure phase and in the test phase. It was also greater for previously heard melodies than for novel melodies. Our findings provide the first evidence that pupillometry can be used to measure recognition of stimuli that unfold over several seconds. They also provide the first evidence of enhanced arousal to vocal melodies during encoding and retrieval, thereby supporting the more general notion of the voice as a privileged signal. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Individual vocal signatures in barn owl nestlings: does individual recognition have an adaptive role in sibling vocal competition?

PubMed

Dreiss, A N; Ruppli, C A; Roulin, A

2014-01-01

To compete over limited parental resources, young animals communicate with their parents and siblings by producing honest vocal signals of need. Components of begging calls that are sensitive to food deprivation may honestly signal need, whereas other components may be associated with individual-specific attributes that do not change with time such as identity, sex, absolute age and hierarchy. In a sib-sib communication system where barn owl (Tyto alba) nestlings vocally negotiate priority access to food resources, we show that calls have individual signatures that are used by nestlings to recognize which siblings are motivated to compete, even if most vocalization features vary with hunger level. Nestlings were more identifiable when food-deprived than food-satiated, suggesting that vocal identity is emphasized when the benefit of winning a vocal contest is higher. In broods where siblings interact iteratively, we speculate that individual-specific signature permits siblings to verify that the most vocal individual in the absence of parents is the one that indeed perceived the food brought by parents. Individual recognition may also allow nestlings to associate identity with individual-specific characteristics such as position in the within-brood dominance hierarchy. Calls indeed revealed age hierarchy and to a lower extent sex and absolute age. Using a cross-fostering experimental design, we show that most acoustic features were related to the nest of origin (but not the nest of rearing), suggesting a genetic or an early developmental effect on the ontogeny of vocal signatures. To conclude, our study suggests that sibling competition has promoted the evolution of vocal behaviours that signal not only hunger level but also intrinsic individual characteristics such as identity, family, sex and age. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.
Bihippocampal damage with emotional dysfunction: impaired auditory recognition of fear.

PubMed

Ghika-Schmid, F; Ghika, J; Vuilleumier, P; Assal, G; Vuadens, P; Scherer, K; Maeder, P; Uske, A; Bogousslavsky, J

1997-01-01

A right-handed man developed a sudden transient, amnestic syndrome associated with bilateral hemorrhage of the hippocampi, probably due to Urbach-Wiethe disease. In the 3rd month, despite significant hippocampal structural damage on imaging, only a milder degree of retrograde and anterograde amnesia persisted on detailed neuropsychological examination. On systematic testing of recognition of facial and vocal expression of emotion, we found an impairment of the vocal perception of fear, but not that of other emotions, such as joy, sadness and anger. Such selective impairment of fear perception was not present in the recognition of facial expression of emotion. Thus emotional perception varies according to the different aspects of emotions and the different modality of presentation (faces versus voices). This is consistent with the idea that there may be multiple emotion systems. The study of emotional perception in this unique case of bilateral involvement of hippocampus suggests that this structure may play a critical role in the recognition of fear in vocal expression, possibly dissociated from that of other emotions and from that of fear in facial expression. In regard of recent data suggesting that the amygdala is playing a role in the recognition of fear in the auditory as well as in the visual modality this could suggest that the hippocampus may be part of the auditory pathway of fear recognition.
Auditory perception vs. recognition: representation of complex communication sounds in the mouse auditory cortical fields.

PubMed

Geissler, Diana B; Ehret, Günter

2004-02-01

Details of brain areas for acoustical Gestalt perception and the recognition of species-specific vocalizations are not known. Here we show how spectral properties and the recognition of the acoustical Gestalt of wriggling calls of mouse pups based on a temporal property are represented in auditory cortical fields and an association area (dorsal field) of the pups' mothers. We stimulated either with a call model releasing maternal behaviour at a high rate (call recognition) or with two models of low behavioural significance (perception without recognition). Brain activation was quantified using c-Fos immunocytochemistry, counting Fos-positive cells in electrophysiologically mapped auditory cortical fields and the dorsal field. A frequency-specific labelling in two primary auditory fields is related to call perception but not to the discrimination of the biological significance of the call models used. Labelling related to call recognition is present in the second auditory field (AII). A left hemisphere advantage of labelling in the dorsoposterior field seems to reflect an integration of call recognition with maternal responsiveness. The dorsal field is activated only in the left hemisphere. The spatial extent of Fos-positive cells within the auditory cortex and its fields is larger in the left than in the right hemisphere. Our data show that a left hemisphere advantage in processing of a species-specific vocalization up to recognition is present in mice. The differential representation of vocalizations of high vs. low biological significance, as seen only in higher-order and not in primary fields of the auditory cortex, is discussed in the context of perceptual strategies.
Analysis of human scream and its impact on text-independent speaker verification.

PubMed

Hansen, John H L; Nandwana, Mahesh Kumar; Shokouhi, Navid

2017-04-01

Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.
Mother-offspring recognition in the domestic cat: Kittens recognize their own mother's call.

PubMed

Szenczi, Péter; Bánszegi, Oxána; Urrutia, Andrea; Faragó, Tamás; Hudson, Robyn

2016-07-01

Acoustic communication can play an important part in mother-young recognition in many mammals. This, however, has still only been investigated in a small range mainly of herd- or colony-living species. Here we report on the behavioral response of kittens of the domestic cat, a typically solitary carnivore, to playbacks of "greeting chirps" and "meows" from their own versus alien mothers. We found significantly stronger responses to the chirps from kittens' own mother than to her meows or to the chirps or meows of alien mothers. Acoustic analysis revealed greater variation between vocalizations from different mothers than for vocalizations from the same mother. We conclude that chirps emitted by mother cats at the nest represent a specific form of vocal communication with their young, and that kittens learn and respond positively to these and distinguish them from chirps of other mothers and from other cat vocalizations while still in the nest. © 2016 Wiley Periodicals, Inc. Dev Psychobiol 58: 568-577, 2016. © 2016 Wiley Periodicals, Inc.
Speech recognition for embedded automatic positioner for laparoscope

NASA Astrophysics Data System (ADS)

Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin

2014-07-01

In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
Early Sign Language Experience Goes Along with an Increased Cross-modal Gain for Affective Prosodic Recognition in Congenitally Deaf CI Users.

PubMed

Fengler, Ineke; Delfau, Pia-Céline; Röder, Brigitte

2018-04-01

It is yet unclear whether congenitally deaf cochlear implant (CD CI) users' visual and multisensory emotion perception is influenced by their history in sign language acquisition. We hypothesized that early-signing CD CI users, relative to late-signing CD CI users and hearing, non-signing controls, show better facial expression recognition and rely more on the facial cues of audio-visual emotional stimuli. Two groups of young adult CD CI users-early signers (ES CI users; n = 11) and late signers (LS CI users; n = 10)-and a group of hearing, non-signing, age-matched controls (n = 12) performed an emotion recognition task with auditory, visual, and cross-modal emotionally congruent and incongruent speech stimuli. On different trials, participants categorized either the facial or the vocal expressions. The ES CI users more accurately recognized affective prosody than the LS CI users in the presence of congruent facial information. Furthermore, the ES CI users, but not the LS CI users, gained more than the controls from congruent visual stimuli when recognizing affective prosody. Both CI groups performed overall worse than the controls in recognizing affective prosody. These results suggest that early sign language experience affects multisensory emotion perception in CD CI users.
Vocal responses of austral forest frogs to amplitude and degradation patterns of advertisement calls.

PubMed

Penna, Mario; Moreno-Gómez, Felipe N; Muñoz, Matías I; Cisternas, Javiera

2017-07-01

Degradation phenomena affecting animal acoustic signals may provide cues to assess the distance of emitters. Recognition of degraded signals has been extensively demonstrated in birds, and recently studies have also reported detection of degraded patterns in anurans that call at or above ground level. In the current study we explore the vocal responses of the syntopic burrowing male frogs Eupsophus emiliopugini and E. calcaratus from the South American temperate forest to synthetic conspecific calls differing in amplitude and emulating degraded and non-degraded signal patterns. The results show a strong dependence of vocal responses on signal amplitude, and a general lack of differential responses to signals with different pulse amplitude modulation depths in E. emiliopugini and no effect of relative amplitude of harmonics in E. calcaratus. Such limited discrimination of signal degradation patterns from non-degraded signals is likely related to the burrowing habits of these species. Shelters amplify outgoing and incoming conspecific vocalizations, but do not counteract signal degradation to an extent comparable to calling strategies used by other frogs. The limited detection abilities and resultant response permissiveness to degraded calls in these syntopic burrowing species would be advantageous for animals communicating in circumstances in which signal alteration prevails. Copyright © 2017 Elsevier B.V. All rights reserved.
Auditory–vocal mirroring in songbirds

PubMed Central

Mooney, Richard

2014-01-01

Mirror neurons are theorized to serve as a neural substrate for spoken language in humans, but the existence and functions of auditory–vocal mirror neurons in the human brain remain largely matters of speculation. Songbirds resemble humans in their capacity for vocal learning and depend on their learned songs to facilitate courtship and individual recognition. Recent neurophysiological studies have detected putative auditory–vocal mirror neurons in a sensorimotor region of the songbird's brain that plays an important role in expressive and receptive aspects of vocal communication. This review discusses the auditory and motor-related properties of these cells, considers their potential role on song learning and communication in relation to classical studies of birdsong, and points to the circuit and developmental mechanisms that may give rise to auditory–vocal mirroring in the songbird's brain. PMID:24778375
Auditory-vocal mirroring in songbirds.

PubMed

Mooney, Richard

2014-01-01

Mirror neurons are theorized to serve as a neural substrate for spoken language in humans, but the existence and functions of auditory-vocal mirror neurons in the human brain remain largely matters of speculation. Songbirds resemble humans in their capacity for vocal learning and depend on their learned songs to facilitate courtship and individual recognition. Recent neurophysiological studies have detected putative auditory-vocal mirror neurons in a sensorimotor region of the songbird's brain that plays an important role in expressive and receptive aspects of vocal communication. This review discusses the auditory and motor-related properties of these cells, considers their potential role on song learning and communication in relation to classical studies of birdsong, and points to the circuit and developmental mechanisms that may give rise to auditory-vocal mirroring in the songbird's brain.
Impaired perception of facial emotion in developmental prosopagnosia.

PubMed

Biotti, Federica; Cook, Richard

2016-08-01

Developmental prosopagnosia (DP) is a neurodevelopmental condition characterised by difficulties recognising faces. Despite severe difficulties recognising facial identity, expression recognition is typically thought to be intact in DP; case studies have described individuals who are able to correctly label photographic displays of facial emotion, and no group differences have been reported. This pattern of deficits suggests a locus of impairment relatively late in the face processing stream, after the divergence of expression and identity analysis pathways. To date, however, there has been little attempt to investigate emotion recognition systematically in a large sample of developmental prosopagnosics using sensitive tests. In the present study, we describe three complementary experiments that examine emotion recognition in a sample of 17 developmental prosopagnosics. In Experiment 1, we investigated observers' ability to make binary classifications of whole-face expression stimuli drawn from morph continua. In Experiment 2, observers judged facial emotion using only the eye-region (the rest of the face was occluded). Analyses of both experiments revealed diminished ability to classify facial expressions in our sample of developmental prosopagnosics, relative to typical observers. Imprecise expression categorisation was particularly evident in those individuals exhibiting apperceptive profiles, associated with problems encoding facial shape accurately. Having split the sample of prosopagnosics into apperceptive and non-apperceptive subgroups, only the apperceptive prosopagnosics were impaired relative to typical observers. In our third experiment, we examined the ability of observers' to classify the emotion present within segments of vocal affect. Despite difficulties judging facial emotion, the prosopagnosics exhibited excellent recognition of vocal affect. Contrary to the prevailing view, our results suggest that many prosopagnosics do experience difficulties classifying expressions, particularly those with apperceptive profiles. These individuals may have difficulties forming view-invariant structural descriptions at an early stage in the face processing stream, before identity and expression pathways diverge. Copyright © 2016 Elsevier Ltd. All rights reserved.
Speech therapy and voice recognition instrument

NASA Technical Reports Server (NTRS)

Cohen, J.; Babcock, M. L.

1972-01-01

Characteristics of electronic circuit for examining variations in vocal excitation for diagnostic purposes and in speech recognition for determiniog voice patterns and pitch changes are described. Operation of the circuit is discussed and circuit diagram is provided.
Path Models of Vocal Emotion Communication

PubMed Central

Bänziger, Tanja; Hosoya, Georg; Scherer, Klaus R.

2015-01-01

We propose to use a comprehensive path model of vocal emotion communication, encompassing encoding, transmission, and decoding processes, to empirically model data sets on emotion expression and recognition. The utility of the approach is demonstrated for two data sets from two different cultures and languages, based on corpora of vocal emotion enactment by professional actors and emotion inference by naïve listeners. Lens model equations, hierarchical regression, and multivariate path analysis are used to compare the relative contributions of objectively measured acoustic cues in the enacted expressions and subjective voice cues as perceived by listeners to the variance in emotion inference from vocal expressions for four emotion families (fear, anger, happiness, and sadness). While the results confirm the central role of arousal in vocal emotion communication, the utility of applying an extended path modeling framework is demonstrated by the identification of unique combinations of distal cues and proximal percepts carrying information about specific emotion families, independent of arousal. The statistical models generated show that more sophisticated acoustic parameters need to be developed to explain the distal underpinnings of subjective voice quality percepts that account for much of the variance in emotion inference, in particular voice instability and roughness. The general approach advocated here, as well as the specific results, open up new research strategies for work in psychology (specifically emotion and social perception research) and engineering and computer science (specifically research and development in the domain of affective computing, particularly on automatic emotion detection and synthetic emotion expression in avatars). PMID:26325076
Within-individual variation in bullfrog vocalizations: implications for a vocally mediated social recognition system.

PubMed

Bee, Mark A

2004-12-01

Acoustic signals provide a basis for social recognition in a wide range of animals. Few studies, however, have attempted to relate the patterns of individual variation in signals to behavioral discrimination thresholds used by receivers to discriminate among individuals. North American bullfrogs (Rana catesbeiana) discriminate among familiar and unfamiliar individuals based on individual variation in advertisement calls. The sources, patterns, and magnitudes of variation in eight acoustic properties of multiple-note advertisement calls were examined to understand how patterns of within-individual variation might either constrain, or provide additional cues for, vocal recognition. Six of eight acoustic properties exhibited significant note-to-note variation within multiple-note calls. Despite this source of within-individual variation, all call properties varied significantly among individuals, and multivariate analyses indicated that call notes were individually distinct. Fine-temporal and spectral call properties exhibited less within-individual variation compared to gross-temporal properties and contributed most toward statistically distinguishing among individuals. Among-individual differences in the patterns of within-individual variation in some properties suggest that within-individual variation could also function as a recognition cue. The distributions of among-individual and within-individual differences were used to generate hypotheses about the expected behavioral discrimination thresholds of receivers.
Automated Assessment of Child Vocalization Development Using LENA.

PubMed

Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance

2017-07-12

To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.
Discussion: Changes in Vocal Production and Auditory Perception after Hair Cell Regeneration.

ERIC Educational Resources Information Center

Ryals, Brenda M.; Dooling, Robert J.

2000-01-01

A bird study found that with sufficient time and training after hair cell and hearing loss and hair cell regeneration, the mature avian auditory system can accommodate input from a newly regenerated periphery sufficiently to allow for recognition of previously familiar vocalizations and the learning of new complex acoustic classifications.…
Somatosensory Representations Link the Perception of Emotional Expressions and Sensory Experience.

PubMed

Kragel, Philip A; LaBar, Kevin S

2016-01-01

Studies of human emotion perception have linked a distributed set of brain regions to the recognition of emotion in facial, vocal, and body expressions. In particular, lesions to somatosensory cortex in the right hemisphere have been shown to impair recognition of facial and vocal expressions of emotion. Although these findings suggest that somatosensory cortex represents body states associated with distinct emotions, such as a furrowed brow or gaping jaw, functional evidence directly linking somatosensory activity and subjective experience during emotion perception is critically lacking. Using functional magnetic resonance imaging and multivariate decoding techniques, we show that perceiving vocal and facial expressions of emotion yields hemodynamic activity in right somatosensory cortex that discriminates among emotion categories, exhibits somatotopic organization, and tracks self-reported sensory experience. The findings both support embodied accounts of emotion and provide mechanistic insight into how emotional expressions are capable of biasing subjective experience in those who perceive them.
Somatosensory Representations Link the Perception of Emotional Expressions and Sensory Experience123

PubMed Central

2016-01-01

Abstract Studies of human emotion perception have linked a distributed set of brain regions to the recognition of emotion in facial, vocal, and body expressions. In particular, lesions to somatosensory cortex in the right hemisphere have been shown to impair recognition of facial and vocal expressions of emotion. Although these findings suggest that somatosensory cortex represents body states associated with distinct emotions, such as a furrowed brow or gaping jaw, functional evidence directly linking somatosensory activity and subjective experience during emotion perception is critically lacking. Using functional magnetic resonance imaging and multivariate decoding techniques, we show that perceiving vocal and facial expressions of emotion yields hemodynamic activity in right somatosensory cortex that discriminates among emotion categories, exhibits somatotopic organization, and tracks self-reported sensory experience. The findings both support embodied accounts of emotion and provide mechanistic insight into how emotional expressions are capable of biasing subjective experience in those who perceive them. PMID:27280154

Coding of vocalizations by single neurons in ventrolateral prefrontal cortex.

PubMed

Plakke, Bethany; Diltz, Mark D; Romanski, Lizabeth M

2013-11-01

Neuronal activity in single prefrontal neurons has been correlated with behavioral responses, rules, task variables and stimulus features. In the non-human primate, neurons recorded in ventrolateral prefrontal cortex (VLPFC) have been found to respond to species-specific vocalizations. Previous studies have found multisensory neurons which respond to simultaneously presented faces and vocalizations in this region. Behavioral data suggests that face and vocal information are inextricably linked in animals and humans and therefore may also be tightly linked in the coding of communication calls in prefrontal neurons. In this study we therefore examined the role of VLPFC in encoding vocalization call type information. Specifically, we examined previously recorded single unit responses from the VLPFC in awake, behaving rhesus macaques in response to 3 types of species-specific vocalizations made by 3 individual callers. Analysis of responses by vocalization call type and caller identity showed that ∼19% of cells had a main effect of call type with fewer cells encoding caller. Classification performance of VLPFC neurons was ∼42% averaged across the population. When assessed at discrete time bins, classification performance reached 70 percent for coos in the first 300 ms and remained above chance for the duration of the response period, though performance was lower for other call types. In light of the sub-optimal classification performance of the majority of VLPFC neurons when only vocal information is present, and the recent evidence that most VLPFC neurons are multisensory, the potential enhancement of classification with the addition of accompanying face information is discussed and additional studies recommended. Behavioral and neuronal evidence has shown a considerable benefit in recognition and memory performance when faces and voices are presented simultaneously. In the natural environment both facial and vocalization information is present simultaneously and neural systems no doubt evolved to integrate multisensory stimuli during recognition. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives". Copyright © 2013 Elsevier B.V. All rights reserved.
Individual identity and affective valence in marmoset calls: in vivo brain imaging with vocal sound playback.

PubMed

Kato, Masaki; Yokoyama, Chihiro; Kawasaki, Akihiro; Takeda, Chiho; Koike, Taku; Onoe, Hirotaka; Iriki, Atsushi

2018-05-01

As with humans, vocal communication is an important social tool for nonhuman primates. Common marmosets (Callithrix jacchus) often produce whistle-like 'phee' calls when they are visually separated from conspecifics. The neural processes specific to phee call perception, however, are largely unknown, despite the possibility that these processes involve social information. Here, we examined behavioral and whole-brain mapping evidence regarding the detection of individual conspecific phee calls using an audio playback procedure. Phee calls evoked sound exploratory responses when the caller changed, indicating that marmosets can discriminate between caller identities. Positron emission tomography with [ 18 F] fluorodeoxyglucose revealed that perception of phee calls from a single subject was associated with activity in the dorsolateral prefrontal, medial prefrontal, orbitofrontal cortices, and the amygdala. These findings suggest that these regions are implicated in cognitive and affective processing of salient social information. However, phee calls from multiple subjects induced brain activation in only some of these regions, such as the dorsolateral prefrontal cortex. We also found distinctive brain deactivation and functional connectivity associated with phee call perception depending on the caller change. According to changes in pupillary size, phee calls from a single subject induced a higher arousal level compared with those from multiple subjects. These results suggest that marmoset phee calls convey information about individual identity and affective valence depending on the consistency or variability of the caller. Based on the flexible perception of the call based on individual recognition, humans and marmosets may share some neural mechanisms underlying conspecific vocal perception.
Asymmetries in the individual distinctiveness and maternal recognition of infant contact calls and distress screams in baboons

PubMed Central

Rendall, Drew; Notman, Hugh; Owren, Michael J.

2009-01-01

A key component of nonhuman primate vocal communication is the production and recognition of clear cues to social identity that function in the management of these species’ individualistic social relationships. However, it remains unclear how ubiquitous such identity cues are across call types and age-sex classes and what the underlying vocal production mechanisms responsible might be. This study focused on two structurally distinct call types produced by infant baboons in contexts that place a similar functional premium on communicating clear cues to caller identity: (1) contact calls produced when physically separated from, and attempting to relocate, mothers and (2) distress screams produced when aggressively attacked by other group members. Acoustic analyses and field experiments were conducted to examine individual differentiation in single vocalizations of each type and to test mothers’ ability to recognize infant calls. Both call types showed statistically significant individual differentiation, but the magnitude of the differentiation was substantially higher in contact calls. Mothers readily discriminated own-offspring contact calls from those of familiar but unrelated infants, but did not do so when it came to distress screams. Several possible explanations for these asymmetries in call differentiation and recognition are considered. PMID:19275336
Pinniped bioacoustics: Atmospheric and hydrospheric signal production, reception, and function

NASA Astrophysics Data System (ADS)

Schusterman, Ronald J.; Kastak, David; Reichmuth Kastak, Colleen; Holt, Marla; Southall, Brandon L.

2004-05-01

There is no convincing evidence that any of the 33 pinniped species evolved acoustic specializations for echolocation. However, all species produce and localize signals amphibiously in different communicative contexts. In the setting of sexual selection, aquatic mating male phocids and walruses tend to emit underwater calls, while male otariids and phocids that breed terrestrially emit airborne calls. Signature vocalizations are widespread among pinnipeds. There is evidence that males use signature threat calls, and it is possible that vocal recognition may be used by territorial males to form categories consisting of neighbors and strangers. In terms of mother-offspring recognition, both otariid females and their pups use acoustical cues for mutual recognition. In contrast, reunions between phocid females and their dependent pups depend mostly on pup vocalizations. In terms of signal reception, audiometric studies show that otariids are highly sensitive to aerial sounds but slightly less sensitive to underwater sounds. Conversely, except for deep-diving elephant seals, phocids are quite sensitive to acoustic signals both in air and under water. Finally, despite differences in absolute hearing sensitivity, pinnipeds have similar masked hearing capabilities in both media, supporting the notion that cochlear mechanics determine the effects of noise on hearing.
The program complex for vocal recognition

NASA Astrophysics Data System (ADS)

Konev, Anton; Kostyuchenko, Evgeny; Yakimuk, Alexey

2017-01-01

This article discusses the possibility of applying the algorithm of determining the pitch frequency for the note recognition problems. Preliminary study of programs-analogues were carried out for programs with function “recognition of the music”. The software package based on the algorithm for pitch frequency calculation was implemented and tested. It was shown that the algorithm allows recognizing the notes in the vocal performance of the user. A single musical instrument, a set of musical instruments, and a human voice humming a tune can be the sound source. The input file is initially presented in the .wav format or is recorded in this format from a microphone. Processing is performed by sequentially determining the pitch frequency and conversion of its values to the note. According to test results, modification of algorithms used in the complex was planned.
Apoptosis and Vocal Fold Disease: Clinically Relevant Implications of Epithelial Cell Death

ERIC Educational Resources Information Center

Novaleski, Carolyn K.; Carter, Bruce D.; Sivasankar, M. Preeti; Ridner, Sheila H.; Dietrich, Mary S.; Rousseau, Bernard

2017-01-01

Purpose: Vocal fold diseases affecting the epithelium have a detrimental impact on vocal function. This review article provides an overview of apoptosis, the most commonly studied type of programmed cell death. Because apoptosis can damage epithelial cells, this article examines the implications of apoptosis on diseases affecting the vocal fold…
Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study.

PubMed

Kogan, J A; Margoliash, D

1998-04-01

The performance of two techniques is compared for automated recognition of bird song units from continuous recordings. The advantages and limitations of dynamic time warping (DTW) and hidden Markov models (HMMs) are evaluated on a large database of male songs of zebra finches (Taeniopygia guttata) and indigo buntings (Passerina cyanea), which have different types of vocalizations and have been recorded under different laboratory conditions. Depending on the quality of recordings and complexity of song, the DTW-based technique gives excellent to satisfactory performance. Under challenging conditions such as noisy recordings or presence of confusing short-duration calls, good performance of the DTW-based technique requires careful selection of templates that may demand expert knowledge. Because HMMs are trained, equivalent or even better performance of HMMs can be achieved based only on segmentation and labeling of constituent vocalizations, albeit with many more training examples than DTW templates. One weakness in HMM performance is the misclassification of short-duration vocalizations or song units with more variable structure (e.g., some calls, and syllables of plastic songs). To address these and other limitations, new approaches for analyzing bird vocalizations are discussed.
Distributed Recognition of Natural Songs by European Starlings

ERIC Educational Resources Information Center

Knudsen, Daniel; Thompson, Jason V.; Gentner, Timothy Q.

2010-01-01

Individual vocal recognition behaviors in songbirds provide an excellent framework for the investigation of comparative psychological and neurobiological mechanisms that support the perception and cognition of complex acoustic communication signals. To this end, the complex songs of European starlings have been studied extensively. Yet, several…
On the Time Course of Vocal Emotion Recognition

PubMed Central

Pell, Marc D.; Kotz, Sonja A.

2011-01-01

How quickly do listeners recognize emotions from a speaker's voice, and does the time course for recognition vary by emotion type? To address these questions, we adapted the auditory gating paradigm to estimate how much vocal information is needed for listeners to categorize five basic emotions (anger, disgust, fear, sadness, happiness) and neutral utterances produced by male and female speakers of English. Semantically-anomalous pseudo-utterances (e.g., The rivix jolled the silling) conveying each emotion were divided into seven gate intervals according to the number of syllables that listeners heard from sentence onset. Participants (n = 48) judged the emotional meaning of stimuli presented at each gate duration interval, in a successive, blocked presentation format. Analyses looked at how recognition of each emotion evolves as an utterance unfolds and estimated the “identification point” for each emotion. Results showed that anger, sadness, fear, and neutral expressions are recognized more accurately at short gate intervals than happiness, and particularly disgust; however, as speech unfolds, recognition of happiness improves significantly towards the end of the utterance (and fear is recognized more accurately than other emotions). When the gate associated with the emotion identification point of each stimulus was calculated, data indicated that fear (M = 517 ms), sadness (M = 576 ms), and neutral (M = 510 ms) expressions were identified from shorter acoustic events than the other emotions. These data reveal differences in the underlying time course for conscious recognition of basic emotions from vocal expressions, which should be accounted for in studies of emotional speech processing. PMID:22087275
The effect of time on word learning: An examination of decay of the memory trace and vocal rehearsal in children with and without specific language impairment

PubMed Central

Alt, Mary; Spaulding, Tammie

2011-01-01

Purpose The purpose of this study was to measure the effect of time to response in a fast-mapping word learning task for children with Specific Language Impairment (SLI) and children with typically-developing language skills (TD). Manipulating time to response allows us to examine decay of the memory trace, the use of vocal rehearsal, and their effects on word learning. Method Participants included 40 school-age children: half with SLI and half with TD. The children were asked to expressively and receptively fast-map 24 novel labels for 24 novel animated dinosaurs. They were asked to demonstrate learning either immediately after presentation of the novel word or after a 10-second delay. Data were collected on the use of vocal rehearsal and for recognition and production accuracy. Results Although the SLI group was less accurate overall, there was no evidence of decay of the memory trace. Both groups used vocal rehearsal at comparable rates, which did not vary when learning was tested immediately or after a delay. Use of vocal rehearsal resulted in better accuracy on the recognition task, but only for the TD group. Conclusions A delay in time to response without interference was not an undue burden for either group. Despite the fact that children with SLI used a vocal rehearsal strategy as often as unimpaired peers, they did not benefit from the strategy in the same way as their peers. Possible explanations for these findings and clinical implications will be discussed. PMID:21885056
Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data

PubMed Central

Cao, Beiming; Kim, Myungjong; Mau, Ted; Wang, Jun

2017-01-01

Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance. PMID:29423453
Paradigms and progress in vocal fold restoration.

PubMed

Ford, Charles N

2008-09-01

Science advances occur through orderly steps, puzzle-solving leaps, or divergences from the accepted disciplinary matrix that occasionally result in a revolutionary paradigm shift. Key advances must overcome bias, criticism, and rejection. Examples in biological science include use of embryonic stem cells, recognition of Helicobacter pylori in the etiology of ulcer disease, and the evolution of species. Our work in vocal fold restoration reflects these patterns. We progressed through phases of tissue replacement with fillers and biological implants, to current efforts at vocal fold regeneration through tissue engineering, and face challenges of a new "systems biology" paradigm embracing genomics and proteomics.
Comments on "Intraspecific and geographic variation of West Indian manatee (Trichechus manatus spp.) vocalizations" [J. Acoust. Soc. Am. 114, 66-69 (2003)].

PubMed

Sousa-Lima, Renata S

2006-06-01

This letter concerns the paper "Intraspecific and geographic variation of West Indian manatee (Trichechus manatus spp.) vocalizations" [Nowacek et al., J. Acoust. Soc. Am. 114, 66-69 (2003)]. The purpose here is to correct the fundamental frequency range and information on intraindividual variation in the vocalizations of Amazonian manatees reported by Nowacek et al. (2003) in citing the paper "Signature information and individual recognition in the isolation calls of Amazonian manatees, Trichechus inunguis (Mammalia: Sirenia)" [Sousa-Lima et al., Anim. Behav. 63, 301-310 (2002)].
Acoustic correlates of body size and individual identity in banded penguins

PubMed Central

Gamba, Marco; Gili, Claudia; Pessani, Daniela

2017-01-01

Animal vocalisations play a role in individual recognition and mate choice. In nesting penguins, acoustic variation in vocalisations originates from distinctiveness in the morphology of the vocal apparatus. Using the source-filter theory approach, we investigated vocal individuality cues and correlates of body size and mass in the ecstatic display songs the Humboldt and Magellanic penguins. We demonstrate that both fundamental frequency (f0) and formants (F1-F4) are essential vocal features to discriminate among individuals. However, we show that only duration and f0 are honest indicators of the body size and mass, respectively. We did not find any effect of body dimension on formants, formant dispersion nor estimated vocal tract length of the emitters. Overall, our findings provide the first evidence that the resonant frequencies of the vocal tract do not correlate with body size in penguins. Our results add important information to a growing body of literature on the role of the different vocal parameters in conveying biologically meaningful information in bird vocalisations. PMID:28199318
Recognition of facial, auditory, and bodily emotions in older adults.

PubMed

Ruffman, Ted; Halberstadt, Jamin; Murray, Janice

2009-11-01

Understanding older adults' social functioning difficulties requires insight into their recognition of emotion processing in voices and bodies, not just faces, the focus of most prior research. We examined 60 young and 61 older adults' recognition of basic emotions in facial, vocal, and bodily expressions, and when matching faces and bodies to voices, using 120 emotion items. Older adults were worse than young adults in 17 of 30 comparisons, with consistent difficulties in recognizing both positive (happy) and negative (angry and sad) vocal and bodily expressions. Nearly three quarters of older adults functioned at a level similar to the lowest one fourth of young adults, suggesting that age-related changes are common. In addition, we found that older adults' difficulty in matching emotions was not explained by difficulty on the component sources (i.e., faces or voices on their own), suggesting an additional problem of integration.
A memory like a female Fur Seal: long-lasting recognition of pup's voice by mothers.

PubMed

Mathevon, Nicolas; Charrier, Isabelle; Aubin, Thierry

2004-06-01

In colonial mammals like fur seals, mutual vocal recognition between mothers and their pup is of primary importance for breeding success. Females alternate feeding sea-trips with suckling periods on land, and when coming back from the ocean, they have to vocally find their offspring among numerous similar-looking pups. Young fur seals emit a 'mother-attraction call' that presents individual characteristics. In this paper, we review the perceptual process of pup's call recognition by Subantarctic Fur Seal Arctocephalus tropicalis mothers. To identify their progeny, females rely on the frequency modulation pattern and spectral features of this call. As the acoustic characteristics of a pup's call change throughout the lactation period due to the growing process, mothers have thus to refine their memorization of their pup's voice. Field experiments show that female Fur Seals are able to retain all the successive versions of their pup's call.
Towards a computer-aided diagnosis system for vocal cord diseases.

PubMed

Verikas, A; Gelzinis, A; Bacauskiene, M; Uloza, V

2006-01-01

The objective of this work is to investigate a possibility of creating a computer-aided decision support system for an automated analysis of vocal cord images aiming to categorize diseases of vocal cords. The problem is treated as a pattern recognition task. To obtain a concise and informative representation of a vocal cord image, colour, texture, and geometrical features are used. The representation is further analyzed by a pattern classifier categorizing the image into healthy, diffuse, and nodular classes. The approach developed was tested on 785 vocal cord images collected at the Department of Otolaryngology, Kaunas University of Medicine, Lithuania. A correct classification rate of over 87% was obtained when categorizing a set of unseen images into the aforementioned three classes. Bearing in mind the high similarity of the decision classes, the results obtained are rather encouraging and the developed tools could be very helpful for assuring objective analysis of the images of laryngeal diseases.
Flight calls signal group and individual identity but not kinship in a cooperatively breeding bird.

PubMed

Keen, Sara C; Meliza, C Daniel; Rubenstein, Dustin R

2013-11-01

In many complex societies, intricate communication and recognition systems may evolve to help support both direct and indirect benefits of group membership. In cooperatively breeding species where groups typically comprise relatives, both learned and innate vocal signals may serve as reliable cues for kin recognition. Here, we investigated vocal communication in the plural cooperatively breeding superb starling, Lamprotornis superbus , where flight calls-short, stereotyped vocalizations used when approaching conspecifics-may communicate kin relationships, group membership, and/or individual identity. We found that flight calls were most similar within individual repertoires but were also more similar within groups than within the larger population. Although starlings responded differently to playback of calls from their own versus other neighboring and distant social groups, call similarity was uncorrelated with genetic relatedness. Additionally, immigrant females showed similar patterns to birds born in the study population. Together, these results suggest that flight calls are learned signals that reflect social association but may also carry a signal of individuality. Flight calls, therefore, provide a reliable recognition mechanism for groups and may also be used to recognize individuals. In complex societies comprising related and unrelated individuals, signaling individuality and group association, rather than kinship, may be a route to cooperation.
The Effect of Classroom Capacity on Vocal Fatigue as Quantified by the Vocal Fatigue Index.

PubMed

Banks, Russell E; Bottalico, Pasquale; Hunter, Eric J

2017-01-01

Previous research has concluded that teachers are at a higher-than-normal risk for voice issues that can cause occupational limitations. While some risk factors have been identified, there are still many unknowns. A survey was distributed electronically with 506 female teacher respondents. The survey included questions to quantify three aspects of vocal fatigue as captured by the Vocal Fatigue Index (VFI): (1) general tiredness of voice (performance), (2) physical discomfort associated with voicing (pain), and (3) improvement of symptoms with rest (recovery). The effect of classroom capacity on US teachers' self-reported experience of vocal fatigue was analyzed. The results indicated that a classroom's capacity significantly affected teachers' reported amounts of vocal fatigue, while a teacher's age also appeared to significantly affect the reported amount of vocal fatigue. A quadratic rather than linear effect was seen, with the largest age effect occurring at around 40-45 years in all three factors of the VFI. Further factors which may affect vocal fatigue must be explored in future research. By understanding what increases the risk for vocal fatigue, educators and school administrators can take precautions to mitigate the occupational risk of short- and long-term vocal health issues in school teachers. © 2017 S. Karger AG, Basel.
Recognition of Facial Expressions and Prosodic Cues with Graded Emotional Intensities in Adults with Asperger Syndrome

ERIC Educational Resources Information Center

Doi, Hirokazu; Fujisawa, Takashi X.; Kanai, Chieko; Ohta, Haruhisa; Yokoi, Hideki; Iwanami, Akira; Kato, Nobumasa; Shinohara, Kazuyuki

2013-01-01

This study investigated the ability of adults with Asperger syndrome to recognize emotional categories of facial expressions and emotional prosodies with graded emotional intensities. The individuals with Asperger syndrome showed poorer recognition performance for angry and sad expressions from both facial and vocal information. The group…

Communication Skills Training Exploiting Multimodal Emotion Recognition

ERIC Educational Resources Information Center

Bahreini, Kiavash; Nadolski, Rob; Westera, Wim

2017-01-01

The teaching of communication skills is a labour-intensive task because of the detailed feedback that should be given to learners during their prolonged practice. This study investigates to what extent our FILTWAM facial and vocal emotion recognition software can be used for improving a serious game (the Communication Advisor) that delivers a…
Effect of Acting Experience on Emotion Expression and Recognition in Voice: Non-Actors Provide Better Stimuli than Expected.

PubMed

Jürgens, Rebecca; Grass, Annika; Drolet, Matthis; Fischer, Julia

Both in the performative arts and in emotion research, professional actors are assumed to be capable of delivering emotions comparable to spontaneous emotional expressions. This study examines the effects of acting training on vocal emotion depiction and recognition. We predicted that professional actors express emotions in a more realistic fashion than non-professional actors. However, professional acting training may lead to a particular speech pattern; this might account for vocal expressions by actors that are less comparable to authentic samples than the ones by non-professional actors. We compared 80 emotional speech tokens from radio interviews with 80 re-enactments by professional and inexperienced actors, respectively. We analyzed recognition accuracies for emotion and authenticity ratings and compared the acoustic structure of the speech tokens. Both play-acted conditions yielded similar recognition accuracies and possessed more variable pitch contours than the spontaneous recordings. However, professional actors exhibited signs of different articulation patterns compared to non-trained speakers. Our results indicate that for emotion research, emotional expressions by professional actors are not better suited than those from non-actors.
Cetacean vocal learning and communication.

PubMed

Janik, Vincent M

2014-10-01

The cetaceans are one of the few mammalian clades capable of vocal production learning. Evidence for this comes from synchronous changes in song patterns of baleen whales and experimental work on toothed whales in captivity. While baleen whales like many vocal learners use this skill in song displays that are involved in sexual selection, toothed whales use learned signals in individual recognition and the negotiation of social relationships. Experimental studies demonstrated that dolphins can use learned signals referentially. Studies on wild dolphins demonstrated how this skill appears to be useful in their own communication system, making them an interesting subject for comparative communication studies. Copyright © 2014. Published by Elsevier Ltd.
Vocal copying of individually distinctive signature whistles in bottlenose dolphins

PubMed Central

King, Stephanie L.; Sayigh, Laela S.; Wells, Randall S.; Fellner, Wendi; Janik, Vincent M.

2013-01-01

Vocal learning is relatively common in birds but less so in mammals. Sexual selection and individual or group recognition have been identified as major forces in its evolution. While important in the development of vocal displays, vocal learning also allows signal copying in social interactions. Such copying can function in addressing or labelling selected conspecifics. Most examples of addressing in non-humans come from bird song, where matching occurs in an aggressive context. However, in other animals, addressing with learned signals is very much an affiliative signal. We studied the function of vocal copying in a mammal that shows vocal learning as well as complex cognitive and social behaviour, the bottlenose dolphin (Tursiops truncatus). Copying occurred almost exclusively between close associates such as mother–calf pairs and male alliances during separation and was not followed by aggression. All copies were clearly recognizable as such because copiers consistently modified some acoustic parameters of a signal when copying it. We found no evidence for the use of copying in aggression or deception. This use of vocal copying is similar to its use in human language, where the maintenance of social bonds appears to be more important than the immediate defence of resources. PMID:23427174
Rapid onset of maternal vocal recognition in a colonially breeding mammal, the Australian sea lion.

PubMed

Pitcher, Benjamin J; Harcourt, Robert G; Charrier, Isabelle

2010-08-13

In many gregarious mammals, mothers and offspring have developed the abilities to recognise each other using acoustic signals. Such capacity may develop at different rates after birth/parturition, varying between species and between the participants, i.e., mothers and young. Differences in selective pressures between species, and between mothers and offspring, are likely to drive the timing of the onset of mother-young recognition. We tested the ability of Australian sea lion mothers to identify their offspring by vocalisation, and examined the onset of this behaviour in these females. We hypothesise that a rapid onset of recognition may reflect an adaptation to a colonial lifestyle. In a playback study maternal responses to own pup and non-filial vocalisations were compared at 12, 24 and every subsequent 24 hours until the females' first departure post-partum. Mothers showed a clear ability to recognise their pup's voice by 48 hours of age. At 24 hours mothers called more, at 48 hours they called sooner and at 72 hours they looked sooner in response to their own pup's vocalisations compared to those of non-filial pups. We demonstrate that Australian sea lion females can vocally identify offspring within two days of birth and before mothers leave to forage post-partum. We suggest that this rapid onset is a result of selection pressures imposed by a colonial lifestyle and may be seen in other colonial vertebrates. This is the first demonstration of the timing of the onset of maternal vocal recognition in a pinniped species.
The Glasgow Voice Memory Test: Assessing the ability to memorize and recognize unfamiliar voices.

PubMed

Aglieri, Virginia; Watson, Rebecca; Pernet, Cyril; Latinus, Marianne; Garrido, Lúcia; Belin, Pascal

2017-02-01

One thousand one hundred and twenty subjects as well as a developmental phonagnosic subject (KH) along with age-matched controls performed the Glasgow Voice Memory Test, which assesses the ability to encode and immediately recognize, through an old/new judgment, both unfamiliar voices (delivered as vowels, making language requirements minimal) and bell sounds. The inclusion of non-vocal stimuli allows the detection of significant dissociations between the two categories (vocal vs. non-vocal stimuli). The distributions of accuracy and sensitivity scores (d') reflected a wide range of individual differences in voice recognition performance in the population. As expected, KH showed a dissociation between the recognition of voices and bell sounds, her performance being significantly poorer than matched controls for voices but not for bells. By providing normative data of a large sample and by testing a developmental phonagnosic subject, we demonstrated that the Glasgow Voice Memory Test, available online and accessible from all over the world, can be a valid screening tool (~5 min) for a preliminary detection of potential cases of phonagnosia and of "super recognizers" for voices.
The Distribution and Severity of Tremor in Speech Structures of Persons with Vocal Tremor.

PubMed

Hemmerich, Abby L; Finnegan, Eileen M; Hoffman, Henry T

2017-05-01

Vocal tremor may be associated with cyclic oscillations in the pulmonary, laryngeal, velopharyngeal, or oral regions. This study aimed to correlate the overall severity of vocal tremor with the distribution and severity of tremor in structures involved. Endoscopic and clinical examinations were completed on 20 adults with vocal tremor and two age-matched controls during sustained phonation. Two judges rated the severity of vocal tremor and the severity of tremor affecting each of 13 structures. Participants with mild vocal tremor typically presented with tremor in three laryngeal structures, moderate vocal tremor in five structures (laryngeal and another region), and severe vocal tremor in eight structures affecting all regions. The severity of tremor was lowest (mean = 1.2 out of 3) in persons with mild vocal tremor and greater in persons with moderate (mean = 1.5) and severe vocal tremor (mean = 1.4). Laryngeal structures were most frequently (95%) and severely (1.7 out of 3) affected, followed by velopharynx (40% occurrence, 1.3 severity), pulmonary (40% occurrence, 1.1 severity), and oral (40% occurrence, 1.0 severity) regions. Regression analyses indicated tremor severity of the supraglottic structures, and vertical laryngeal movement contributed most to vocal tremor severity during sustained phonation (r = 0.77, F = 16.17, P < 0.0001). A strong positive correlation (r = 0.72) was found between the Tremor Index and the severity of the vocal tremor during sustained phonation. It is useful to obtain a wide endoscopic view of the larynx to visualize tremor, which is rarely isolated to the true vocal folds alone. Published by Elsevier Inc.
Vocal cysts: clinical, endoscopic, and surgical aspects.

PubMed

Martins, Regina Helena Garcia; Santana, Marcela Ferreira; Tavares, Elaine Lara Mendes

2011-01-01

Vocal cysts are benign laryngeal lesions, which affect children and adults. They can be classified as epidermic or mucous-retention cyst. The objective was to study the clinical, endoscopic, and surgical aspects of vocal cysts. We reviewed the medical charts of 72 patients with vocal cysts, considering age, gender, occupation, time of vocal symptoms, nasosinusal and gastroesophageal symptoms, vocal abuse, tabagism, alcoholism, associated lesions, treatment, and histological details. Of the 72 cases, 46 were adults (36 females and 10 male) and 26 were children (eight girls and 18 boys). As far as occupation is concerned, there was a higher incidence of students and teachers. All the patients had symptoms of chronic hoarseness. Nasosinusal (27.77%) and gastroesophageal (32%) symptoms were not relevant. Vocal abuse was reported by 45.83%, smoking by 18%, and alcoholism by 8.4% of the patients. Unilateral cysts were seen in 93% of the cases, 22 patients had associated lesions, such as bridge, sulcus vocalis, and microweb. Surgical treatment was performed in 46 cases. Histological analysis of the epidermic cysts revealed a cavity with caseous content, covered by stratified squamous epithelium, often keratinized. Mucous cysts presented mucous content, and the walls were coated by a cylindrical ciliated epithelium. Vocal cysts are benign vocal fold lesions that affect children and adults, being often associated with vocal overuse, which frequently affects people who use their voices professionally. Vocal symptoms are chronic in course, often times since childhood, and the treatment of choice is surgical removal. A careful examination of the vocal folds is necessary during surgery, because other laryngeal lesions may be associated with vocal cysts. Copyright Â© 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Affective State Level Recognition in Naturalistic Facial and Vocal Expressions.

PubMed

Meng, Hongying; Bianchi-Berthouze, Nadia

2014-03-01

Naturalistic affective expressions change at a rate much slower than the typical rate at which video or audio is recorded. This increases the probability that consecutive recorded instants of expressions represent the same affective content. In this paper, we exploit such a relationship to improve the recognition performance of continuous naturalistic affective expressions. Using datasets of naturalistic affective expressions (AVEC 2011 audio and video dataset, PAINFUL video dataset) continuously labeled over time and over different dimensions, we analyze the transitions between levels of those dimensions (e.g., transitions in pain intensity level). We use an information theory approach to show that the transitions occur very slowly and hence suggest modeling them as first-order Markov models. The dimension levels are considered to be the hidden states in the Hidden Markov Model (HMM) framework. Their discrete transition and emission matrices are trained by using the labels provided with the training set. The recognition problem is converted into a best path-finding problem to obtain the best hidden states sequence in HMMs. This is a key difference from previous use of HMMs as classifiers. Modeling of the transitions between dimension levels is integrated in a multistage approach, where the first level performs a mapping between the affective expression features and a soft decision value (e.g., an affective dimension level), and further classification stages are modeled as HMMs that refine that mapping by taking into account the temporal relationships between the output decision labels. The experimental results for each of the unimodal datasets show overall performance to be significantly above that of a standard classification system that does not take into account temporal relationships. In particular, the results on the AVEC 2011 audio dataset outperform all other systems presented at the international competition.
Multisensory emotion perception in congenitally, early, and late deaf CI users

PubMed Central

Nava, Elena; Villwock, Agnes K.; Büchner, Andreas; Lenarz, Thomas; Röder, Brigitte

2017-01-01

Emotions are commonly recognized by combining auditory and visual signals (i.e., vocal and facial expressions). Yet it is unknown whether the ability to link emotional signals across modalities depends on early experience with audio-visual stimuli. In the present study, we investigated the role of auditory experience at different stages of development for auditory, visual, and multisensory emotion recognition abilities in three groups of adolescent and adult cochlear implant (CI) users. CI users had a different deafness onset and were compared to three groups of age- and gender-matched hearing control participants. We hypothesized that congenitally deaf (CD) but not early deaf (ED) and late deaf (LD) CI users would show reduced multisensory interactions and a higher visual dominance in emotion perception than their hearing controls. The CD (n = 7), ED (deafness onset: <3 years of age; n = 7), and LD (deafness onset: >3 years; n = 13) CI users and the control participants performed an emotion recognition task with auditory, visual, and audio-visual emotionally congruent and incongruent nonsense speech stimuli. In different blocks, participants judged either the vocal (Voice task) or the facial expressions (Face task). In the Voice task, all three CI groups performed overall less efficiently than their respective controls and experienced higher interference from incongruent facial information. Furthermore, the ED CI users benefitted more than their controls from congruent faces and the CD CI users showed an analogous trend. In the Face task, recognition efficiency of the CI users and controls did not differ. Our results suggest that CI users acquire multisensory interactions to some degree, even after congenital deafness. When judging affective prosody they appear impaired and more strongly biased by concurrent facial information than typically hearing individuals. We speculate that limitations inherent to the CI contribute to these group differences. PMID:29023525
Multisensory emotion perception in congenitally, early, and late deaf CI users.

PubMed

Fengler, Ineke; Nava, Elena; Villwock, Agnes K; Büchner, Andreas; Lenarz, Thomas; Röder, Brigitte

2017-01-01

Emotions are commonly recognized by combining auditory and visual signals (i.e., vocal and facial expressions). Yet it is unknown whether the ability to link emotional signals across modalities depends on early experience with audio-visual stimuli. In the present study, we investigated the role of auditory experience at different stages of development for auditory, visual, and multisensory emotion recognition abilities in three groups of adolescent and adult cochlear implant (CI) users. CI users had a different deafness onset and were compared to three groups of age- and gender-matched hearing control participants. We hypothesized that congenitally deaf (CD) but not early deaf (ED) and late deaf (LD) CI users would show reduced multisensory interactions and a higher visual dominance in emotion perception than their hearing controls. The CD (n = 7), ED (deafness onset: <3 years of age; n = 7), and LD (deafness onset: >3 years; n = 13) CI users and the control participants performed an emotion recognition task with auditory, visual, and audio-visual emotionally congruent and incongruent nonsense speech stimuli. In different blocks, participants judged either the vocal (Voice task) or the facial expressions (Face task). In the Voice task, all three CI groups performed overall less efficiently than their respective controls and experienced higher interference from incongruent facial information. Furthermore, the ED CI users benefitted more than their controls from congruent faces and the CD CI users showed an analogous trend. In the Face task, recognition efficiency of the CI users and controls did not differ. Our results suggest that CI users acquire multisensory interactions to some degree, even after congenital deafness. When judging affective prosody they appear impaired and more strongly biased by concurrent facial information than typically hearing individuals. We speculate that limitations inherent to the CI contribute to these group differences.
Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition

PubMed Central

Borowiak, Kamila; von Kriegstein, Katharina

2016-01-01

The ability to recognise the identity of others is a key requirement for successful communication. Brain regions that respond selectively to voices exist in humans from early infancy on. Currently, it is unclear whether dysfunction of these voice-sensitive regions can explain voice identity recognition impairments. Here, we used two independent functional magnetic resonance imaging studies to investigate voice processing in a population that has been reported to have no voice-sensitive regions: autism spectrum disorder (ASD). Our results refute the earlier report that individuals with ASD have no responses in voice-sensitive regions: Passive listening to vocal, compared to non-vocal, sounds elicited typical responses in voice-sensitive regions in the high-functioning ASD group and controls. In contrast, the ASD group had a dysfunction in voice-sensitive regions during voice identity but not speech recognition in the right posterior superior temporal sulcus/gyrus (STS/STG)—a region implicated in processing complex spectrotemporal voice features and unfamiliar voices. The right anterior STS/STG correlated with voice identity recognition performance in controls but not in the ASD group. The findings suggest that right STS/STG dysfunction is critical for explaining voice recognition impairments in high-functioning ASD and show that ASD is not characterised by a general lack of voice-sensitive responses. PMID:27369067
Affective responses in tamarins elicited by species-specific music

PubMed Central

Snowdon, Charles T.; Teie, David

2010-01-01

Theories of music evolution agree that human music has an affective influence on listeners. Tests of non-humans provided little evidence of preferences for human music. However, prosodic features of speech (‘motherese’) influence affective behaviour of non-verbal infants as well as domestic animals, suggesting that features of music can influence the behaviour of non-human species. We incorporated acoustical characteristics of tamarin affiliation vocalizations and tamarin threat vocalizations into corresponding pieces of music. We compared music composed for tamarins with that composed for humans. Tamarins were generally indifferent to playbacks of human music, but responded with increased arousal to tamarin threat vocalization based music, and with decreased activity and increased calm behaviour to tamarin affective vocalization based music. Affective components in human music may have evolutionary origins in the structure of calls of non-human animals. In addition, animal signals may have evolved to manage the behaviour of listeners by influencing their affective state. PMID:19726444
Modulation of voice related to tremor and vibrato

NASA Astrophysics Data System (ADS)

Lester, Rosemary Anne

Modulation of voice is a result of physiologic oscillation within one or more components of the vocal system including the breathing apparatus (i.e., pressure supply), the larynx (i.e. sound source), and the vocal tract (i.e., sound filter). These oscillations may be caused by pathological tremor associated with neurological disorders like essential tremor or by volitional production of vibrato in singers. Because the acoustical characteristics of voice modulation specific to each component of the vocal system and the effect of these characteristics on perception are not well-understood, it is difficult to assess individuals with vocal tremor and to determine the most effective interventions for reducing the perceptual severity of the disorder. The purpose of the present studies was to determine how the acoustical characteristics associated with laryngeal-based vocal tremor affect the perception of the magnitude of voice modulation, and to determine if adjustments could be made to the voice source and vocal tract filter to alter the acoustic output and reduce the perception of modulation. This research was carried out using both a computational model of speech production and trained singers producing vibrato to simulate laryngeal-based vocal tremor with different voice source characteristics (i.e., vocal fold length and degree of vocal fold adduction) and different vocal tract filter characteristics (i.e., vowel shapes). It was expected that, by making adjustments to the voice source and vocal tract filter that reduce the amplitude of the higher harmonics, the perception of magnitude of voice modulation would be reduced. The results of this study revealed that listeners' perception of the magnitude of modulation of voice was affected by the degree of vocal fold adduction and the vocal tract shape with the computational model, but only by the vocal quality (corresponding to the degree of vocal fold adduction) with the female singer. Based on regression analyses, listeners' judgments were predicted by modulation information in both low and high frequency bands. The findings from these studies indicate that production of a breathy vocal quality might be a useful compensatory strategy for reducing the perceptual severity of modulation of voice for individuals with tremor affecting the larynx.
Human and animal sounds influence recognition of body language.

PubMed

Van den Stock, Jan; Grèzes, Julie; de Gelder, Beatrice

2008-11-25

In naturalistic settings emotional events have multiple correlates and are simultaneously perceived by several sensory systems. Recent studies have shown that recognition of facial expressions is biased towards the emotion expressed by a simultaneously presented emotional expression in the voice even if attention is directed to the face only. So far, no study examined whether this phenomenon also applies to whole body expressions, although there is no obvious reason why this crossmodal influence would be specific for faces. Here we investigated whether perception of emotions expressed in whole body movements is influenced by affective information provided by human and by animal vocalizations. Participants were instructed to attend to the action displayed by the body and to categorize the expressed emotion. The results indicate that recognition of body language is biased towards the emotion expressed by the simultaneously presented auditory information, whether it consist of human or of animal sounds. Our results show that a crossmodal influence from auditory to visual emotional information obtains for whole body video images with the facial expression blanked and includes human as well as animal sounds.
The Use of Voice Cues for Speaker Gender Recognition in Cochlear Implant Recipients

ERIC Educational Resources Information Center

Meister, Hartmut; Fürsen, Katrin; Streicher, Barbara; Lang-Roth, Ruth; Walger, Martin

2016-01-01

Purpose: The focus of this study was to examine the influence of fundamental frequency (F0) and vocal tract length (VTL) modifications on speaker gender recognition in cochlear implant (CI) recipients for different stimulus types. Method: Single words and sentences were manipulated using isolated or combined F0 and VTL cues. Using an 11-point…
Blunted vocal affect and expression is not associated with schizophrenia: A computerized acoustic analysis of speech under ambiguous conditions.

PubMed

Meaux, Lauren T; Mitchell, Kyle R; Cohen, Alex S

2018-05-01

Patients with schizophrenia are consistently rated by clinicians as having high levels of blunted vocal affect and alogia. However, objective technologies have often failed to substantiate these abnormalities. It could be the case that negative symptoms are context-dependent. The present study examined speech elicited under conditions demonstrated to exacerbate thought disorder. The Rorschach Test was administered to 36 outpatients with schizophrenia and 25 nonpatient controls. Replies to separate "perceptual" and "memory" phases were analyzed using validated acoustic analytic methods. Compared to nonpatient controls, schizophrenia patients did not display abnormal speech expression on objective measure of blunted vocal affect or alogia. Moreover, clinical ratings of negative symptoms were not significantly correlated with objective measures. These findings suggest that in patients with schizophrenia, vocal affect/alogia is generally unremarkable under ambiguous conditions. Clarifying the nature of blunted vocal affect and alogia, and how objective measures correspond to what clinicians attend to when making clinical ratings are important directions for future research. Copyright © 2018 Elsevier Inc. All rights reserved.
Differing Roles of the Face and Voice in Early Human Communication: Roots of Language in Multimodal Expression

PubMed Central

Jhang, Yuna; Franklin, Beau; Ramsdell-Hudock, Heather L.; Oller, D. Kimbrough

2017-01-01

Seeking roots of language, we probed infant facial expressions and vocalizations. Both have roles in language, but the voice plays an especially flexible role, expressing a variety of functions and affect conditions with the same vocal categories—a word can be produced with many different affective flavors. This requirement of language is seen in very early infant vocalizations. We examined the extent to which affect is transmitted by early vocal categories termed “protophones” (squeals, vowel-like sounds, and growls) and by their co-occurring facial expressions, and similarly the extent to which vocal type is transmitted by the voice and co-occurring facial expressions. Our coder agreement data suggest infant affect during protophones was most reliably transmitted by the face (judged in video-only), while vocal type was transmitted most reliably by the voice (judged in audio-only). Voice alone transmitted negative affect more reliably than neutral or positive affect, suggesting infant protophones may be used especially to call for attention when the infant is in distress. By contrast, the face alone provided no significant information about protophone categories. Indeed coders in VID could scarcely recognize the difference between silence and voice when coding protophones in VID. The results suggest that partial decoupling of communicative roles for face and voice occurs even in the first months of life. Affect in infancy appears to be transmitted in a way that audio and video aspects are flexibly interwoven, as in mature language. PMID:29423398
Differing Roles of the Face and Voice in Early Human Communication: Roots of Language in Multimodal Expression.

PubMed

Jhang, Yuna; Franklin, Beau; Ramsdell-Hudock, Heather L; Oller, D Kimbrough

2017-01-01

Seeking roots of language, we probed infant facial expressions and vocalizations. Both have roles in language, but the voice plays an especially flexible role, expressing a variety of functions and affect conditions with the same vocal categories-a word can be produced with many different affective flavors. This requirement of language is seen in very early infant vocalizations. We examined the extent to which affect is transmitted by early vocal categories termed "protophones" (squeals, vowel-like sounds, and growls) and by their co-occurring facial expressions, and similarly the extent to which vocal type is transmitted by the voice and co-occurring facial expressions. Our coder agreement data suggest infant affect during protophones was most reliably transmitted by the face (judged in video-only), while vocal type was transmitted most reliably by the voice (judged in audio-only). Voice alone transmitted negative affect more reliably than neutral or positive affect, suggesting infant protophones may be used especially to call for attention when the infant is in distress. By contrast, the face alone provided no significant information about protophone categories. Indeed coders in VID could scarcely recognize the difference between silence and voice when coding protophones in VID. The results suggest that partial decoupling of communicative roles for face and voice occurs even in the first months of life. Affect in infancy appears to be transmitted in a way that audio and video aspects are flexibly interwoven, as in mature language.
Vocalizations of adult male Asian koels (Eudynamys scolopacea) in the breeding season.

PubMed

Khan, Abdul Aziz; Qureshi, Irfan Zia

2017-01-01

Defining the vocal repertoire provides a basis for understanding the role of acoustic signals in sexual and social interactions of an animal. The Asian koel (Eudynamys scolopacea) is a migratory bird which spends its summer breeding season in the plains of Pakistan. The bird is typically wary and secretive but produces loud and distinct calls, making it easily detected when unseen. Like the other birds in the wild, presumably Asian koels use their calls for social cohesion and coordination of different behaviors. To date, the description of vocal repertoire of the male Asian koel has been lacking. Presently we analyzed and described for the first time the vocalizations of the adult male Asian koel, recorded in two consecutive breeding seasons. Using 10 call parameters, we categorized the vocalization type into six different categories on the basis of spectrogram and statistical analyses, namely the; "type 1 cooee call", "type 2 cooee call", "type 1 coegh call", "type 2 coegh call", "wurroo call" and "coe call". These names were assigned not on the basis of functional analysis and were therefore onomatopoeic. Stepwise cross validated discriminant function analysis classified the vocalization correctly (100%) into the predicted vocal categories that we initially classified on the basis of spectrographic examination. Our findings enrich the biological knowledge about vocalizations of the adult male Asian koel and provide a foundation for future acoustic monitoring of the species, as well as for comparative studies with vocalizations of other bird species of the cuckoo family. Further studies on the vocalizations of the Asian koel are required to unravel their functions in sexual selection and individual recognition.

Heterospecific eavesdropping in ant-following birds of the Neotropics is a learned behaviour.

PubMed

Pollock, Henry S; Martínez, Ari E; Kelley, J Patrick; Touchton, Janeene M; Tarwater, Corey E

2017-10-25

Animals eavesdrop on other species to obtain information about their environments. Heterospecific eavesdropping can yield tangible fitness benefits by providing valuable information about food resources and predator presence. The ability to eavesdrop may therefore be under strong selection, although extensive research on alarm-calling in avian mixed-species flocks has found only limited evidence that close association with another species could select for innate signal recognition. Nevertheless, very little is known about the evolution of eavesdropping behaviour and the mechanism of heterospecific signal recognition, particularly in other ecological contexts, such as foraging. To understand whether heterospecific eavesdropping was an innate or learned behaviour in a foraging context, we studied heterospecific signal recognition in ant-following birds of the Neotropics, which eavesdrop on vocalizations of obligate ant-following species to locate and recruit to swarms of the army ant Eciton burchellii , a profitable food resource. We used a playback experiment to compare recruitment of ant-following birds to vocalizations of two obligate species at a mainland site (where both species are present) and a nearby island site (where one species remains whereas the other went extinct approx. 40 years ago). We found that ant-following birds recruited strongly to playbacks of the obligate species present at both island and mainland sites, but the island birds did not recruit to playbacks of the absent obligate species. Our results strongly suggest that (i) ant-following birds learn to recognize heterospecific vocalizations from ecological experience and (ii) island birds no longer recognize the locally extinct obligate species after eight generations of absence from the island. Although learning appears to be the mechanism of heterospecific signal recognition in ant-following birds, more experimental tests are needed to fully understand the evolution of eavesdropping behaviour. © 2017 The Author(s).
Memristive Computational Architecture of an Echo State Network for Real-Time Speech Emotion Recognition

DTIC Science & Technology

2015-05-28

recognition is simpler and requires less computational resources compared to other inputs such as facial expressions . The Berlin database of Emotional ...Processing Magazine, IEEE, vol. 18, no. 1, pp. 32– 80, 2001. [15] K. R. Scherer, T. Johnstone, and G. Klasmeyer, “Vocal expression of emotion ...Network for Real-Time Speech- Emotion Recognition 5a. CONTRACT NUMBER IN-HOUSE 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 62788F 6. AUTHOR(S) Q
Social calls provide novel insights into the evolution of vocal learning

PubMed Central

Sewall, Kendra B.; Young, Anna M.; Wright, Timothy F.

2016-01-01

Learned song is among the best-studied models of animal communication. In oscine songbirds, where learned song is most prevalent, it is used primarily for intrasexual selection and mate attraction. Learning of a different class of vocal signals, known as contact calls, is found in a diverse array of species, where they are used to mediate social interactions among individuals. We argue that call learning provides a taxonomically rich system for studying testable hypotheses for the evolutionary origins of vocal learning. We describe and critically evaluate four nonmutually exclusive hypotheses for the origin and current function of vocal learning of calls, which propose that call learning (1) improves auditory detection and recognition, (2) signals local knowledge, (3) signals group membership, or (4) allows for the encoding of more complex social information. We propose approaches to testing these four hypotheses but emphasize that all of them share the idea that social living, not sexual selection, is a central driver of vocal learning. Finally, we identify future areas for research on call learning that could provide new perspectives on the origins and mechanisms of vocal learning in both animals and humans. PMID:28163325
Chronic naltrindole administration does not modify the inhibitory effect of morphine on vocalization responses in the tail electric stimulation test in rats.

PubMed

Fernández, B; Alberti, I; Kitchen, I; Paz Viveros, M

1999-01-29

To address the existence of possible functional interactions between delta- and mu- receptors in relation to the affective component of pain, we have studied the effects of functional blockade of delta-receptors by a chronic treatment with naltrindole (1 mg/kg, 8 consecutive days) on antinociceptive responses to morphine (2 and 5 mg/kg) in the tail electric stimulation test, in adult male rats. The thresholds for the motor response (tail withdrawal), vocalization during stimulus and vocalization afterdischarge were assessed. These responses are considered to be integrated at spinal, medulla oblongata and diencephalon-rhinencephalon levels, respectively. The results show that the vocalization during stimulus and the vocalization afterdischarge were significantly affected by morphine in a dose dependent manner, the latter response being the most sensitive to the effects of the mu-opioid agonist. However, no significant effect was observed on motor responses at the doses used in this study. Chronic naltrindole treatment did not modify the inhibitory effect of morphine on the vocalization responses. Since the vocalization afterdischarge is related to the affective component of pain, the data suggest that the delta-opioid receptor is not involved in the supraspinal mechanisms at which these responses are organized and that there is not a mu-delta interaction in the modulation of the affective responses to noxious electrical stimulation.
Obligatory and facultative brain regions for voice-identity recognition

PubMed Central

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. PMID:29228111
Obligatory and facultative brain regions for voice-identity recognition.

PubMed

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain.
The perceptual features of vocal fatigue as self-reported by a group of actors and singers.

PubMed

Kitch, J A; Oates, J

1994-09-01

Performers (10 actors/10 singers) rated via a self-report questionnaire the severity of their voice-related changes when vocally fatigued. Similar frequency patterns and perceptual features of vocal fatigue were found across subjects. Actors rated "power" aspects (e.g., voice projection) and singers rated vocal dynamic aspects (e.g., pitch range) of their voices as most affected when vocally fatigued. Vocal fatigue was evidenced by changes in kinesthetic/proprioceptive sensations and vocal dynamics. The causes and context of vocal fatigue were vocal misuse, being "run down," high performance demands, and using high pitch/volume levels. Further research is needed to delineate the perceptual features of "normal" levels of vocal fatigue and its possible causes.
A study of voice production characteristics of astronuat speech during Apollo 11 for speaker modeling in space.

PubMed

Yu, Chengzhu; Hansen, John H L

2017-03-01

Human physiology has evolved to accommodate environmental conditions, including temperature, pressure, and air chemistry unique to Earth. However, the environment in space varies significantly compared to that on Earth and, therefore, variability is expected in astronauts' speech production mechanism. In this study, the variations of astronaut voice characteristics during the NASA Apollo 11 mission are analyzed. Specifically, acoustical features such as fundamental frequency and phoneme formant structure that are closely related to the speech production system are studied. For a further understanding of astronauts' vocal tract spectrum variation in space, a maximum likelihood frequency warping based analysis is proposed to detect the vocal tract spectrum displacement during space conditions. The results from fundamental frequency, formant structure, as well as vocal spectrum displacement indicate that astronauts change their speech production mechanism when in space. Moreover, the experimental results for astronaut voice identification tasks indicate that current speaker recognition solutions are highly vulnerable to astronaut voice production variations in space conditions. Future recommendations from this study suggest that successful applications of speaker recognition during extended space missions require robust speaker modeling techniques that could effectively adapt to voice production variation caused by diverse space conditions.
A Kinect-Based Sign Language Hand Gesture Recognition System for Hearing- and Speech-Impaired: A Pilot Study of Pakistani Sign Language.

PubMed

Halim, Zahid; Abbas, Ghulam

2015-01-01

Sign language provides hearing and speech impaired individuals with an interface to communicate with other members of the society. Unfortunately, sign language is not understood by most of the common people. For this, a gadget based on image processing and pattern recognition can provide with a vital aid for detecting and translating sign language into a vocal language. This work presents a system for detecting and understanding the sign language gestures by a custom built software tool and later translating the gesture into a vocal language. For the purpose of recognizing a particular gesture, the system employs a Dynamic Time Warping (DTW) algorithm and an off-the-shelf software tool is employed for vocal language generation. Microsoft(®) Kinect is the primary tool used to capture video stream of a user. The proposed method is capable of successfully detecting gestures stored in the dictionary with an accuracy of 91%. The proposed system has the ability to define and add custom made gestures. Based on an experiment in which 10 individuals with impairments used the system to communicate with 5 people with no disability, 87% agreed that the system was useful.
Differential short-term memorisation for vocal and instrumental rhythms.

PubMed

Klyn, Niall A M; Will, Udo; Cheong, Yong-Jeon; Allen, Erin T

2016-07-01

This study explores differential processing of vocal and instrumental rhythms in short-term memory with three decision (same/different judgments) and one reproduction experiment. In the first experiment, memory performance declined for delayed versus immediate recall, with accuracy for the two rhythms being affected differently: Musicians performed better than non-musicians on clapstick but not on vocal rhythms, and musicians were better on vocal rhythms in the same than in the different condition. Results for the second experiment showed that concurrent sub-vocal articulation and finger-tapping differentially affected the two rhythms and same/different decisions, but produced no evidence for articulatory loop involvement in delayed decision tasks. In a third experiment, which tested rhythm reproduction, concurrent sub-vocal articulation decreased memory performance, with a stronger deleterious effect on the reproduction of vocal than of clapstick rhythms. This suggests that the articulatory loop may only be involved in delayed reproduction not in decision tasks. The fourth experiment tested whether differences between filled and empty rhythms (continuous vs. discontinuous sounds) can explain the different memorisation of vocal and clapstick rhythms. Though significant differences were found for empty and filled instrumental rhythms, the differences between vocal and clapstick can only be explained by considering additional voice specific features.
Effects of Social Games on Infant Vocalizations

ERIC Educational Resources Information Center

Hsu, Hui-Chin; Iyer, Suneeti Nathani; Fogel, Alan

2014-01-01

The aim of the present study was to examine the contextual effects of social games on prelinguistic vocalizations. The two main goals were to (1) investigate the functions of vocalizations as symptoms of affective arousal and symbols of social understanding, and (2) explore form-function (de)coupling relations between vocalization types and game…
Vocal contagion of emotions in non-human animals

PubMed Central

2018-01-01

Communicating emotions to conspecifics (emotion expression) allows the regulation of social interactions (e.g. approach and avoidance). Moreover, when emotions are transmitted from one individual to the next, leading to state matching (emotional contagion), information transfer and coordination between group members are facilitated. Despite the high potential for vocalizations to influence the affective state of surrounding individuals, vocal contagion of emotions has been largely unexplored in non-human animals. In this paper, I review the evidence for discrimination of vocal expression of emotions, which is a necessary step for emotional contagion to occur. I then describe possible proximate mechanisms underlying vocal contagion of emotions, propose criteria to assess this phenomenon and review the existing evidence. The literature so far shows that non-human animals are able to discriminate and be affected by conspecific and also potentially heterospecific (e.g. human) vocal expression of emotions. Since humans heavily rely on vocalizations to communicate (speech), I suggest that studying vocal contagion of emotions in non-human animals can lead to a better understanding of the evolution of emotional contagion and empathy. PMID:29491174
Effects of social games on infant vocalizations*.

PubMed

Hsu, Hui-Chin; Iyer, Suneeti Nathani; Fogel, Alan

2014-01-01

The aim of the present study was to examine the contextual effects of social games on prelinguistic vocalizations. The two main goals were to (1) investigate the functions of vocalizations as symptoms of affective arousal and symbols of social understanding, and (2) explore form-function (de)coupling relations between vocalization types and game contexts. Seventy-one six-month-olds and sixty-four twelve-month-olds played with their mothers in normal and perturbed tickle and peek-a-boo games. The effects of infant age, game, game climax, and game perturbation on the frequency and types of infant vocalizations were examined. Results showed twelve-month-olds vocalized more mature canonical syllables during peek-a-boo and more primitive quasi-resonant nuclei during tickle than six-month-olds. Six- and twelve-month-olds increased their vocalizations from the set-up to climax during peek-a-boo, but they did not show such an increase during tickle. Findings support the symptom function of prelinguistic vocalizations reflecting affective arousal and the prevalence of form-function decoupling during the first year of life.
Identifying Knowledge Gaps in Clinicians Who Evaluate and Treat Vocal Performing Artists in College Health Settings.

PubMed

McKinnon-Howe, Leah; Dowdall, Jayme

2018-05-01

The goal of this study was to identify knowledge gaps in clinicians who evaluate and treat performing artists for illnesses and injuries that affect vocal function in college health settings. This pilot study utilized a web-based cross-sectional survey design incorporating common clinical scenarios to test knowledge of evaluation and management strategies in the vocal performing artist. A web-based survey was administered to a purposive sample of 28 clinicians to identify the approach utilized to evaluate and treat vocal performing artists in college health settings, and factors that might affect knowledge gaps and influence referral patterns to voice specialists. Twenty-eight clinicians were surveyed, with 36% of respondents incorrectly identifying appropriate vocal hygiene measures, 56% of respondents failing to identify symptoms of vocal fold hemorrhage, 84% failing to identify other indications for referral to a voice specialist, 96% of respondents acknowledging unfamiliarity with the Voice Handicap Index and the Singers Voice Handicap Index, and 68% acknowledging unfamiliarity with the Reflux Symptom Index. The data elucidated specific knowledge gaps in college health providers who are responsible for evaluating and treating common illnesses that affect vocal function, and triaging and referring students experiencing symptoms of potential vocal emergencies. Future work is needed to improve the standard of care for this population. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Effects of cue modality and emotional category on recognition of nonverbal emotional signals in schizophrenia.

PubMed

Vogel, Bastian D; Brück, Carolin; Jacob, Heike; Eberle, Mark; Wildgruber, Dirk

2016-07-07

Impaired interpretation of nonverbal emotional cues in patients with schizophrenia has been reported in several studies and a clinical relevance of these deficits for social functioning has been assumed. However, it is unclear to what extent the impairments depend on specific emotions or specific channels of nonverbal communication. Here, the effect of cue modality and emotional categories on accuracy of emotion recognition was evaluated in 21 patients with schizophrenia and compared to a healthy control group (n = 21). To this end, dynamic stimuli comprising speakers of both genders in three different sensory modalities (auditory, visual and audiovisual) and five emotional categories (happy, alluring, neutral, angry and disgusted) were used. Patients with schizophrenia were found to be impaired in emotion recognition in comparison to the control group across all stimuli. Considering specific emotions more severe deficits were revealed in the recognition of alluring stimuli and less severe deficits in the recognition of disgusted stimuli as compared to all other emotions. Regarding cue modality the extent of the impairment in emotional recognition did not significantly differ between auditory and visual cues across all emotional categories. However, patients with schizophrenia showed significantly more severe disturbances for vocal as compared to facial cues when sexual interest is expressed (alluring stimuli), whereas more severe disturbances for facial as compared to vocal cues were observed when happiness or anger is expressed. Our results confirmed that perceptual impairments can be observed for vocal as well as facial cues conveying various social and emotional connotations. The observed differences in severity of impairments with most severe deficits for alluring expressions might be related to specific difficulties in recognizing the complex social emotional information of interpersonal intentions as compared to "basic" emotional states. Therefore, future studies evaluating perception of nonverbal cues should consider a broader range of social and emotional signals beyond basic emotions including attitudes and interpersonal intentions. Identifying specific domains of social perception particularly prone for misunderstandings in patients with schizophrenia might allow for a refinement of interventions aiming at improving social functioning.
Comparative analysis of perceptual evaluation, acoustic analysis and indirect laryngoscopy for vocal assessment of a population with vocal complaint.

PubMed

Nemr, Kátia; Amar, Ali; Abrahão, Marcio; Leite, Grazielle Capatto de Almeida; Köhle, Juliana; Santos, Alexandra de O; Correa, Luiz Artur Costa

2005-01-01

As a result of technology evolution and development, methods of voice evaluation have changed both in medical and speech and language pathology practice. To relate the results of perceptual evaluation, acoustic analysis and medical evaluation in the diagnosis of vocal and/or laryngeal affections of the population with vocal complaint. Clinical prospective. 29 people that attended vocal health protection campaign were evaluated. They were submitted to perceptual evaluation (AFPA), acoustic analysis (AA), indirect laryngoscopy (LI) and telelaryngoscopy (TL). Correlations between medical and speech language pathology evaluation methods were established, verifying possible statistical signification with the application of Fischer Exact Test. There were statistically significant results in the correlation between AFPA and LI, AFPA and TL, LI and TL. This research study conducted in a vocal health protection campaign presented correlations between speech language pathology evaluation and perceptual evaluation and clinical evaluation, as well as between vocal affection and/or laryngeal medical exams.
Recognizing vocal emotions in Mandarin Chinese: a validated database of Chinese vocal emotional stimuli.

PubMed

Liu, Pan; Pell, Marc D

2012-12-01

To establish a valid database of vocal emotional stimuli in Mandarin Chinese, a set of Chinese pseudosentences (i.e., semantically meaningless sentences that resembled real Chinese) were produced by four native Mandarin speakers to express seven emotional meanings: anger, disgust, fear, sadness, happiness, pleasant surprise, and neutrality. These expressions were identified by a group of native Mandarin listeners in a seven-alternative forced choice task, and items reaching a recognition rate of at least three times chance performance in the seven-choice task were selected as a valid database and then subjected to acoustic analysis. The results demonstrated expected variations in both perceptual and acoustic patterns of the seven vocal emotions in Mandarin. For instance, fear, anger, sadness, and neutrality were associated with relatively high recognition, whereas happiness, disgust, and pleasant surprise were recognized less accurately. Acoustically, anger and pleasant surprise exhibited relatively high mean f0 values and large variation in f0 and amplitude; in contrast, sadness, disgust, fear, and neutrality exhibited relatively low mean f0 values and small amplitude variations, and happiness exhibited a moderate mean f0 value and f0 variation. Emotional expressions varied systematically in speech rate and harmonics-to-noise ratio values as well. This validated database is available to the research community and will contribute to future studies of emotional prosody for a number of purposes. To access the database, please contact pan.liu@mail.mcgill.ca.
Whispering - The hidden side of auditory communication.

PubMed

Frühholz, Sascha; Trost, Wiebke; Grandjean, Didier

2016-11-15

Whispering is a unique expression mode that is specific to auditory communication. Individuals switch their vocalization mode to whispering especially when affected by inner emotions in certain social contexts, such as in intimate relationships or intimidating social interactions. Although this context-dependent whispering is adaptive, whispered voices are acoustically far less rich than phonated voices and thus impose higher hearing and neural auditory decoding demands for recognizing their socio-affective value by listeners. The neural dynamics underlying this recognition especially from whispered voices are largely unknown. Here we show that whispered voices in humans are considerably impoverished as quantified by an entropy measure of spectral acoustic information, and this missing information needs large-scale neural compensation in terms of auditory and cognitive processing. Notably, recognizing the socio-affective information from voices was slightly more difficult from whispered voices, probably based on missing tonal information. While phonated voices elicited extended activity in auditory regions for decoding of relevant tonal and time information and the valence of voices, whispered voices elicited activity in a complex auditory-frontal brain network. Our data suggest that a large-scale multidirectional brain network compensates for the impoverished sound quality of socially meaningful environmental signals to support their accurate recognition and valence attribution. Copyright © 2016 Elsevier Inc. All rights reserved.
Recalibration of vocal affect by a dynamic face.

PubMed

Baart, Martijn; Vroomen, Jean

2018-04-25

Perception of vocal affect is influenced by the concurrent sight of an emotional face. We demonstrate that the sight of an emotional face also can induce recalibration of vocal affect. Participants were exposed to videos of a 'happy' or 'fearful' face in combination with a slightly incongruous sentence with ambiguous prosody. After this exposure, ambiguous test sentences were rated as more 'happy' when the exposure phase contained 'happy' instead of 'fearful' faces. This auditory shift likely reflects recalibration that is induced by error minimization of the inter-sensory discrepancy. In line with this view, when the prosody of the exposure sentence was non-ambiguous and congruent with the face (without audiovisual discrepancy), aftereffects went in the opposite direction, likely reflecting adaptation. Our results demonstrate, for the first time, that perception of vocal affect is flexible and can be recalibrated by slightly discrepant visual information.
Biomechanical effects of hydration in vocal fold tissues.

PubMed

Chan, Roger W; Tayama, Niro

2002-05-01

It has often been hypothesized, with little empirical support, that vocal fold hydration affects voice production by mediating changes in vocal fold tissue rheology. To test this hypothesis, we attempted in this study to quantify the effects of hydration on the viscoelastic shear properties of vocal fold tissues in vitro. Osmotic changes in hydration (dehydration and rehydration) of 5 excised canine larynges were induced by sequential incubation of the tissues in isotonic, hypertonic, and hypotonic solutions. Elastic shear modulus (G'), dynamic viscosity eta' and the damping ratio zeta of the vocal fold mucosa (lamina propria) were measured as a function of frequency (0.01 to 15 Hz) with a torsional rheometer. Vocal fold tissue stiffness (G') and viscosity (eta) increased significantly (by 4 to 7 times) with the osmotically induced dehydration, whereas they decreased by 22% to 38% on the induced rehydration. Damping ratio (zeta) also increased with dehydration and decreased with rehydration, but the detected differences were not statistically significant at all frequencies. These findings support the long-standing hypothesis that hydration affects vocal fold vibration by altering tissue rheologic (or viscoelastic) properties. Our results demonstrated the biomechanical importance of hydration in vocal fold tissues and suggested that hydration approaches may potentially improve the biomechanics of phonation in vocal fold lesions involving disordered fluid balance.

Using Ambulatory Voice Monitoring to Investigate Common Voice Disorders: Research Update

PubMed Central

Mehta, Daryush D.; Van Stan, Jarrad H.; Zañartu, Matías; Ghassemi, Marzyeh; Guttag, John V.; Espinoza, Víctor M.; Cortés, Juan P.; Cheyne, Harold A.; Hillman, Robert E.

2015-01-01

Many common voice disorders are chronic or recurring conditions that are likely to result from inefficient and/or abusive patterns of vocal behavior, referred to as vocal hyperfunction. The clinical management of hyperfunctional voice disorders would be greatly enhanced by the ability to monitor and quantify detrimental vocal behaviors during an individual’s activities of daily life. This paper provides an update on ongoing work that uses a miniature accelerometer on the neck surface below the larynx to collect a large set of ambulatory data on patients with hyperfunctional voice disorders (before and after treatment) and matched-control subjects. Three types of analysis approaches are being employed in an effort to identify the best set of measures for differentiating among hyperfunctional and normal patterns of vocal behavior: (1) ambulatory measures of voice use that include vocal dose and voice quality correlates, (2) aerodynamic measures based on glottal airflow estimates extracted from the accelerometer signal using subject-specific vocal system models, and (3) classification based on machine learning and pattern recognition approaches that have been used successfully in analyzing long-term recordings of other physiological signals. Preliminary results demonstrate the potential for ambulatory voice monitoring to improve the diagnosis and treatment of common hyperfunctional voice disorders. PMID:26528472
Linear Classifier with Reject Option for the Detection of Vocal Fold Paralysis and Vocal Fold Edema

NASA Astrophysics Data System (ADS)

Kotropoulos, Constantine; Arce, Gonzalo R.

2009-12-01

Two distinct two-class pattern recognition problems are studied, namely, the detection of male subjects who are diagnosed with vocal fold paralysis against male subjects who are diagnosed as normal and the detection of female subjects who are suffering from vocal fold edema against female subjects who do not suffer from any voice pathology. To do so, utterances of the sustained vowel "ah" are employed from the Massachusetts Eye and Ear Infirmary database of disordered speech. Linear prediction coefficients extracted from the aforementioned utterances are used as features. The receiver operating characteristic curve of the linear classifier, that stems from the Bayes classifier when Gaussian class conditional probability density functions with equal covariance matrices are assumed, is derived. The optimal operating point of the linear classifier is specified with and without reject option. First results using utterances of the "rainbow passage" are also reported for completeness. The reject option is shown to yield statistically significant improvements in the accuracy of detecting the voice pathologies under study.
What's in a voice? Prosody as a test case for the Theory of Mind account of autism.

PubMed

Chevallier, Coralie; Noveck, Ira; Happé, Francesca; Wilson, Deirdre

2011-02-01

The human voice conveys a variety of information about people's feelings, emotions and mental states. Some of this information relies on sophisticated Theory of Mind (ToM) skills, whilst others are simpler and do not require ToM. This variety provides an interesting test case for the ToM account of autism, which would predict greater impairment as ToM requirements increase. In this paper, we draw on psychological and pragmatic theories to classify vocal cues according to the amount of mindreading required to identify them. Children with a high functioning Autism Spectrum Disorder and matched controls were tested in three experiments where the speakers' state had to be extracted from their vocalizations. Although our results confirm that people with autism have subtle difficulties dealing with vocal cues, they show a pattern of performance that is inconsistent with the view that atypical recognition of vocal cues is caused by impaired ToM. Copyright © 2010 Elsevier Ltd. All rights reserved.
Vocal learning in elephants: neural bases and adaptive context

PubMed Central

Stoeger, Angela S; Manger, Paul

2014-01-01

In the last decade clear evidence has accumulated that elephants are capable of vocal production learning. Examples of vocal imitation are documented in African (Loxodonta africana) and Asian (Elephas maximus) elephants, but little is known about the function of vocal learning within the natural communication systems of either species. We are also just starting to identify the neural basis of elephant vocalizations. The African elephant diencephalon and brainstem possess specializations related to aspects of neural information processing in the motor system (affecting the timing and learning of trunk movements) and the auditory and vocalization system. Comparative interdisciplinary (from behavioral to neuroanatomical) studies are strongly warranted to increase our understanding of both vocal learning and vocal behavior in elephants. PMID:25062469
Parent-offspring communication in the western sandpiper

USGS Publications Warehouse

Johnson, M.; Aref, S.; Walters, J.R.

2008-01-01

Western sandpiper (Calidris mauri) chicks are precocial and leave the nest shortly after hatch to forage independently. Chicks require thermoregulatory assistance from parents (brooding) for 5-7 days posthatch, and parents facilitate chick survival for 2-3 weeks posthatch by leading and defending chicks. Parental vocal signals are likely involved in protecting chicks from predators, preventing them from wandering away and becoming lost and leading them to good foraging locations. Using observational and experimental methods in the field, we describe and demonstrate the form and function of parent-chick communication in the western sandpiper. We document 4 distinct calls produced by parents that are apparently directed toward their chicks (brood, gather, alarm, and freeze calls). Through experimental playback of parental and non-parental vocalizations to chicks in a small arena, we demonstrated the following: 1) chicks respond to the alarm call by vocalizing relatively less often and moving away from the signal source, 2) chicks respond to the gather call by vocalizing relatively more often and moving toward the signal source, and 3) chicks respond to the freeze call by vocalizing relatively less often and crouching motionless on the substrate for extended periods of time. Chicks exhibited consistent directional movement and space use to parental and non-parental signals. Although fewer vocalizations were given in response to non-parental signals, which may indicate a weaker response to unfamiliar individuals, the relative number of chick calls given to each type of call signal was consistent between parental and non-parental signals. We also discovered 2 distinct chick vocalizations (chick-contact and chick-alarm calls) during arena playback experiments. Results indicate that sandpiper parents are able to elicit antipredatory chick behaviors and direct chick movement and vocalizations through vocal signals. Future study of parent-offspring communication should determine whether shorebird chicks exhibit parental recognition though vocalizations and the role of chick vocalizations in parental behavior. ?? The Author 2008. Published by Oxford University Press on behalf of the International Society for Behavioral Ecology. All rights reserved.
Mutual mother-infant recognition in mice: The role of pup ultrasonic vocalizations.

PubMed

Mogi, Kazutaka; Takakuda, Ayaka; Tsukamoto, Chihiro; Ooyama, Rumi; Okabe, Shota; Koshida, Nobuyoshi; Nagasawa, Miho; Kikusui, Takefumi

2017-05-15

The importance of the mother-infant bond for the development of offspring health and sociality has been studied not only in primate species but also in rodent species. A social bond is defined as affiliative behaviors toward a specific partner. However, controversy remains concerning whether mouse pups can distinguish between their own mother and an alien mother, and whether mothers can differentiate their own pups from alien pups. In this study, we investigated whether mutual recognition exists between mother and infant in ICR mice. Furthermore, we studied pup ultrasonic vocalizations (USVs), which are emitted by pups when isolated from their mothers, to determine whether they constituted an individual signature used by the mother for pup recognition. We conducted a variety of two-choice tests and selective-retrieving tests. In a two-choice test for mother recognition by the pup, pups between the ages of 17 and 21days preferred their own mothers to alien mothers. In a two-choice test for pup recognition by its mother, the mothers located their own pups faster than alien pups at the beginning of the test, yet displayed similar retrieving activity for both their own and alien pups in the subsequent selective-retrieving test. Furthermore, after recording USVs from pups from subject and alien mothers, then playing them simultaneously, subject mothers displayed a preference for pup USVs emitted by their own pups. Overall, our findings support the existence of mother-infant bonding in mice and suggest that pup USVs contribute to pup recognition by mothers. Copyright © 2016. Published by Elsevier B.V.
Human vocal attractiveness as signaled by body size projection.

PubMed

Xu, Yi; Lee, Albert; Wu, Wing-Li; Liu, Xuan; Birkholz, Peter

2013-01-01

Voice, as a secondary sexual characteristic, is known to affect the perceived attractiveness of human individuals. But the underlying mechanism of vocal attractiveness has remained unclear. Here, we presented human listeners with acoustically altered natural sentences and fully synthetic sentences with systematically manipulated pitch, formants and voice quality based on a principle of body size projection reported for animal calls and emotional human vocal expressions. The results show that male listeners preferred a female voice that signals a small body size, with relatively high pitch, wide formant dispersion and breathy voice, while female listeners preferred a male voice that signals a large body size with low pitch and narrow formant dispersion. Interestingly, however, male vocal attractiveness was also enhanced by breathiness, which presumably softened the aggressiveness associated with a large body size. These results, together with the additional finding that the same vocal dimensions also affect emotion judgment, indicate that humans still employ a vocal interaction strategy used in animal calls despite the development of complex language.
Interactive voice technology: Variations in the vocal utterances of speakers performing a stress-inducing task

NASA Astrophysics Data System (ADS)

Mosko, J. D.; Stevens, K. N.; Griffin, G. R.

1983-08-01

Acoustical analyses were conducted of words produced by four speakers in a motion stress-inducing situation. The aim of the analyses was to document the kinds of changes that occur in the vocal utterances of speakers who are exposed to motion stress and to comment on the implications of these results for the design and development of voice interactive systems. The speakers differed markedly in the types and magnitudes of the changes that occurred in their speech. For some speakers, the stress-inducing experimental condition caused an increase in fundamental frequency, changes in the pattern of vocal fold vibration, shifts in vowel production and changes in the relative amplitudes of sounds containing turbulence noise. All speakers showed greater variability in the experimental condition than in more relaxed control situation. The variability was manifested in the acoustical characteristics of individual phonetic elements, particularly in speech sound variability observed serve to unstressed syllables. The kinds of changes and variability observed serve to emphasize the limitations of speech recognition systems based on template matching of patterns that are stored in the system during a training phase. There is need for a better understanding of these phonetic modifications and for developing ways of incorporating knowledge about these changes within a speech recognition system.
Interstitial protein alterations in rabbit vocal fold with scar.

PubMed

Thibeault, Susan L; Bless, Diane M; Gray, Steven D

2003-09-01

Fibrous and interstitial proteins compose the extracellular matrix of the vocal fold lamina propria and account for its biomechanic properties. Vocal fold scarring is characterized by altered biomechanical properties, which create dysphonia. Although alterations of the fibrous proteins have been confirmed in the rabbit vocal fold scar, interstitial proteins, which are known to be important in wound repair, have not been investigated to date. Using a rabbit model, interstitial proteins decorin, fibromodulin, and fibronectin were examined immunohistologically, two months postinduction of vocal fold scar by means of forcep biopsy. Significantly decreased decorin and fibromodulin with significantly increased fibronectin characterized scarred vocal fold tissue. The implications of altered interstitial proteins levels and their affect on the fibrous proteins will be discussed in relation to increased vocal fold stiffness and viscosity, which characterizes vocal fold scar.
Multilevel Analysis in Analyzing Speech Data

ERIC Educational Resources Information Center

Guddattu, Vasudeva; Krishna, Y.

2011-01-01

The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Noise levels in an urban Asian school environment

PubMed Central

Chan, Karen M.K.; Li, Chi Mei; Ma, Estella P.M.; Yiu, Edwin M.L.; McPherson, Bradley

2015-01-01

Background noise is known to adversely affect speech perception and speech recognition. High levels of background noise in school classrooms may affect student learning, especially for those pupils who are learning in a second language. The current study aimed to determine the noise level and teacher speech-to-noise ratio (SNR) in Hong Kong classrooms. Noise level was measured in 146 occupied classrooms in 37 schools, including kindergartens, primary schools, secondary schools and special schools, in Hong Kong. The mean noise levels in occupied kindergarten, primary school, secondary school and special school classrooms all exceeded recommended maximum noise levels, and noise reduction measures were seldom used in classrooms. The measured SNRs were not optimal and could have adverse implications for student learning and teachers’ vocal health. Schools in urban Asian environments are advised to consider noise reduction measures in classrooms to better comply with recommended maximum noise levels for classrooms. PMID:25599758
Noise levels in an urban Asian school environment.

PubMed

Chan, Karen M K; Li, Chi Mei; Ma, Estella P M; Yiu, Edwin M L; McPherson, Bradley

2015-01-01

Background noise is known to adversely affect speech perception and speech recognition. High levels of background noise in school classrooms may affect student learning, especially for those pupils who are learning in a second language. The current study aimed to determine the noise level and teacher speech-to-noise ratio (SNR) in Hong Kong classrooms. Noise level was measured in 146 occupied classrooms in 37 schools, including kindergartens, primary schools, secondary schools and special schools, in Hong Kong. The mean noise levels in occupied kindergarten, primary school, secondary school and special school classrooms all exceeded recommended maximum noise levels, and noise reduction measures were seldom used in classrooms. The measured SNRs were not optimal and could have adverse implications for student learning and teachers' vocal health. Schools in urban Asian environments are advised to consider noise reduction measures in classrooms to better comply with recommended maximum noise levels for classrooms.
Quantitative Tools for Examining the Vocalizations of Juvenile Songbirds

PubMed Central

Wellock, Cameron D.; Reeke, George N.

2012-01-01

The singing of juvenile songbirds is highly variable and not well stereotyped, a feature that makes it difficult to analyze with existing computational techniques. We present here a method suitable for analyzing such vocalizations, windowed spectral pattern recognition (WSPR). Rather than performing pairwise sample comparisons, WSPR measures the typicality of a sample against a large sample set. We also illustrate how WSPR can be used to perform a variety of tasks, such as sample classification, song ontogeny measurement, and song variability measurement. Finally, we present a novel measure, based on WSPR, for quantifying the apparent complexity of a bird's singing. PMID:22701474
Gender differences affecting vocal health of women in vocally demanding careers

PubMed Central

Hunter, Eric J.; Smith, Marshall E.; Tanner, Kristine

2012-01-01

Studies suggest that occupational voice users have a greater incidence of vocal issues than the general population. Women have been found to experience vocal health problems more frequently than men, regardless of their occupation. Traditionally, it has been assumed that differences in the laryngeal system are the cause of this disproportion. Nevertheless, it is valuable to identify other potential gender distinctions which may make women more vulnerable to voice disorders. A search of the literature was conducted for gender-specific characteristics which might impact the vocal health of women. This search can be used by healthcare practitioners to help female patients avoid serious vocal health injuries, as well as to better treat women who already suffer from such vocal health issues. PMID:21722077
Name that tune: Melodic recognition by songbirds.

PubMed

Templeton, Christopher N

2016-12-01

Recent findings have indicated that European starlings perceive overall spectral shape and use this, rather than absolute pitch or timbre, to generalize between similar melodic progressions. This finding highlights yet another parallel between human and avian vocal communication systems and has many biological implications.
Strain Modulations as a Mechanism to Reduce Stress Relaxation in Laryngeal Tissues

PubMed Central

Hunter, Eric J.; Siegmund, Thomas; Chan, Roger W.

2014-01-01

Vocal fold tissues in animal and human species undergo deformation processes at several types of loading rates: a slow strain involved in vocal fold posturing (on the order of 1 Hz or so), cyclic and faster posturing often found in speech tasks or vocal embellishment (1–10 Hz), and shear strain associated with vocal fold vibration during phonation (100 Hz and higher). Relevant to these deformation patterns are the viscous properties of laryngeal tissues, which exhibit non-linear stress relaxation and recovery. In the current study, a large strain time-dependent constitutive model of human vocal fold tissue is used to investigate effects of phonatory posturing cyclic strain in the range of 1 Hz to 10 Hz. Tissue data for two subjects are considered and used to contrast the potential effects of age. Results suggest that modulation frequency and extent (amplitude), as well as the amount of vocal fold overall strain, all affect the change in stress relaxation with modulation added. Generally, the vocal fold cover reduces the rate of relaxation while the opposite is true for the vocal ligament. Further, higher modulation frequencies appear to reduce the rate of relaxation, primarily affecting the ligament. The potential benefits of cyclic strain, often found in vibrato (around 5 Hz modulation) and intonational inflection, are discussed in terms of vocal effort and vocal pitch maintenance. Additionally, elderly tissue appears to not exhibit these benefits to modulation. The exacerbating effect such modulations may have on certain voice disorders, such as muscle tension dysphonia, are explored. PMID:24614616
Strain modulations as a mechanism to reduce stress relaxation in laryngeal tissues.

PubMed

Hunter, Eric J; Siegmund, Thomas; Chan, Roger W

2014-01-01

Vocal fold tissues in animal and human species undergo deformation processes at several types of loading rates: a slow strain involved in vocal fold posturing (on the order of 1 Hz or so), cyclic and faster posturing often found in speech tasks or vocal embellishment (1-10 Hz), and shear strain associated with vocal fold vibration during phonation (100 Hz and higher). Relevant to these deformation patterns are the viscous properties of laryngeal tissues, which exhibit non-linear stress relaxation and recovery. In the current study, a large strain time-dependent constitutive model of human vocal fold tissue is used to investigate effects of phonatory posturing cyclic strain in the range of 1 Hz to 10 Hz. Tissue data for two subjects are considered and used to contrast the potential effects of age. Results suggest that modulation frequency and extent (amplitude), as well as the amount of vocal fold overall strain, all affect the change in stress relaxation with modulation added. Generally, the vocal fold cover reduces the rate of relaxation while the opposite is true for the vocal ligament. Further, higher modulation frequencies appear to reduce the rate of relaxation, primarily affecting the ligament. The potential benefits of cyclic strain, often found in vibrato (around 5 Hz modulation) and intonational inflection, are discussed in terms of vocal effort and vocal pitch maintenance. Additionally, elderly tissue appears to not exhibit these benefits to modulation. The exacerbating effect such modulations may have on certain voice disorders, such as muscle tension dysphonia, are explored.
Effects of Physical Attractiveness on Evaluation of Vocal Performance.

ERIC Educational Resources Information Center

Wapnick, Joel; Darrow, Alice Ann; Kovacs, Jolan; Dalrymple, Lucinda

1997-01-01

Studies whether physical attractiveness of singers affects judges' ratings of their vocal performances. Reveals that physical attractiveness does impact evaluation, that male raters were more severe than female raters, and that the rating of undergraduate majors versus graduate students and professors combined were not differently affected by…
Reducing Vocalized Pauses in Public Speaking Situations Using the VP Card

ERIC Educational Resources Information Center

Ramos Salazar, Leslie

2014-01-01

This article describes a speaking problem very common in today's world--"vocalized pauses" (VP). Vocalized pauses are defined as utterances such as "uh," "like," and "um" that occur between words in oral sentences. This practice of everyday speech can affect how a speaker's intentions are…
Vocal Parameters and Self-Perception in Individuals With Adductor Spasmodic Dysphonia.

PubMed

Rojas, Gleidy Vannesa E; Ricz, Hilton; Tumas, Vitor; Rodrigues, Guilherme R; Toscano, Patrícia; Aguiar-Ricz, Lílian

2017-05-01

The study aimed to compare and correlate perceptual-auditory analysis of vocal parameters and self-perception in individuals with adductor spasmodic dysphonia before and after the application of botulinum toxin. This is a prospective cohort study. Sixteen individuals with a diagnosis of adductor spasmodic dysphonia were submitted to the application of botulinum toxin in the thyroarytenoid muscle, to the recording of a voice signal, and to the Voice Handicap Index (VHI) questionnaire before the application and at two time points after application. Two judges performed a perceptual-auditory analysis of eight vocal parameters with the aid of the Praat software for the visualization of narrow band spectrography, pitch, and intensity contour. Comparison of the vocal parameters before toxin application and on the first return revealed a reduction of oscillation intensity (P = 0.002), voice breaks (P = 0.002), and vocal tremor (P = 0.002). The same parameters increased on the second return. The degree of severity, strained-strangled voice, roughness, breathiness, and asthenia was unchanged. The total score and the emotional domain score of the VHI were reduced on the first return. There was a moderate correlation between the degree of voice severity and the total VHI score before application and on the second return, and a weak correlation on the first return. Perceptual-auditory analysis and self-perception proved to be efficient in the recognition of vocal changes and of the vocal impact on individuals with adductor spasmodic dysphonia under treatment with botulinum toxin, permitting the quantitation of changes along time. Copyright © 2017. Published by Elsevier Inc.

Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback.

PubMed

Behroozmand, Roozbeh; Larson, Charles R

2011-06-06

The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.
The effect of voice amplification on occupational vocal dose in elementary school teachers.

PubMed

Gaskill, Christopher S; O'Brien, Shenendoah G; Tinter, Sara R

2012-09-01

Two elementary school teachers, one with and one without a history of vocal complaints, wore a vocal dosimeter all day at school for a 3-week period. In the second week, each teacher wore a portable voice amplifier. Each teacher showed a reduction in vocal intensity during the week of amplification, with a larger effect for the teacher with vocal difficulties. This teacher also showed a decrease in hourly vocal fold distance dose as measured by the dosimeter despite incurring longer phonation times. Fundamental frequency and vocal fold cycle dose did not appear to be affected by the use of amplification during the teaching day. Both teachers showed evidence of a possible moderate effect of adjusting vocal intensity in the week after amplification, possibly as a means to recalibrate their perceived vocal loudness. This study demonstrates the usefulness of both vocal dosimetry and amplification in monitoring and modifying vocal dose in an occupational setting and reinforces previous data suggesting the effectiveness of amplification in reducing the vocal load in schoolteachers. Implications of the data for future research regarding prevention and treatment of occupational voice disorders are discussed. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Texting while driving using Google Glass™: Promising but not distraction-free.

PubMed

He, Jibo; Choi, William; McCarley, Jason S; Chaparro, Barbara S; Wang, Chun

2015-08-01

Texting while driving is risky but common. This study evaluated how texting using a Head-Mounted Display, Google Glass, impacts driving performance. Experienced drivers performed a classic car-following task while using three different interfaces to text: fully manual interaction with a head-down smartphone, vocal interaction with a smartphone, and vocal interaction with Google Glass. Fully manual interaction produced worse driving performance than either of the other interaction methods, leading to more lane excursions and variable vehicle control, and higher workload. Compared to texting vocally with a smartphone, texting using Google Glass produced fewer lane excursions, more braking responses, and lower workload. All forms of texting impaired driving performance compared to undistracted driving. These results imply that the use of Google Glass for texting impairs driving, but its Head-Mounted Display configuration and speech recognition technology may be safer than texting using a smartphone. Copyright © 2015 Elsevier Ltd. All rights reserved.
Vocal fold nodules in adult singers: regional opinions about etiologic factors, career impact, and treatment. A survey of otolaryngologists, speech pathologists, and teachers of singing.

PubMed

Hogikyan, N D; Appel, S; Guinn, L W; Haxer, M J

1999-03-01

This study was undertaken to better understand current regional opinions regarding vocal fold nodules in adult singers. A questionnaire was sent to 298 persons representing the 3 professional groups most involved with the care of singers with vocal nodules: otolaryngologists, speech pathologists, and teachers of singing. The questionnaire queried respondents about their level of experience with this problem, and their beliefs about causative factors, career impact, and optimum treatment. Responses within and between groups were similar, with differences between groups primarily in the magnitude of positive or negative responses, rather than in the polarity of the responses. Prevailing opinions included: recognition of causative factors in both singing and speaking voice practices, optimism about responsiveness to appropriate treatment, enthusiasm for coordinated voice therapy and voice training as first-line treatment, and acceptance of microsurgical management as appropriate treatment if behavioral management fails.
Temporal processing of speech in a time-feature space

NASA Astrophysics Data System (ADS)

Avendano, Carlos

1997-09-01

The performance of speech communication systems often degrades under realistic environmental conditions. Adverse environmental factors include additive noise sources, room reverberation, and transmission channel distortions. This work studies the processing of speech in the temporal-feature or modulation spectrum domain, aiming for alleviation of the effects of such disturbances. Speech reflects the geometry of the vocal organs, and the linguistically dominant component is in the shape of the vocal tract. At any given point in time, the shape of the vocal tract is reflected in the short-time spectral envelope of the speech signal. The rate of change of the vocal tract shape appears to be important for the identification of linguistic components. This rate of change, or the rate of change of the short-time spectral envelope can be described by the modulation spectrum, i.e. the spectrum of the time trajectories described by the short-time spectral envelope. For a wide range of frequency bands, the modulation spectrum of speech exhibits a maximum at about 4 Hz, the average syllabic rate. Disturbances often have modulation frequency components outside the speech range, and could in principle be attenuated without significantly affecting the range with relevant linguistic information. Early efforts for exploiting the modulation spectrum domain (temporal processing), such as the dynamic cepstrum or the RASTA processing, used ad hoc designed processing and appear to be suboptimal. As a major contribution, in this dissertation we aim for a systematic data-driven design of temporal processing. First we analytically derive and discuss some properties and merits of temporal processing for speech signals. We attempt to formalize the concept and provide a theoretical background which has been lacking in the field. In the experimental part we apply temporal processing to a number of problems including adaptive noise reduction in cellular telephone environments, reduction of reverberation for speech enhancement, and improvements on automatic recognition of speech degraded by linear distortions and reverberation.
Vocal tract length and acoustics of vocalization in the domestic dog (Canis familiaris).

PubMed

Riede, T; Fitch, T

1999-10-01

The physical nature of the vocal tract results in the production of formants during vocalisation. In some animals (including humans), receivers can derive information (such as body size) about sender characteristics on the basis of formant characteristics. Domestication and selective breeding have resulted in a high variability in head size and shape in the dog (Canis familiaris), suggesting that there might be large differences in the vocal tract length, which could cause formant behaviour to affect interbreed communication. Lateral radiographs were made of dogs from several breeds ranging in size from a Yorkshire terrier (2.5 kg) to a German shepherd (50 kg) and were used to measure vocal tract length. In addition, we recorded an acoustic signal (growling) from some dogs. Significant correlations were found between vocal tract length, body mass and formant dispersion, suggesting that formant dispersion can deliver information about the body size of the vocalizer. Because of the low correlation between vocal tract length and the first formant, we predict a non-uniform vocal tract shape.
The Influence of Vocal Qualities and Confirmation of Nonnative English-Speaking Teachers on Student Receiver Apprehension, Affective Learning, and Cognitive Learning

ERIC Educational Resources Information Center

Hsu, Chia-Fang

2012-01-01

This study investigated the influence of teacher vocal qualities and confirmation behaviors on student learning. Students (N = 197) enrolled in nonnative English-speaking teachers' classes completed a battery of instruments. Results indicated that both vocal qualities and confirmation behaviors were negatively related to receiver apprehension,…
Limiting parental interaction during vocal development affects acoustic call structure in marmoset monkeys

PubMed Central

2018-01-01

Human vocal development is dependent on learning by imitation through social feedback between infants and caregivers. Recent studies have revealed that vocal development is also influenced by parental feedback in marmoset monkeys, suggesting vocal learning mechanisms in nonhuman primates. Marmoset infants that experience more contingent vocal feedback than their littermates develop vocalizations more rapidly, and infant marmosets with limited parental interaction exhibit immature vocal behavior beyond infancy. However, it is yet unclear whether direct parental interaction is an obligate requirement for proper vocal development because all monkeys in the aforementioned studies were able to produce the adult call repertoire after infancy. Using quantitative measures to compare distinct call parameters and vocal sequence structure, we show that social interaction has a direct impact not only on the maturation of the vocal behavior but also on acoustic call structures during vocal development. Monkeys with limited parental interaction during development show systematic differences in call entropy, a measure for maturity, compared with their normally raised siblings. In addition, different call types were occasionally uttered in motif-like sequences similar to those exhibited by vocal learners, such as birds and humans, in early vocal development. These results indicate that a lack of parental interaction leads to long-term disturbances in the acoustic structure of marmoset vocalizations, suggesting an imperative role for social interaction in proper primate vocal development. PMID:29651461
Limiting parental interaction during vocal development affects acoustic call structure in marmoset monkeys.

PubMed

Gultekin, Yasemin B; Hage, Steffen R

2018-04-01

Human vocal development is dependent on learning by imitation through social feedback between infants and caregivers. Recent studies have revealed that vocal development is also influenced by parental feedback in marmoset monkeys, suggesting vocal learning mechanisms in nonhuman primates. Marmoset infants that experience more contingent vocal feedback than their littermates develop vocalizations more rapidly, and infant marmosets with limited parental interaction exhibit immature vocal behavior beyond infancy. However, it is yet unclear whether direct parental interaction is an obligate requirement for proper vocal development because all monkeys in the aforementioned studies were able to produce the adult call repertoire after infancy. Using quantitative measures to compare distinct call parameters and vocal sequence structure, we show that social interaction has a direct impact not only on the maturation of the vocal behavior but also on acoustic call structures during vocal development. Monkeys with limited parental interaction during development show systematic differences in call entropy, a measure for maturity, compared with their normally raised siblings. In addition, different call types were occasionally uttered in motif-like sequences similar to those exhibited by vocal learners, such as birds and humans, in early vocal development. These results indicate that a lack of parental interaction leads to long-term disturbances in the acoustic structure of marmoset vocalizations, suggesting an imperative role for social interaction in proper primate vocal development.
[Clinical study on vocal cords spontaneous rehabilitation after CO2 laser surgery].

PubMed

Zhang, Qingxiang; Hu, Huiying; Sun, Guoyan; Yu, Zhenkun

2014-10-01

To study the spontaneous rehabilitation and phonation quality of vocal cords after different types of CO2 laser microsurgery. Surgical procedures based on Remacle system Type I, Type II, Type III, Type IV and Type V a respectively. Three hundred and fifteen cases with hoarseness based on strobe laryngoscopy results were prospectively assigned to different group according to vocal lesions apperence,vocal vibration and imaging of larynx CT/MRI. Each group holded 63 cases. The investigation included the vocal cords morphological features,the patients' subjective feelings and objective results of vocal cords. There are no severe complications for all patients in perioperative period. Vocal scar found in Type I ,1 case; Type II, 9 cases ;Type III, 47 cases; Type IV, 61 cases and Type Va 63 cases respectively after surgery. The difference of Vocal scar formation after surgery between surgical procedures are statistical significance (χ2 = 222.24, P < 0.05). Hoarseness improved after the surgery in 59 cases of Type I , 51 cases of Type II, 43 cases of Type III, 21 cases of Type IV and 17 cases of Type Va. There are statistically significance (χ2 = 89.46, P < 0.05) between different surgical procedures. The parameters of strobe laryngoscope: there are statistical significance on jitter between procedures (F 44.51, P < 0.05), but without difference within Type I and Type II (P > 0.05). This happened in shimmer parameter and the maximum phonation time (MPT) as jitter. There are no statistical significance between Type IV and Type Va on MPT (P > 0.05). Morphological and functional rehabilitation of vocal cord will be affected obviously when the body layer is injured. The depth and range of the CO2 laser microsurgery are the key factors affecting the vocal rehabilitation.
Two organizing principles of vocal production: Implications for nonhuman and human primates.

PubMed

Owren, Michael J; Amoss, R Toby; Rendall, Drew

2011-06-01

Vocal communication in nonhuman primates receives considerable research attention, with many investigators arguing for similarities between this calling and speech in humans. Data from development and neural organization show a central role of affect in monkey and ape sounds, however, suggesting that their calls are homologous to spontaneous human emotional vocalizations while having little relation to spoken language. Based on this evidence, we propose two principles that can be useful in evaluating the many and disparate empirical findings that bear on the nature of vocal production in nonhuman and human primates. One principle distinguishes production-first from reception-first vocal development, referring to the markedly different role of auditory-motor experience in each case. The second highlights a phenomenon dubbed dual neural pathways, specifically that when a species with an existing vocal system evolves a new functionally distinct vocalization capability, it occurs through emergence of a second parallel neural pathway rather than through expansion of the extant circuitry. With these principles as a backdrop, we review evidence of acoustic modification of calling associated with background noise, conditioning effects, audience composition, and vocal convergence and divergence in nonhuman primates. Although each kind of evidence has been interpreted to show flexible cognitively mediated control over vocal production, we suggest that most are more consistent with affectively grounded mechanisms. The lone exception is production of simple, novel sounds in great apes, which is argued to reveal at least some degree of volitional vocal control. If also present in early hominins, the cortically based circuitry surmised to be associated with these rudimentary capabilities likely also provided the substrate for later emergence of the neural pathway allowing volitional production in modern humans. © 2010 Wiley-Liss, Inc.
Vocalization-Induced Enhancement of the Auditory Cortex Responsiveness during Voice F0 Feedback Perturbation

PubMed Central

Behroozmand, Roozbeh; Karvelis, Laura; Liu, Hanjun; Larson, Charles R.

2009-01-01

Objective The present study investigated whether self-vocalization enhances auditory neural responsiveness to voice pitch feedback perturbation and how this vocalization-induced neural modulation can be affected by the extent of the feedback deviation. Method Event related potentials (ERPs) were recorded in 15 subjects in response to +100, +200 and +500 cents pitch-shifted voice auditory feedback during active vocalization and passive listening to the playback of the self-produced vocalizations. Result The amplitude of the evoked P1 (latency: 73.51 ms) and P2 (latency: 199.55 ms) ERP components in response to feedback perturbation were significantly larger during vocalization than listening. The difference between P2 peak amplitudes during vocalization vs. listening was shown to be significantly larger for +100 than +500 cents stimulus. Conclusion Results indicate that the human auditory cortex is more responsive to voice F0 feedback perturbations during vocalization than passive listening. Greater vocalization-induced enhancement of the auditory responsiveness to smaller feedback perturbations may imply that the audio-vocal system detects and corrects for errors in vocal production that closely match the expected vocal output. Significance Findings of this study support previous suggestions regarding the enhanced auditory sensitivity to feedback alterations during self-vocalization, which may serve the purpose of feedback-based monitoring of one’s voice. PMID:19520602
Altered vocal fold kinematics in synthetic self-oscillating models that employ adipose tissue as a lateral boundary condition.

NASA Astrophysics Data System (ADS)

Saidi, Hiba; Erath, Byron D.

2015-11-01

The vocal folds play a major role in human communication by initiating voiced sound production. During voiced speech, the vocal folds are set into sustained vibrations. Synthetic self-oscillating vocal fold models are regularly employed to gain insight into flow-structure interactions governing the phonation process. Commonly, a fixed boundary condition is applied to the lateral, anterior, and posterior sides of the synthetic vocal fold models. However, physiological observations reveal the presence of adipose tissue on the lateral surface between the thyroid cartilage and the vocal folds. The goal of this study is to investigate the influence of including this substrate layer of adipose tissue on the dynamics of phonation. For a more realistic representation of the human vocal folds, synthetic multi-layer vocal fold models have been fabricated and tested while including a soft lateral layer representative of adipose tissue. Phonation parameters have been collected and are compared to those of the standard vocal fold models. Results show that vocal fold kinematics are affected by adding the adipose tissue layer as a new boundary condition.
Major depressive disorder discrimination using vocal acoustic features.

PubMed

Taguchi, Takaya; Tachikawa, Hirokazu; Nemoto, Kiyotaka; Suzuki, Masayuki; Nagano, Toru; Tachibana, Ryuki; Nishimura, Masafumi; Arai, Tetsuaki

2018-01-01

The voice carries various information produced by vibrations of the vocal cords and the vocal tract. Though many studies have reported a relationship between vocal acoustic features and depression, including mel-frequency cepstrum coefficients (MFCCs) which applied to speech recognition, there have been few studies in which acoustic features allowed discrimination of patients with depressive disorder. Vocal acoustic features as biomarker of depression could make differential diagnosis of patients with depressive state. In order to achieve differential diagnosis of depression, in this preliminary study, we examined whether vocal acoustic features could allow discrimination between depressive patients and healthy controls. Subjects were 36 patients who met the criteria for major depressive disorder and 36 healthy controls with no current or past psychiatric disorders. Voices of reading out digits before and after verbal fluency task were recorded. Voices were analyzed using OpenSMILE. The extracted acoustic features, including MFCCs, were used for group comparison and discriminant analysis between patients and controls. The second dimension of MFCC (MFCC 2) was significantly different between groups and allowed the discrimination between patients and controls with a sensitivity of 77.8% and a specificity of 86.1%. The difference in MFCC 2 between the two groups reflected an energy difference of frequency around 2000-3000Hz. The MFCC 2 was significantly different between depressive patients and controls. This feature could be a useful biomarker to detect major depressive disorder. Sample size was relatively small. Psychotropics could have a confounding effect on voice. Copyright © 2017 Elsevier B.V. All rights reserved.
The disassociation of visual and acoustic conspecific cues decreases discrimination by female zebra finches (Taeniopygia guttata).

PubMed

Campbell, Dana L M; Hauber, Mark E

2009-08-01

Female zebra finches (Taeniopygia guttata) use visual and acoustic traits for accurate recognition of male conspecifics. Evidence from video playbacks confirms that both sensory modalities are important for conspecific and species discrimination, but experimental evidence of the individual roles of these cue types affecting live conspecific recognition is limited. In a spatial paradigm to test discrimination, the authors used live male zebra finch stimuli of 2 color morphs, wild-type (conspecific) and white with a painted black beak (foreign), producing 1 of 2 vocalization types: songs and calls learned from zebra finch parents (conspecific) or cross-fostered songs and calls learned from Bengalese finch (Lonchura striata vars. domestica) foster parents (foreign). The authors found that female zebra finches consistently preferred males with conspecific visual and acoustic cues over males with foreign cues, but did not discriminate when the conspecific and foreign visual and acoustic cues were mismatched. These results indicate the importance of both visual and acoustic features for female zebra finches when discriminating between live conspecific males. Copyright 2009 APA, all rights reserved.
Early development of turn-taking with parents shapes vocal acoustics in infant marmoset monkeys

PubMed Central

Takahashi, Daniel Y.; Fenley, Alicia R.; Ghazanfar, Asif A.

2016-01-01

In humans, vocal turn-taking is a ubiquitous form of social interaction. It is a communication system that exhibits the properties of a dynamical system: two individuals become coupled to each other via acoustic exchanges and mutually affect each other. Human turn-taking develops during the first year of life. We investigated the development of vocal turn-taking in infant marmoset monkeys, a New World species whose adult vocal behaviour exhibits the same universal features of human turn-taking. We find that marmoset infants undergo the same trajectory of change for vocal turn-taking as humans, and do so during the same life-history stage. Our data show that turn-taking by marmoset infants depends on the development of self-monitoring, and that contingent parental calls elicit more mature-sounding calls from infants. As in humans, there was no evidence that parental feedback affects the rate of turn-taking maturation. We conclude that vocal turn-taking by marmoset monkeys and humans is an instance of convergent evolution, possibly as a result of pressures on both species to adopt a cooperative breeding strategy and increase volubility. PMID:27069047
Stimulus-Dependent Flexibility in Non-Human Auditory Pitch Processing

ERIC Educational Resources Information Center

Bregman, Micah R.; Patel, Aniruddh D.; Gentner, Timothy Q.

2012-01-01

Songbirds and humans share many parallels in vocal learning and auditory sequence processing. However, the two groups differ notably in their abilities to recognize acoustic sequences shifted in absolute pitch (pitch height). Whereas humans maintain accurate recognition of words or melodies over large pitch height changes, songbirds are…
Predicting Achievable Fundamental Frequency Ranges in Vocalization Across Species

PubMed Central

Titze, Ingo; Riede, Tobias; Mau, Ted

2016-01-01

Vocal folds are used as sound sources in various species, but it is unknown how vocal fold morphologies are optimized for different acoustic objectives. Here we identify two main variables affecting range of vocal fold vibration frequency, namely vocal fold elongation and tissue fiber stress. A simple vibrating string model is used to predict fundamental frequency ranges across species of different vocal fold sizes. While average fundamental frequency is predominantly determined by vocal fold length (larynx size), range of fundamental frequency is facilitated by (1) laryngeal muscles that control elongation and by (2) nonlinearity in tissue fiber tension. One adaptation that would increase fundamental frequency range is greater freedom in joint rotation or gliding of two cartilages (thyroid and cricoid), so that vocal fold length change is maximized. Alternatively, tissue layers can develop to bear a disproportionate fiber tension (i.e., a ligament with high density collagen fibers), increasing the fundamental frequency range and thereby vocal versatility. The range of fundamental frequency across species is thus not simply one-dimensional, but can be conceptualized as the dependent variable in a multi-dimensional morphospace. In humans, this could allow for variations that could be clinically important for voice therapy and vocal fold repair. Alternative solutions could also have importance in vocal training for singing and other highly-skilled vocalizations. PMID:27309543
Using active shape modeling based on MRI to study morphologic and pitch-related functional changes affecting vocal structures and the airway.

PubMed

Miller, Nicola A; Gregory, Jennifer S; Aspden, Richard M; Stollery, Peter J; Gilbert, Fiona J

2014-09-01

The shape of the vocal tract and associated structures (eg, tongue and velum) is complicated and varies according to development and function. This variability challenges interpretation of voice experiments. Quantifying differences between shapes and understanding how vocal structures move in relation to each other is difficult using traditional linear and angle measurements. With statistical shape models, shape can be characterized in terms of independent modes of variation. Here, we build an active shape model (ASM) to assess morphologic and pitch-related functional changes affecting vocal structures and the airway. Using a cross-sectional study design, we obtained six midsagittal magnetic resonance images from 10 healthy adults (five men and five women) at rest, while breathing out, and while listening to, and humming low and high notes. Eighty landmark points were chosen to define the shape of interest and an ASM was built using these (60) images. Principal component analysis was used to identify independent modes of variation, and statistical analysis was performed using one-way repeated-measures analysis of variance. Twenty modes of variation were identified with modes 1 and 2 accounting for half the total variance. Modes 1 and 9 were significantly associated with humming low and high notes (P < 0.001) and showed coordinated changes affecting the cervical spine, vocal structures, and airway. Mode 2 highlighted wide structural variations between subjects. This study highlights the potential of active shape modeling to advance understanding of factors underlying morphologic and pitch-related functional variations affecting vocal structures and the airway in health and disease. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Effect of Traditional Singing Warm-Up Versus Semioccluded Vocal Tract Exercises on the Acoustic Parameters of Singing Voice.

PubMed

Duke, Emily; Plexico, Laura W; Sandage, Mary J; Hoch, Matthew

2015-11-01

This study investigated the effect of traditional vocal warm-up versus semioccluded vocal tract exercises on the acoustic parameters of voice through three questions: does vocal warm-up condition significantly alter the singing power ratio of the singing voice? Is singing power ratio dependent upon vowel? Is perceived phonatory effort affected by warm-up condition? Hypotheses were that vocal warm-up would alter the singing power ratio, and that semioccluded vocal tract warm-up would affect the singing power ratio more than no warm-up or traditional warm-up, that singing power ratio would vary across vowel, and that perceived phonatory effort would vary with warm-up condition. This study was a within-participant repeated measures design with counterbalanced conditions. Thirteen male singers were recorded under three different conditions: no warm-up, traditional warm-up, and semioccluded vocal tract exercise warm-up. Recordings were made of these singers performing the Star Spangled Banner, and singing power ratio (SPR) was calculated from four vowels. Singers rated their perceived phonatory effort (PPE) singing the Star Spangled Banner after each warm-up condition. Warm-up condition did not significantly affect SPR. SPR was significantly different for /i/ and /e/. PPE was not significantly different between warm-up conditions. The present study did not find significant differences in SPR between warm-up conditions. SPR differences for /i/, support previous findings. PPE did not differ significantly across warm-up condition despite the expectation that traditional or semioccluded warm-up would cause a decrease. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Psychogenic Respiratory Distress: A Case of Paradoxical Vocal Cord Dysfunction and Literature Review

PubMed Central

Leo, Raphael J.; Konakanchi, Ramesh

1999-01-01

Background: Pulmonary disease such as asthma is a psychosomatic disorder vulnerable to exacerbations precipitated by psychological factors. A case is described in which a patient thought to have treatment-refractory asthma was discovered to have a conversion reaction, specifically paradoxical vocal cord dysfunction (PVCD), characterized by abnormal vocal cord adduction during inspiration. Data Sources: Reports of PVCD were located using a MEDLINE search and review of bibliographies. MEDLINE (English language only) was searched from 1966 through December 1998 using the terms functional asthma, functional upper airway obstruction, laryngeal diseases, Munchausen's stridor, paradoxical vocal cord dysfunction, psychogenic stridor, respiratory stridor, vocal cord dysfunction, and vocal cord paralysis. A total of 170 cases of PVCD were reviewed. Study Findings: PVCD appears to be significantly more common among females. PVCD spans all age groups, including pediatric, adolescent, and adult patients. PVCD was most often misdiagnosed as asthma or upper airway disease. Because patients present with atypical and/or refractory symptoms, several diagnostic tests are employed to evaluate patients with PVCD; laryngoscopy is the most common. Direct visualization of abnormal vocal cord movement is the most definitive means of establishing the diagnosis of PVCD. A number of psychiatric disturbances are related to PVCD, including conversion and anxiety disorders. PVCD is associated with severe psychosocial stress and difficulties with modulation of intense emotional states. Conclusions: Psychogenic respiratory distress produced by PVCD can be easily misdiagnosed as severe or refractory asthma or other pulmonary disease states. Recognition of PVCD is important to avoid unnecessary medications and invasive treatments. Primary care physicians can detect cases of PVCD by attending to clinical symptoms, implementing appropriate laboratory investigations, and examining the psychological covariates of the disorder. Psychotherapy and speech therapy are effective in treating most cases of PVCD. PMID:15014694
Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback

PubMed Central

2011-01-01

Background The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Conclusions Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds. PMID:21645406
Perceived differences in social status between speaker and listener affect the speaker's vocal characteristics

PubMed Central

Mileva, Viktoria R.; Little, Anthony C.; Roberts, S. Craig

2017-01-01

Non-verbal behaviours, including voice characteristics during speech, are an important way to communicate social status. Research suggests that individuals can obtain high social status through dominance (using force and intimidation) or through prestige (by being knowledgeable and skilful). However, little is known regarding differences in the vocal behaviour of men and women in response to dominant and prestigious individuals. Here, we tested within-subject differences in vocal parameters of interviewees during simulated job interviews with dominant, prestigious, and neutral employers (targets), while responding to questions which were classified as introductory, personal, and interpersonal. We found that vocal modulations were apparent between responses to the neutral and high-status targets, with participants, especially those who perceived themselves as low in dominance, increasing fundamental frequency (F0) in response to the dominant and prestigious targets relative to the neutral target. Self-perceived prestige, however, was less related to contextual vocal modulations than self-perceived dominance. Finally, we found that differences in the context of the interview questions participants were asked to respond to (introductory, personal, interpersonal), also affected their vocal parameters, being more prominent in responses to personal and interpersonal questions. Overall, our results suggest that people adjust their vocal parameters according to the perceived social status of the listener as well as their own self-perceived social status. PMID:28614413
Amygdala and auditory cortex exhibit distinct sensitivity to relevant acoustic features of auditory emotions.

PubMed

Pannese, Alessia; Grandjean, Didier; Frühholz, Sascha

2016-12-01

Discriminating between auditory signals of different affective value is critical to successful social interaction. It is commonly held that acoustic decoding of such signals occurs in the auditory system, whereas affective decoding occurs in the amygdala. However, given that the amygdala receives direct subcortical projections that bypass the auditory cortex, it is possible that some acoustic decoding occurs in the amygdala as well, when the acoustic features are relevant for affective discrimination. We tested this hypothesis by combining functional neuroimaging with the neurophysiological phenomena of repetition suppression (RS) and repetition enhancement (RE) in human listeners. Our results show that both amygdala and auditory cortex responded differentially to physical voice features, suggesting that the amygdala and auditory cortex decode the affective quality of the voice not only by processing the emotional content from previously processed acoustic features, but also by processing the acoustic features themselves, when these are relevant to the identification of the voice's affective value. Specifically, we found that the auditory cortex is sensitive to spectral high-frequency voice cues when discriminating vocal anger from vocal fear and joy, whereas the amygdala is sensitive to vocal pitch when discriminating between negative vocal emotions (i.e., anger and fear). Vocal pitch is an instantaneously recognized voice feature, which is potentially transferred to the amygdala by direct subcortical projections. These results together provide evidence that, besides the auditory cortex, the amygdala too processes acoustic information, when this is relevant to the discrimination of auditory emotions. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Does affective information influence domestic dogs' (Canis lupus familiaris) point-following behavior?

PubMed

Flom, Ross; Gartman, Peggy

2016-03-01

Several studies have examined dogs' (Canis lupus familiaris) comprehension and use of human communicative cues. Relatively few studies have, however, examined the effects of human affective behavior (i.e., facial and vocal expressions) on dogs' exploratory and point-following behavior. In two experiments, we examined dogs' frequency of following an adult's pointing gesture in locating a hidden reward or treat when it occurred silently, or when it was paired with a positive or negative facial and vocal affective expression. Like prior studies, the current results demonstrate that dogs reliably follow human pointing cues. Unlike prior studies, the current results also demonstrate that the addition of a positive affective facial and vocal expression, when paired with a pointing gesture, did not reliably increase dogs' frequency of locating a hidden piece of food compared to pointing alone. In addition, and within the negative facial and vocal affect conditions of Experiment 1 and 2, dogs were delayed in their exploration, or approach, toward a baited or sham-baited bowl. However, in Experiment 2, dogs continued to follow an adult's pointing gesture, even when paired with a negative expression, as long as the attention-directing gesture referenced a baited bowl. Together these results suggest that the addition of affective information does not significantly increase or decrease dogs' point-following behavior. Rather these results demonstrate that the presence or absence of affective expressions influences a dogs' exploratory behavior and the presence or absence of reward affects whether they will follow an unfamiliar adult's attention-directing gesture.
Improvement of a Vocal Fold Imaging System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krauter, K. G.

Medical professionals can better serve their patients through continual update of their imaging tools. A wide range of pathologies and disease may afflict human vocal cords or, as they’re also known, vocal folds. These diseases can affect human speech hampering the ability of the patient to communicate. Vocal folds must be opened for breathing and the closed to produce speech. Currently methodologies to image markers of potential pathologies are difficult to use and often fail to detect early signs of disease. These current methodologies rely on a strobe light and slower frame rate camera in an attempt to obtain imagesmore » as the vocal folds travel over the full extent of their motion.« less
Literature review of voice recognition and generation technology for Army helicopter applications

NASA Astrophysics Data System (ADS)

Christ, K. A.

1984-08-01

This report is a literature review on the topics of voice recognition and generation. Areas covered are: manual versus vocal data input, vocabulary, stress and workload, noise, protective masks, feedback, and voice warning systems. Results of the studies presented in this report indicate that voice data entry has less of an impact on a pilot's flight performance, during low-level flying and other difficult missions, than manual data entry. However, the stress resulting from such missions may cause the pilot's voice to change, reducing the recognition accuracy of the system. The noise present in helicopter cockpits also causes the recognition accuracy to decrease. Noise-cancelling devices are being developed and improved upon to increase the recognition performance in noisy environments. Future research in the fields of voice recognition and generation should be conducted in the areas of stress and workload, vocabulary, and the types of voice generation best suited for the helicopter cockpit. Also, specific tasks should be studied to determine whether voice recognition and generation can be effectively applied.
Impact of call center work in subjective voice symptoms and complaints--an analytic study.

PubMed

Rechenberg, Leila; Goulart, Bárbara Niegia Garcia de; Roithmann, Renato

2011-12-01

To estimate the prevalence of vocal symptoms, occupational risk factors, associated symptoms and their impact on the professional activity of the telemarketers. Cross-section analytical study with 124 telemarketers and 109 administrative workers (control group) selected from a random sample stratified by gender. The subjects answered an anonymous self-administered questionnaire involving issues related to the presence of vocal symptoms, potential risk factors for dysphonia, and vocal impact of symptoms in professional activity. The presence of one or more voice symptoms that occurred daily or weekly was considered positive for the presence of vocal symptoms. The prevalence of vocal symptoms was found in 33% of telemarketers and in 21% of the control group, indicating an association between vocal symptoms and the activity of the telemarketer. When adjusted for confounders, this association remained in the sense of risk. In telemarketers, the sensation of dry air, ambient noise, and lack of vocal rest were the most frequently reported complaints reported by those presenting vocal symptoms. Almost 70% of telemarketers with vocal symptoms reported that these symptoms interfere with their professional activity. The rate of absenteeism by vocal symptoms in this group was 29%. Vocal symptoms are common in most telemarketers when compared to their peer controls, and significantly affect their job performance.
Methods and apparatus for non-acoustic speech characterization and recognition

DOEpatents

Holzrichter, John F.

1999-01-01

By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Methods and apparatus for non-acoustic speech characterization and recognition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, J.F.

By simultaneously recording EM wave reflections and acoustic speech information, the positions and velocities of the speech organs as speech is articulated can be defined for each acoustic speech unit. Well defined time frames and feature vectors describing the speech, to the degree required, can be formed. Such feature vectors can uniquely characterize the speech unit being articulated each time frame. The onset of speech, rejection of external noise, vocalized pitch periods, articulator conditions, accurate timing, the identification of the speaker, acoustic speech unit recognition, and organ mechanical parameters can be determined.
Effects of musical expertise on oscillatory brain activity in response to emotional sounds.

PubMed

Nolden, Sophie; Rigoulot, Simon; Jolicoeur, Pierre; Armony, Jorge L

2017-08-01

Emotions can be conveyed through a variety of channels in the auditory domain, be it via music, non-linguistic vocalizations, or speech prosody. Moreover, recent studies suggest that expertise in one sound category can impact the processing of emotional sounds in other sound categories as they found that musicians process more efficiently emotional musical and vocal sounds than non-musicians. However, the neural correlates of these modulations, especially their time course, are not very well understood. Consequently, we focused here on how the neural processing of emotional information varies as a function of sound category and expertise of participants. Electroencephalogram (EEG) of 20 non-musicians and 17 musicians was recorded while they listened to vocal (speech and vocalizations) and musical sounds. The amplitude of EEG-oscillatory activity in the theta, alpha, beta, and gamma band was quantified and Independent Component Analysis (ICA) was used to identify underlying components of brain activity in each band. Category differences were found in theta and alpha bands, due to larger responses to music and speech than to vocalizations, and in posterior beta, mainly due to differential processing of speech. In addition, we observed greater activation in frontal theta and alpha for musicians than for non-musicians, as well as an interaction between expertise and emotional content of sounds in frontal alpha. The results reflect musicians' expertise in recognition of emotion-conveying music, which seems to also generalize to emotional expressions conveyed by the human voice, in line with previous accounts of effects of expertise on musical and vocal sounds processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Impaired Object Recognition but Normal Social Behavior and Ultrasonic Communication in Cofilin1 Mutant Mice

PubMed Central

Sungur, A. Özge; Stemmler, Lea; Wöhr, Markus; Rust, Marco B.

2018-01-01

Autism spectrum disorder (ASD), schizophrenia (SCZ) and intellectual disability (ID) show a remarkable overlap in symptoms, including impairments in cognition, social behavior and communication. Human genetic studies revealed an enrichment of mutations in actin-related genes for these disorders, and some of the strongest candidate genes control actin dynamics. These findings led to the hypotheses: (i) that ASD, SCZ and ID share common disease mechanisms; and (ii) that, at least in a subgroup of affected individuals, defects in the actin cytoskeleton cause or contribute to their pathologies. Cofilin1 emerged as a key regulator of actin dynamics and we previously demonstrated its critical role for synaptic plasticity and associative learning. Notably, recent studies revealed an over-activation of cofilin1 in mutant mice displaying ASD- or SCZ-like behavioral phenotypes, suggesting that dysregulated cofilin1-dependent actin dynamics contribute to their behavioral abnormalities, such as deficits in social behavior. These findings let us hypothesize: (i) that, apart from cognitive impairments, cofilin1 mutants display additional behavioral deficits with relevance to ASD or SCZ; and (ii) that our cofilin1 mutants represent a valuable tool to study the underlying disease mechanisms. To test our hypotheses, we compared social behavior and ultrasonic communication of juvenile mutants to control littermates, and we did not obtain evidence for impaired direct reciprocal social interaction, social approach or social memory. Moreover, concomitant emission of ultrasonic vocalizations was not affected and time-locked to social activity, supporting the notion that ultrasonic vocalizations serve a pro-social communicative function as social contact calls maintaining social proximity. Finally, cofilin1 mutants did not display abnormal repetitive behaviors. Instead, they performed weaker in novel object recognition, thereby demonstrating that cofilin1 is relevant not only for associative learning, but also for “non-matching-to-sample” learning. Here we report the absence of an ASD- or a SCZ-like phenotype in cofilin1 mutants, and we conclude that cofilin1 is relevant specifically for non-social cognition. PMID:29515378
Promoting Vocal Health in the Choral Rehearsal: When Planning for and Conducting Choral Rehearsals, Guide Your Students in Healthful Singing

ERIC Educational Resources Information Center

Webb, Jeffrey L.

2007-01-01

Choral conductors can positively affect the voices in their choirs through their instruction. It is their job to teach the choir not only the music, but also the healthy ways of singing it. Promoting vocal health benefits both singers and conductors. For singers, it helps remove the risk factors for vocal fatigue. For the choral conductor,…
Neural Processing of Vocal Emotion and Identity

ERIC Educational Resources Information Center

Spreckelmeyer, Katja N.; Kutas, Marta; Urbach, Thomas; Altenmuller, Eckart; Munte, Thomas F.

2009-01-01

The voice is a marker of a person's identity which allows individual recognition even if the person is not in sight. Listening to a voice also affords inferences about the speaker's emotional state. Both these types of personal information are encoded in characteristic acoustic feature patterns analyzed within the auditory cortex. In the present…
Music Listening--The Classical Period (1720-1815), Music: 5635.793.

ERIC Educational Resources Information Center

Pearl, Jesse; Carter, Raymond

This 9-week, Quinmester course of study is designed to teach the principal types of vocal, instrumental, and operatic compositions of the classical period through listening to the styles of different composers and acquiring recognition of their works, as well as through developing fastidious listening habits. The course is intended for those…
Music Listening--Romantic Period (1815-1914), Music: 5635.794.

ERIC Educational Resources Information Center

Pearl, Jesse; Carter, Raymond

This secondary level Quinmester course is designed to teach the principal types of vocal, instrumental, and operatic compositions of the Romantic period through listening to the styles of different composers and acquiring recognition of their works. The course is intended for students who have participated in fine or performing arts and for pupils…
Video indexing based on image and sound

NASA Astrophysics Data System (ADS)

Faudemay, Pascal; Montacie, Claude; Caraty, Marie-Jose

1997-10-01

Video indexing is a major challenge for both scientific and economic reasons. Information extraction can sometimes be easier from sound channel than from image channel. We first present a multi-channel and multi-modal query interface, to query sound, image and script through 'pull' and 'push' queries. We then summarize the segmentation phase, which needs information from the image channel. Detection of critical segments is proposed. It should speed-up both automatic and manual indexing. We then present an overview of the information extraction phase. Information can be extracted from the sound channel, through speaker recognition, vocal dictation with unconstrained vocabularies, and script alignment with speech. We present experiment results for these various techniques. Speaker recognition methods were tested on the TIMIT and NTIMIT database. Vocal dictation as experimented on newspaper sentences spoken by several speakers. Script alignment was tested on part of a carton movie, 'Ivanhoe'. For good quality sound segments, error rates are low enough for use in indexing applications. Major issues are the processing of sound segments with noise or music, and performance improvement through the use of appropriate, low-cost architectures or networks of workstations.
Vocal Dose Measures: Quantifying Accumulated Vibration Exposure in Vocal Fold Tissues

PubMed Central

Titze, Ingo R.; Švec, Jan G.; Popolo, Peter S.

2011-01-01

To measure the exposure to self-induced tissue vibration in speech, three vocal doses were defined and described: distance dose, which accumulates the distance that tissue particles of the vocal folds travel in an oscillatory trajectory; energy dissipation dose, which accumulates the total amount of heat dissipated over a unit volume of vocal fold tissues; and time dose, which accumulates the total phonation time. These doses were compared to a previously used vocal dose measure, the vocal loading index, which accumulates the number of vibration cycles of the vocal folds. Empirical rules for viscosity and vocal fold deformation were used to calculate all the doses from the fundamental frequency (F0) and sound pressure level (SPL) values of speech. Six participants were asked to read in normal, monotone, and exaggerated speech and the doses associated with these vocalizations were calculated. The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient. The vibration exposure of the vocal folds in normal speech was related to the industrial limits for hand-transmitted vibration, in which the safe distance dose was derived to be about 500 m. This limit was found rather low for vocalization; it was related to a comparable time dose of about 17 min of continuous vocalization, or about 35 min of continuous reading with normal breathing and unvoiced segments. The voicing pauses in normal speech and dialogue effectively prolong the safe time dose. The derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal fold tissue during voicing pauses. PMID:12959470
Vocal dose measures: quantifying accumulated vibration exposure in vocal fold tissues.

PubMed

Titze, Ingo R; Svec, Jan G; Popolo, Peter S

2003-08-01

To measure the exposure to self-induced tissue vibration in speech, three vocal doses were defined and described: distance dose, which accumulates the distance that tissue particles of the vocal folds travel in an oscillatory trajectory; energy dissipation dose, which accumulates the total amount of heat dissipated over a unit volume of vocal fold tissues; and time dose, which accumulates the total phonation time. These doses were compared to a previously used vocal dose measure, the vocal loading index, which accumulates the number of vibration cycles of the vocal folds. Empirical rules for viscosity and vocal fold deformation were used to calculate all the doses from the fundamental frequency (F0) and sound pressure level (SPL) values of speech. Six participants were asked to read in normal, monotone, and exaggerated speech and the doses associated with these vocalizations were calculated. The results showed that large F0 and SPL variations in speech affected the dose measures, suggesting that accumulation of phonation time alone is insufficient. The vibration exposure of the vocal folds in normal speech was related to the industrial limits for hand-transmitted vibration, in which the safe distance dose was derived to be about 500 m. This limit was found rather low for vocalization; it was related to a comparable time dose of about 17 min of continuous vocalization, or about 35 min of continuous reading with normal breathing and unvoiced segments. The voicing pauses in normal speech and dialogue effectively prolong the safe time dose. The derived safety limits for vocalization will likely require refinement based on a more detailed knowledge of the differences in hand and vocal fold tissue morphology and their response to vibrational stress, and on the effect of recovery of the vocal fold tissue during voicing pauses.
Evidence for cultural dialects in vocal emotion expression: acoustic classification within and across five nations.

PubMed

Laukka, Petri; Neiberg, Daniel; Elfenbein, Hillary Anger

2014-06-01

The possibility of cultural differences in the fundamental acoustic patterns used to express emotion through the voice is an unanswered question central to the larger debate about the universality versus cultural specificity of emotion. This study used emotionally inflected standard-content speech segments expressing 11 emotions produced by 100 professional actors from 5 English-speaking cultures. Machine learning simulations were employed to classify expressions based on their acoustic features, using conditions where training and testing were conducted on stimuli coming from either the same or different cultures. A wide range of emotions were classified with above-chance accuracy in cross-cultural conditions, suggesting vocal expressions share important characteristics across cultures. However, classification showed an in-group advantage with higher accuracy in within- versus cross-cultural conditions. This finding demonstrates cultural differences in expressive vocal style, and supports the dialect theory of emotions according to which greater recognition of expressions from in-group members results from greater familiarity with culturally specific expressive styles.

Take time to smell the frogs: vocal sac glands of reed frogs (Anura: Hyperoliidae) contain species-specific chemical cocktails

PubMed Central

Starnberger, Iris; Poth, Dennis; Peram, Pardha Saradhi; Schulz, Stefan; Vences, Miguel; Knudsen, Jette; Barej, Michael F; Rödel, Mark-Oliver; Walzl, Manfred; Hödl, Walter

2013-01-01

Males of all reed frog species (Anura: Hyperoliidae) have a prominent, often colourful, gular patch on their vocal sac, which is particularly conspicuous once the vocal sac is inflated. Although the presence, shape, and form of the gular patch are well-known diagnostic characters for these frogs, its function remains unknown. By integrating biochemical and histological methods, we found strong evidence that the gular patch is a gland producing volatile compounds, which might be emitted while calling. Volatile compounds were confirmed by gas chromatography–mass spectrometry in the gular glands in 11 species of the hyperoliid genera Afrixalus, Heterixalus, Hyperolius, and Phlyctimantis. Comparing the gular gland contents of 17 specimens of four sympatric Hyperolius species yielded a large variety of 65 compounds in species-specific combinations. We suggest that reed frogs might use a complex combination of at least acoustic and chemical signals in species recognition and mate choice. PMID:24277973
The perception of self in birds.

PubMed

Derégnaucourt, Sébastien; Bovet, Dalila

2016-10-01

The perception of self is an important topic in several disciplines such as ethology, behavioral ecology, psychology, developmental and cognitive neuroscience. Self-perception is investigated by experimentally exposing different species of animals to self-stimuli such as their own image, smell or vocalizations. Here we review more than one hundred studies using these methods in birds, a taxonomic group that exhibits a rich diversity regarding ecology and behavior. Exposure to self-image is the main method for studying self-recognition, while exposing birds to their own smell is generally used for the investigation of homing or odor-based kin discrimination. Self-produced vocalizations - especially in oscine songbirds - are used as stimuli for understanding the mechanisms of vocal coding/decoding both at the neural and at the behavioral levels. With this review, we highlight the necessity to study the perception of self in animals cross-modally and to consider the role of experience and development, aspects that can be easily monitored in captive populations of birds. Copyright © 2016 Elsevier Ltd. All rights reserved.
NMDA or non-NMDA receptor antagonism within the amygdaloid central nucleus suppresses the affective dimension of pain in rats: evidence for hemispheric synergy.

PubMed

Spuz, Catherine A; Borszcz, George S

2012-04-01

The amygdala contributes to generation of affective behaviors to threats. The prototypical threat to an individual is exposure to a noxious stimulus and the amygdaloid central nucleus (CeA) receives nociceptive input that is mediated by glutamatergic neurotransmission. The present study evaluated the contribution of glutamate receptors in CeA to generation of the affective response to acute pain in rats. Vocalizations that occur following a brief noxious tail shock (vocalization afterdischarges) are a validated rodent model of pain affect, and were preferentially suppressed by bilateral injection into CeA of the NMDA receptor antagonist D-2-amino-5-phosphonovalerate (AP5, 1 μg, 2 μg, or 4 μg) or the non-NMDA receptor antagonist 6-Cyano-7-nitroquinoxaline-2,3-dione disodium (CNQX, .25 μg, .5 μg, 1 μg, or 2 μg). Vocalizations that occur during tail shock were suppressed to a lesser degree, whereas spinal motor reflexes (tail flick and hind limb movements) were unaffected by injection of AP5 or CNQX into CeA. Unilateral administration of AP5 or CNQX into CeA of either hemisphere also selectively elevated vocalization thresholds. Bilateral administration of AP5 or CNQX produced greater increases in vocalization thresholds than the same doses of antagonists administered unilaterality into either hemisphere indicating synergistic hemispheric interactions. The amygdala contributes to production of emotional responses to environmental threats. Blocking glutamate neurotransmission within the central nucleus of the amygdala suppressed rats' emotional response to acute painful stimulation. Understanding the neurobiology underlying emotional responses to pain will provide insights into new treatments for pain and its associated affective disorders. Copyright © 2012 American Pain Society. Published by Elsevier Inc. All rights reserved.
Effects of speech style, room acoustics, and vocal fatigue on vocal effort

PubMed Central

Bottalico, Pasquale; Graetzer, Simone; Hunter, Eric J.

2016-01-01

Vocal effort is a physiological measure that accounts for changes in voice production as vocal loading increases. It has been quantified in terms of sound pressure level (SPL). This study investigates how vocal effort is affected by speaking style, room acoustics, and short-term vocal fatigue. Twenty subjects were recorded while reading a text at normal and loud volumes in anechoic, semi-reverberant, and reverberant rooms in the presence of classroom babble noise. The acoustics in each environment were modified by creating a strong first reflection in the talker position. After each task, the subjects answered questions addressing their perception of the vocal effort, comfort, control, and clarity of their own voice. Variation in SPL for each subject was measured per task. It was found that SPL and self-reported effort increased in the loud style and decreased when the reflective panels were present and when reverberation time increased. Self-reported comfort and control decreased in the loud style, while self-reported clarity increased when panels were present. The lowest magnitude of vocal fatigue was experienced in the semi-reverberant room. The results indicate that early reflections may be used to reduce vocal effort without modifying reverberation time. PMID:27250179
Integrating perspectives on vocal performance and consistency

PubMed Central

Sakata, Jon T.; Vehrencamp, Sandra L.

2012-01-01

SUMMARY Recent experiments in divergent fields of birdsong have revealed that vocal performance is important for reproductive success and under active control by distinct neural circuits. Vocal consistency, the degree to which the spectral properties (e.g. dominant or fundamental frequency) of song elements are produced consistently from rendition to rendition, has been highlighted as a biologically important aspect of vocal performance. Here, we synthesize functional, developmental and mechanistic (neurophysiological) perspectives to generate an integrated understanding of this facet of vocal performance. Behavioral studies in the field and laboratory have found that vocal consistency is affected by social context, season and development, and, moreover, positively correlated with reproductive success. Mechanistic investigations have revealed a contribution of forebrain and basal ganglia circuits and sex steroid hormones to the control of vocal consistency. Across behavioral, developmental and mechanistic studies, a convergent theme regarding the importance of vocal practice in juvenile and adult songbirds emerges, providing a basis for linking these levels of analysis. By understanding vocal consistency at these levels, we gain an appreciation for the various dimensions of song control and plasticity and argue that genes regulating the function of basal ganglia circuits and sex steroid hormones could be sculpted by sexual selection. PMID:22189763
Effect of artificially lengthened vocal tract on vocal fold oscillation's fundamental frequency.

PubMed

Hanamitsu, Masakazu; Kataoka, Hideyuki

2004-06-01

The fundamental frequency of vocal fold oscillation (F(0)) is controlled by laryngeal mechanics and aerodynamic properties. F(0) change per unit change of transglottal pressure (dF/dP) using a shutter valve has been studied and found to have nonlinear, V-shaped relationship with F(0). On the other hand, the vocal tract is also known to affect vocal fold oscillation. This study examined the effect of artificially lengthened vocal tract length on dF/dP. dF/dP was measured in six men using two mouthpieces of different lengths. The dF/dP graph for the longer vocal tract was shifted leftward relative to the shorter one. Using the one-mass model, the nadir of the "V" on the dF/dP graph was strongly influenced by the resonance around the first formant frequency. However, a more precise model is needed to account for the effects of viscosity and turbulence.
Visualizing Collagen Network Within Human and Rhesus Monkey Vocal Folds Using Polarized Light Microscopy

PubMed Central

Julias, Margaret; Riede, Tobias; Cook, Douglas

2014-01-01

Objectives Collagen fiber content and orientation affect the viscoelastic properties of the vocal folds, determining oscillation characteristics during speech and other vocalization. The investigation and reconstruction of the collagen network in vocal folds remains a challenge, because the collagen network requires at least micron-scale resolution. In this study, we used polarized light microscopy to investigate the distribution and alignment of collagen fibers within the vocal folds. Methods Data were collected in sections of human and rhesus monkey (Macaca mulatta) vocal folds cut at 3 different angles and stained with picrosirius red. Results Statistically significant differences were found between different section angles, implying that more than one section angle is required to capture the network’s complexity. In the human vocal folds, the collagen fiber distribution continuously varied across the lamina propria (medial to lateral). Distinct differences in birefringence distribution were observed between the species. For the human vocal folds, high birefringence was observed near the thyroarytenoid muscle and near the epithelium. However, in the rhesus monkey vocal folds, high birefringence was observed near the epithelium, and lower birefringence was seen near the thyroarytenoid muscle. Conclusions The differences between the collagen networks in human and rhesus monkey vocal folds provide a morphological basis for differences in viscoelastic properties between species. PMID:23534129
The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter speech describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control of all these acoustic characteristics of vocal signals is discussed in Sect. 16.8, while Sect. 16.9 considers expressive aspects of vocal communication.
The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control of all these acoustic characteristics of vocal signals is discussed in Sect. 16.8, while Sect. 16.9 considers expressive aspects of vocal communication.
Understanding environmental sounds in sentence context.

PubMed

Uddin, Sophia; Heald, Shannon L M; Van Hedger, Stephen C; Klos, Serena; Nusbaum, Howard C

2018-03-01

There is debate about how individuals use context to successfully predict and recognize words. One view argues that context supports neural predictions that make use of the speech motor system, whereas other views argue for a sensory or conceptual level of prediction. While environmental sounds can convey clear referential meaning, they are not linguistic signals, and are thus neither produced with the vocal tract nor typically encountered in sentence context. We compared the effect of spoken sentence context on recognition and comprehension of spoken words versus nonspeech, environmental sounds. In Experiment 1, sentence context decreased the amount of signal needed for recognition of spoken words and environmental sounds in similar fashion. In Experiment 2, listeners judged sentence meaning in both high and low contextually constraining sentence frames, when the final word was present or replaced with a matching environmental sound. Results showed that sentence constraint affected decision time similarly for speech and nonspeech, such that high constraint sentences (i.e., frame plus completion) were processed faster than low constraint sentences for speech and nonspeech. Linguistic context facilitates the recognition and understanding of nonspeech sounds in much the same way as for spoken words. This argues against a simple form of a speech-motor explanation of predictive coding in spoken language understanding, and suggests support for conceptual-level predictions. Copyright © 2017 Elsevier B.V. All rights reserved.
Vocal performance affects metabolic rate in dolphins: implications for animals communicating in noisy environments.

PubMed

Holt, Marla M; Noren, Dawn P; Dunkin, Robin C; Williams, Terrie M

2015-06-01

Many animals produce louder, longer or more repetitious vocalizations to compensate for increases in environmental noise. Biological costs of increased vocal effort in response to noise, including energetic costs, remain empirically undefined in many taxa, particularly in marine mammals that rely on sound for fundamental biological functions in increasingly noisy habitats. For this investigation, we tested the hypothesis that an increase in vocal effort would result in an energetic cost to the signaler by experimentally measuring oxygen consumption during rest and a 2 min vocal period in dolphins that were trained to vary vocal loudness across trials. Vocal effort was quantified as the total acoustic energy of sounds produced. Metabolic rates during the vocal period were, on average, 1.2 and 1.5 times resting metabolic rate (RMR) in dolphin A and B, respectively. As vocal effort increased, we found that there was a significant increase in metabolic rate over RMR during the 2 min following sound production in both dolphins, and in total oxygen consumption (metabolic cost of sound production plus recovery costs) in the dolphin that showed a wider range of vocal effort across trials. Increases in vocal effort, as a consequence of increases in vocal amplitude, repetition rate and/or duration, are consistent with behavioral responses to noise in free-ranging animals. Here, we empirically demonstrate for the first time in a marine mammal, that these vocal modifications can have an energetic impact at the individual level and, importantly, these data provide a mechanistic foundation for evaluating biological consequences of vocal modification in noise-polluted habitats. © 2015. Published by The Company of Biologists Ltd.
Weight-Bearing MR Imaging as an Option in the Study of Gravitational Effects on the Vocal Tract of Untrained Subjects in Singing Phonation

PubMed Central

Traser, Louisa; Burdumy, Michael; Richter, Bernhard; Vicari, Marco; Echternach, Matthias

2014-01-01

Magnetic Resonance Imaging (MRI) of subjects in a supine position can be used to evaluate the configuration of the vocal tract during phonation. However, studies of speech phonation have shown that gravity can affect vocal tract shape and bias measurements. This is one of the reasons that MRI studies of singing phonation have used professionally trained singers as subjects, because they are generally considered to be less affected by the supine body position and environmental distractions. A study of untrained singers might not only contribute to the understanding of intuitive singing function and aid the evaluation of potential hazards for vocal health, but also provide insights into the effect of the supine position on singers in general. In the present study, an open configuration 0.25 T MRI system with a rotatable examination bed was used to study the effect of body position in 20 vocally untrained subjects. The subjects were asked to sing sustained tones in both supine and upright body positions on different pitches and in different register conditions. Morphometric measurements were taken from the acquired images of a sagittal slice depicting the vocal tract. The analysis concerning the vocal tract configuration in the two body positions revealed differences in 5 out of 10 measured articulatory parameters. In the upright position the jaw was less protruded, the uvula was elongated, the larynx more tilted and the tongue was positioned more to the front of the mouth than in the supine position. The findings presented are in agreement with several studies on gravitational effects in speech phonation, but contrast with the results of a previous study on professional singers of our group where only minor differences between upright and supine body posture were observed. The present study demonstrates that imaging of the vocal tract using weight-bearing MR imaging is a feasible tool for the study of sustained phonation in singing for vocally untrained subjects. PMID:25379885
Weight-bearing MR imaging as an option in the study of gravitational effects on the vocal tract of untrained subjects in singing phonation.

PubMed

Traser, Louisa; Burdumy, Michael; Richter, Bernhard; Vicari, Marco; Echternach, Matthias

2014-01-01

Magnetic Resonance Imaging (MRI) of subjects in a supine position can be used to evaluate the configuration of the vocal tract during phonation. However, studies of speech phonation have shown that gravity can affect vocal tract shape and bias measurements. This is one of the reasons that MRI studies of singing phonation have used professionally trained singers as subjects, because they are generally considered to be less affected by the supine body position and environmental distractions. A study of untrained singers might not only contribute to the understanding of intuitive singing function and aid the evaluation of potential hazards for vocal health, but also provide insights into the effect of the supine position on singers in general. In the present study, an open configuration 0.25 T MRI system with a rotatable examination bed was used to study the effect of body position in 20 vocally untrained subjects. The subjects were asked to sing sustained tones in both supine and upright body positions on different pitches and in different register conditions. Morphometric measurements were taken from the acquired images of a sagittal slice depicting the vocal tract. The analysis concerning the vocal tract configuration in the two body positions revealed differences in 5 out of 10 measured articulatory parameters. In the upright position the jaw was less protruded, the uvula was elongated, the larynx more tilted and the tongue was positioned more to the front of the mouth than in the supine position. The findings presented are in agreement with several studies on gravitational effects in speech phonation, but contrast with the results of a previous study on professional singers of our group where only minor differences between upright and supine body posture were observed. The present study demonstrates that imaging of the vocal tract using weight-bearing MR imaging is a feasible tool for the study of sustained phonation in singing for vocally untrained subjects.
Modulating Phonation Through Alteration of Vocal Fold Medial Surface Contour

PubMed Central

Mau, Ted; Muhlestein, Joseph; Callahan, Sean; Chan, Roger W.

2012-01-01

Objectives 1. To test whether alteration of the vocal fold medial surface contour can improve phonation. 2. To demonstrate that implant material properties affect vibration even when implant is deep to the vocal fold lamina propria. Study Design Induced phonation of excised human larynges. Methods Thirteen larynges were harvested within 24 hours post-mortem. Phonation threshold pressure (PTP) and flow (PTF) were measured before and after vocal fold injections using either calcium hydroxylapatite (CaHA) or hyaluronic acid (HA). Small-volume injections (median 0.0625 mL) were targeted to the infero-medial aspect of the thyroarytenoid (TA) muscle. Implant locations were assessed histologically. Results The effect of implantation on PTP was material-dependent. CaHA tended to increase PTP, whereas HA tended to decrease PTP (Wilcoxon test P = 0.00013 for onset). In contrast, the effect of implantation on PTF was similar, with both materials tending to decrease PTF (P = 0.16 for onset). Histology confirmed implant presence in the inferior half of the vocal fold vertical thickness. Conclusions Taken together, these data suggested the implants may have altered the vocal fold medial surface contour, potentially resulting in a less convergent or more rectangular glottal geometry as a means to improve phonation. An implant with a closer viscoelastic match to vocal fold cover is desirable for this purpose, as material properties can affect vibration even when the implant is not placed within the lamina propria. This result is consistent with theoretical predictions and implies greater need for surgical precision in implant placement and care in material selection. PMID:22865592
Tissue Engineering-based Therapeutic Strategies for Vocal Fold Repair and Regeneration

PubMed Central

Li, Linqing; Stiadle, Jeanna M.; Lau, Hang K.; Zerdoum, Aidan B.; Jia, Xinqiao; L.Thibeault, Susan; Kiick, Kristi L.

2016-01-01

Vocal folds are soft laryngeal connective tissues with distinct layered structures and complex multicomponent matrix compositions that endow phonatory and respiratory functions. This delicate tissue is easily damaged by various environmental factors and pathological conditions, altering vocal biomechanics and causing debilitating vocal disorders that detrimentally affect the daily lives of suffering individuals. Modern techniques and advanced knowledge of regenerative medicine have led to a deeper understanding of the microstructure, microphysiology, and micropathophysiology of vocal fold tissues. State-of-the-art materials ranging from extracecullar-matrix (ECM)-derived biomaterials to synthetic polymer scaffolds have been proposed for the prevention and treatment of voice disorders including vocal fold scarring and fibrosis. This review intends to provide a thorough overview of current achievements in the field of vocal fold tissue engineering, including the fabrication of injectable biomaterials to mimic in vitro cell microenvironments, novel designs of bioreactors that capture in vivo tissue biomechanics, and establishment of various animal models to characterize the in vivo biocompatibility of these materials. The combination of polymeric scaffolds, cell transplantation, biomechanical stimulation, and delivery of antifibrotic growth factors will lead to successful restoration of functional vocal folds and improved vocal recovery in animal models, facilitating the application of these materials and related methodologies in clinical practice. PMID:27619243
The effects of physiological adjustments on the perceptual and acoustical characteristics of simulated laryngeal vocal tremor

PubMed Central

Lester, Rosemary A.; Story, Brad H.

2015-01-01

The purpose of this study was to determine if adjustments to the voice source [i.e., fundamental frequency (F0), degree of vocal fold adduction] or vocal tract filter (i.e., vocal tract shape for vowels) reduce the perception of simulated laryngeal vocal tremor and to determine if listener perception could be explained by characteristics of the acoustical modulations. This research was carried out using a computational model of speech production that allowed for precise control and manipulation of the glottal and vocal tract configurations. Forty-two healthy adults participated in a perceptual study involving pair-comparisons of the magnitude of “shakiness” with simulated samples of laryngeal vocal tremor. Results revealed that listeners perceived a higher magnitude of voice modulation when simulated samples had a higher mean F0, greater degree of vocal fold adduction, and vocal tract shape for /i/ vs /ɑ/. However, the effect of F0 was significant only when glottal noise was not present in the acoustic signal. Acoustical analyses were performed with the simulated samples to determine the features that affected listeners' judgments. Based on regression analyses, listeners' judgments were predicted to some extent by modulation information present in both low and high frequency bands. PMID:26328711
Facial Expression Enhances Emotion Perception Compared to Vocal Prosody: Behavioral and fMRI Studies.

PubMed

Zhang, Heming; Chen, Xuhai; Chen, Shengdong; Li, Yansong; Chen, Changming; Long, Quanshan; Yuan, Jiajin

2018-05-09

Facial and vocal expressions are essential modalities mediating the perception of emotion and social communication. Nonetheless, currently little is known about how emotion perception and its neural substrates differ across facial expression and vocal prosody. To clarify this issue, functional MRI scans were acquired in Study 1, in which participants were asked to discriminate the valence of emotional expression (angry, happy or neutral) from facial, vocal, or bimodal stimuli. In Study 2, we used an affective priming task (unimodal materials as primers and bimodal materials as target) and participants were asked to rate the intensity, valence, and arousal of the targets. Study 1 showed higher accuracy and shorter response latencies in the facial than in the vocal modality for a happy expression. Whole-brain analysis showed enhanced activation during facial compared to vocal emotions in the inferior temporal-occipital regions. Region of interest analysis showed a higher percentage signal change for facial than for vocal anger in the superior temporal sulcus. Study 2 showed that facial relative to vocal priming of anger had a greater influence on perceived emotion for bimodal targets, irrespective of the target valence. These findings suggest that facial expression is associated with enhanced emotion perception compared to equivalent vocal prosodies.
Is laughter a better vocal change detector than a growl?

PubMed

Pinheiro, Ana P; Barros, Carla; Vasconcelos, Margarida; Obermeier, Christian; Kotz, Sonja A

2017-07-01

The capacity to predict what should happen next and to minimize any discrepancy between an expected and an actual sensory input (prediction error) is a central aspect of perception. Particularly in vocal communication, the effective prediction of an auditory input that informs the listener about the emotionality of a speaker is critical. What is currently unknown is how the perceived valence of an emotional vocalization affects the capacity to predict and detect a change in the auditory input. This question was probed in a combined event-related potential (ERP) and time-frequency analysis approach. Specifically, we examined the brain response to standards (Repetition Positivity) and to deviants (Mismatch Negativity - MMN), as well as the anticipatory response to the vocal sounds (pre-stimulus beta oscillatory power). Short neutral, happy (laughter), and angry (growls) vocalizations were presented both as standard and deviant stimuli in a passive oddball listening task while participants watched a silent movie and were instructed to ignore the vocalizations. MMN amplitude was increased for happy compared to neutral and angry vocalizations. The Repetition Positivity was enhanced for happy standard vocalizations. Induced pre-stimulus upper beta power was increased for happy vocalizations, and predicted the modulation of the standard Repetition Positivity. These findings indicate enhanced sensory prediction for positive vocalizations such as laughter. Together, the results suggest that positive vocalizations are more effective predictors in social communication than angry and neutral ones, possibly due to their high social significance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Experimental analysis of the characteristics of artificial vocal folds.

PubMed

Misun, Vojtech; Svancara, Pavel; Vasek, Martin

2011-05-01

Specialized literature presents a number of models describing the function of the vocal folds. In most of those models, an emphasis is placed on the air flowing through the glottis and, further, on the effect of the parameters of the air alone (its mass, speed, and so forth). The article focuses on the constructional definition of artificial vocal folds and their experimental analysis. The analysis is conducted for voiced source voice phonation and for the changing mean value of the subglottal pressure. The article further deals with the analysis of the pressure of the airflow through the vocal folds, which is cut (separated) into individual pulses by the vibrating vocal folds. The analysis results show that air pulse characteristics are relevant to voice generation, as they are produced by the flowing air and vibrating vocal folds. A number of artificial vocal folds have been constructed to date, and the aforementioned view of their phonation is confirmed by their analysis. The experiments have confirmed that man is able to consciously affect only two parameters of the source voice, that is, its fundamental frequency and voice intensity. The main forces acting on the vocal folds during phonation are as follows: subglottal air pressure and elastic and inertia forces of the vocal folds' structure. The correctness of the function of the artificial vocal folds is documented by the experimental verification of the spectra of several types of artificial vocal folds. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Overcoming the Effects of Variation in Infant Speech Segmentation: Influences of Word Familiarity

PubMed Central

Singh, Leher; Nestor, Sarah S.; Bortfeld, Heather

2010-01-01

Previous studies have shown that 7.5-month-olds can track and encode words in fluent speech, but they fail to equate instances of a word that contrast in talker gender, vocal affect, and fundamental frequency. By 10.5 months, they succeed at generalizing across such variability, marking a clear transition period during which infants’ word recognition skills become qualitatively more mature. Here we explore the role of word familiarity in this critical transition and, in particular, whether words that occur frequently in a child’s listening environment (i.e., “Mommy” and “Daddy”) are more easily recognized when they differ in surface characteristics than those that infants have not previously encountered (termed nonwords). Results demonstrate that words are segmented from continuous speech in a more linguistically mature fashion than nonwords at 7.5 months, but at 10.5 months, both words and nonwords are segmented in a relatively mature fashion. These findings suggest that early word recognition is facilitated in cases where infants have had significant exposure to items, but at later stages, infants are able to segment items regardless of their presumed familiarity. PMID:21088702

A Volumetric Analysis of the Vocal Tract Associated with Laryngectomees Using Acoustic Reflection Technology.

PubMed

Ng, Manwa L; Yan, Nan; Chan, Venus; Chen, Yang; Lam, Paul K Y

2018-06-28

Previous studies of the laryngectomized vocal tract using formant frequencies reported contradictory findings. Imagining studies of the vocal tract in alaryngeal speakers are limited due to the possible radiation effect as well as the cost and time associated with the studies. The present study examined the vocal tract configuration of laryngectomized individuals using acoustic reflection technology. Thirty alaryngeal and 30 laryngeal male speakers of Cantonese participated in the study. A pharyngometer was used to obtain volumetric information of the vocal tract. All speakers were instructed to imitate the production of /a/ when the length and volume information of the oral cavity, pharyngeal cavity, and the entire vocal tract were obtained. The data of alaryngeal and laryngeal speakers were compared. Pharyngometric measurements revealed no significant difference in the vocal tract dimensions between laryngeal and alaryngeal speakers. Despite the removal of the larynx and a possible alteration in the pharyngeal cavity during total laryngectomy, the vocal tract configuration (length and volume) in laryngectomized individuals was not significantly different from laryngeal speakers. It is suggested that other factors might have affected formant measures in previous studies. © 2018 S. Karger AG, Basel.
Precise Motor Control Enables Rapid Flexibility in Vocal Behavior of Marmoset Monkeys.

PubMed

Pomberger, Thomas; Risueno-Segovia, Cristina; Löschner, Julia; Hage, Steffen R

2018-03-05

Investigating the evolution of human speech is difficult and controversial because human speech surpasses nonhuman primate vocal communication in scope and flexibility [1-3]. Monkey vocalizations have been assumed to be largely innate, highly affective, and stereotyped for over 50 years [4, 5]. Recently, this perception has dramatically changed. Current studies have revealed distinct learning mechanisms during vocal development [6-8] and vocal flexibility, allowing monkeys to cognitively control when [9, 10], where [11], and what to vocalize [10, 12, 13]. However, specific call features (e.g., duration, frequency) remain surprisingly robust and stable in adult monkeys, resulting in rather stereotyped and discrete call patterns [14]. Additionally, monkeys seem to be unable to modulate their acoustic call structure under reinforced conditions beyond natural constraints [15, 16]. Behavioral experiments have shown that monkeys can stop sequences of calls immediately after acoustic perturbation but cannot interrupt ongoing vocalizations, suggesting that calls consist of single impartible pulses [17, 18]. Using acoustic perturbation triggered by the vocal behavior itself and quantitative measures of resulting vocal adjustments, we show that marmoset monkeys are capable of producing calls with durations beyond the natural boundaries of their repertoire by interrupting ongoing vocalizations rapidly after perturbation onset. Our results indicate that marmosets are capable of interrupting vocalizations only at periodic time points throughout calls, further supported by the occurrence of periodically segmented phees. These ideas overturn decades-old concepts on primate vocal pattern generation, indicating that vocalizations do not consist of one discrete call pattern but are built of many sequentially uttered units, like human speech. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Current Understanding and Future Directions for Vocal Fold Mechanobiology

PubMed Central

Li, Nicole Y.K.; Heris, Hossein K.; Mongeau, Luc

2013-01-01

The vocal folds, which are located in the larynx, are the main organ of voice production for human communication. The vocal folds are under continuous biomechanical stress similar to other mechanically active organs, such as the heart, lungs, tendons and muscles. During speech and singing, the vocal folds oscillate at frequencies ranging from 20 Hz to 3 kHz with amplitudes of a few millimeters. The biomechanical stress associated with accumulated phonation is believed to alter vocal fold cell activity and tissue structure in many ways. Excessive phonatory stress can damage tissue structure and induce a cell-mediated inflammatory response, resulting in a pathological vocal fold lesion. On the other hand, phonatory stress is one major factor in the maturation of the vocal folds into a specialized tri-layer structure. One specific form of vocal fold oscillation, which involves low impact and large amplitude excursion, is prescribed therapeutically for patients with mild vocal fold injuries. Although biomechanical forces affect vocal fold physiology and pathology, there is little understanding of how mechanical forces regulate these processes at the cellular and molecular level. Research into vocal fold mechanobiology has burgeoned over the past several years. Vocal fold bioreactors are being developed in several laboratories to provide a biomimic environment that allows the systematic manipulation of physical and biological factors on the cells of interest in vitro. Computer models have been used to simulate the integrated response of cells and proteins as a function of phonation stress. The purpose of this paper is to review current research on the mechanobiology of the vocal folds as it relates to growth, pathogenesis and treatment as well as to propose specific research directions that will advance our understanding of this subject. PMID:24812638
Mating Signals Indicating Sexual Receptiveness Induce Unique Spatio-Temporal EEG Theta Patterns in an Anuran Species

PubMed Central

Fang, Guangzhan; Yang, Ping; Cui, Jianguo; Yao, Dezhong; Brauth, Steven E.; Tang, Yezhong

2012-01-01

Female mate choice is of importance for individual fitness as well as a determining factor in genetic diversity and speciation. Nevertheless relatively little is known about how females process information acquired from males during mate selection. In the Emei music frog, Babina daunchina, males normally call from hidden burrows and females in the reproductive stage prefer male calls produced from inside burrows compared with ones from outside burrows. The present study evaluated changes in electroencephalogram (EEG) power output in four frequency bands induced by male courtship vocalizations on both sides of the telencephalon and mesencephalon in females. The results show that (1) both the values of left hemispheric theta relative power and global lateralization in the theta band are modulated by the sexual attractiveness of the acoustic stimulus in the reproductive stage, suggesting the theta oscillation is closely correlated with processing information associated with mate choice; (2) mean relative power in the beta band is significantly greater in the mesencephalon than the left telencephalon, regardless of reproductive status or the biological significance of signals, indicating it is associated with processing acoustic features and (3) relative power in the delta and alpha bands are not affected by reproductive status or acoustic stimuli. The results imply that EEG power in the theta and beta bands reflect different information processing mechanisms related to vocal recognition and auditory perception in anurans. PMID:23285010
Evidence for an audience effect in mice: male social partners alter the male vocal response to female cues

PubMed Central

Seagraves, Kelly M.; Arthur, Ben J.; Egnor, S. E. Roian

2016-01-01

ABSTRACT Mice (Mus musculus) form large and dynamic social groups and emit ultrasonic vocalizations in a variety of social contexts. Surprisingly, these vocalizations have been studied almost exclusively in the context of cues from only one social partner, despite the observation that in many social species the presence of additional listeners changes the structure of communication signals. Here, we show that male vocal behavior elicited by female odor is affected by the presence of a male audience – with changes in vocalization count, acoustic structure and syllable complexity. We further show that single sensory cues are not sufficient to elicit this audience effect, indicating that multiple cues may be necessary for an audience to be apparent. Together, these experiments reveal that some features of mouse vocal behavior are only expressed in more complex social situations, and introduce a powerful new assay for measuring detection of the presence of social partners in mice. PMID:27207951
Mycosis fungoides of the true vocal cord: a case report and review of the literature.

PubMed

Maleki, Zahra; Azmi, Farrukh

2010-09-01

Mycosis fungoides is the most common type of cutaneous malignant T cell lymphoma which primarily affects skin. However, extracutaneous manifestation may occur in advanced stages, mostly observed in postmortem studies. We present a case of mycosis fungoides that disseminated to the true vocal cord of a 48-year-old African American man who presented with hoarseness. Only two cases that have also demonstrated a rare involvement of the true vocal cord have been reported in the English literature. In both cases, mycosis fungoides infiltration of the true vocal cord was seen postmortem, along with visceral dissemination of mycosis fungoides. We herein describe a single extracutaneous manifestation of mycosis fungoides in the true vocal cord of a living patient with a 21-year diagnosis of mycosis fungoides. Vocal cord involvement by mycosis fungoides must be considered as one of the differential diagnoses in any mycosis fungoides patients who complain of persistent hoarseness. Awareness of this entity is clinically important due to the necessity of a different management.
Evidence for an audience effect in mice: male social partners alter the male vocal response to female cues.

PubMed

Seagraves, Kelly M; Arthur, Ben J; Egnor, S E Roian

2016-05-15

Mice (Mus musculus) form large and dynamic social groups and emit ultrasonic vocalizations in a variety of social contexts. Surprisingly, these vocalizations have been studied almost exclusively in the context of cues from only one social partner, despite the observation that in many social species the presence of additional listeners changes the structure of communication signals. Here, we show that male vocal behavior elicited by female odor is affected by the presence of a male audience - with changes in vocalization count, acoustic structure and syllable complexity. We further show that single sensory cues are not sufficient to elicit this audience effect, indicating that multiple cues may be necessary for an audience to be apparent. Together, these experiments reveal that some features of mouse vocal behavior are only expressed in more complex social situations, and introduce a powerful new assay for measuring detection of the presence of social partners in mice. © 2016. Published by The Company of Biologists Ltd.
Exploring the zebra finch Taeniopygia guttata as a novel animal model for the speech-language deficit of fragile X syndrome.

PubMed

Winograd, Claudia; Ceman, Stephanie

2012-01-01

Fragile X syndrome (FXS) is the most common cause of inherited intellectual disability and presents with markedly atypical speech-language, likely due to impaired vocal learning. Although current models have been useful for studies of some aspects of FXS, zebra finch is the only tractable lab model for vocal learning. The neural circuits for vocal learning in the zebra finch have clear relationships to the pathways in the human brain that may be affected in FXS. Further, finch vocal learning may be quantified using software designed specifically for this purpose. Knockdown of the zebra finch FMR1 gene may ultimately enable novel tests of therapies that are modality-specific, using drugs or even social strategies, to ameliorate deficits in vocal development and function. In this chapter, we describe the utility of the zebra finch model and present a hypothesis for the role of FMRP in the developing neural circuitry for vocalization.
Lubrication mechanism of the larynx during phonation: an experiment in excised canine larynges.

PubMed

Nakagawa, H; Fukuda, H; Kawaida, M; Shiotani, A; Kanzaki, J

1998-01-01

To evaluate how the viscosity of the laryngeal mucus influences vocal fold vibration, two fluids of differing viscosity were applied separately to excised canine larynges and experimental phonation was induced. Vibration of the vocal folds was measured by use of a laryngostroboscope and an X-ray stroboscope. With the high viscosity fluid, the amplitude of vibration of the free edge and the peak glottal area was decreased while the open quotient was increased. Because the viscosity of this fluid affected the wave motion of the vocal fold mucosa, changes in viscosity of the mucus may be involved in causing such disorders as hoarseness, in the absence of apparent changes in the vocal folds themselves.
A Vowel-Based Method for Vocal Tract Control in Clarinet Pedagogy

ERIC Educational Resources Information Center

González, Darleny; Payri, Blas

2017-01-01

Our review of scientific literature shows that the activity inside the clarinetist's vocal tract (VT) affects pitch and timbre, while also facilitating technical exercises. Clarinetists adapt their VT intuitively and, in some cases, may compensate an inadequate VT configuration through unnecessary pressure, resulting in technical blockage,…
Aging Affects Identification of Vocal Emotions in Semantically Neutral Sentences

ERIC Educational Resources Information Center

Dupuis, Kate; Pichora-Fuller, M. Kathleen

2015-01-01

Purpose: The authors determined the accuracy of younger and older adults in identifying vocal emotions using the Toronto Emotional Speech Set (TESS; Dupuis & Pichora-Fuller, 2010a) and investigated the possible contributions of auditory acuity and suprathreshold processing to emotion identification accuracy. Method: In 2 experiments, younger…
Production, Usage, and Comprehension in Animal Vocalizations

ERIC Educational Resources Information Center

Seyfarth, Robert M.; Cheney, Dorothy L.

2010-01-01

In this review, we place equal emphasis on production, usage, and comprehension because these components of communication may exhibit different developmental trajectories and be affected by different neural mechanisms. In the animal kingdom generally, learned, flexible vocal production is rare, appearing in only a few orders of birds and few…
A Computational Study of Vocal Fold Dehydration During Phonation.

PubMed

Wu, Liang; Zhang, Zhaoyan

2017-12-01

While vocal fold dehydration is often considered an important factor contributing to vocal fatigue, it still remains unclear whether vocal fold vibration alone is able to induce severe dehydration that has a noticeable effect on phonation and perceived vocal effort. A three-dimensional model was developed to investigate vocal fold systemic dehydration and surface dehydration during phonation. Based on the linear poroelastic theory, the model considered water resupply from blood vessels through the lateral boundary, water movement within the vocal folds, water exchange between the vocal folds and the surface liquid layer through the epithelium, and surface fluid accumulation and discharge to the glottal airway. Parametric studies were conducted to investigate water loss within the vocal folds and from the surface after a 5-min sustained phonation under different permeability and vibration conditions. The results showed that the dehydration generally increased with increasing vibration amplitude, increasing epithelial permeability, and reduced water resupply. With adequate water resupply, a large-amplitude vibration can induce an overall systemic dehydration as high as 3%. The distribution of water loss within the vocal folds was non-uniform, and a local dehydration higher than 5% was observed even under conditions of a low overall systemic dehydration (<1%). Such high level of water loss may severely affect tissue properties, muscular functions, and phonations characteristics. In contrast, water loss of the surface liquid layer was generally an order of magnitude higher than water loss inside the vocal folds, indicating that the surface dehydration level is likely not a good indicator of the systemic dehydration.
Sensing emotion in voices: Negativity bias and gender differences in a validation study of the Oxford Vocal ('OxVoc') sounds database.

PubMed

Young, Katherine S; Parsons, Christine E; LeBeau, Richard T; Tabak, Benjamin A; Sewart, Amy R; Stein, Alan; Kringelbach, Morten L; Craske, Michelle G

2017-08-01

Emotional expressions are an essential element of human interactions. Recent work has increasingly recognized that emotional vocalizations can color and shape interactions between individuals. Here we present data on the psychometric properties of a recently developed database of authentic nonlinguistic emotional vocalizations from human adults and infants (the Oxford Vocal 'OxVoc' Sounds Database; Parsons, Young, Craske, Stein, & Kringelbach, 2014). In a large sample (n = 562), we demonstrate that adults can reliably categorize these sounds (as 'positive,' 'negative,' or 'sounds with no emotion'), and rate valence in these sounds consistently over time. In an extended sample (n = 945, including the initial n = 562), we also investigated a number of individual difference factors in relation to valence ratings of these vocalizations. Results demonstrated small but significant effects of (a) symptoms of depression and anxiety with more negative ratings of adult neutral vocalizations (R2 = .011 and R2 = .008, respectively) and (b) gender differences in perceived valence such that female listeners rated adult neutral vocalizations more positively and infant cry vocalizations more negatively than male listeners (R2 = .021, R2 = .010, respectively). Of note, we did not find evidence of negativity bias among other affective vocalizations or gender differences in perceived valence of adult laughter, adult cries, infant laughter, or infant neutral vocalizations. Together, these findings largely converge with factors previously shown to impact processing of emotional facial expressions, suggesting a modality-independent impact of depression, anxiety, and listener gender, particularly among vocalizations with more ambiguous valence. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A Mechanism for Frequency Modulation in Songbirds Shared with Humans

PubMed Central

Margoliash, Daniel

2013-01-01

In most animals that vocalize, control of fundamental frequency is a key element for effective communication. In humans, subglottal pressure controls vocal intensity but also influences fundamental frequency during phonation. Given the underlying similarities in the biomechanical mechanisms of vocalization in humans and songbirds, songbirds offer an attractive opportunity to study frequency modulation by pressure. Here, we present a novel technique for dynamic control of subsyringeal pressure in zebra finches. By regulating the opening of a custom-built fast valve connected to the air sac system, we achieved partial or total silencing of specific syllables, and could modify syllabic acoustics through more complex manipulations of air sac pressure. We also observed that more nuanced pressure variations over a limited interval during production of a syllable concomitantly affected the frequency of that syllable segment. These results can be explained in terms of a mathematical model for phonation that incorporates a nonlinear description for the vocal source capable of generating the observed frequency modulations induced by pressure variations. We conclude that the observed interaction between pressure and frequency was a feature of the source, not a result of feedback control. Our results indicate that, beyond regulating phonation or its absence, regulation of pressure is important for control of fundamental frequencies of vocalizations. Thus, although there are separate brainstem pathways for syringeal and respiratory control of song production, both can affect airflow and frequency. We hypothesize that the control of pressure and frequency is combined holistically at higher levels of the vocalization pathways. PMID:23825417
A mechanism for frequency modulation in songbirds shared with humans.

PubMed

Amador, Ana; Margoliash, Daniel

2013-07-03

In most animals that vocalize, control of fundamental frequency is a key element for effective communication. In humans, subglottal pressure controls vocal intensity but also influences fundamental frequency during phonation. Given the underlying similarities in the biomechanical mechanisms of vocalization in humans and songbirds, songbirds offer an attractive opportunity to study frequency modulation by pressure. Here, we present a novel technique for dynamic control of subsyringeal pressure in zebra finches. By regulating the opening of a custom-built fast valve connected to the air sac system, we achieved partial or total silencing of specific syllables, and could modify syllabic acoustics through more complex manipulations of air sac pressure. We also observed that more nuanced pressure variations over a limited interval during production of a syllable concomitantly affected the frequency of that syllable segment. These results can be explained in terms of a mathematical model for phonation that incorporates a nonlinear description for the vocal source capable of generating the observed frequency modulations induced by pressure variations. We conclude that the observed interaction between pressure and frequency was a feature of the source, not a result of feedback control. Our results indicate that, beyond regulating phonation or its absence, regulation of pressure is important for control of fundamental frequencies of vocalizations. Thus, although there are separate brainstem pathways for syringeal and respiratory control of song production, both can affect airflow and frequency. We hypothesize that the control of pressure and frequency is combined holistically at higher levels of the vocalization pathways.
Classification of complex information: inference of co-occurring affective states from their expressions in speech.

PubMed

Sobol-Shikler, Tal; Robinson, Peter

2010-07-01

We present a classification algorithm for inferring affective states (emotions, mental states, attitudes, and the like) from their nonverbal expressions in speech. It is based on the observations that affective states can occur simultaneously and different sets of vocal features, such as intonation and speech rate, distinguish between nonverbal expressions of different affective states. The input to the inference system was a large set of vocal features and metrics that were extracted from each utterance. The classification algorithm conducted independent pairwise comparisons between nine affective-state groups. The classifier used various subsets of metrics of the vocal features and various classification algorithms for different pairs of affective-state groups. Average classification accuracy of the 36 pairwise machines was 75 percent, using 10-fold cross validation. The comparison results were consolidated into a single ranked list of the nine affective-state groups. This list was the output of the system and represented the inferred combination of co-occurring affective states for the analyzed utterance. The inference accuracy of the combined machine was 83 percent. The system automatically characterized over 500 affective state concepts from the Mind Reading database. The inference of co-occurring affective states was validated by comparing the inferred combinations to the lexical definitions of the labels of the analyzed sentences. The distinguishing capabilities of the system were comparable to human performance.
Current treatment of vocal fold scarring.

PubMed

Hirano, Shigeru

2005-06-01

Vocal fold scarring still remains a therapeutic challenge, with the most problematic issue being the histologic changes that are primarily responsible for altering the viscoelasticity of the vocal fold mucosa. Optimal treatment for vocal fold scarring has not yet been established. To restore or regenerate damaged vocal folds, it is important to investigate the changes to the layer structure of the lamina propria. Tissue engineering and regenerative medicine may provide new strategies for the prevention and treatment of vocal fold scarring. Recent developments in this field are reviewed in the present article. Histologic studies have revealed that hyaluronic acid, fibronectin, decorin, and various other extracellular matrix components, as well as collagen, may contribute to determining the vibratory properties of the vocal fold mucosa. Changes of these molecules are thought to affect the viscoelasticity of the scarred vocal folds. Based on such histologic findings, innovative approaches have been developed, including administration of hyaluronic acid into injured or scarred vocal folds. Other strategies that have recently shown advances include growth factor therapy and cell therapy using stem cells or mature fibroblasts. The effects of these new treatments have not fully been confirmed clinically, but there seems to be great therapeutic potential in such regenerative medical strategies. Recent research has revealed the detailed histologic and rheologic changes related to vocal fold scarring. Based on these findings, various new therapeutic strategies have been developed in animal models using tissue engineering and regenerative medicine. However, no clinical trials have been performed, and more studies are necessary to establish the optimum modality.
Rhesus macaques recognize unique multi-modal face-voice relations of familiar individuals and not of unfamiliar ones

PubMed Central

Habbershon, Holly M.; Ahmed, Sarah Z.; Cohen, Yale E.

2013-01-01

Communication signals in non-human primates are inherently multi-modal. However, for laboratory-housed monkeys, there is relatively little evidence in support of the use of multi-modal communication signals in individual recognition. Here, we used a preferential-looking paradigm to test whether laboratory-housed rhesus could “spontaneously” (i.e., in the absence of operant training) use multi-modal communication stimuli to discriminate between known conspecifics. The multi-modal stimulus was a silent movie of two monkeys vocalizing and an audio file of the vocalization from one of the monkeys in the movie. We found that the gaze patterns of those monkeys that knew the individuals in the movie were reliably biased toward the individual that did not produce the vocalization. In contrast, there was not a systematic gaze pattern for those monkeys that did not know the individuals in the movie. These data are consistent with the hypothesis that laboratory-housed rhesus can recognize and distinguish between conspecifics based on auditory and visual communication signals. PMID:23774779
Identification of prelinguistic phonological categories.

PubMed

Ramsdell, Heather L; Oller, D Kimbrough; Buder, Eugene H; Ethington, Corinna A; Chorna, Lesya

2012-12-01

The prelinguistic infant's babbling repertoire of syllables--the phonological categories that form the basis for early word learning--is noticed by caregivers who interact with infants around them. Prior research on babbling has not explored the caregiver's role in recognition of early vocal categories as foundations for word learning. In the present work, the authors begin to address this gap. The authors explored vocalizations produced by 8 infants at 3 ages (8, 10, and 12 months) in studies illustrating identification of phonological categories through caregiver report, laboratory procedures simulating the caregiver's natural mode of listening, and the more traditional laboratory approach (phonetic transcription). Caregivers reported small repertoires of syllables for their infants. Repertoires of similar size and phonetic content were discerned in the laboratory by judges who simulated the caregiver's natural mode of listening. However, phonetic transcription with repeated listening to infant recordings yielded repertoire sizes that vastly exceeded those reported by caregivers and naturalistic listeners. The results suggest that caregiver report and naturalistic listening by laboratory staff can provide a new way to explore key characteristics of early infant vocal categories, a way that may provide insight into later speech and language development.

Ultrasonic vocalization changes and FOXP2 expression after experimental stroke.

PubMed

Doran, Sarah J; Trammel, Cassandra; Benashaski, Sharon E; Venna, Venugopal Reddy; McCullough, Louise D

2015-04-15

Speech impairments affect one in four stroke survivors. However, animal models of post-ischemic vocalization deficits are limited. Male mice vocalize at ultrasonic frequencies when exposed to an estrous female mouse. In this study we assessed vocalization patterns and quantity in male mice after cerebral ischemia. FOXP2, a gene associated with verbal dyspraxia in humans, with known roles in neurogenesis and synaptic plasticity, was also examined after injury. Using a transient middle cerebral artery occlusion (MCAO) model, we assessed correlates of vocal impairment at several time-points after stroke. Further, to identify possible lateralization of vocalization deficits induced by left and right hemispheric strokes were compared. Significant differences in vocalization quantity were observed between stroke and sham animals that persisted for a month after injury. Injury to the left hemisphere reduced early vocalizations more profoundly than those to the right hemisphere. Nuclear expression of Foxp2 was elevated early after stroke (at 6h), but significantly decreased 24h after injury in both the nucleus and the cytoplasm. Neuronal Foxp2 expression increased in stroke mice compared to sham animals 4 weeks after injury. This study demonstrates that quantifiable deficits in ultrasonic vocalizations (USVs) are seen after stroke. USV may be a useful tool to assess chronic behavioral recovery in murine models of stroke. Copyright © 2015 Elsevier B.V. All rights reserved.
Bi-stable vocal fold adduction: a mechanism of modal-falsetto register shifts and mixed registration.

PubMed

Titze, Ingo R

2014-04-01

The origin of vocal registers has generally been attributed to differential activation of cricothyroid and thyroarytenoid muscles in the larynx. Register shifts, however, have also been shown to be affected by glottal pressures exerted on vocal fold surfaces, which can change with loudness, pitch, and vowel. Here it is shown computationally and with empirical data that intraglottal pressures can change abruptly when glottal adductory geometry is changed relatively smoothly from convergent to divergent. An intermediate shape between large convergence and large divergence, namely, a nearly rectangular glottal shape with almost parallel vocal fold surfaces, is associated with mixed registration. It can be less stable than either of the highly angular shapes unless transglottal pressure is reduced and upper stiffness of vocal fold tissues is balanced with lower stiffness. This intermediate state of adduction is desirable because it leads to a low phonation threshold pressure with moderate vocal fold collision. Achieving mixed registration consistently across wide ranges of F0, lung pressure, and vocal tract shapes appears to be a balancing act of coordinating laryngeal muscle activation with vocal tract pressures. Surprisingly, a large transglottal pressure is not facilitative in this process, exacerbating the bi-stable condition and the associated register contrast.
Bi-stable vocal fold adduction: A mechanism of modal-falsetto register shifts and mixed registration

PubMed Central

Titze, Ingo R.

2014-01-01

The origin of vocal registers has generally been attributed to differential activation of cricothyroid and thyroarytenoid muscles in the larynx. Register shifts, however, have also been shown to be affected by glottal pressures exerted on vocal fold surfaces, which can change with loudness, pitch, and vowel. Here it is shown computationally and with empirical data that intraglottal pressures can change abruptly when glottal adductory geometry is changed relatively smoothly from convergent to divergent. An intermediate shape between large convergence and large divergence, namely, a nearly rectangular glottal shape with almost parallel vocal fold surfaces, is associated with mixed registration. It can be less stable than either of the highly angular shapes unless transglottal pressure is reduced and upper stiffness of vocal fold tissues is balanced with lower stiffness. This intermediate state of adduction is desirable because it leads to a low phonation threshold pressure with moderate vocal fold collision. Achieving mixed registration consistently across wide ranges of F0, lung pressure, and vocal tract shapes appears to be a balancing act of coordinating laryngeal muscle activation with vocal tract pressures. Surprisingly, a large transglottal pressure is not facilitative in this process, exacerbating the bi-stable condition and the associated register contrast. PMID:25235006
Using bedding in a test environment critically affects 50-kHz ultrasonic vocalizations in laboratory rats.

PubMed

Natusch, C; Schwarting, R K W

2010-09-01

Rats utter distinct classes of ultrasonic vocalizations depending on their developmental stage, current state, and situational factors. One class, comprising the so-called 50-kHz calls, is typical for situations where rats are anticipating or actually experiencing rewarding stimuli, like being tickled by an experimenter, or when treated with drugs of abuse, such as the psychostimulant amphetamine. Furthermore, rats emit 50-kHz calls when exposed to a clean housing cage. Here, we show that such vocalization effects can depend on subtle details of the testing situation, namely the presence of fresh rodent bedding. Actually, we found that adult males vocalize more in bedded cages than in bare ones. Also, two experiments showed that adult rats emitted more 50-kHz calls when tickled on fresh bedding. Furthermore, ip amphetamine led to more 50-kHz vocalization in activity boxes containing such bedding as compared to bare ones. The analysis of psychomotor activation did not yield such group differences in case of locomotion and centre time, except for rearing duration in rats tested on bedding. Also, the temporal profile of vocalization did not parallel that of behavioural activation, since the effects on vocalization peaked and started to decline again before those of psychomotor activation. Therefore, 50-kHz calls are not a simple correlate of psychomotor activation. A final experiment with a choice procedure showed that rats prefer bedded conditions. Overall, we assume that bedded environments induce a positive affective state, which increases the likelihood of 50-kHz calling. Based on these findings, we recommend that contextual factors, like bedding, should receive more research attention, since they can apparently decrease the aversiveness of a testing situation. Also, we recommend to more routinely measure rat ultrasonic vocalization, especially when studying emotion and motivation, since this analysis can provide information about the subject's status, which may not be detected in its visible behaviour. Copyright 2010 Elsevier Inc. All rights reserved.
Sparrowhawk movement, calling, and presence of dead conspecifics differentially impact blue tit (Cyanistes caeruleus) vocal and behavioral mobbing responses.

PubMed

Carlson, Nora V; Pargeter, Helen M; Templeton, Christopher N

2017-01-01

Many animals alter their anti-predator behavior in accordance to the threat level of a predator. While much research has examined variation in mobbing responses to different predators, few studies have investigated how anti-predator behavior is affected by changes in a predator's own state or behavior. We examined the effect of sparrowhawk ( Accipiter nisus ) behavior on the mobbing response of wild blue tits ( Cyanistes caeruleus ) using robotic taxidermy sparrowhawks. We manipulated whether the simulated predator moved its head, produced vocalizations, or held a taxidermy blue tit in its talons. When any sparrowhawk model was present, blue tits decreased foraging and increased anti-predator behavior and vocalizations. Additionally, each manipulation of the model predator's state (moving, vocalizing, or the presence of a dead conspecific) impacted different types of blue tit anti-predator behavior and vocalizations. These results indicate that different components of mobbing vary according to the specific state of a given predator-beyond its presence or absence-and suggest that each might play a different role in the overall mobbing response. Last, our results indicate that using more life-like predator stimuli-those featuring simple head movements and audio playback of vocalizations-changes how prey respond to the predator; these 'robo-raptor' models provide a powerful tool to provide increased realism in simulated predator encounters without sacrificing experimental control. Anti-predatory behavior is often modulated by the threat level posed by a particular predator. While much research has tested how different types of predators change prey behavior, few experiments have examined how predator behavior affects anti-predatory responses of prey. By experimentally manipulating robotic predators, we show that blue tits not only respond to the presence of a sparrowhawk, by decreasing feeding and increasing anti-predator behavior and vocalizations, but that they vary specific anti-predator behaviors when encountering differently behaving predators (moving, vocalizing, or those with captured prey), suggesting that prey pay attention to their predators' state and behavior.
Dissociable Effects on Birdsong of Androgen Signaling in Cortex-Like Brain Regions of Canaries

PubMed Central

2017-01-01

The neural basis of how learned vocalizations change during development and in adulthood represents a major challenge facing cognitive neuroscience. This plasticity in the degree to which learned vocalizations can change in both humans and songbirds is linked to the actions of sex steroid hormones during ontogeny but also in adulthood in the context of seasonal changes in birdsong. We investigated the role of steroid hormone signaling in the brain on distinct features of birdsong using adult male canaries (Serinus canaria), which show extensive seasonal vocal plasticity as adults. Specifically, we bilaterally implanted the potent androgen receptor antagonist flutamide in two key brain regions that control birdsong. We show that androgen signaling in the motor cortical-like brain region, the robust nucleus of the arcopallium (RA), controls syllable and trill bandwidth stereotypy, while not significantly affecting higher order features of song such syllable-type usage (i.e., how many times each syllable type is used) or syllable sequences. In contrast, androgen signaling in the premotor cortical-like brain region, HVC (proper name), controls song variability by increasing the variability of syllable-type usage and syllable sequences, while having no effect on syllable or trill bandwidth stereotypy. Other aspects of song, such as the duration of trills and the number of syllables per song, were also differentially affected by androgen signaling in HVC versus RA. These results implicate androgens in regulating distinct features of complex motor output in a precise and nonredundant manner. SIGNIFICANCE STATEMENT Vocal plasticity is linked to the actions of sex steroid hormones, but the precise mechanisms are unclear. We investigated this question in adult male canaries (Serinus canaria), which show extensive vocal plasticity throughout their life. We show that androgens in two cortex-like vocal control brain regions regulate distinct aspects of vocal plasticity. For example, in HVC (proper name), androgens regulate variability in syntax but not phonology, whereas androgens in the robust nucleus of the arcopallium (RA) regulate variability in phonology but not syntax. Temporal aspects of song were also differentially affected by androgen signaling in HVC versus RA. Thus, androgen signaling may reduce vocal plasticity by acting in a nonredundant and precise manner in the brain. PMID:28821656
Call recognition and individual identification of fish vocalizations based on automatic speech recognition: An example with the Lusitanian toadfish.

PubMed

Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C

2015-12-01

The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Recognition of facial expressions and prosodic cues with graded emotional intensities in adults with Asperger syndrome.

PubMed

Doi, Hirokazu; Fujisawa, Takashi X; Kanai, Chieko; Ohta, Haruhisa; Yokoi, Hideki; Iwanami, Akira; Kato, Nobumasa; Shinohara, Kazuyuki

2013-09-01

This study investigated the ability of adults with Asperger syndrome to recognize emotional categories of facial expressions and emotional prosodies with graded emotional intensities. The individuals with Asperger syndrome showed poorer recognition performance for angry and sad expressions from both facial and vocal information. The group difference in facial expression recognition was prominent for stimuli with low or intermediate emotional intensities. In contrast to this, the individuals with Asperger syndrome exhibited lower recognition accuracy than typically-developed controls mainly for emotional prosody with high emotional intensity. In facial expression recognition, Asperger and control groups showed an inversion effect for all categories. The magnitude of this effect was less in the Asperger group for angry and sad expressions, presumably attributable to reduced recruitment of the configural mode of face processing. The individuals with Asperger syndrome outperformed the control participants in recognizing inverted sad expressions, indicating enhanced processing of local facial information representing sad emotion. These results suggest that the adults with Asperger syndrome rely on modality-specific strategies in emotion recognition from facial expression and prosodic information.
Exploring vocal recovery after cranial nerve injury in Bengalese finches.

PubMed

Urbano, Catherine M; Peterson, Jennifer R; Cooper, Brenton G

2013-02-08

Songbirds and humans use auditory feedback to acquire and maintain their vocalizations. The Bengalese finch (Lonchura striata domestica) is a songbird species that rapidly modifies its vocal output to adhere to an internal song memory. In this species, the left side of the bipartite vocal organ is specialized for producing louder, higher frequencies (≥2.2kHz) and denervation of the left vocal muscles eliminates these notes. Thus, the return of higher frequency notes after cranial nerve injury can be used as a measure of vocal recovery. Either the left or right side of the syrinx was denervated by resection of the tracheosyringeal portion of the hypoglossal nerve. Histologic analyses of syringeal muscle tissue showed significant muscle atrophy in the denervated side. After left nerve resection, songs were mainly composed of lower frequency syllables, but three out of five birds recovered higher frequency syllables. Right nerve resection minimally affected phonology, but it did change song syntax; syllable sequence became abnormally stereotyped after right nerve resection. Therefore, damage to the neuromuscular control of sound production resulted in reduced motor variability, and Bengalese finches are a potential model for functional vocal recovery following cranial nerve injury. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Ultrasonic vocalizations: a tool for behavioural phenotyping of mouse models of neurodevelopmental disorders

PubMed Central

Scattoni, Maria Luisa; Crawley, Jacqueline; Ricceri, Laura

2009-01-01

In neonatal mice ultrasonic vocalizations have been studied both as an early communicative behavior of the pup-mother dyad and as a sign of an aversive affective state. Adult mice of both sexes produce complex ultrasonic vocalization patterns in different experimental/social contexts. All these vocalizations are becoming an increasingly valuable assay for behavioral phenotyping throughout the mouse life-span and alterations of the ultrasound patterns have been reported in several mouse models of neurodevelopmental disorders. Here we also show that the modulation of vocalizations by maternal cues (maternal potentiation paradigm) – originally identified and investigated in rats - can be measured in C57Bl/6 mouse pups with appropriate modifications of the rat protocol and can likely be applied to mouse behavioral phenotyping. In addition we suggest that a detailed qualitative evaluation of neonatal calls together with analysis of adult mouse vocalization patterns in both sexes in social settings, may lead to a greater understanding of the communication value of vocalizations in mice. Importantly, both neonatal and adult USV altered patterns can be determined during the behavioural phenotyping of mouse models of human neurodevelopmental and neuropsychiatric disorders, starting from those in which deficits in communication are a primary symptom. PMID:18771687
Do obesity and weight loss affect vocal function?

PubMed

Solomon, Nancy Pearl; Helou, Leah B; Dietrich-Burns, Katie; Stojadinovic, Alexander

2011-02-01

Obesity may be associated with increased tissue bulk in the laryngeal airway, neck, and chest wall, and as such may affect vocal function. Eight obese and eight nonobese adults participated in this study; the obese participants underwent bariatric surgical procedures. This mixed-design study included cross-sectional analysis for group differences and longitudinal analysis for multidimensional changes in vocal function from four assessments collected over 6 months. No significant differences were detected between groups from the preoperative assessment. Further, no changes were detected over time for acoustic parameters, maximum phonation time, laryngeal airway resistance, and airflow during a sustained vowel for either group. Only minor differences were detected for strain, pitch, and loudness perceptions of voice over time, but not between groups. Phonation threshold pressure (PTP), at comfortable and high pitches (30% and 80% of the F0 range) changed significantly over time, but not between groups. Examination of individual data revealed a trend for PTP at 30% F0 to decrease as body mass index decreased. PTP may be informative for assessing vocal function in clients who present with obesity and voice symptoms. © Thieme Medical Publishers.
The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

PubMed

Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

2016-11-01

This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
From Mimicry to Language: A Neuroanatomically Based Evolutionary Model of the Emergence of Vocal Language

PubMed Central

Poliva, Oren

2016-01-01

The auditory cortex communicates with the frontal lobe via the middle temporal gyrus (auditory ventral stream; AVS) or the inferior parietal lobule (auditory dorsal stream; ADS). Whereas the AVS is ascribed only with sound recognition, the ADS is ascribed with sound localization, voice detection, prosodic perception/production, lip-speech integration, phoneme discrimination, articulation, repetition, phonological long-term memory and working memory. Previously, I interpreted the juxtaposition of sound localization, voice detection, audio-visual integration and prosodic analysis, as evidence that the behavioral precursor to human speech is the exchange of contact calls in non-human primates. Herein, I interpret the remaining ADS functions as evidence of additional stages in language evolution. According to this model, the role of the ADS in vocal control enabled early Homo (Hominans) to name objects using monosyllabic calls, and allowed children to learn their parents' calls by imitating their lip movements. Initially, the calls were forgotten quickly but gradually were remembered for longer periods. Once the representations of the calls became permanent, mimicry was limited to infancy, and older individuals encoded in the ADS a lexicon for the names of objects (phonological lexicon). Consequently, sound recognition in the AVS was sufficient for activating the phonological representations in the ADS and mimicry became independent of lip-reading. Later, by developing inhibitory connections between acoustic-syllabic representations in the AVS and phonological representations of subsequent syllables in the ADS, Hominans became capable of concatenating the monosyllabic calls for repeating polysyllabic words (i.e., developed working memory). Finally, due to strengthening of connections between phonological representations in the ADS, Hominans became capable of encoding several syllables as a single representation (chunking). Consequently, Hominans began vocalizing and mimicking/rehearsing lists of words (sentences). PMID:27445676
New Ideas for Speech Recognition and Related Technologies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, J F

The ideas relating to the use of organ motion sensors for the purposes of speech recognition were first described by.the author in spring 1994. During the past year, a series of productive collaborations between the author, Tom McEwan and Larry Ng ensued and have lead to demonstrations, new sensor ideas, and algorithmic descriptions of a large number of speech recognition concepts. This document summarizes the basic concepts of recognizing speech once organ motions have been obtained. Micro power radars and their uses for the measurement of body organ motions, such as those of the heart and lungs, have been demonstratedmore » by Tom McEwan over the past two years. McEwan and I conducted a series of experiments, using these instruments, on vocal organ motions beginning in late spring, during which we observed motions of vocal folds (i.e., cords), tongue, jaw, and related organs that are very useful for speech recognition and other purposes. These will be reviewed in a separate paper. Since late summer 1994, Lawrence Ng and I have worked to make many of the initial recognition ideas more rigorous and to investigate the applications of these new ideas to new speech recognition algorithms, to speech coding, and to speech synthesis. I introduce some of those ideas in section IV of this document, and we describe them more completely in the document following this one, UCRL-UR-120311. For the design and operation of micro-power radars and their application to body organ motions, the reader may contact Tom McEwan directly. The capability for using EM sensors (i.e., radar units) to measure body organ motions and positions has been available for decades. Impediments to their use appear to have been size, excessive power, lack of resolution, and lack of understanding of the value of organ motion measurements, especially as applied to speech related technologies. However, with the invention of very low power, portable systems as demonstrated by McEwan at LLNL researchers have begun to think differently about practical applications of such radars. In particular, his demonstrations of heart and lung motions have opened up many new areas of application for human and animal measurements.« less
Current Treatment Options for Bilateral Vocal Fold Paralysis: A State-of-the-Art Review

PubMed Central

Li, Yike; Garrett, Gaelyn; Zealear, David

2017-01-01

Vocal fold paralysis (VFP) refers to neurological causes of reduced or absent movement of one or both vocal folds. Bilateral VFP (BVFP) is characterized by inspiratory dyspnea due to narrowing of the airway at the glottic level with both vocal folds assuming a paramedian position. The primary objective of intervention for BVFP is to relieve patients’ dyspnea. Common clinical options for management include tracheostomy, arytenoidectomy and cordotomy. Other options that have been used with varying success include reinnervation techniques and botulinum toxin (Botox) injections into the vocal fold adductors. More recently, research has focused on neuromodulation, laryngeal pacing, gene therapy, and stem cell therapy. These newer approaches have the potential advantage of avoiding damage to the voicing mechanism of the larynx with an added goal of restoring some physiologic movement of the affected vocal folds. However, clinical data are scarce for these new treatment options (i.e., reinnervation and pacing), so more investigative work is needed. These areas of research are expected to provide dramatic improvements in the treatment of BVFP. PMID:28669149
Interactions of hyaluronan grafted on protein surfaces studied using a quartz crystal microbalance and a surface force balance.

PubMed

Jiang, Lei; Han, Juan; Yang, Limin; Ma, Hongchao; Huang, Bo

2015-10-07

Vocal folds are complex and multilayer-structured where the main layer is widely composed of hyaluronan (HA). The viscoelasticity of HA is key to voice production in the vocal fold as it affects the initiation and maintenance of phonation. In this study a simple layer-structured surface model was set up to mimic the structure of the vocal folds. The interactions between two opposing surfaces bearing HA were measured and characterised to analyse HA's response to the normal and shear compression at a stress level similar to that in the vocal fold. From the measurements of the quartz crystal microbalance, atomic force microscopy and the surface force balance, the osmotic pressure, normal interactions, elasticity change, volume fraction, refractive index and friction of both HA and the supporting protein layer were obtained. These findings may shed light on the physical mechanism of HA function in the vocal fold and the specific role of HA as an important component in the effective treatment of the vocal fold disease.
Effect of pneumotach on measurement of vocal function

NASA Astrophysics Data System (ADS)

Walters, Gage; McPhail, Michael; Krane, Michael

2017-11-01

Aerodynamic and acoustic measurements of vocal function were performed in a physical model of the human airway with and without a pneumotach (Rothenberg mask), used by clinicians to measure vocal volume flow. The purpose of these experiments was to assess whether the device alters acoustic and aerodynamic conditions sufficiently to change phonation behavior. The airway model, which mimics acoustic behavior of an adult human airway from trachea to mouth, consists of a 31.5cm long straight duct with a 2.54cm square cross section. Model vocal folds comprised of molded silicone rubber were set into vibration by introducing airflow from a compressed air source. Measurements included transglottal pressure difference, mean volume flow, vocal fold vibratory motion, and sound pressure measured at the mouth. The experiments show that while the pneumotach imparted measurable aerodynamic and acoustic loads on the system, measurement of mean glottal resistance was not affected. Acoustic pressure levels were attenuated, however, suggesting clinical acoustic measurements of vocal function need correction when performed in conjunction with a pneumotach Acknowledge support from NIH DC R01005642-11.
Non-invasive In vivo measurement of the shear modulus of human vocal fold tissue

PubMed Central

Kazemirad, Siavash; Bakhshaee, Hani; Mongeau, Luc; Kost, Karen

2014-01-01

Voice is the essential part of singing and speech communication. Voice disorders significantly affect the quality of life. The viscoelastic mechanical properties of the vocal fold mucosa determine the characteristics of the vocal folds oscillations, and thereby voice quality. In the present study, a non-invasive method was developed to determine the shear modulus of human vocal fold tissue in vivo via measurements of the mucosal wave propagation speed during phonation. Images of four human subjects’ vocal folds were captured using high speed digital imaging (HSDI) and magnetic resonance imaging (MRI) for different phonation pitches, specifically fundamental frequencies between 110 to 440 Hz. The MRI images were used to obtain the morphometric dimensions of each subject's vocal folds in order to determine the pixel size in the high-speed images. The mucosal wave propagation speed was determined for each subject and at each pitch value using an automated image processing algorithm. The transverse shear modulus of the vocal fold mucosa was then calculated from a surface (Rayleigh) wave propagation dispersion equation using the measured wave speeds. It was found that the mucosal wave propagation speed and therefore the shear modulus of the vocal fold tissue were generally greater at higher pitches. The results were in good agreement with those from other studies obtained via in vitro measurements, thereby supporting the validity of the proposed measurement method. This method offers the potential for in vivo clinical assessments of vocal folds viscoelasticity from HSDI. PMID:24433668
Catecholaminergic contributions to vocal communication signals.

PubMed

Matheson, Laura E; Sakata, Jon T

2015-05-01

Social context affects behavioral displays across a variety of species. For example, social context acutely influences the acoustic and temporal structure of vocal communication signals such as speech and birdsong. Despite the prevalence and importance of such social influences, little is known about the neural mechanisms underlying the social modulation of communication. Catecholamines are implicated in the regulation of social behavior and motor control, but the degree to which catecholamines influence vocal communication signals remains largely unknown. Using a songbird, the Bengalese finch, we examined the extent to which the social context in which song is produced affected immediate early gene expression (EGR-1) in catecholamine-synthesising neurons in the midbrain. Further, we assessed the degree to which administration of amphetamine, which increases catecholamine concentrations in the brain, mimicked the effect of social context on vocal signals. We found that significantly more catecholaminergic neurons in the ventral tegmental area and substantia nigra (but not the central grey, locus coeruleus or subcoeruleus) expressed EGR-1 in birds that were exposed to females and produced courtship song than in birds that produced non-courtship song in isolation. Furthermore, we found that amphetamine administration mimicked the effects of social context and caused many aspects of non-courtship song to resemble courtship song. Specifically, amphetamine increased the stereotypy of syllable structure and sequencing, the repetition of vocal elements and the degree of sequence completions. Taken together, these data highlight the conserved role of catecholamines in vocal communication across species, including songbirds and humans. © 2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Cingulo-opercular activity affects incidental memory encoding for speech in noise.

PubMed

Vaden, Kenneth I; Teubner-Rhodes, Susan; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A

2017-08-15

Correctly understood speech in difficult listening conditions is often difficult to remember. A long-standing hypothesis for this observation is that the engagement of cognitive resources to aid speech understanding can limit resources available for memory encoding. This hypothesis is consistent with evidence that speech presented in difficult conditions typically elicits greater activity throughout cingulo-opercular regions of frontal cortex that are proposed to optimize task performance through adaptive control of behavior and tonic attention. However, successful memory encoding of items for delayed recognition memory tasks is consistently associated with increased cingulo-opercular activity when perceptual difficulty is minimized. The current study used a delayed recognition memory task to test competing predictions that memory encoding for words is enhanced or limited by the engagement of cingulo-opercular activity during challenging listening conditions. An fMRI experiment was conducted with twenty healthy adult participants who performed a word identification in noise task that was immediately followed by a delayed recognition memory task. Consistent with previous findings, word identification trials in the poorer signal-to-noise ratio condition were associated with increased cingulo-opercular activity and poorer recognition memory scores on average. However, cingulo-opercular activity decreased for correctly identified words in noise that were not recognized in the delayed memory test. These results suggest that memory encoding in difficult listening conditions is poorer when elevated cingulo-opercular activity is not sustained. Although increased attention to speech when presented in difficult conditions may detract from more active forms of memory maintenance (e.g., sub-vocal rehearsal), we conclude that task performance monitoring and/or elevated tonic attention supports incidental memory encoding in challenging listening conditions. Copyright © 2017 Elsevier Inc. All rights reserved.

Utterance Duration as It Relates to Communicative Variables in Infant Vocal Development

ERIC Educational Resources Information Center

Ramsdell-Hudock, Heather L.; Stuart, Andrew; Parham, Douglas F.

2018-01-01

Purpose: We aimed to provide novel information on utterance duration as it relates to vocal type, facial affect, gaze direction, and age in the prelinguistic/early linguistic infant. Method: Infant utterances were analyzed from longitudinal recordings of 15 infants at 8, 10, 12, 14, and 16 months of age. Utterance durations were measured and coded…
Cross-cultural emotional prosody recognition: evidence from Chinese and British listeners.

PubMed

Paulmann, Silke; Uskul, Ayse K

2014-01-01

This cross-cultural study of emotional tone of voice recognition tests the in-group advantage hypothesis (Elfenbein & Ambady, 2002) employing a quasi-balanced design. Individuals of Chinese and British background were asked to recognise pseudosentences produced by Chinese and British native speakers, displaying one of seven emotions (anger, disgust, fear, happy, neutral tone of voice, sad, and surprise). Findings reveal that emotional displays were recognised at rates higher than predicted by chance; however, members of each cultural group were more accurate in recognising the displays communicated by a member of their own cultural group than a member of the other cultural group. Moreover, the evaluation of error matrices indicates that both culture groups relied on similar mechanism when recognising emotional displays from the voice. Overall, the study reveals evidence for both universal and culture-specific principles in vocal emotion recognition.
A humanized version of Foxp2 does not affect ultrasonic vocalization in adult mice.

PubMed

Hammerschmidt, K; Schreiweis, C; Minge, C; Pääbo, S; Fischer, J; Enard, W

2015-11-01

The transcription factor FOXP2 has been linked to severe speech and language impairments in humans. An analysis of the evolution of the FOXP2 gene has identified two amino acid substitutions that became fixed after the split of the human and chimpanzee lineages. Studying the functional consequences of these two substitutions in the endogenous Foxp2 gene of mice showed alterations in dopamine levels, striatal synaptic plasticity, neuronal morphology and cortico-striatal-dependent learning. In addition, ultrasonic vocalizations (USVs) of pups had a significantly lower average pitch than control littermates. To which degree adult USVs would be affected in mice carrying the 'humanized' Foxp2 variant remained unclear. In this study, we analyzed USVs of 68 adult male mice uttered during repeated courtship encounters with different females. Mice carrying the Foxp2(hum/hum) allele did not differ significantly in the number of call elements, their element structure or in their element composition from control littermates. We conclude that neither the structure nor the usage of USVs in adult mice is affected by the two amino acid substitutions that occurred in FOXP2 during human evolution. The reported effect for pup vocalization thus appears to be transient. These results are in line with accumulating evidence that mouse USVs are hardly influenced by vocal learning. Hence, the function and evolution of genes that are necessary, but not sufficient for vocal learning in humans, must be either studied at a different phenotypic level in mice or in other organisms. © 2015 The Authors. Genes, Brain and Behavior published by International Behavioural and Neural Genetics Society and John Wiley & Sons Ltd.
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

PubMed

Mitra, Vikramjit; Nam, Hosung; Espy-Wilson, Carol Y; Saltzman, Elliot; Goldstein, Louis

2010-09-13

Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is usually termed "speech-inversion." This study aims to propose and compare various machine learning strategies for speech inversion: Trajectory mixture density networks (TMDNs), feedforward artificial neural networks (FF-ANN), support vector regression (SVR), autoregressive artificial neural network (AR-ANN), and distal supervised learning (DSL). Further, using a database generated by the Haskins Laboratories speech production model, we test the claim that information regarding constrictions produced by the distinct organs of the vocal tract (vocal tract variables) is superior to flesh-point information (articulatory pellet trajectories) for the inversion process.
Popular song and lyrics synchronization and its application to music information retrieval

NASA Astrophysics Data System (ADS)

Chen, Kai; Gao, Sheng; Zhu, Yongwei; Sun, Qibin

2006-01-01

An automatic synchronization system of the popular song and its lyrics is presented in the paper. The system includes two main components: a) automatically detecting vocal/non-vocal in the audio signal and b) automatically aligning the acoustic signal of the song with its lyric using speech recognition techniques and positioning the boundaries of the lyrics in its acoustic realization at the multiple levels simultaneously (e.g. the word / syllable level and phrase level). The GMM models and a set of HMM-based acoustic model units are carefully designed and trained for the detection and alignment. To eliminate the severe mismatch due to the diversity of musical signal and sparse training data available, the unsupervised adaptation technique such as maximum likelihood linear regression (MLLR) is exploited for tailoring the models to the real environment, which improves robustness of the synchronization system. To further reduce the effect of the missed non-vocal music on alignment, a novel grammar net is build to direct the alignment. As we know, this is the first automatic synchronization system only based on the low-level acoustic feature such as MFCC. We evaluate the system on a Chinese song dataset collecting from 3 popular singers. We obtain 76.1% for the boundary accuracy at the syllable level (BAS) and 81.5% for the boundary accuracy at the phrase level (BAP) using fully automatic vocal/non-vocal detection and alignment. The synchronization system has many applications such as multi-modality (audio and textual) content-based popular song browsing and retrieval. Through the study, we would like to open up the discussion of some challenging problems when developing a robust synchronization system for largescale database.
Comparison of Voice Handicap Index Scores Between Female Students of Speech Therapy and Other Health Professions.

PubMed

Tafiadis, Dionysios; Chronopoulos, Spyridon K; Siafaka, Vassiliki; Drosos, Konstantinos; Kosma, Evangelia I; Toki, Eugenia I; Ziavra, Nausica

2017-09-01

Students' groups (eg, teachers, speech language pathologists) are presumably at risk of developing a voice disorder due to misuse of their voice, which will affect their way of living. Multidisciplinary voice assessment of student populations is currently spread widely along with the use of self-reported questionnaires. This study compared the Voice Handicap Index domains and item scores between female students of speech and language therapy and of other health professions in Greece. We also examined the probability of speech language therapy students developing any vocal symptom. Two hundred female non-dysphonic students (aged 18-31) were recruited. Participants answered the Voice Evaluation Form and the Greek adaptation of the Voice Handicap Index. Significant differences were observed between the two groups (students of speech therapy and other health professions) through Voice Handicap Index (total score, functional and physical domains), excluding the emotional domain. Furthermore, significant differences for specific Voice Handicap Index items, between subgroups, were observed. In conclusion, speech language therapy students had higher Voice Handicap Index scores, which probably could be an indicator for avoiding profession-related dysphonia at a later stage. Also, Voice Handicap Index could be at a first glance an assessment tool for the recognition of potential voice disorder development in students. In turn, the results could be used for indirect therapy approaches, such as providing methods for maintaining vocal health in different student populations. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Development of vocal tract length during early childhood: A magnetic resonance imaging study

NASA Astrophysics Data System (ADS)

Vorperian, Houri K.; Kent, Ray D.; Lindstrom, Mary J.; Kalina, Cliff M.; Gentry, Lindell R.; Yandell, Brian S.

2005-01-01

Speech development in children is predicated partly on the growth and anatomic restructuring of the vocal tract. This study examines the growth pattern of the various hard and soft tissue vocal tract structures as visualized by magnetic resonance imaging (MRI), and assesses their relational growth with vocal tract length (VTL). Measurements on lip thickness, hard- and soft-palate length, tongue length, naso-oro-pharyngeal length, mandibular length and depth, and distance of the hyoid bone and larynx from the posterior nasal spine were used from 63 pediatric cases (ages birth to 6 years and 9 months) and 12 adults. Results indicate (a) ongoing growth of all oral and pharyngeal vocal tract structures with no sexual dimorphism, and a period of accelerated growth between birth and 18 months; (b) vocal tract structure's region (oral/anterior versus pharyngeal/posterior) and orientation (horizontal versus vertical) determine its growth pattern; and (c) the relational growth of the different structures with VTL changes with development-while the increase in VTL throughout development is predominantly due to growth of pharyngeal/posterior structures, VTL is also substantially affected by the growth of oral/anterior structures during the first 18 months of life. Findings provide normative data that can be used for modeling the development of the vocal tract. .
Feminization laryngoplasty: assessment of surgical pitch elevation.

PubMed

Thomas, James P; Macmillan, Cody

2013-09-01

The aim of this study is to analyze change in pitch following feminization laryngoplasty, a technique to alter the vocal tract of male to female transgender patients. This is a retrospective review of 94 patients undergoing feminization laryngoplasty between June 2002 and April 2012 of which 76 individuals completed follow-up audio recordings. Feminization laryngoplasty is a procedure removing the anterior thyroid cartilage, collapsing the diameter of the larynx as well as shortening and tensioning the vocal folds to raise the pitch. Changes in comfortable speaking pitch, lowest vocal pitch and highest vocal pitch are assessed before and after surgery. Acoustic parameters of speaking pitch and vocal range were compared between pre- and postoperative results. The average comfortable speaking pitch preoperatively, C3# (139 Hz), was raised an average of six semitones to G3 (196 Hz), after surgical intervention. The lowest attainable pitch was raised an average of seven semitones and the highest attainable pitch decreased by an average of two semitones. One aspect of the procedure, thyrohyoid approximation (introduced in 2006 to alter resonance), did not affect pitch. Feminization laryngoplasty successfully increased the comfortable fundamental frequency of speech and removed the lowest notes from the patient's vocal range. It does not typically raise the upper limits of the vocal range.
The role of temporal call structure in species recognition of male Allobates talamancae (Cope, 1875): (Anura: Dendrobatidae).

PubMed

Kollarits, Dennis; Wappl, Christian; Ringler, Max

2017-01-30

Acoustic species recognition in anurans depends on spectral and temporal characteristics of the advertisement call. The recognition space of a species is shaped by the likelihood of heterospecific acoustic interference. The dendrobatid frogs Allobates talamancae (Cope, 1875) and Silverstoneia flotator (Dunn, 1931) occur syntopically in south-west Costa Rica. A previous study showed that these two species avoid acoustic interference by spectral stratification. In this study, the role of the temporal call structure in the advertisement call of A. talamancae was analyzed, in particular the internote-interval duration in providing species specific temporal cues. In playback trials, artificial advertisement calls with internote-intervals deviating up to ± 90 % from the population mean internote-interval were broadcast to vocally active territorial males. The phonotactic reactions of the males indicated that, unlike in closely related species, internote-interval duration is not a call property essential for species recognition in A. talamancae . However, temporal call structure may be used for species recognition when the likelihood of heterospecific interference is high. Also, the close-encounter courtship call of male A. talamancae is described.
Response to displaced neighbours in a territorial songbird with a large repertoire

NASA Astrophysics Data System (ADS)

Briefer, Elodie; Aubin, Thierry; Rybak, Fanny

2009-09-01

Neighbour recognition allows territory owners to modulate their territorial response according to the threat posed by each neighbour and thus to reduce the costs associated with territorial defence. Individual acoustic recognition of neighbours has been shown in numerous bird species, but few of them had a large repertoire. Here, we tested individual vocal recognition in a songbird with a large repertoire, the skylark Alauda arvensis. We first examined the physical basis for recognition in the song, and we then experimentally tested recognition by playing back songs of adjacent neighbours and strangers. Males showed a lower territorial response to adjacent neighbours than to strangers when we broadcast songs from the shared boundary. However, when we broadcast songs from the opposite boundary, males showed a similar response to neighbours and strangers, indicating a spatial categorisation of adjacent neighbours’ songs. Acoustic analyses revealed that males could potentially use the syntactical arrangement of syllables in sequences to identify the songs of their neighbours. Neighbour interactions in skylarks are thus subtle relationships that can be modulated according to the spatial position of each neighbour.
The role of vocal individuality in conservation

PubMed Central

Terry, Andrew MR; Peake, Tom M; McGregor, Peter K

2005-01-01

Identifying the individuals within a population can generate information on life history parameters, generate input data for conservation models, and highlight behavioural traits that may affect management decisions and error or bias within census methods. Individual animals can be discriminated by features of their vocalisations. This vocal individuality can be utilised as an alternative marking technique in situations where the marks are difficult to detect or animals are sensitive to disturbance. Vocal individuality can also be used in cases were the capture and handling of an animal is either logistically or ethically problematic. Many studies have suggested that vocal individuality can be used to count and monitor populations over time; however, few have explicitly tested the method in this role. In this review we discuss methods for extracting individuality information from vocalisations and techniques for using this to count and monitor populations over time. We present case studies in birds where vocal individuality has been applied to conservation and we discuss its role in mammals. PMID:15960848
Gender recognition from vocal source

NASA Astrophysics Data System (ADS)

Sorokin, V. N.; Makarov, I. S.

2008-07-01

Efficiency of automatic recognition of male and female voices based on solving the inverse problem for glottis area dynamics and for waveform of the glottal airflow volume velocity pulse is studied. The inverse problem is regularized through the use of analytical models of the voice excitation pulse and of the dynamics of the glottis area, as well as the model of one-dimensional glottal airflow. Parameters of these models and spectral parameters of the volume velocity pulse are considered. The following parameters are found to be most promising: the instant of maximum glottis area, the maximum derivative of the area, the slope of the spectrum of the glottal airflow volume velocity pulse, the amplitude ratios of harmonics of this spectrum, and the pitch. On the plane of the first two main components in the space of these parameters, an almost twofold decrease in the classification error relative to that for the pitch alone is attained. The male voice recognition probability is found to be 94.7%, and the female voice recognition probability is 95.9%.
Measuring positive and negative affect in the voiced sounds of African elephants (Loxodonta africana).

PubMed

Soltis, Joseph; Blowers, Tracy E; Savage, Anne

2011-02-01

As in other mammals, there is evidence that the African elephant voice reflects affect intensity, but it is less clear if positive and negative affective states are differentially reflected in the voice. An acoustic comparison was made between African elephant "rumble" vocalizations produced in negative social contexts (dominance interactions), neutral social contexts (minimal social activity), and positive social contexts (affiliative interactions) by four adult females housed at Disney's Animal Kingdom®. Rumbles produced in the negative social context exhibited higher and more variable fundamental frequencies (F(0)) and amplitudes, longer durations, increased voice roughness, and higher first formant locations (F1), compared to the neutral social context. Rumbles produced in the positive social context exhibited similar shifts in most variables (F(0 )variation, amplitude, amplitude variation, duration, and F1), but the magnitude of response was generally less than that observed in the negative context. Voice roughness and F(0) observed in the positive social context remained similar to that observed in the neutral context. These results are most consistent with the vocal expression of affect intensity, in which the negative social context elicited higher intensity levels than the positive context, but differential vocal expression of positive and negative affect cannot be ruled out.
A Chinese alligator in heliox: formant frequencies in a crocodilian

PubMed Central

Reber, Stephan A.; Nishimura, Takeshi; Janisch, Judith; Robertson, Mark; Fitch, W. Tecumseh

2015-01-01

ABSTRACT Crocodilians are among the most vocal non-avian reptiles. Adults of both sexes produce loud vocalizations known as ‘bellows’ year round, with the highest rate during the mating season. Although the specific function of these vocalizations remains unclear, they may advertise the caller's body size, because relative size differences strongly affect courtship and territorial behaviour in crocodilians. In mammals and birds, a common mechanism for producing honest acoustic signals of body size is via formant frequencies (vocal tract resonances). To our knowledge, formants have to date never been documented in any non-avian reptile, and formants do not seem to play a role in the vocalizations of anurans. We tested for formants in crocodilian vocalizations by using playbacks to induce a female Chinese alligator (Alligator sinensis) to bellow in an airtight chamber. During vocalizations, the animal inhaled either normal air or a helium/oxygen mixture (heliox) in which the velocity of sound is increased. Although heliox allows normal respiration, it alters the formant distribution of the sound spectrum. An acoustic analysis of the calls showed that the source signal components remained constant under both conditions, but an upward shift of high-energy frequency bands was observed in heliox. We conclude that these frequency bands represent formants. We suggest that crocodilian vocalizations could thus provide an acoustic indication of body size via formants. Because birds and crocodilians share a common ancestor with all dinosaurs, a better understanding of their vocal production systems may also provide insight into the communication of extinct Archosaurians. PMID:26246611
A Mutation Associated with Stuttering Alters Mouse Pup Ultrasonic Vocalizations.

PubMed

Barnes, Terra D; Wozniak, David F; Gutierrez, Joanne; Han, Tae-Un; Drayna, Dennis; Holy, Timothy E

2016-04-13

A promising approach to understanding the mechanistic basis of speech is to study disorders that affect speech without compromising other cognitive or motor functions. Stuttering, also known as stammering, has been linked to mutations in the lysosomal enzyme-targeting pathway, but how this remarkably specific speech deficit arises from mutations in a family of general "cellular housekeeping" genes is unknown. To address this question, we asked whether a missense mutation associated with human stuttering causes vocal or other abnormalities in mice. We compared vocalizations from mice engineered to carry a mutation in the Gnptab (N-acetylglucosamine-1-phosphotransferase subunits alpha/beta) gene with wild-type littermates. We found significant differences in the vocalizations of pups with the human Gnptab stuttering mutation compared to littermate controls. Specifically, we found that mice with the mutation emitted fewer vocalizations per unit time and had longer pauses between vocalizations and that the entropy of the temporal sequence was significantly reduced. Furthermore, Gnptab missense mice were similar to wild-type mice on an extensive battery of non-vocal behaviors. We then used the same language-agnostic metrics for auditory signal analysis of human speech. We analyzed speech from people who stutter with mutations in this pathway and compared it to control speech and found abnormalities similar to those found in the mouse vocalizations. These data show that mutations in the lysosomal enzyme-targeting pathway produce highly specific effects in mouse pup vocalizations and establish the mouse as an attractive model for studying this disorder. Copyright © 2016 Elsevier Ltd. All rights reserved.
Housing conditions and sacrifice protocol affect neural activity and vocal behavior in a songbird species, the zebra finch (Taeniopygia guttata).

PubMed

Elie, Julie Estelle; Soula, Hédi Antoine; Trouvé, Colette; Mathevon, Nicolas; Vignal, Clémentine

2015-12-01

Individual cages represent a widely used housing condition in laboratories. This isolation represents an impoverished physical and social environment in gregarious animals. It prevents animals from socializing, even when auditory and visual contact is maintained. Zebra finches are colonial songbirds that are widely used as laboratory animals for the study of vocal communication from brain to behavior. In this study, we investigated the effect of single housing on the vocal behavior and the brain activity of male zebra finches (Taeniopygia guttata): male birds housed in individual cages were compared to freely interacting male birds housed as a social group in a communal cage. We focused on the activity of septo-hypothalamic regions of the "social behavior network" (SBN), a set of limbic regions involved in several social behaviors in vertebrates. The activity of four structures of the SBN (BSTm, medial bed nucleus of the stria terminalis; POM, medial preoptic area; lateral septum; ventromedial hypothalamus) and one associated region (paraventricular nucleus of the hypothalamus) was assessed using immunoreactive nuclei density of the immediate early gene Zenk (egr-1). We further assessed the identity of active cell populations by labeling vasotocin (VT). Brain activity was related to behavioral activities of birds like physical and vocal interactions. We showed that individual housing modifies vocal exchanges between birds compared to communal housing. This is of particular importance in the zebra finch, a model species for the study of vocal communication. In addition, a protocol that daily removes one or two birds from the group affects differently male zebra finches depending of their housing conditions: while communally-housed males changed their vocal output, brains of individually housed males show increased Zenk labeling in non-VT cells of the BSTm and enhanced correlation of Zenk-revealed activity between the studied structures. These results show that housing conditions must gain some attention in behavioral neuroscience protocols. Copyright © 2015. Published by Elsevier SAS.
Seabird acoustic communication at sea: a new perspective using bio-logging devices.

PubMed

Thiebault, Andréa; Pistorius, Pierre; Mullers, Ralf; Tremblay, Yann

2016-08-05

Most seabirds are very noisy at their breeding colonies, when aggregated in high densities. Calls are used for individual recognition and also emitted during agonistic interactions. When at sea, many seabirds aggregate over patchily distributed resources and may benefit from foraging in groups. Because these aggregations are so common, it raises the question of whether seabirds use acoustic communication when foraging at sea? We deployed video-cameras with built in microphones on 36 Cape gannets (Morus capensis) during the breeding season of 2010-2011 at Bird Island (Algoa Bay, South Africa) to study their foraging behaviour and vocal activity at sea. Group formation was derived from the camera footage. During ~42 h, calls were recorded on 72 occasions from 16 birds. Vocalization exclusively took place in the presence of conspecifics, and mostly in feeding aggregations (81% of the vocalizations). From the observation of the behaviours of birds associated with the emission of calls, we suggest that the calls were emitted to avoid collisions between birds. Our observations show that at least some seabirds use acoustic communication when foraging at sea. These findings open up new perspectives for research on seabirds foraging ecology and their interactions at sea.
Seabird acoustic communication at sea: a new perspective using bio-logging devices

PubMed Central

Thiebault, Andréa; Pistorius, Pierre; Mullers, Ralf; Tremblay, Yann

2016-01-01

Most seabirds are very noisy at their breeding colonies, when aggregated in high densities. Calls are used for individual recognition and also emitted during agonistic interactions. When at sea, many seabirds aggregate over patchily distributed resources and may benefit from foraging in groups. Because these aggregations are so common, it raises the question of whether seabirds use acoustic communication when foraging at sea? We deployed video-cameras with built in microphones on 36 Cape gannets (Morus capensis) during the breeding season of 2010–2011 at Bird Island (Algoa Bay, South Africa) to study their foraging behaviour and vocal activity at sea. Group formation was derived from the camera footage. During ~42 h, calls were recorded on 72 occasions from 16 birds. Vocalization exclusively took place in the presence of conspecifics, and mostly in feeding aggregations (81% of the vocalizations). From the observation of the behaviours of birds associated with the emission of calls, we suggest that the calls were emitted to avoid collisions between birds. Our observations show that at least some seabirds use acoustic communication when foraging at sea. These findings open up new perspectives for research on seabirds foraging ecology and their interactions at sea. PMID:27492779
Monitoring of piglets' open field activity and choice behaviour during the replay of maternal vocalization: a comparison between Observer and PID technique.

PubMed

Puppe, B; Schön, P C; Wendland, K

1999-07-01

The paper presents a new system for the automatic monitoring of open field activity and choice behaviour of medium-sized animals. Passive infrared motion detectors (PID) were linked on-line via a digital I/O interface to a personal computer provided with self-developed analysis software based on LabVIEW (PID technique). The set up was used for testing 18 one-week-old piglets (Sus scrofa) for their approach to their mother's nursing vocalization replayed through loudspeakers. The results were validated by comparison with a conventional Observer technique, a computer-aided direct observation. In most of the cases, no differences were seen between the Observer and PID technique regarding the percentage of stay in previously defined open field segments, the locomotor open field activity, and the choice behaviour. The results revealed that piglets are clearly attracted by their mother's nursing vocalization. The monitoring system presented in this study is thus suitable for detailed behavioural investigations of individual acoustic recognition. In general, the PID technique is a useful tool for research into the behaviour of individual animals in a restricted open field which does not rely on subjective analysis by a human observer.
Vocal recognition of owners by domestic cats (Felis catus).

PubMed

Saito, Atsuko; Shinozuka, Kazutaka

2013-07-01

Domestic cats have had a 10,000-year history of cohabitation with humans and seem to have the ability to communicate with humans. However, this has not been widely examined. We studied 20 domestic cats to investigate whether they could recognize their owners by using voices that called out the subjects' names, with a habituation-dishabituation method. While the owner was out of the cat's sight, we played three different strangers' voices serially, followed by the owner's voice. We recorded the cat's reactions to the voices and categorized them into six behavioral categories. In addition, ten naive raters rated the cats' response magnitudes. The cats responded to human voices not by communicative behavior (vocalization and tail movement), but by orienting behavior (ear movement and head movement). This tendency did not change even when they were called by their owners. Of the 20 cats, 15 demonstrated a lower response magnitude to the third voice than to the first voice. These habituated cats showed a significant rebound in response to the subsequent presentation of their owners' voices. This result indicates that cats are able to use vocal cues alone to distinguish between humans.

Magnetic Resonance Imaging of the Vocal Folds in Women with Congenital Adrenal Hyperplasia and Virilized Voices

ERIC Educational Resources Information Center

Nygren, Ulrika; Isberg, Bengt; Arver, Stefan; Hertegård, Stellan; Södersten, Maria; Nordenskjöld, Agneta

2016-01-01

Purpose: Women with congenital adrenal hyperplasia (CAH) may develop a virilized voice due to late diagnosis or suboptimal suppression of adrenal androgens. Changes in the vocal folds due to virilization have not been studied in vivo. The purpose was to investigate if the thyroarytenoid (TA) muscle is affected by virilization and correlate…
Effects of Contingent and Noncontingent Maternal Stimulation on the Vocal Behaviour of Three- to Four-Month-Old Japanese Infants.

ERIC Educational Resources Information Center

Masataka, Nobuo

1993-01-01

Forty-eight male infants (tested at ages three and four months) experienced either conversational turn-taking or random responsiveness of their mothers. Contingency did not affect the infant's rate of vocalizing but influenced its quality and timing. Intervals at which mother delivered contingent responses were longer when infant was older.…
Medications and Adverse Voice Effects.

PubMed

Nemr, Kátia; Di Carlos Silva, Ariana; Rodrigues, Danilo de Albuquerque; Zenari, Marcia Simões

2017-08-16

To identify the medications used by patients with dysphonia, describe the voice symptoms reported on initial speech-language pathology (SLP) examination, evaluate the possible direct and indirect effects of medications on voice production, and determine the association between direct and indirect adverse voice effects and self-reported voice symptoms, hydration and smoking habits, comorbidities, vocal assessment, and type and degree of dysphonia. This is a retrospective cross-sectional study. Fifty-five patients were evaluated and the vocal signs and symptoms indicated in the Dysphonia Risk Protocol were considered, as well as data on hydration, smoking and medication use. We analyzed the associations between type of side effect and self-reported vocal signs/symptoms, hydration, smoking, comorbidities, type of dysphonia, and auditory-perceptual and acoustic parameters. Sixty percent were women, the mean age was 51.8 years, 29 symptoms were reported on the screening, and 73 active ingredients were identified with 8.2% directly and 91.8% indirectly affecting vocal function. There were associations between the use of drugs with direct adverse voice effects, self-reported symptoms, general degree of vocal deviation, and pitch deviation. The symptoms of dry throat and shortness of breath were associated with the direct vocal side effect of the medicine, as well as the general degree of vocal deviation and the greater pitch deviation. Shortness of breath when speaking was also associated with the greatest degree of vocal deviation. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vibration stimulates vocal mucosa-like matrix expression by hydrogel-encapsulated fibroblasts.

PubMed

Kutty, Jaishankar K; Webb, Ken

2010-01-01

The composition and organization of the vocal fold extracellular matrix (ECM) provide the viscoelastic mechanical properties that are required to sustain high-frequency vibration during voice production. Although vocal injury and pathology are known to produce alterations in matrix physiology, the mechanisms responsible for the development and maintenance of vocal fold ECM are poorly understood. The objective of this study was to investigate the effect of physiologically relevant vibratory stimulation on ECM gene expression and synthesis by fibroblasts encapsulated within hyaluronic acid hydrogels that approximate the viscoelastic properties of vocal mucosa. Relative to static controls, samples exposed to vibration exhibited significant increases in mRNA expression levels of HA synthase 2, decorin, fibromodulin and MMP-1, while collagen and elastin expression were relatively unchanged. Expression levels exhibited a temporal response, with maximum increases observed after 3 and 5 days of vibratory stimulation and significant downregulation observed at 10 days. Quantitative assays of matrix accumulation confirmed significant increases in sulphated glycosaminoglycans and significant decreases in collagen after 5 and 10 days of vibratory culture, relative to static controls. Cellular remodelling and hydrogel viscosity were affected by vibratory stimulation and were influenced by varying the encapsulated cell density. These results indicate that vibration is a critical epigenetic factor regulating vocal fold ECM and suggest that rapid restoration of the phonatory microenvironment may provide a basis for reducing vocal scarring, restoring native matrix composition and improving vocal quality. 2009 John Wiley & Sons, Ltd.
Functional flexibility of infant vocalization and the emergence of language

PubMed Central

Oller, D. Kimbrough; Buder, Eugene H.; Ramsdell, Heather L.; Warlaumont, Anne S.; Chorna, Lesya; Bakeman, Roger

2013-01-01

We report on the emergence of functional flexibility in vocalizations of human infants. This vastly underappreciated capability becomes apparent when prelinguistic vocalizations express a full range of emotional content—positive, neutral, and negative. The data show that at least three types of infant vocalizations (squeals, vowel-like sounds, and growls) occur with this full range of expression by 3–4 mo of age. In contrast, infant cry and laughter, which are species-specific signals apparently homologous to vocal calls in other primates, show functional stability, with cry overwhelmingly expressing negative and laughter positive emotional states. Functional flexibility is a sine qua non in spoken language, because all words or sentences can be produced as expressions of varying emotional states and because learning conventional “meanings” requires the ability to produce sounds that are free of any predetermined function. Functional flexibility is a defining characteristic of language, and empirically it appears before syntax, word learning, and even earlier-developing features presumed to be critical to language (e.g., joint attention, syllable imitation, and canonical babbling). The appearance of functional flexibility early in the first year of human life is a critical step in the development of vocal language and may have been a critical step in the evolution of human language, preceding protosyntax and even primitive single words. Such flexible affect expression of vocalizations has not yet been reported for any nonhuman primate but if found to occur would suggest deep roots for functional flexibility of vocalization in our primate heritage. PMID:23550164
On Short-Time Estimation of Vocal Tract Length from Formant Frequencies

PubMed Central

Lammert, Adam C.; Narayanan, Shrikanth S.

2015-01-01

Vocal tract length is highly variable across speakers and determines many aspects of the acoustic speech signal, making it an essential parameter to consider for explaining behavioral variability. A method for accurate estimation of vocal tract length from formant frequencies would afford normalization of interspeaker variability and facilitate acoustic comparisons across speakers. A framework for considering estimation methods is developed from the basic principles of vocal tract acoustics, and an estimation method is proposed that follows naturally from this framework. The proposed method is evaluated using acoustic characteristics of simulated vocal tracts ranging from 14 to 19 cm in length, as well as real-time magnetic resonance imaging data with synchronous audio from five speakers whose vocal tracts range from 14.5 to 18.0 cm in length. Evaluations show improvements in accuracy over previously proposed methods, with 0.631 and 1.277 cm root mean square error on simulated and human speech data, respectively. Empirical results show that the effectiveness of the proposed method is based on emphasizing higher formant frequencies, which seem less affected by speech articulation. Theoretical predictions of formant sensitivity reinforce this empirical finding. Moreover, theoretical insights are explained regarding the reason for differences in formant sensitivity. PMID:26177102
Final Syllable Lengthening (FSL) in infant vocalizations.

PubMed

Nathani, Suneeti; Oller, D Kimbrough; Cobo-Lewis, Alan B

2003-02-01

Final Syllable Lengthening (FSL) has been extensively examined in infant vocalizations in order to determine whether its basis is biological or learned. Findings suggest there may be a U-shaped developmental trajectory for FSL. The present study sought to verify this pattern and to determine whether vocal maturity and deafness influence FSL. Eight normally hearing infants, aged 0;3 to 1;0, and eight deaf infants, aged 0;8 to 4;0, were examined at three levels of prelinguistic vocal development: precanonical, canonical, and postcanonical. FSL was found at all three levels suggesting a biological basis for this phenomenon. Individual variability was, however, considerable. Reduction in the magnitude of FSL across the three sessions provided some support for a downward trend for FSL in infancy. Findings further indicated that auditory deprivation can significantly affect temporal aspects of infant speech production.
Face recognition with the Karhunen-Loeve transform

NASA Astrophysics Data System (ADS)

Suarez, Pedro F.

1991-12-01

The major goal of this research was to investigate machine recognition of faces. The approach taken to achieve this goal was to investigate the use of Karhunen-Loe've Transform (KLT) by implementing flexible and practical code. The KLT utilizes the eigenvectors of the covariance matrix as a basis set. Faces were projected onto the eigenvectors, called eigenfaces, and the resulting projection coefficients were used as features. Face recognition accuracies for the KLT coefficients were superior to Fourier based techniques. Additionally, this thesis demonstrated the image compression and reconstruction capabilities of the KLT. This theses also developed the use of the KLT as a facial feature detector. The ability to differentiate between facial features provides a computer communications interface for non-vocal people with cerebral palsy. Lastly, this thesis developed a KLT based axis system for laser scanner data of human heads. The scanner data axis system provides the anthropometric community a more precise method of fitting custom helmets.
Assessment of vocal cord nodules: a case study in speech processing by using Hilbert-Huang Transform

NASA Astrophysics Data System (ADS)

Civera, M.; Filosi, C. M.; Pugno, N. M.; Silvestrini, M.; Surace, C.; Worden, K.

2017-05-01

Vocal cord nodules represent a pathological condition for which the growth of unnatural masses on vocal folds affects the patients. Among other effects, changes in the vocal cords’ overall mass and stiffness alter their vibratory behaviour, thus changing the vocal emission generated by them. This causes dysphonia, i.e. abnormalities in the patients’ voice, which can be analysed and inspected via audio signals. However, the evaluation of voice condition through speech processing is not a trivial task, as standard methods based on the Fourier Transform, fail to fit the non-stationary nature of vocal signals. In this study, four audio tracks, provided by a volunteer patient, whose vocal fold nodules have been surgically removed, were analysed using a relatively new technique: the Hilbert-Huang Transform (HHT) via Empirical Mode Decomposition (EMD); specifically, by using the CEEMDAN (Complete Ensemble EMD with Adaptive Noise) algorithm. This method has been applied here to speech signals, which were recorded before removal surgery and during convalescence, to investigate specific trends. Possibilities offered by the HHT are exposed, but also some limitations of decomposing the signals into so-called intrinsic mode functions (IMFs) are highlighted. The results of these preliminary studies are intended to be a basis for the development of new viable alternatives to the softwares currently used for the analysis and evaluation of pathological voice.
Epidemiology of Vocal Fold Paralyses After Total Thyroidectomy for Well-Differentiated Thyroid Cancer in Medicare Population

PubMed Central

Francis, David O.; Pearce, Elizabeth C.; Ni, Shenghua; Garrett, C. Gaelyn; Penson, David F.

2014-01-01

Objectives Population-level incidence of vocal fold paralysis after thyroidectomy for well-differentiated thyroid carcinoma (WDTC) is not known. This study aimed to measure longitudinal incidence of post-operative vocal fold paralyses and need for directed interventions in the Medicare population undergoing total thyroidectomy for WDTC. Study Design Retrospective Cohort Study Setting United States Population Subjects Medicare Beneficiaries Methods SEER-Medicare data (1991 – 2009) were used to identify beneficiaries who underwent total thyroidectomy for WDTC. Incident vocal fold paralyses and directed interventions were identified. Multivariate analyses were used to determine factors associated with odds of developing these surgical complications. Results Of 5,670 total thyroidectomies for WDTC, 9.5% were complicated by vocal fold paralysis [8.2% unilateral vocal fold paralysis (UVFP); 1.3% bilateral vocal fold paralysis (BVFP)]. Rate of paralyses decreased 5% annually from 1991 to 2009 (OR 0.95, 95% CI 0.93 – 0.97; p<0.001). Overall, 22% of patients with vocal fold paralysis required surgical intervention (UVFP 21%, BVFP 28%). Multivariate logistic regression revealed odds of post-thyroidectomy paralysis increased with each additional year of age, with non-Caucasian race, particular histologic types, advanced stage, and in particular registry regions. Conclusions Annual rates of post-thyroidectomy vocal fold paralyses are decreasing among Medicare beneficiaries with WDTC. High incidence in this aged population is likely due to a preponderance of temporary paralyses, which is supported by the need for directed intervention in less than a quarter of affected patients. Further population-based studies are needed to refine the population incidence and risk factors for paralyses in the aging population. PMID:24482349
Speech perception and production in severe environments

NASA Astrophysics Data System (ADS)

Pisoni, David B.

1990-09-01

The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.
Memory for vocal tempo and pitch.

PubMed

Boltz, Marilyn G

2017-11-01

Two experiments examined the ability to remember the vocal tempo and pitch of different individuals, and the way this information is encoded into the cognitive system. In both studies, participants engaged in an initial familiarisation phase while attending was systematically directed towards different aspects of speakers' voices. Afterwards, they received a tempo or pitch recognition task. Experiment 1 showed that tempo and pitch are both incidentally encoded into memory at levels comparable to intentional learning, and no performance deficit occurs with divided attending. Experiment 2 examined the ability to recognise pitch or tempo when the two dimensions co-varied and found that the presence of one influenced the other: performance was best when both dimensions were positively correlated with one another. As a set, these findings indicate that pitch and tempo are automatically processed in a holistic, integral fashion [Garner, W. R. (1974). The processing of information and structure. Potomac, MD: Erlbaum.] which has a number of cognitive implications.
Speech recognition index of workers with tinnitus exposed to environmental or occupational noise: a comparative study

PubMed Central

2012-01-01

Introduction Tinnitus is considered the third worst symptom affecting humans. The aim of this article is to assess complaints by workers with tinnitus exposed to environmental and occupational noise. Methodology 495 workers went through an epidemiological survey at the Audiology Department of the Center for Studies on Workers’ Health and Human Ecology, from 2003 to 2007. The workers underwent tonal and vocal audiometry, preceded by a clinical and occupational history questionnaire. Two-factor ANOVA and Tukey were the statistical tests used. All the analysis set statistical significance at α=5%. Findings There was a higher prevalence of occupational tinnitus (73.7%), a predominance of female domestic workers (65.4%) in cases of environmental exposure, and predominance of male construction workers (71.5%) for occupational exposure. There was a significant difference in workers with hearing loss, who showed a mean speech recognition index (SRI) of 85%, as compared to healthy workers with a mean SRI greater than 93.5%. Signs and symptoms, speech perception, and interference in sound localization with the type of noise exposure (environmental versus occupational) comparisons found no significant differences. Conclusion Studied group’s high prevalence of tinnitus, major difficulties in speech recognition with hearing loss and the presence of individuals with normal hearing with both types of exposure justify the importance of measures in health promotion, prevention, and hearing surveillance. The findings highlight the importance of valuing the patients’ own perception as the first indication of tinnitus and hearing loss in order to help develop appropriate public policies within the Unified National Health System (SUS). PMID:23259813
Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar

PubMed Central

Shin, Young Hoon; Seo, Jiwon

2016-01-01

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker’s vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing. PMID:27801867
Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar.

PubMed

Shin, Young Hoon; Seo, Jiwon

2016-10-29

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.
[Frequency of self-reported vocal problems and associated occupational factors in primary schoolteachers in Londrina, Paraná State, Brazil].

PubMed

Fillis, Michelle Moreira Abujamra; Andrade, Selma Maffei de; González, Alberto Durán; Melanda, Francine Nesello; Mesas, Arthur Eumann

2016-01-01

This study aimed to estimate the prevalence of self-reported vocal problems among primary schoolteachers and to identify associated occupational factors, using a cross-sectional design and face-to-face interviews with 967 teachers in 20 public schools in Londrina, Paraná State, Brazil. Prevalence of self-reported vocal problems was 25.7%. Adjusted analyses showed associations with characteristics of the employment relationship (workweek ≥ 40 hours and poor perception of salaries and health benefits), characteristics of the work environment (number of students per class and exposure to chalk dust and microorganisms), psychological factors (low job satisfaction, limited opportunities to express opinions, worse relationship with superiors, and poor balance between professional and personal life), and violence (insults and bullying). Vocal disorders affected one in four primary schoolteachers and were associated with various characteristics of the teaching profession (both structural and work-related).
Annual variation in vocal performance and its relationship with bill morphology in Lincoln’s sparrows

PubMed Central

SOCKMAN, KEITH W.

2009-01-01

Morphology may affect behavioural performance through a direct, physical link or through indirect, secondary mechanisms. Although some evidence suggests that the bill morphology of songbirds directly constrains vocal performance, bill morphology may influence vocal performance through indirect mechanisms also, such as one in which morphology influences foraging and thus the ability to perform some types of vocal behaviour. This raises the possibility for ecologically induced variation in the relationship between morphology and behaviour. To investigate this, I used an information theoretic approach to examine the relationship between bill morphology and several measures of vocal performance in Lincoln’s sparrows (Melospiza lincolnii). I compared this relationship between two breeding seasons that differed markedly in ambient temperatures, phenology of habitat maturation, and food abundance. I found a strong curvilinear relationship between bill shape (height/width) and vocal performance in the seemingly less hospitable season but not in the other, leading to a difference between seasons in the population’s mean vocal performance. Currently, I do not know the cause of this annual variation. However, it could be due to the effects of bill shape on foraging and therefore on time budget, energy balance, or some other behavioural or physiological response that manifests mostly under difficult environmental conditions or, alternatively, to associations between male quality and both vocal performance and bill shape. Regardless of the cause, these results suggest the presence of an indirect, ecologically mediated link between morphology and behavioural performance, leading to annual variation in the prevailing environment of acoustic signals. PMID:20160859
Experiences of a short vocal training course for call-centre customer service advisors.

PubMed

Lehto, Laura; Rantala, Leena; Vilkman, Erkki; Alku, Paavo; Bäckström, Tom

2003-01-01

It is commonly known that occupational voice users suffer from voice symptoms to varying extents. The purpose of this study was to find out the effects of a short (2-day) vocal training course on professional speakers' voice. The subjects were 38 female and 10 male customer advisors, who mainly use the telephone during their working hours at a call centre. The findings showed that although the subjects did not suffer from severe voice problems, they reported that the short vocal training course had an effect of some of the vocal symptoms they had experienced. More than 50% of the females and males reported a decrease in the feeling of mucus and the consequent need to clear the throat, and diminished worsening of their voice. Over 60% thought that voice training had improved their vocal habits and none reported a negative influence of the course on their voice. Females also reported a reduction of vocal fatigue. The subjects were further asked to respond to 23 statements on how they experienced the voice training in general. The statements 'I learned things that I didn't know about the use of voice in general' and 'I got useful and important knowledge concerning my work' were highly assessed by both females and males. The results suggest that even a short vocal training course might affect positively the self-reported well-being of persons working in a vocally loading occupation. However, to find out the long-term effects of a short training course, a follow-up study would need to be carried out. Copyright 2003 S. Karger AG, Basel
Increased vocal intensity due to the Lombard effect in speakers with Parkinson's disease: simultaneous laryngeal and respiratory strategies.

PubMed

Stathopoulos, Elaine T; Huber, Jessica E; Richardson, Kelly; Kamphaus, Jennifer; DeCicco, Devan; Darling, Meghan; Fulcher, Katrina; Sussman, Joan E

2014-01-01

The objective of the present study was to investigate whether speakers with hypophonia, secondary to Parkinson's disease (PD), would increases their vocal intensity when speaking in a noisy environment (Lombard effect). The other objective was to examine the underlying laryngeal and respiratory strategies used to increase vocal intensity. Thirty-three participants with PD were included for study. Each participant was fitted with the SpeechVive™ device that played multi-talker babble noise into one ear during speech. Using acoustic, aerodynamic and respiratory kinematic techniques, the simultaneous laryngeal and respiratory mechanisms used to regulate vocal intensity were examined. Significant group results showed that most speakers with PD (26/33) were successful at increasing their vocal intensity when speaking in the condition of multi-talker babble noise. They were able to support their increased vocal intensity and subglottal pressure with combined strategies from both the laryngeal and respiratory mechanisms. Individual speaker analysis indicated that the particular laryngeal and respiratory interactions differed among speakers. The SpeechVive™ device elicited higher vocal intensities from patients with PD. Speakers used different combinations of laryngeal and respiratory physiologic mechanisms to increase vocal intensity, thus suggesting that disease process does not uniformly affect the speech subsystems. Readers will be able to: (1) identify speech characteristics of people with Parkinson's disease (PD), (2) identify typical respiratory strategies for increasing sound pressure level (SPL), (3) identify typical laryngeal strategies for increasing SPL, (4) define the Lombard effect. Copyright © 2014 Elsevier Inc. All rights reserved.
Biosimulation of Inflammation and Healing in Surgically Injured Vocal Folds

PubMed Central

Li, Nicole Y. K.; Vodovotz, Yoram; Hebda, Patricia A.; Abbott, Katherine Verdolini

2010-01-01

Objectives The pathogenesis of vocal fold scarring is complex and remains to be deciphered. The current study is part of research endeavors aimed at applying systems biology approaches to address the complex biological processes involved in the pathogenesis of vocal fold scarring and other lesions affecting the larynx. Methods We developed a computational agent-based model (ABM) to quantitatively characterize multiple cellular and molecular interactions involved in inflammation and healing in vocal fold mucosa after surgical trauma. The ABM was calibrated with empirical data on inflammatory mediators (eg, tumor necrosis factor) and extracellular matrix components (eg, hyaluronan) from published studies on surgical vocal fold injury in the rat population. Results The simulation results reproduced and predicted trajectories seen in the empirical data from the animals. Moreover, the ABM studies suggested that hyaluronan fragments might be the clinical surrogate of tissue damage, a key variable that in these simulations both is enhanced by and further induces inflammation. Conclusions A relatively simple ABM such as the one reported in this study can provide new understanding of laryngeal wound healing and generate working hypotheses for further wet-lab studies. PMID:20583741

Biosimulation of inflammation and healing in surgically injured vocal folds.

PubMed

Li, Nicole Y K; Vodovotz, Yoram; Hebda, Patricia A; Abbott, Katherine Verdolini

2010-06-01

The pathogenesis of vocal fold scarring is complex and remains to be deciphered. The current study is part of research endeavors aimed at applying systems biology approaches to address the complex biological processes involved in the pathogenesis of vocal fold scarring and other lesions affecting the larynx. We developed a computational agent-based model (ABM) to quantitatively characterize multiple cellular and molecular interactions involved in inflammation and healing in vocal fold mucosa after surgical trauma. The ABM was calibrated with empirical data on inflammatory mediators (eg, tumor necrosis factor) and extracellular matrix components (eg, hyaluronan) from published studies on surgical vocal fold injury in the rat population. The simulation results reproduced and predicted trajectories seen in the empirical data from the animals. Moreover, the ABM studies suggested that hyaluronan fragments might be the clinical surrogate of tissue damage, a key variable that in these simulations both is enhanced by and further induces inflammation. A relatively simple ABM such as the one reported in this study can provide new understanding of laryngeal wound healing and generate working hypotheses for further wet-lab studies.
Vocal fundamental and formant frequencies affect perceptions of speaker cooperativeness.

PubMed

Knowles, Kristen K; Little, Anthony C

2016-01-01

In recent years, the perception of social traits in faces and voices has received much attention. Facial and vocal masculinity are linked to perceptions of trustworthiness; however, while feminine faces are generally considered to be trustworthy, vocal trustworthiness is associated with masculinized vocal features. Vocal traits such as pitch and formants have previously been associated with perceived social traits such as trustworthiness and dominance, but the link between these measurements and perceptions of cooperativeness have yet to be examined. In Experiment 1, cooperativeness ratings of male and female voices were examined against four vocal measurements: fundamental frequency (F0), pitch variation (F0-SD), formant dispersion (Df), and formant position (Pf). Feminine pitch traits (F0 and F0-SD) and masculine formant traits (Df and Pf) were associated with higher cooperativeness ratings. In Experiment 2, manipulated voices with feminized F0 were found to be more cooperative than voices with masculinized F0(,) among both male and female speakers, confirming our results from Experiment 1. Feminine pitch qualities may indicate an individual who is friendly and non-threatening, while masculine formant qualities may reflect an individual that is socially dominant or prestigious, and the perception of these associated traits may influence the perceived cooperativeness of the speakers.
Vocal communication between male Xenopus laevis.

PubMed

Tobias, Martha L; Barnard, Candace; O'Hagan, Robert; Horng, Sam H; Rand, Masha; Kelley, Darcy B

2004-02-01

This study focuses on the role of male-male vocal communication in the reproductive repertoire of the South African clawed frog, Xenopus laevis . Six male and two female call types were recorded from native ponds in the environs of Cape Town, South Africa. These include all call types previously recorded in the laboratory as well as one previously unidentified male call: chirping. The amount of calling and the number of call types increased as the breeding season progressed. Laboratory recordings indicated that all six male call types were directed to males; three of these were directed to both sexes and three were directed exclusively to males. Both female call types were directed exclusively to males. The predominant call type, in both field and laboratory recordings, was the male advertisement call. Sexual state affected male vocal behaviour. Male pairs in which at least one male was sexually active (gonadotropin injected) produced all call types, whereas pairs of uninjected males rarely called. Some call types were strongly associated with a specific behaviour and others were not. Clasped males always growled and clasping males typically produced amplectant calls or chirps; males not engaged in clasping most frequently advertised. The amount of advertising produced by one male was profoundly affected by the presence of another male. Pairing two sexually active males resulted in suppression of advertisement calling in one; suppression was released when males were isolated after pairing. Vocal dominance was achieved even in the absence of physical contact (clasping). We suggest that X. laevis males gain a reproductive advantage by competing for advertisement privileges and by vocally suppressing neighbouring males.
Experience-dependent modulation of feedback integration during singing: role of the right anterior insula.

PubMed

Kleber, Boris; Zeitouni, Anthony G; Friberg, Anders; Zatorre, Robert J

2013-04-03

Somatosensation plays an important role in the motor control of vocal functions, yet its neural correlate and relation to vocal learning is not well understood. We used fMRI in 17 trained singers and 12 nonsingers to study the effects of vocal-fold anesthesia on the vocal-motor singing network as a function of singing expertise. Tasks required participants to sing musical target intervals under normal conditions and after anesthesia. At the behavioral level, anesthesia altered pitch accuracy in both groups, but singers were less affected than nonsingers, indicating an experience-dependent effect of the intervention. At the neural level, this difference was accompanied by distinct patterns of decreased activation in singers (cortical and subcortical sensory and motor areas) and nonsingers (subcortical motor areas only) respectively, suggesting that anesthesia affected the higher-level voluntary (explicit) motor and sensorimotor integration network more in experienced singers, and the lower-level (implicit) subcortical motor loops in nonsingers. The right anterior insular cortex (AIC) was identified as the principal area dissociating the effect of expertise as a function of anesthesia by three separate sources of evidence. First, it responded differently to anesthesia in singers (decreased activation) and nonsingers (increased activation). Second, functional connectivity between AIC and bilateral A1, M1, and S1 was reduced in singers but augmented in nonsingers. Third, increased BOLD activity in right AIC in singers was correlated with larger pitch deviation under anesthesia. We conclude that the right AIC and sensory-motor areas play a role in experience-dependent modulation of feedback integration for vocal motor control during singing.
The effect of temperature on basal tension and thyroarytenoid muscle contraction in an isolated rat glottis model.

PubMed

Wang, Hsing-Won; Chu, Yueng-Hsiang; Chao, Pin-Zhir; Lee, Fei-Peng

2014-10-01

The pitch of voice is closely related to the vocal fold tension, which is the end result of coordinated movement of the intralaryngeal muscles, and especially the thyroarytenoid muscle. It is known that vocal quality may be affected by surrounding temperature; however, the effect of temperature on vocal fold tension is mostly unknown. Thus, the aim of this study was to evaluate the effect of temperature on isolated rat glottis and thyroarytenoid muscle contraction induced by electrical field stimulation. In vitro isometric tension of the glottis ring from 30 Sprague-Dawley rats was continuously recorded by the tissue bath method. Electrical field stimulation was applied to the glottis ring with two wire electrodes placed parallel to the glottis and connected to a direct-current stimulator. The tension changes of the rat glottis rings that were either untreated or treated with electrical field stimulation were recorded continuously at temperatures from 37 to 7 °C or from 7 to 37 °C. Warming from 7 to 37 °C increased the basal tension of the glottis rings and decreased the electrical field stimulation-induced glottis ring contraction, which was chiefly due to thyroarytenoid muscle contraction. In comparison, cooling from 37 to 7 °C decreased the basal tension and enhanced glottis ring contraction by electrical field stimulation. We concluded that warming increased the basal tension of the glottis in vitro and decreased the amplitude of electrical field stimulation-induced thyroarytenoid muscle contraction. Thus, vocal pitch and the fine tuning of vocal fold tension might be affected by temperature in vivo.
Current knowledge on bioacoustics of the subfamily Lophyohylinae (Hylidae, Anura) and description of Ocellated treefrog Itapotihyla langsdorffii vocalizations.

PubMed

Forti, Lucas Rodriguez; Foratto, Roseli Maria; Márquez, Rafael; Pereira, Vânia Rosa; Toledo, Luís Felipe

2018-01-01

Anuran vocalizations, such as advertisement and release calls, are informative for taxonomy because species recognition can be based on those signals. Thus, a proper acoustic description of the calls may support taxonomic decisions and may contribute to knowledge about amphibian phylogeny. Here we present a perspective on advertisement call descriptions of the frog subfamily Lophyohylinae, through a literature review and a spatial analysis presenting bioacoustic coldspots (sites with high diversity of species lacking advertisement call descriptions) for this taxonomic group. Additionally, we describe the advertisement and release calls of the still poorly known treefrog, Itapotihyla langsdorffii . We analyzed recordings of six males using the software Raven Pro 1.4 and calculated the coefficient of variation for classifying static and dynamic acoustic properties. We found that more than half of the species within the subfamily do not have their vocalizations described yet. Most of these species are distributed in the western and northern Amazon, where recording sampling effort should be strengthened in order to fill these gaps. The advertisement call of I. langsdorffii is composed of 3-18 short unpulsed notes (mean of 13 ms long), presents harmonic structure, and has a peak dominant frequency of about 1.4 kHz. This call usually presents amplitude modulation, with decreasing intensity along the sequence of notes. The release call is a simple unpulsed note with an average duration of 9 ms, and peak dominant frequency around 1.8 kHz. Temporal properties presented higher variations than spectral properties at both intra- and inter-individual levels. However, only peak dominant frequency was static at intra-individual level. High variability in temporal properties and lower variations related to spectral ones is usual for anurans; The first set of variables is determined by social environment or temperature, while the second is usually related to species-recognition process. Here we review and expand the acoustic knowledge of the subfamily Lophyohylinae, highlighting areas and species for future research.
Social experience affects neuronal responses to male calls in adult female zebra finches.

PubMed

Menardy, F; Touiki, K; Dutrieux, G; Bozon, B; Vignal, C; Mathevon, N; Del Negro, C

2012-04-01

Plasticity studies have consistently shown that behavioural relevance can change the neural representation of sounds in the auditory system, but what occurs in the context of natural acoustic communication where significance could be acquired through social interaction remains to be explored. The zebra finch, a highly social songbird species that forms lifelong pair bonds and uses a vocalization, the distance call, to identify its mate, offers an opportunity to address this issue. Here, we recorded spiking activity in females while presenting distance calls that differed in their degree of familiarity: calls produced by the mate, by a familiar male, or by an unfamiliar male. We focused on the caudomedial nidopallium (NCM), a secondary auditory forebrain region. Both the mate's call and the familiar call evoked responses that differed in magnitude from responses to the unfamiliar call. This distinction between responses was seen both in single unit recordings from anesthetized females and in multiunit recordings from awake freely moving females. In contrast, control females that had not heard them previously displayed responses of similar magnitudes to all three calls. In addition, more cells showed highly selective responses in mated than in control females, suggesting that experience-dependent plasticity in call-evoked responses resulted in enhanced discrimination of auditory stimuli. Our results as a whole demonstrate major changes in the representation of natural vocalizations in the NCM within the context of individual recognition. The functional properties of NCM neurons may thus change continuously to adapt to the social environment. © 2012 The Authors. European Journal of Neuroscience © 2012 Federation of European Neuroscience Societies and Blackwell Publishing Ltd.
A multimodal approach to emotion recognition ability in autism spectrum disorders.

PubMed

Jones, Catherine R G; Pickles, Andrew; Falcaro, Milena; Marsden, Anita J S; Happé, Francesca; Scott, Sophie K; Sauter, Disa; Tregay, Jenifer; Phillips, Rebecca J; Baird, Gillian; Simonoff, Emily; Charman, Tony

2011-03-01

Autism spectrum disorders (ASD) are characterised by social and communication difficulties in day-to-day life, including problems in recognising emotions. However, experimental investigations of emotion recognition ability in ASD have been equivocal, hampered by small sample sizes, narrow IQ range and over-focus on the visual modality. We tested 99 adolescents (mean age 15;6 years, mean IQ 85) with an ASD and 57 adolescents without an ASD (mean age 15;6 years, mean IQ 88) on a facial emotion recognition task and two vocal emotion recognition tasks (one verbal; one non-verbal). Recognition of happiness, sadness, fear, anger, surprise and disgust were tested. Using structural equation modelling, we conceptualised emotion recognition ability as a multimodal construct, measured by the three tasks. We examined how the mean levels of recognition of the six emotions differed by group (ASD vs. non-ASD) and IQ (≥ 80 vs. < 80). We found no evidence of a fundamental emotion recognition deficit in the ASD group and analysis of error patterns suggested that the ASD group were vulnerable to the same pattern of confusions between emotions as the non-ASD group. However, recognition ability was significantly impaired in the ASD group for surprise. IQ had a strong and significant effect on performance for the recognition of all six emotions, with higher IQ adolescents outperforming lower IQ adolescents. The findings do not suggest a fundamental difficulty with the recognition of basic emotions in adolescents with ASD. © 2010 The Authors. Journal of Child Psychology and Psychiatry © 2010 Association for Child and Adolescent Mental Health.
Relationship Between Laryngeal Electromyography and Video Laryngostroboscopy in Vocal Fold Paralysis.

PubMed

Maamary, Joel A; Cole, Ian; Darveniza, Paul; Pemberton, Cecilia; Brake, Helen Mary; Tisch, Stephen

2017-09-01

The objective of this study was to better define the relationship of laryngeal electromyography and video laryngostroboscopy in the diagnosis of vocal fold paralysis. Retrospective diagnostic cohort study with cross-sectional data analysis METHODS: Data were obtained from 57 patients with unilateral vocal fold paralysis who attended a large tertiary voice referral center. Electromyographic findings were classified according to recurrent laryngeal nerve, superior laryngeal nerve, and high vagal/combined lesions. Video laryngostroboscopy recordings were classified according to the position of the immobile fold into median, paramedian, lateral, and a foreshortened/hooded vocal fold. The position of the paralyzed vocal fold was then analyzed according to the lesion as determined by electromyography. The recurrent laryngeal nerve was affected in the majority of cases with left-sided lesions more common than right. Vocal fold position differed between recurrent laryngeal and combined vagal lesions. Recurrent laryngeal nerve lesions were more commonly associated with a laterally displaced immobile fold. No fold position was suggestive of a combined vagal lesion. The inter-rater reliability for determining fold position was high. Laryngeal electromyography is useful in diagnosing neuromuscular dysfunction of the larynx and best practice recommends its continued implementation along with laryngostroboscopy. While recurrent laryngeal nerve lesions are more likely to present with a lateral vocal fold, this does not occur in all cases. Such findings indicate that further unknown mechanisms contribute to fold position in unilateral paralysis. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Effects of dehydration on the viscoelastic properties of vocal folds in large deformations.

PubMed

Miri, Amir K; Barthelat, François; Mongeau, Luc

2012-11-01

Dehydration may alter vocal fold viscoelastic properties, thereby hampering phonation. The effects of water loss induced by an osmotic pressure potential on vocal fold tissue viscoelastic properties were investigated. Porcine vocal folds were dehydrated by immersion in a hypertonic solution, and quasi-static and low-frequency dynamic traction tests were performed for elongations of up to 50%. Digital image correlation was used to determine local strains from surface deformations. The elastic modulus and the loss factor were then determined for normal and dehydrated tissues. An eight-chain hyperelastic model was used to describe the observed nonlinear stress-stretch behavior. Contrary to the expectations, the mass history indicated that the tissue absorbed water during cyclic extension when submerged in a hypertonic solution. During loading history, the elastic modulus was increased for dehydrated tissues as a function of strain. The response of dehydrated tissues was much less affected when the load was released. This observation suggests that hydration should be considered in micromechanical models of the vocal folds. The internal hysteresis, which is often linked to phonation effort, increased significantly with water loss. The effects of dehydration on the viscoelastic properties of vocal fold tissue were quantified in a systematic way. A better understanding of the role of hydration on the mechanical properties of vocal fold tissue may help to establish objective dehydration and phonotrauma criteria. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The Effects of Prenatal Stocking Densities on the Fear Responses and Sociality of Goat (Capra hircus) Kids

PubMed Central

Chojnacki, Rachel M.; Vas, Judit; Andersen, Inger Lise

2014-01-01

Prenatal stress (stress experienced by a pregnant mother) and its effects on offspring have been comprehensively studied but relatively little research has been done on how prenatal social stress affects farm animals such as goats. Here, we use the operational description of ‘stress’ as “physical or perceived threats to homeostasis.” The aim of this study was to investigate the prenatal effects of different herd densities on the fear responses and sociality of goat kids. Pregnant Norwegian dairy goats were exposed to high, medium or low prenatal animal density treatments throughout gestation (1.0, 2.0 or 3.0 m2 per animal, respectively). One kid per litter was subjected to two behavioral tests at 5 weeks of age. The ‘social test’ was applied to assess the fear responses, sociality and social recognition skills when presented with a familiar and unfamiliar kid and the ‘separation test’ assessed the behavioral coping skills when isolated. The results indicate goat kids from the highest prenatal density of 1.0 m2 were more fearful than the kids from the lower prenatal densities (i.e. made more escape attempts (separation test: P < 0.001) and vocalizations (social test: P < 0.001; separation test: P < 0.001). This effect was more pronounced in females than males in the high density (vocalizations; social test: P < 0.001; separation test: P = 0.001) and females were generally more social than males. However, goat kids did not differentiate between a familiar and an unfamiliar kid at 5 weeks of age and sociality was not affected by the prenatal density treatment. We conclude that high animal densities during pregnancy in goats produce offspring that have a higher level of fear, particularly in females. Behavioral changes in offspring that occur as an effect of prenatal stress are of high importance as many of the females are recruited to the breeding stock of dairy goats. PMID:24710177
Auditory responses in the amygdala to social vocalizations

NASA Astrophysics Data System (ADS)

Gadziola, Marie A.

The underlying goal of this dissertation is to understand how the amygdala, a brain region involved in establishing the emotional significance of sensory input, contributes to the processing of complex sounds. The general hypothesis is that communication calls of big brown bats (Eptesicus fuscus) transmit relevant information about social context that is reflected in the activity of amygdalar neurons. The first specific aim analyzed social vocalizations emitted under a variety of behavioral contexts, and related vocalizations to an objective measure of internal physiological state by monitoring the heart rate of vocalizing bats. These experiments revealed a complex acoustic communication system among big brown bats in which acoustic cues and call structure signal the emotional state of a sender. The second specific aim characterized the responsiveness of single neurons in the basolateral amygdala to a range of social syllables. Neurons typically respond to the majority of tested syllables, but effectively discriminate among vocalizations by varying the response duration. This novel coding strategy underscores the importance of persistent firing in the general functioning of the amygdala. The third specific aim examined the influence of acoustic context by characterizing both the behavioral and neurophysiological responses to natural vocal sequences. Vocal sequences differentially modify the internal affective state of a listening bat, with lower aggression vocalizations evoking the greatest change in heart rate. Amygdalar neurons employ two different coding strategies: low background neurons respond selectively to very few stimuli, whereas high background neurons respond broadly to stimuli but demonstrate variation in response magnitude and timing. Neurons appear to discriminate the valence of stimuli, with aggression sequences evoking robust population-level responses across all sound levels. Further, vocal sequences show improved discrimination among stimuli compared to isolated syllables, and this improved discrimination is expressed in part by the timing of action potentials. Taken together, these data support the hypothesis that big brown bat social vocalizations transmit relevant information about the social context that is encoded within the discharge pattern of amygdalar neurons ultimately responsible for coordinating appropriate social behaviors. I further propose that vocalization-evoked amygdalar activity will have significant impact on subsequent sensory processing and plasticity.
Can blind persons accurately assess body size from the voice?

PubMed

Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka

2016-04-01

Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).
Vocal individuality cues in the African penguin (Spheniscus demersus): a source-filter theory approach.

PubMed

Favaro, Livio; Gamba, Marco; Alfieri, Chiara; Pessani, Daniela; McElligott, Alan G

2015-11-25

The African penguin is a nesting seabird endemic to southern Africa. In penguins of the genus Spheniscus vocalisations are important for social recognition. However, it is not clear which acoustic features of calls can encode individual identity information. We recorded contact calls and ecstatic display songs of 12 adult birds from a captive colony. For each vocalisation, we measured 31 spectral and temporal acoustic parameters related to both source and filter components of calls. For each parameter, we calculated the Potential of Individual Coding (PIC). The acoustic parameters showing PIC ≥ 1.1 were used to perform a stepwise cross-validated discriminant function analysis (DFA). The DFA correctly classified 66.1% of the contact calls and 62.5% of display songs to the correct individual. The DFA also resulted in the further selection of 10 acoustic features for contact calls and 9 for display songs that were important for vocal individuality. Our results suggest that studying the anatomical constraints that influence nesting penguin vocalisations from a source-filter perspective, can lead to a much better understanding of the acoustic cues of individuality contained in their calls. This approach could be further extended to study and understand vocal communication in other bird species.
Vocal individuality cues in the African penguin (Spheniscus demersus): a source-filter theory approach

PubMed Central

Favaro, Livio; Gamba, Marco; Alfieri, Chiara; Pessani, Daniela; McElligott, Alan G.

2015-01-01

The African penguin is a nesting seabird endemic to southern Africa. In penguins of the genus Spheniscus vocalisations are important for social recognition. However, it is not clear which acoustic features of calls can encode individual identity information. We recorded contact calls and ecstatic display songs of 12 adult birds from a captive colony. For each vocalisation, we measured 31 spectral and temporal acoustic parameters related to both source and filter components of calls. For each parameter, we calculated the Potential of Individual Coding (PIC). The acoustic parameters showing PIC ≥ 1.1 were used to perform a stepwise cross-validated discriminant function analysis (DFA). The DFA correctly classified 66.1% of the contact calls and 62.5% of display songs to the correct individual. The DFA also resulted in the further selection of 10 acoustic features for contact calls and 9 for display songs that were important for vocal individuality. Our results suggest that studying the anatomical constraints that influence nesting penguin vocalisations from a source-filter perspective, can lead to a much better understanding of the acoustic cues of individuality contained in their calls. This approach could be further extended to study and understand vocal communication in other bird species. PMID:26602001
Singing with yourself: evidence for an inverse modeling account of poor-pitch singing.

PubMed

Pfordresher, Peter Q; Mantell, James T

2014-05-01

Singing is a ubiquitous and culturally significant activity that humans engage in from an early age. Nevertheless, some individuals - termed poor-pitch singers - are unable to match target pitches within a musical semitone while singing. In the experiments reported here, we tested whether poor-pitch singing deficits would be reduced when individuals imitate recordings of themselves as opposed to recordings of other individuals. This prediction was based on the hypothesis that poor-pitch singers have not developed an abstract "inverse model" of the auditory-vocal system and instead must rely on sensorimotor associations that they have experienced directly, which is true for sequences an individual has already produced. In three experiments, participants, both accurate and poor-pitch singers, were better able to imitate sung recordings of themselves than sung recordings of other singers. However, this self-advantage was enhanced for poor-pitch singers. These effects were not a byproduct of self-recognition (Experiment 1), vocal timbre (Experiment 2), or the absolute pitch of target recordings (i.e., the advantage remains when recordings are transposed, Experiment 3). Results support the conceptualization of poor-pitch singing as an imitative deficit resulting from a deficient inverse model of the auditory-vocal system with respect to pitch. Copyright © 2014 Elsevier Inc. All rights reserved.
Can blind persons accurately assess body size from the voice?

PubMed Central

Oleszkiewicz, Anna; Sorokowska, Agnieszka

2016-01-01

Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20–65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. PMID:27095264
Sex hormones and the female voice.

PubMed

Abitbol, J; Abitbol, P; Abitbol, B

1999-09-01

In the following, the authors examine the relationship between hormonal climate and the female voice through discussion of hormonal biochemistry and physiology and informal reporting on a study of 197 women with either premenstrual or menopausal voice syndrome. These facts are placed in a larger historical and cultural context, which is inextricably bound to the understanding of the female voice. The female voice evolves from childhood to menopause, under the varied influences of estrogens, progesterone, and testosterone. These hormones are the dominant factor in determining voice changes throughout life. For example, a woman's voice always develops masculine characteristics after an injection of testosterone. Such a change is irreversible. Conversely, male castrati had feminine voices because they lacked the physiologic changes associated with testosterone. The vocal instrument is comprised of the vibratory body, the respiratory power source and the oropharyngeal resonating chambers. Voice is characterized by its intensity, frequency, and harmonics. The harmonics are hormonally dependent. This is illustrated by the changes that occur during male and female puberty: In the female, the impact of estrogens at puberty, in concert with progesterone, produces the characteristics of the female voice, with a fundamental frequency one third lower than that of a child. In the male, androgens released at puberty are responsible for the male vocal frequency, an octave lower than that of a child. Premenstrual vocal syndrome is characterized by vocal fatigue, decreased range, a loss of power and loss of certain harmonics. The syndrome usually starts some 4-5 days before menstruation in some 33% of women. Vocal professionals are particularly affected. Dynamic vocal exploration by televideoendoscopy shows congestion, microvarices, edema of the posterior third of the vocal folds and a loss of its vibratory amplitude. The authors studied 97 premenstrual women who were prescribed a treatment of multivitamins, venous tone stimulants (phlebotonics), and anti-edematous drugs. We obtained symptomatic improvement in 84 patients. The menopausal vocal syndrome is characterized by lowered vocal intensity, vocal fatigue, a decreased range with loss of the high tones and a loss of vocal quality. In a study of 100 menopausal women, 17 presented with a menopausal vocal syndrome. To rehabilitate their voices, and thus their professional lives, patients were prescribed hormone replacement therapy and multi-vitamins. All 97 women showed signs of vocal muscle atrophy, reduction in the thickness of the mucosa and reduced mobility in the cricoarytenoid joint. Multi-factorial therapy (hormone replacement therapy and multi-vitamins) has to be individually adjusted to each case depending on body type, vocal needs, and other factors.
Emotional memory and perception in temporal lobectomy patients with amygdala damage.

PubMed

Brierley, B; Medford, N; Shaw, P; David, A S

2004-04-01

The human amygdala is implicated in the formation of emotional memories and the perception of emotional stimuli--particularly fear--across various modalities. To discern the extent to which these functions are related. 28 patients who had anterior temporal lobectomy (13 left and 15 right) for intractable epilepsy were recruited. Structural magnetic resonance imaging showed that three of them had atrophy of their remaining amygdala. All participants were given tests of affect perception from facial and vocal expressions and of emotional memory, using a standard narrative test and a novel test of word recognition. The results were standardised against matched healthy controls. Performance on all emotion tasks in patients with unilateral lobectomy ranged from unimpaired to moderately impaired. Perception of emotions in faces and voices was (with exceptions) significantly positively correlated, indicating multimodal emotional processing. However, there was no correlation between the subjects' performance on tests of emotional memory and perception. Several subjects showed strong emotional memory enhancement but poor fear perception. Patients with bilateral amygdala damage had greater impairment, particularly on the narrative test of emotional memory, one showing superior fear recognition but absent memory enhancement. Bilateral amygdala damage is particularly disruptive of emotional memory processes in comparison with unilateral temporal lobectomy. On a cognitive level, the pattern of results implies that perception of emotional expressions and emotional memory are supported by separate processing systems or streams.
The Effect of Hydration on Voice Quality in Adults: A Systematic Review.

PubMed

Alves, Maxine; Krüger, Esedra; Pillay, Bhavani; van Lierde, Kristiane; van der Linde, Jeannie

2017-11-06

We aimed to critically appraise scientific, peer-reviewed articles, published in the past 10 years on the effects of hydration on voice quality in adults. This is a systematic review. Five databases were searched using the key words "vocal fold hydration", "voice quality", "vocal fold dehydration", and "hygienic voice therapy". The Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines were followed. The included studies were scored based on American Speech-Language-Hearing Association's levels of evidence and quality indicators, as well as the Cochrane Collaboration's risk of bias tool. Systemic dehydration as a result of fasting and not ingesting fluids significantly negatively affected the parameters of noise-to-harmonics ratio (NHR), shimmer, jitter, frequency, and the s/z ratio. Water ingestion led to significant improvements in shimmer, jitter, frequency, and maximum phonation time values. Caffeine intake does not appear to negatively affect voice production. Laryngeal desiccation challenges by oral breathing led to surface dehydration which negatively affected jitter, shimmer, NHR, phonation threshold pressure, and perceived phonatory effort. Steam inhalation significantly improved NHR, shimmer, and jitter. Only nebulization of isotonic solution decreased phonation threshold pressure and showed some indication of a potential positive effect of nebulization substances. Treatments in high humidity environments prove to be effective and adaptations of low humidity environments should be encouraged. Recent literature regarding vocal hydration is high quality evidence. Systemic hydration is the easiest and most cost-effective solution to improve voice quality. Recent evidence therefore supports the inclusion of hydration in a vocal hygiene program. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

The singing/acting mature adult--singing instruction perspective.

PubMed

Westerman Gregg, J

1997-06-01

Complete knowledge of anatomy and physiology of the vocal mechanism and tract is essential for the voice teacher to be maximally effective. Possible contributing factors to vocal attrition in the mature singer/actor are outlined: poor posture, inadequate respiratory function, lack of adequate hydration, phonatory hyperfunction, habitual speaking pitch at too low a frequency, lack of resonance, tongue tension affecting phonation, resonation, and articulation. Techniques for rehabilitation of the damaged voice are recommended.
Discrimination of communication vocalizations by single neurons and groups of neurons in the auditory midbrain.

PubMed

Schneider, David M; Woolley, Sarah M N

2010-06-01

Many social animals including songbirds use communication vocalizations for individual recognition. The perception of vocalizations depends on the encoding of complex sounds by neurons in the ascending auditory system, each of which is tuned to a particular subset of acoustic features. Here, we examined how well the responses of single auditory neurons could be used to discriminate among bird songs and we compared discriminability to spectrotemporal tuning. We then used biologically realistic models of pooled neural responses to test whether the responses of groups of neurons discriminated among songs better than the responses of single neurons and whether discrimination by groups of neurons was related to spectrotemporal tuning and trial-to-trial response variability. The responses of single auditory midbrain neurons could be used to discriminate among vocalizations with a wide range of abilities, ranging from chance to 100%. The ability to discriminate among songs using single neuron responses was not correlated with spectrotemporal tuning. Pooling the responses of pairs of neurons generally led to better discrimination than the average of the two inputs and the most discriminating input. Pooling the responses of three to five single neurons continued to improve neural discrimination. The increase in discriminability was largest for groups of neurons with similar spectrotemporal tuning. Further, we found that groups of neurons with correlated spike trains achieved the largest gains in discriminability. We simulated neurons with varying levels of temporal precision and measured the discriminability of responses from single simulated neurons and groups of simulated neurons. Simulated neurons with biologically observed levels of temporal precision benefited more from pooling correlated inputs than did neurons with highly precise or imprecise spike trains. These findings suggest that pooling correlated neural responses with the levels of precision observed in the auditory midbrain increases neural discrimination of complex vocalizations.
Detection and Classification of Whale Acoustic Signals

NASA Astrophysics Data System (ADS)

Xian, Yin

This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification. In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information. In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data. Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear. We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.
Laryngeal muscle activity in unilateral vocal fold paralysis patients using electromyography and coronal reconstructed images.

PubMed

Sanuki, Tetsuji; Yumoto, Eiji; Nishimoto, Kohei; Minoda, Ryosei

2014-04-01

To assess laryngeal muscle activity in unilateral vocal fold paralysis (UVFP) patients using laryngeal electromyography (LEMG) and coronal images. Case series with chart review. University hospital. Twenty-one patients diagnosed with UVFP of at least 6 months in duration with paralytic dysphonia, underwent LEMG, phonatory function tests, and coronal imaging. A 4-point scale was used to grade motor unit (MU) recruitment: absent = 4+, greatly decreased = 3+, moderately decreased = 2+, and mildly decreased = 1+. Maximum phonation time (MPT) and mean flow rate (MFR) were employed. Coronal images were assessed for differences in thickness and vertical position of the vocal folds during phonation and inhalation. MU recruitment in thyroarytenoid/lateral cricoarytenoid (TA/LCA) muscle complex results were 1+ for 4 patients, 2+ for 5, 3+ for 6, and 4+ for 6. MPT was positively correlated with MU recruitment. Thinning of the affected fold was evident during phonation in 19 of the 21 subjects. The affected fold was at an equal level with the healthy fold in all 9 subjects with MU recruitment of 1+ and 2+. Eleven of 12 subjects with MU recruitments of 3+ and 4+ showed the affected fold at a higher level than the healthy fold. There was a significant difference between MU recruitment and the vertical position of the affected fold. Synkinetic reinnervation may occur in some cases with UVFP. MU recruitments of TA/LCA muscle complex in UVFP patients may be related to phonatory function and the vertical position of the affected fold.
Vocal warm-up and breathing training for teachers: randomized clinical trial

PubMed Central

Pereira, Lílian Paternostro de Pina; Masson, Maria Lúcia Vaz; Carvalho, Fernando Martins

2015-01-01

OBJECTIVE To compare the effectiveness of two speech therapy interventions, vocal warm-up and breathing training, focusing on teachers’ voice quality. METHODS A single-blind, randomized, parallel clinical trial was conducted. The research included 31 20 to 60-year old teachers from a public school in Salvador, BA, Northeasatern Brazil, with minimum workloads of 20 hours a week, who have or have not reported having vocal alterations. The exclusion criteria were the following: being a smoker, excessive alcohol consumption, receiving additional speech therapy assistance while taking part in the study, being affected by upper respiratory tract infections, professional use of the voice in another activity, neurological disorders, and history of cardiopulmonary pathologies. The subjects were distributed through simple randomization in groups vocal warm-up (n = 14) and breathing training (n = 17). The teachers’ voice quality was subjectively evaluated through the Voice Handicap Index (Índice de Desvantagem Vocal, in the Brazilian version) and computerized voice analysis (average fundamental frequency, jitter, shimmer, noise, and glottal-to-noise excitation ratio) by speech therapists. RESULTS Before the interventions, the groups were similar regarding sociodemographic characteristics, teaching activities, and vocal quality. The variations before and after the intervention in self-assessment and acoustic voice indicators have not significantly differed between the groups. In the comparison between groups before and after the six-week interventions, significant reductions in the Voice Handicap Index of subjects in both groups were observed, as wells as reduced average fundamental frequencies in the vocal warm-up group and increased shimmer in the breathing training group. Subjects from the vocal warm-up group reported speaking more easily and having their voices more improved in a general way as compared to the breathing training group. CONCLUSIONS Both interventions were similar regarding their effects on the teachers’ voice quality. However, each contribution has individually contributed to improve the teachers’ voice quality, especially the vocal warm-up. PMID:26465664
Infant Cries Rattle Adult Cognition.

PubMed

Dudek, Joanna; Faress, Ahmed; Bornstein, Marc H; Haley, David W

2016-01-01

The attention-grabbing quality of the infant cry is well recognized, but how the emotional valence of infant vocal signals affects adult cognition and cortical activity has heretofore been unknown. We examined the effects of two contrasting infant vocalizations (cries vs. laughs) on adult performance on a Stroop task using a cross-modal distraction paradigm in which infant distractors were vocal and targets were visual. Infant vocalizations were presented before (Experiment 1) or during each Stroop trial (Experiment 2). To evaluate the influence of infant vocalizations on cognitive control, neural responses to the Stroop task were obtained by measuring electroencephalography (EEG) and event-related potentials (ERPs) in Experiment 1. Based on the previously demonstrated existence of negative arousal bias, we hypothesized that cry vocalizations would be more distracting and invoke greater conflict processing than laugh vocalizations. Similarly, we expected participants to have greater difficulty shifting attention from the vocal distractors to the target task after hearing cries vs. after hearing laughs. Behavioral results from both experiments showed a cry interference effect, in which task performance was slower with cry than with laugh distractors. Electrophysiology data further revealed that cries more than laughs reduced attention to the task (smaller P200) and increased conflict processing (larger N450), albeit differently for incongruent and congruent trials. Results from a correlation analysis showed that the amplitudes of P200 and N450 were inversely related, suggesting a reciprocal relationship between attention and conflict processing. The findings suggest that cognitive control processes contribute to an attention bias to infant signals, which is modulated in part by the valence of the infant vocalization and the demands of the cognitive task. The findings thus support the notion that infant cries elicit a negative arousal bias that is distracting; they also identify, for the first time, the neural dynamics underlying the unique influence that infant cries and laughs have on cognitive control.
Cetacean Bioacoustics with Emphasis on Recording and Monitoring

NASA Astrophysics Data System (ADS)

Akamatsu, Tomonari

More than 80 cetacean species live in oceans, lakes, and rivers. For underwater navigation and recognition, whales and dolphins have developed unique sensory systems using acoustic signals. Toothed whales, such as dolphins and porpoises, have sonar using ultrasonic pulse trains called echolocations (Au, 1993). As top predators in the water, dolphins and porpoises rely on accurate and long-range sensory systems for catching prey. Dolphins have another type of vocalization called a whistle that is narrowband with a long duration.
Amplification and spectral shifts of vocalizations inside burrows of the frog Eupsophus calcaratus (Leptodactylidae)

NASA Astrophysics Data System (ADS)

Penna, Mario

2004-08-01

A variety of animals that communicate by sound emit signals from sites favoring their propagation, thereby increasing the range over which these sounds convey information. A different significance of calling sites has been reported for burrowing frogs Eupsophus emiliopugini from southern Chile: the cavities from which these frogs vocalize amplify conspecific vocalizations generated externally, thus providing a means to enhance the reception of neighbor's vocalizations in chorusing aggregations. In the current study the amplification of vocalizations of a related species, E. calcaratus, is investigated, to explore the extent of sound enhancement reported previously. Advertisement calls broadcast through a loudspeaker placed in the vicinity of a burrow, monitored with small microphones, are amplified by up to 18 dB inside cavities relative to outside. The fundamental resonant frequency of burrows, measured with broadcast noise and pure tones, ranges from 842 to 1836 Hz and is significantly correlated with the burrow's length. Burrows change the spectral envelope of incoming calls by increasing the amplitude of lower relative to higher harmonics. The call amplification effect inside burrows of E. calcaratus parallels the effect reported previously for E. emiliopugini, and indicates that the acoustic properties of calling sites may affect signal reception by burrowing animals.
Amplification and spectral shifts of vocalizations inside burrows of the frog Eupsophus calcaratus (Leptodactylidae).

PubMed

Penna, Mario

2004-08-01

A variety of animals that communicate by sound emit signals from sites favoring their propagation, thereby increasing the range over which these sounds convey information. A different significance of calling sites has been reported for burrowing frogs Eupsophus emiliopugini from southern Chile: the cavities from which these frogs vocalize amplify conspecific vocalizations generated externally, thus providing a means to enhance the reception of neighbor's vocalizations in chorusing aggregations. In the current study the amplification of vocalizations of a related species, E. calcaratus, is investigated, to explore the extent of sound enhancement reported previously. Advertisement calls broadcast through a loudspeaker placed in the vicinity of a burrow, monitored with small microphones, are amplified by up to 18 dB inside cavities relative to outside. The fundamental resonant frequency of burrows, measured with broadcast noise and pure tones, ranges from 842 to 1836 Hz and is significantly correlated with the burrow's length. Burrows change the spectral envelope of incoming calls by increasing the amplitude of lower relative to higher harmonics. The call amplification effect inside burrows of E. calcaratus parallels the effect reported previously for E. emiliopugini, and indicates that the acoustic properties of calling sites may affect signal reception by burrowing animals.
An acoustic analysis of laughter produced by congenitally deaf and normally hearing college students.

PubMed

Makagon, Maja M; Funayama, E Sumie; Owren, Michael J

2008-07-01

Relatively few empirical data are available concerning the role of auditory experience in nonverbal human vocal behavior, such as laughter production. This study compared the acoustic properties of laughter in 19 congenitally, bilaterally, and profoundly deaf college students and in 23 normally hearing control participants. Analyses focused on degree of voicing, mouth position, air-flow direction, temporal features, relative amplitude, fundamental frequency, and formant frequencies. Results showed that laughter produced by the deaf participants was fundamentally similar to that produced by the normally hearing individuals, which in turn was consistent with previously reported findings. Finding comparable acoustic properties in the sounds produced by deaf and hearing vocalizers confirms the presumption that laughter is importantly grounded in human biology, and that auditory experience with this vocalization is not necessary for it to emerge in species-typical form. Some differences were found between the laughter of deaf and hearing groups; the most important being that the deaf participants produced lower-amplitude and longer-duration laughs. These discrepancies are likely due to a combination of the physiological and social factors that routinely affect profoundly deaf individuals, including low overall rates of vocal fold use and pressure from the hearing world to suppress spontaneous vocalizations.
Reading in developmental prosopagnosia: Evidence for a dissociation between word and face recognition.

PubMed

Starrfelt, Randi; Klargaard, Solja K; Petersen, Anders; Gerlach, Christian

2018-02-01

Recent models suggest that face and word recognition may rely on overlapping cognitive processes and neural regions. In support of this notion, face recognition deficits have been demonstrated in developmental dyslexia. Here we test whether the opposite association can also be found, that is, impaired reading in developmental prosopagnosia. We tested 10 adults with developmental prosopagnosia and 20 matched controls. All participants completed the Cambridge Face Memory Test, the Cambridge Face Perception test and a Face recognition questionnaire used to quantify everyday face recognition experience. Reading was measured in four experimental tasks, testing different levels of letter, word, and text reading: (a) single word reading with words of varying length,(b) vocal response times in single letter and short word naming, (c) recognition of single letters and short words at brief exposure durations (targeting the word superiority effect), and d) text reading. Participants with developmental prosopagnosia performed strikingly similar to controls across the four reading tasks. Formal analysis revealed a significant dissociation between word and face recognition, as the difference in performance with faces and words was significantly greater for participants with developmental prosopagnosia than for controls. Adult developmental prosopagnosics read as quickly and fluently as controls, while they are seemingly unable to learn efficient strategies for recognizing faces. We suggest that this is due to the differing demands that face and word recognition put on the perceptual system. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
[The autoimmune rheumatic disease and laryngeal pathology].

PubMed

Osipenko, E V; Kotel'nikova, N M

Vocal disorders make up one of the autoimmune pathological conditions characterized by multiple organ system dysfunction. Laryngeal pathology in this condition has an autoimmune nature; it is highly diverse and poorly explored. The objective of the present work based on the analysis of the relevant literature publications was to study clinical manifestations of the autoimmune rheumatic disease affecting the larynx. 'Bamboo nodes' on the vocal folds is a rare manifestation of laryngeal autoimmune diseases. We found out references to 49 cases of this condition in the available literature. All the patients were women presenting with autoimmune diseases. The present review highlights the problems pertaining to etiology of 'bamboo nodes' on the vocal folds and the method for the treatment of this condition.
A bioreactor for the dynamic mechanical stimulation of vocal-fold fibroblasts based on vibro-acoustography

NASA Astrophysics Data System (ADS)

Chan, Roger W.; Rodriguez, Maritza

2005-09-01

During voice production, the vocal folds undergo airflow-induced self-sustained oscillation at a fundamental frequency of around 100-1000 Hz, with an amplitude of around 1-3 mm. The vocal-fold extracellular matrix (ECM), with appropriate tissue viscoelastic properties, is optimally tuned for such vibration. Vocal-fold fibroblasts regulate the gene expressions for key ECM proteins (e.g., collagen, fibronectin, fibromodulin, and hyaluronic acid), and these expressions are affected by the stress fields experi- enced by the fibroblasts. This study attempts to develop a bioreactor for cultivating cells under a micromechanical environment similar to that in vivo, based on the principle of vibro-acoustography. Vocal-fold fibroblasts from primary culture were grown in 3D, biodegradable scaffolds, and were excited dynamically by the radiation force generated by amplitude modulation of two confocal ultrasound beams of slightly different frequencies. Low-frequency acoustic radiation force was applied to the scaffold surface, and its vibratory response was imaged by videostroboscopy. A phantom tissue (standard viscoelastic material) with known elastic modulus was also excited and its vibratory frequency and amplitude were measured by videostroboscopy. Results showed that the bioreactor was capable of delivering mechanical stimuli to the tissue constructs in a physiological frequency range (100-1000 Hz), supporting its potential for vocal-fold tissue engineering applications. [Work supported by NIH Grant R01 DC006101.
Effects of Dehydration on the Viscoelastic Properties of Vocal Folds in Large Deformations

PubMed Central

Miri, Amir K.; Barthelat, François; Mongeau, Luc

2012-01-01

Summary Dehydration may alter vocal fold viscoelastic properties, which may hamper phonation. The effects of water loss induced by an osmotic-pressure potential on vocal fold tissue viscoelastic properties were investigated. Porcine vocal folds were dehydrated by immersion in a hypertonic solution, and quasi-static and low-frequency dynamic traction tests were performed for elongations of up to 50%. Digital image correlation was used to determine local strains from surface deformations. The elastic modulus and the loss factor were then determined for normal and dehydrated tissues. An eight-chain hyperelastic model was used to describe the observed nonlinear stress-stretch behavior. Contrary to expectations, the mass history indicated that the tissue absorbed water during cyclic extension when submerged in a hypertonic solution. During loading history, the elastic modulus was increased for dehydrated tissues as a function of strain. The response of dehydrated tissues was much less affected when the load was releasing. This calls more attention to the modeling of vocal folds in micromechanics modeling. The internal hysteresis, which is often linked to phonation effort, increased significantly with water loss. The effects of dehydration on the viscoelastic properties of vocal fold tissue were quantified in a systematic way. The results will contribute to a better understanding of the basic biomechanics of voice production and ultimately will help establish objective dehydration and phonotrauma criteria. PMID:22483778
Communicative aspects and coping strategies in patients with Parkinson's disease.

PubMed

Costa, Flávia Pereira da; Diaféria, Giovana; Behlau, Mara

2016-01-01

To investigate, in patients with Parkinson's disease (PD), the coping strategies; the most reported vocal symptoms; and the relation between coping, voice symptoms, and communicative aspects. Seventy-three subjects were included in the sample, 33 of which were participants in the experimental group (EG) with diagnosis of PD and 40 were control subjects, that is, healthy and without vocal complaints. They underwent the following procedures: application of Voice Symptoms Scale (VoiSS), Brazilian Version; Voice Disability Coping Questionnaire (VDCQ), Brazilian Version; and the questionnaire Living with Dysarthria (LwD). The EG presented deviations in all protocols: VDCQ, with the most frequently coping strategy being "self-control," VoiSS, with "Impairment" as the most prevalent domain, and LwD, presenting changes in all sections. Vocal signs and symptoms and communicative aspects were shown to have a regular correlation with coping. The correlation between vocal symptoms and communicative aspects was as follows: the greater the impairment in communication, the greater the VoiSS emotional scores and the more they complaint of voice-related signs and symptoms. Patients with PD use all kinds of coping strategies, but prefer using self-control. They present several vocal signs and symptoms, and "Impairment" was the most prevalent domain. There are difficulties in all aspects of communication. The higher the occurrence of vocal signs and symptoms, the more the patient reports the difficulties of living with dysarthria, particularly when deviations affect the emotional domain.
Comparison of Effects Produced by Physiological Versus Traditional Vocal Warm-up in Contemporary Commercial Music Singers.

PubMed

Portillo, María Priscilla; Rojas, Sandra; Guzman, Marco; Quezada, Camilo

2018-03-01

The present study aimed to observe whether physiological warm-up and traditional singing warm-up differently affect aerodynamic, electroglottographic, acoustic, and self-perceived parameters of voice in Contemporary Commercial Music singers. Thirty subjects were asked to perform a 15-minute session of vocal warm-up. They were randomly assigned to one of two types of vocal warm-up: physiological (based on semi-occluded exercises) or traditional (singing warm-up based on open vowel [a:]). Aerodynamic, electroglottographic, acoustic, and self-perceived voice quality assessments were carried out before (pre) and after (post) warm-up. No significant differences were found when comparing both types of vocal warm-up methods, either in subjective or in objective measures. Furthermore, the main positive effect observed in both groups when comparing pre and post conditions was a better self-reported quality of voice. Additionally, significant differences were observed for sound pressure level (decrease), glottal airflow (increase), and aerodynamic efficiency (decrease) in the traditional warm-up group. Both traditional and physiological warm-ups produce favorable voice sensations. Moreover, there are no evident differences in aerodynamic and electroglottographic variables when comparing both types of vocal warm-ups. Some changes after traditional warm-up (decreased intensity, increased airflow, and decreased aerodynamic efficiency) could imply an early stage of vocal fatigue. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Music recognition by Japanese children with cochlear implants.

PubMed

Nakata, Takayuki; Trehub, Sandra E; Mitani, Chisato; Kanda, Yukihiko; Shibasaki, Atsuko; Schellenberg, E Glenn

2005-01-01

Congenitally deaf Japanese children with cochlear implants were tested on their recognition of theme songs from television programs that they watched regularly. The children, who were 4-9 years of age, attempted to identify each song from a closed set of alternatives. Their song identification ability was examined in the context of the original commercial recordings (vocal plus instrumental), the original versions without the words (i.e., karaoke versions), and flute versions of the melody. The children succeeded in identifying the music only from the original versions, and their performance was related to their music listening habits. Children gave favorable appraisals of the music even when they were unable to recognize it. Further research is needed to find means of enhancing cochlear implants users' perception and appreciation of music.
Agonistic character displacement in social cognition of advertisement signals.

PubMed

Pasch, Bret; Sanford, Rachel; Phelps, Steven M

2017-03-01

Interspecific aggression between sibling species may enhance discrimination of competitors when recognition errors are costly, but proximate mechanisms mediating increased discriminative ability are unclear. We studied behavioral and neural mechanisms underlying responses to conspecific and heterospecific vocalizations in Alston's singing mouse (Scotinomys teguina), a species in which males sing to repel rivals. We performed playback experiments using males in allopatry and sympatry with a dominant heterospecific (Scotinomys xerampelinus) and examined song-evoked induction of egr-1 in the auditory system to examine how neural tuning modulates species-specific responses. Heterospecific songs elicited stronger neural responses in sympatry than in allopatry, despite eliciting less singing in sympatry. Our results refute the traditional neuroethological concept of a matched filter and instead suggest expansion of sensory sensitivity to mediate competitor recognition in sympatry.
Cultural in-group advantage: emotion recognition in African American and European American faces and voices.

PubMed

Wickline, Virginia B; Bailey, Wendy; Nowicki, Stephen

2009-03-01

The authors explored whether there were in-group advantages in emotion recognition of faces and voices by culture or geographic region. Participants were 72 African American students (33 men, 39 women), 102 European American students (30 men, 72 women), 30 African international students (16 men, 14 women), and 30 European international students (15 men, 15 women). The participants determined emotions in African American and European American faces and voices. Results showed an in-group advantage-sometimes by culture, less often by race-in recognizing facial and vocal emotional expressions. African international students were generally less accurate at interpreting American nonverbal stimuli than were European American, African American, and European international peers. Results suggest that, although partly universal, emotional expressions have subtle differences across cultures that persons must learn.
Intraoperative handheld probe for 3D imaging of pediatric benign vocal fold lesions using optical coherence tomography (Conference Presentation)

NASA Astrophysics Data System (ADS)

Benboujja, Fouzi; Garcia, Jordan; Beaudette, Kathy; Strupler, Mathias; Hartnick, Christopher J.; Boudoux, Caroline

2016-02-01

Excessive and repetitive force applied on vocal fold tissue can induce benign vocal fold lesions. Children affected suffer from chronic hoarseness. In this instance, the vibratory ability of the folds, a complex layered microanatomy, becomes impaired. Histological findings have shown that lesions produce a remodeling of sup-epithelial vocal fold layers. However, our understanding of lesion features and development is still limited. Indeed, conventional imaging techniques do not allow a non-invasive assessment of sub-epithelial integrity of the vocal fold. Furthermore, it remains challenging to differentiate these sub-epithelial lesions (such as bilateral nodules, polyps and cysts) from a clinical perspective, as their outer surfaces are relatively similar. As treatment strategy differs for each lesion type, it is critical to efficiently differentiate sub-epithelial alterations involved in benign lesions. In this study, we developed an optical coherence tomography (OCT) based handheld probe suitable for pediatric laryngological imaging. The probe allows for rapid three-dimensional imaging of vocal fold lesions. The system is adapted to allow for high-resolution intra-operative imaging. We imaged 20 patients undergoing direct laryngoscopy during which we looked at different benign pediatric pathologies such as bilateral nodules, cysts and laryngeal papillomatosis and compared them to healthy tissue. We qualitatively and quantitatively characterized laryngeal pathologies and demonstrated the added advantage of using 3D OCT imaging for lesion discrimination and margin assessment. OCT evaluation of the integrity of the vocal cord could yield to a better pediatric management of laryngeal diseases.

Differences between vocalization evoked by social stimuli in feral cats and house cats.

PubMed

Yeon, Seong C; Kim, Young K; Park, Se J; Lee, Scott S; Lee, Seung Y; Suh, Euy H; Houpt, Katherine A; Chang, Hong H; Lee, Hee C; Yang, Byung G; Lee, Hyo J

2011-06-01

To investigate how socialization can affect the types and characteristics of vocalization produced by cats, feral cats (n=25) and house cats (n=13) were used as subjects, allowing a comparison between cats socialized to people and non-socialized cats. To record vocalization and assess the cats' responses to behavioural stimuli, five test situations were used: approach by a familiar caretaker, by a threatening stranger, by a large doll, by a stranger with a dog and by a stranger with a cat. Feral cats showed extremely aggressive and defensive behaviour in most test situations, and produced higher call rates than those of house cats in the test situations, which could be attributed to less socialization to other animals and to more sensitivity to fearful situations. Differences were observed in the acoustic parameters of feral cats in comparison to those of house cats. The feral cat produced significantly higher frequency in fundamental frequency, peak frequency, 1st quartile frequency, 3rd quartile frequency of growls and hisses in agonistic test situations. In contrast to the growls and hisses, in meow, all acoustic parameters like fundamental frequency, first formant, peak frequency, 1st quartile frequency, and 3rd quartile frequency of house cats were of significantly higher frequency than those of feral cats. Also, house cats produced calls of significantly shorter in duration than feral cats in agonistic test situations. These results support the conclusion that a lack of socialization may affect usage of types of vocalizations, and the vocal characteristics, so that the proper socialization of cat may be essential to be a suitable companion house cat. Copyright © 2011 Elsevier B.V. All rights reserved.
Diminished FoxP2 levels affect dopaminergic modulation of corticostriatal signaling important to song variability.

PubMed

Murugan, Malavika; Harward, Stephen; Scharff, Constance; Mooney, Richard

2013-12-18

Mutations of the FOXP2 gene impair speech and language development in humans and shRNA-mediated suppression of the avian ortholog FoxP2 disrupts song learning in juvenile zebra finches. How diminished FoxP2 levels affect vocal control and alter the function of neural circuits important to learned vocalizations remains unclear. Here we show that FoxP2 knockdown in the songbird striatum disrupts developmental and social modulation of song variability. Recordings in anesthetized birds show that FoxP2 knockdown interferes with D1R-dependent modulation of activity propagation in a corticostriatal pathway important to song variability, an effect that may be partly attributable to reduced D1R and DARPP-32 protein levels. Furthermore, recordings in singing birds reveal that FoxP2 knockdown prevents social modulation of singing-related activity in this pathway. These findings show that reduced FoxP2 levels interfere with the dopaminergic modulation of vocal variability, which may impede song and speech development by disrupting reinforcement learning mechanisms. Copyright © 2013 Elsevier Inc. All rights reserved.
Diminished FoxP2 levels affect dopaminergic modulation of corticostriatal signaling important to song variability

PubMed Central

Murugan, Malavika; Harward, Stephen; Scharff, Constance; Mooney, Richard

2013-01-01

Summary Mutations of the FOXP2 gene impair speech and language development in humans and shRNA-mediated suppression of the avian orthologue FoxP2 disrupts song learning in juvenile zebra finches. How diminished FoxP2 levels affect vocal control and alter the function of neural circuits important to learned vocalizations remains unclear. Here we show that FoxP2 knockdown in the songbird striatum disrupts developmental and social modulation of song variability. Recordings in anaesthetized birds show that FoxP2 knockdown interferes with D1R-dependent modulation of activity propagation in a corticostriatal pathway important to song variability, an effect that may be partly attributable to reduced D1R and DARPP-32 protein levels. Furthermore, recordings in singing birds reveal that FoxP2 knockdown prevents social modulation of singing-related activity in this pathway. These findings show that reduced FoxP2 levels interfere with the dopaminergic modulation of vocal variability, which may impede song and speech development by disrupting reinforcement learning mechanisms. PMID:24268418
Vocal Identity Recognition in Autism Spectrum Disorder

PubMed Central

Lin, I-Fan; Yamada, Takashi; Komine, Yoko; Kato, Nobumasa; Kato, Masaharu; Kashino, Makio

2015-01-01

Voices can convey information about a speaker. When forming an abstract representation of a speaker, it is important to extract relevant features from acoustic signals that are invariant to the modulation of these signals. This study investigated the way in which individuals with autism spectrum disorder (ASD) recognize and memorize vocal identity. The ASD group and control group performed similarly in a task when asked to choose the name of the newly-learned speaker based on his or her voice, and the ASD group outperformed the control group in a subsequent familiarity test when asked to discriminate the previously trained voices and untrained voices. These findings suggest that individuals with ASD recognized and memorized voices as well as the neurotypical individuals did, but they categorized voices in a different way: individuals with ASD categorized voices quantitatively based on the exact acoustic features, while neurotypical individuals categorized voices qualitatively based on the acoustic patterns correlated to the speakers' physical and mental properties. PMID:26070199
Vocal Identity Recognition in Autism Spectrum Disorder.

PubMed

Lin, I-Fan; Yamada, Takashi; Komine, Yoko; Kato, Nobumasa; Kato, Masaharu; Kashino, Makio

2015-01-01

Voices can convey information about a speaker. When forming an abstract representation of a speaker, it is important to extract relevant features from acoustic signals that are invariant to the modulation of these signals. This study investigated the way in which individuals with autism spectrum disorder (ASD) recognize and memorize vocal identity. The ASD group and control group performed similarly in a task when asked to choose the name of the newly-learned speaker based on his or her voice, and the ASD group outperformed the control group in a subsequent familiarity test when asked to discriminate the previously trained voices and untrained voices. These findings suggest that individuals with ASD recognized and memorized voices as well as the neurotypical individuals did, but they categorized voices in a different way: individuals with ASD categorized voices quantitatively based on the exact acoustic features, while neurotypical individuals categorized voices qualitatively based on the acoustic patterns correlated to the speakers' physical and mental properties.
A Verbal Guidance System for Severe Disabled People

NASA Astrophysics Data System (ADS)

Redjati, Abdelghani; Bousbia-Salah, Mounir

2008-06-01

The recent development in rehabilitation technology allows to significantly broaden the range of possible applications that support handicapped people in their daily lives. This paper presents a moral and physical support for the disabled. It consists in the development of a verbal guidance system based on a speech recognition development kit `VD364'. This aid is intended to control a wheelchair and a manipulator arm for people with severe disabilities and who can speak. The study and design, conducted in the framework of this contribution have enabled an adaptation for a possible application and maximum exploitation of words that can be generated by a vocal module. The problem addressed is to allow a manipulator arm to compensate mechanically arm movements to give the handicapped satisfaction of his needs (for instance, drinking a glass of water). The objective is then to put forward a vocal command system that allows the arm to move in a well determined area to accomplish tasks that must be given by the user in addition to the displacement of the wheelchair.
Applications for Subvocal Speech

NASA Technical Reports Server (NTRS)

Jorgensen, Charles; Betts, Bradley

2007-01-01

A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.
Phonologically-based biomarkers for major depressive disorder

NASA Astrophysics Data System (ADS)

Trevino, Andrea Carolina; Quatieri, Thomas Francis; Malyska, Nicolas

2011-12-01

Of increasing importance in the civilian and military population is the recognition of major depressive disorder at its earliest stages and intervention before the onset of severe symptoms. Toward the goal of more effective monitoring of depression severity, we introduce vocal biomarkers that are derived automatically from phonologically-based measures of speech rate. To assess our measures, we use a 35-speaker free-response speech database of subjects treated for depression over a 6-week duration. We find that dissecting average measures of speech rate into phone-specific characteristics and, in particular, combined phone-duration measures uncovers stronger relationships between speech rate and depression severity than global measures previously reported for a speech-rate biomarker. Results of this study are supported by correlation of our measures with depression severity and classification of depression state with these vocal measures. Our approach provides a general framework for analyzing individual symptom categories through phonological units, and supports the premise that speaking rate can be an indicator of psychomotor retardation severity.
A Joint Prosodic Origin of Language and Music

PubMed Central

Brown, Steven

2017-01-01

Vocal theories of the origin of language rarely make a case for the precursor functions that underlay the evolution of speech. The vocal expression of emotion is unquestionably the best candidate for such a precursor, although most evolutionary models of both language and speech ignore emotion and prosody altogether. I present here a model for a joint prosodic precursor of language and music in which ritualized group-level vocalizations served as the ancestral state. This precursor combined not only affective and intonational aspects of prosody, but also holistic and combinatorial mechanisms of phrase generation. From this common stage, there was a bifurcation to form language and music as separate, though homologous, specializations. This separation of language and music was accompanied by their (re)unification in songs with words. PMID:29163276
Distributed acoustic cues for caller identity in macaque vocalization.

PubMed

Fukushima, Makoto; Doyle, Alex M; Mullarkey, Matthew P; Mishkin, Mortimer; Averbeck, Bruno B

2015-12-01

Individual primates can be identified by the sound of their voice. Macaques have demonstrated an ability to discern conspecific identity from a harmonically structured 'coo' call. Voice recognition presumably requires the integrated perception of multiple acoustic features. However, it is unclear how this is achieved, given considerable variability across utterances. Specifically, the extent to which information about caller identity is distributed across multiple features remains elusive. We examined these issues by recording and analysing a large sample of calls from eight macaques. Single acoustic features, including fundamental frequency, duration and Weiner entropy, were informative but unreliable for the statistical classification of caller identity. A combination of multiple features, however, allowed for highly accurate caller identification. A regularized classifier that learned to identify callers from the modulation power spectrum of calls found that specific regions of spectral-temporal modulation were informative for caller identification. These ranges are related to acoustic features such as the call's fundamental frequency and FM sweep direction. We further found that the low-frequency spectrotemporal modulation component contained an indexical cue of the caller body size. Thus, cues for caller identity are distributed across identifiable spectrotemporal components corresponding to laryngeal and supralaryngeal components of vocalizations, and the integration of those cues can enable highly reliable caller identification. Our results demonstrate a clear acoustic basis by which individual macaque vocalizations can be recognized.
Distributed acoustic cues for caller identity in macaque vocalization

PubMed Central

Doyle, Alex M.; Mullarkey, Matthew P.; Mishkin, Mortimer; Averbeck, Bruno B.

2015-01-01

Individual primates can be identified by the sound of their voice. Macaques have demonstrated an ability to discern conspecific identity from a harmonically structured ‘coo’ call. Voice recognition presumably requires the integrated perception of multiple acoustic features. However, it is unclear how this is achieved, given considerable variability across utterances. Specifically, the extent to which information about caller identity is distributed across multiple features remains elusive. We examined these issues by recording and analysing a large sample of calls from eight macaques. Single acoustic features, including fundamental frequency, duration and Weiner entropy, were informative but unreliable for the statistical classification of caller identity. A combination of multiple features, however, allowed for highly accurate caller identification. A regularized classifier that learned to identify callers from the modulation power spectrum of calls found that specific regions of spectral–temporal modulation were informative for caller identification. These ranges are related to acoustic features such as the call’s fundamental frequency and FM sweep direction. We further found that the low-frequency spectrotemporal modulation component contained an indexical cue of the caller body size. Thus, cues for caller identity are distributed across identifiable spectrotemporal components corresponding to laryngeal and supralaryngeal components of vocalizations, and the integration of those cues can enable highly reliable caller identification. Our results demonstrate a clear acoustic basis by which individual macaque vocalizations can be recognized. PMID:27019727
Divergent Human Cortical Regions for Processing Distinct Acoustic-Semantic Categories of Natural Sounds: Animal Action Sounds vs. Vocalizations

PubMed Central

Webster, Paula J.; Skipper-Kallal, Laura M.; Frum, Chris A.; Still, Hayley N.; Ward, B. Douglas; Lewis, James W.

2017-01-01

A major gap in our understanding of natural sound processing is knowledge of where or how in a cortical hierarchy differential processing leads to categorical perception at a semantic level. Here, using functional magnetic resonance imaging (fMRI) we sought to determine if and where cortical pathways in humans might diverge for processing action sounds vs. vocalizations as distinct acoustic-semantic categories of real-world sound when matched for duration and intensity. This was tested by using relatively less semantically complex natural sounds produced by non-conspecific animals rather than humans. Our results revealed a striking double-dissociation of activated networks bilaterally. This included a previously well described pathway preferential for processing vocalization signals directed laterally from functionally defined primary auditory cortices to the anterior superior temporal gyri, and a less well-described pathway preferential for processing animal action sounds directed medially to the posterior insulae. We additionally found that some of these regions and associated cortical networks showed parametric sensitivity to high-order quantifiable acoustic signal attributes and/or to perceptual features of the natural stimuli, such as the degree of perceived recognition or intentional understanding. Overall, these results supported a neurobiological theoretical framework for how the mammalian brain may be fundamentally organized to process acoustically and acoustic-semantically distinct categories of ethologically valid, real-world sounds. PMID:28111538
The Role of Auditory Feedback in the Encoding of Paralinguistic Responses.

ERIC Educational Resources Information Center

Plazewski, Joseph G.; Allen, Vernon L.

Twenty college students participated in an examination of the role of auditory feedback in the encoding of paralinguistic affect by adults. A dependent measure indicating the accuracy of paralinguistic communication of affect was obtained by comparing the level of affect that encoders intended to produce with ratings of vocal intonations from…
Are 50-kHz calls used as play signals in the playful interactions of rats? II. Evidence from the effects of devocalization.

PubMed

Kisko, Theresa M; Himmler, Brett T; Himmler, Stephanie M; Euston, David R; Pellis, Sergio M

2015-02-01

During playful interactions, juvenile rats emit many 50-kHz ultrasonic vocalizations, which are associated with a positive affective state. In addition, these calls may also serve a communicative role - as play signals that promote playful contact. Consistent with this hypothesis, a previous study found that vocalizations are more frequent prior to playful contact than after contact is terminated. The present study uses devocalized rats to test three predictions arising from the play signals hypothesis. First, if vocalizations are used to facilitate contact, then in pairs of rats in which one is devocalized, the higher frequency of pre-contact calling should only be present when the intact rat is initiating the approach. Second, when both partners in a playing pair are devocalized, the frequency of play should be reduced and the typical pattern of playful wrestling disrupted. Finally, when given a choice to play with a vocal and a non-vocal partner, rats should prefer to play with the one able to vocalize. The second prediction was supported in that the frequency of playful interactions as well as some typical patterns of play was disrupted. Even though the data for the other two predictions did not produce the expected findings, they support the conclusion that, in rats, 50-kHz calls are likely to function to maintain a playful mood and for them to signal to one another during play fighting. Copyright © 2014 Elsevier B.V. All rights reserved.
The Effects of Hormonal Contraception on the Voice: History of Its Evolution in the Literature.

PubMed

Rodney, Jennifer P; Sataloff, Robert Thayer

2016-11-01

Women of reproductive age commonly use hormonal contraceptives, the vocal effects of which have been studied. Otolaryngologists should be aware of this relationship to make recommendations on hormonal contraception as it relates to each patient's voice requirements. A comprehensive literature review of PubMed was completed. The terms "contraception," "vocal folds," "vocal cords," and "voice" were searched in various combinations. Articles from 1971 to 2015 that addressed the effects of contraception on the vocal folds were included. In total, 24 articles were available for review. Historically, contraception was believed to affect the voice negatively. However, more recent studies using low-dose oral contraceptive pills (OCPs) show that they stabilize the voice. However, stabilization generally occurs only during sustained vowel production; connected speech appears unaffected. Therefore, singers may be the only population that experiences clinically increased vocal stability as a result of taking hormonal contraceptives. Only combined OCPs have been studied; other forms of hormonal contraception have not been evaluated for effects on the voice. Significant variability exists between studies in the physical attributes of patients and parameters tested. Hormonal contraception likely has no clinically perceptible effects on the speaking voice. Singers may experience increased vocal stability with low-dose, combined OCP use. Other available forms of contraception have not been studied. Greater consistency in methodology is needed in future research, and other forms of hormonal contraception require study. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Intraoperative laryngeal electromyography in children with vocal fold immobility: a simplified technique.

PubMed

Scott, Andrew R; Chong, Peter Siao Tick; Randolph, Gregory W; Hartnick, Christopher J

2008-01-01

The primary objective of this study was to determine whether a simplified technique for intraoperative laryngeal electromyography was feasible using standard nerve integrity monitoring electrodes and audiovisual digital recording equipment. Our secondary objective was to determine if laryngeal electromyography data provided any additional information that significantly influenced patient management. Between February 2006 and February 2007, 10 children referred to our institution with vocal fold immobility underwent intraoperative laryngeal electromyography of the thyroarytenoid muscles. A retrospective chart review of these 10 patients was performed after institutional review board approval. Standard nerve integrity monitoring electrodes can be used to perform intraoperative laryngeal electromyography of the thyroarytenoid muscles in children. In 5 of 10 cases reviewed, data from laryngeal electromyography recordings meaningfully influenced the care of children with vocal fold immobility and affected clinical decision-making, sometimes altering management strategies. In the remaining 5 children, data supported clinical impressions but did not alter treatment plans. Two children with idiopathic bilateral vocal fold paralysis initially presented with a lack of electrical activity on one or both sides but went on to develop motor unit action potentials that preceded recovery of motion in both vocal folds. Our findings suggest that standard nerve monitoring equipment can be used to perform intraoperative laryngeal electromyography and that electromyographic data can assist clinicians in the management of complex patients. Additionally, there may be a role for the use of serial intraoperative measurements in predicting recovery from vocal fold paralysis in the pediatric age group.
Seasonal plasticity of auditory saccular sensitivity in the vocal plainfin midshipman fish, Porichthys notatus.

PubMed

Sisneros, Joseph A

2009-08-01

The plainfin midshipman fish, Porichthys notatus, is a seasonally breeding species of marine teleost fish that generates acoustic signals for intraspecific social and reproductive-related communication. Female midshipman use the inner ear saccule as the main acoustic endorgan for hearing to detect and locate vocalizing males that produce multiharmonic advertisement calls during the breeding season. Previous work showed that the frequency sensitivity of midshipman auditory saccular afferents changed seasonally with female reproductive state such that summer reproductive females became better suited than winter nonreproductive females to encode the dominant higher harmonics of the male advertisement calls. The focus of this study was to test the hypothesis that seasonal reproductive-dependent changes in saccular afferent tuning is paralleled by similar changes in saccular sensitivity at the level of the hair-cell receptor. Here, I examined the evoked response properties of midshipman saccular hair cells from winter nonreproductive and summer reproductive females to determine if reproductive state affects the frequency response and threshold of the saccule to behaviorally relevant single tone stimuli. Saccular potentials were recorded from populations of hair cells in vivo while sound was presented by an underwater speaker. Results indicate that saccular hair cells from reproductive females had thresholds that were approximately 8 to 13 dB lower than nonreproductive females across a broad range of frequencies that included the dominant higher harmonic components and the fundamental frequency of the male's advertisement call. These seasonal-reproductive-dependent changes in thresholds varied differentially across the three (rostral, middle, and caudal) regions of the saccule. Such reproductive-dependent changes in saccule sensitivity may represent an adaptive plasticity of the midshipman auditory sense to enhance mate detection, recognition, and localization during the breeding season.
Sensory-motor interactions for vocal pitch monitoring in non-primary human auditory cortex.

PubMed

Greenlee, Jeremy D W; Behroozmand, Roozbeh; Larson, Charles R; Jackson, Adam W; Chen, Fangxiang; Hansen, Daniel R; Oya, Hiroyuki; Kawasaki, Hiroto; Howard, Matthew A

2013-01-01

The neural mechanisms underlying processing of auditory feedback during self-vocalization are poorly understood. One technique used to study the role of auditory feedback involves shifting the pitch of the feedback that a speaker receives, known as pitch-shifted feedback. We utilized a pitch shift self-vocalization and playback paradigm to investigate the underlying neural mechanisms of audio-vocal interaction. High-resolution electrocorticography (ECoG) signals were recorded directly from auditory cortex of 10 human subjects while they vocalized and received brief downward (-100 cents) pitch perturbations in their voice auditory feedback (speaking task). ECoG was also recorded when subjects passively listened to playback of their own pitch-shifted vocalizations. Feedback pitch perturbations elicited average evoked potential (AEP) and event-related band power (ERBP) responses, primarily in the high gamma (70-150 Hz) range, in focal areas of non-primary auditory cortex on superior temporal gyrus (STG). The AEPs and high gamma responses were both modulated by speaking compared with playback in a subset of STG contacts. From these contacts, a majority showed significant enhancement of high gamma power and AEP responses during speaking while the remaining contacts showed attenuated response amplitudes. The speaking-induced enhancement effect suggests that engaging the vocal motor system can modulate auditory cortical processing of self-produced sounds in such a way as to increase neural sensitivity for feedback pitch error detection. It is likely that mechanisms such as efference copies may be involved in this process, and modulation of AEP and high gamma responses imply that such modulatory effects may affect different cortical generators within distinctive functional networks that drive voice production and control.
Sensory-Motor Interactions for Vocal Pitch Monitoring in Non-Primary Human Auditory Cortex

PubMed Central

Larson, Charles R.; Jackson, Adam W.; Chen, Fangxiang; Hansen, Daniel R.; Oya, Hiroyuki; Kawasaki, Hiroto; Howard, Matthew A.

2013-01-01

The neural mechanisms underlying processing of auditory feedback during self-vocalization are poorly understood. One technique used to study the role of auditory feedback involves shifting the pitch of the feedback that a speaker receives, known as pitch-shifted feedback. We utilized a pitch shift self-vocalization and playback paradigm to investigate the underlying neural mechanisms of audio-vocal interaction. High-resolution electrocorticography (ECoG) signals were recorded directly from auditory cortex of 10 human subjects while they vocalized and received brief downward (−100 cents) pitch perturbations in their voice auditory feedback (speaking task). ECoG was also recorded when subjects passively listened to playback of their own pitch-shifted vocalizations. Feedback pitch perturbations elicited average evoked potential (AEP) and event-related band power (ERBP) responses, primarily in the high gamma (70–150 Hz) range, in focal areas of non-primary auditory cortex on superior temporal gyrus (STG). The AEPs and high gamma responses were both modulated by speaking compared with playback in a subset of STG contacts. From these contacts, a majority showed significant enhancement of high gamma power and AEP responses during speaking while the remaining contacts showed attenuated response amplitudes. The speaking-induced enhancement effect suggests that engaging the vocal motor system can modulate auditory cortical processing of self-produced sounds in such a way as to increase neural sensitivity for feedback pitch error detection. It is likely that mechanisms such as efference copies may be involved in this process, and modulation of AEP and high gamma responses imply that such modulatory effects may affect different cortical generators within distinctive functional networks that drive voice production and control. PMID:23577157
JS-X syndrome: A multiple congenital malformation with vocal cord paralysis, ear deformity, hearing loss, shoulder musculature underdevelopment, and X-linked recessive inheritance.

PubMed

Hoeve, Hans L J; Brooks, Alice S; Smit, Liesbeth S

2015-07-01

We report on a family with a not earlier described multiple congenital malformation. Several male family members suffer from laryngeal obstruction caused by bilateral vocal cord paralysis, outer and middle ear deformity with conductive and sensorineural hearing loss, facial dysmorphisms, and underdeveloped shoulder musculature. The affected female members only have middle ear deformity and hearing loss. The pedigree is suggestive of an X-linked recessive inheritance pattern. SNP-array revealed a deletion and duplication on Xq28 in the affected family members. A possible aetiology is a neurocristopathy with most symptoms expressed in structures derived from branchial arches. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Song evolution, speciation, and vocal learning in passerine birds.

PubMed

Mason, Nicholas A; Burns, Kevin J; Tobias, Joseph A; Claramunt, Santiago; Seddon, Nathalie; Derryberry, Elizabeth P

2017-03-01

Phenotypic divergence can promote reproductive isolation and speciation, suggesting a possible link between rates of phenotypic evolution and the tempo of speciation at multiple evolutionary scales. To date, most macroevolutionary studies of diversification have focused on morphological traits, whereas behavioral traits─including vocal signals─are rarely considered. Thus, although behavioral traits often mediate mate choice and gene flow, we have a limited understanding of how behavioral evolution contributes to diversification. Furthermore, the developmental mode by which behavioral traits are acquired may affect rates of behavioral evolution, although this hypothesis is seldom tested in a phylogenetic framework. Here, we examine evidence for rate shifts in vocal evolution and speciation across two major radiations of codistributed passerines: one oscine clade with learned songs (Thraupidae) and one suboscine clade with innate songs (Furnariidae). We find that evolutionary bursts in rates of speciation and song evolution are coincident in both thraupids and furnariids. Further, overall rates of vocal evolution are higher among taxa with learned rather than innate songs. Taken together, these findings suggest an association between macroevolutionary bursts in speciation and vocal evolution, and that the tempo of behavioral evolution can be influenced by variation in developmental modes among lineages. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
An acoustic analysis of laughter produced by congenitally deaf and normally hearing college students1

PubMed Central

Makagon, Maja M.; Funayama, E. Sumie; Owren, Michael J.

2008-01-01

Relatively few empirical data are available concerning the role of auditory experience in nonverbal human vocal behavior, such as laughter production. This study compared the acoustic properties of laughter in 19 congenitally, bilaterally, and profoundly deaf college students and in 23 normally hearing control participants. Analyses focused on degree of voicing, mouth position, air-flow direction, temporal features, relative amplitude, fundamental frequency, and formant frequencies. Results showed that laughter produced by the deaf participants was fundamentally similar to that produced by the normally hearing individuals, which in turn was consistent with previously reported findings. Finding comparable acoustic properties in the sounds produced by deaf and hearing vocalizers confirms the presumption that laughter is importantly grounded in human biology, and that auditory experience with this vocalization is not necessary for it to emerge in species-typical form. Some differences were found between the laughter of deaf and hearing groups; the most important being that the deaf participants produced lower-amplitude and longer-duration laughs. These discrepancies are likely due to a combination of the physiological and social factors that routinely affect profoundly deaf individuals, including low overall rates of vocal fold use and pressure from the hearing world to suppress spontaneous vocalizations. PMID:18646991
PubMed Central

SCHINDLER, A.; MOZZANICA, F.; GINOCCHIO, D.; MARUZZI, P.; ATAC, M.; OTTAVIANI, F.

2012-01-01

SUMMARY Benign vocal fold lesions are common in the general population, and have important public health implications and impact on patient quality of life. Nowadays, phonomicrosurgery is the most common treatment of these lesions. Voice therapy is generally associated in order to minimize detrimental vocal behaviours that increase the stress at the mid-membranous vocal folds. Nonetheless, the most appropriate standard of care for treating benign vocal fold lesion has not been established. The aim of this study was to analyze voice changes in a group of dysphonic patients affected by benign vocal fold lesions, evaluated with a multidimensional protocol before and after voice therapy. Sixteen consecutive patients, 12 females and 4 males, with a mean age of 49.7 years were enrolled. Each subject had 10 voice therapy sessions with an experienced speech/language pathologist for a period of 1-2 months, and was evaluated before and at the end of voice therapy with a multidimensional protocol that included self-assessment measures and videostroboscopic, perceptual, aerodynamic and acoustic ratings. Videostroboscopic examination did not reveal resolution of the initial pathology in any case. No improvement was observed in aerodynamic and perceptual ratings. A clear and significant improvement was visible on Wilcoxon signed-rank test for the mean values of Jitt%, Noise to Harmonic Ratio (NHR) and Voice Handicap Index (VHI) scores. Even if it is possible that, for benign vocal fold lesions, only a minor improvement of voice quality can be achieved after voice therapy, rehabilitation treatment still seems useful as demonstrated by improvement in self-assessment measures. If voice therapy is provided as an initial treatment to the patients with benign vocal fold lesions, this may lead to an improvement in the perceived voice quality, making surgical intervention unnecessary. This is one of the first reports on the efficacy of voice therapy in the management of benign vocal fold lesions; further studies are needed to confirm these preliminary data. PMID:23326009
Cultural relativity in perceiving emotion from vocalizations

PubMed Central

Gendron, Maria; Roberson, Debi; van der Vyver, Jacoba Marietta; Barrett, Lisa Feldman

2014-01-01

A central question in the study of human behavior is whether or not certain emotions, such as anger, fear and sadness, are recognized across cultures in non-verbal cues. We predicted and found that in a concept-free experimental task, participants from an isolated cultural context (the Himba ethnic group from Northwest Namibia) do not freely label Western vocalizations with expected emotion terms. Responses indicated Himba participants perceived more basic affective properties of valence (positivity or negativity) and to some extent arousal (high or low activation). In a second concept-embedded task, we manipulated whether a given trial could be solved using only affective content or discrete emotion content based on the foil choice. Above chance accuracy in Himba participants occurred only when foils differed from targets in valence, indicating that the voice can reliably convey affective meaning across cultures, but that perceptions of emotion from the voice are culturally variable. PMID:24501109
The Lombard Effect in Choral Singing

NASA Astrophysics Data System (ADS)

Tonkinson, Steven E.

The Lombard effect is a phenomenon in which a speaker or singer involuntarily raises his or her vocal intensity in the presence of high levels of sound. Many Lombard studies have been published in relation to the speaking voice but very little on the singing voice. A strong reliance upon auditory feedback seems to be a key factor in causing the Lombard effect and research has suggested that singers with more experience and training, especially soloists, do not rely as much on auditory feedback. The purpose of this study was to compare selected vocal intensity response level readings of adult singers, with varying amounts of training, before and after verbal instructions to resist the Lombard effect in choral singing. Choral singers seem especially susceptible because of the nature of the choral environment with its strong masking effect and because of a relative lack of training in voice management. Twenty-seven subjects were asked to sing the national anthem along with a choir heard through headphones. After some brief instructions to resist increasing vocal intensity as the choir increased, each subject sang once again. The performances were recorded and vocal intensity (dB SPL) readings from selected places in the song were obtained from a graphic level recorder chart and analyzed for statistical significance. A 3 x 3 x 2 multiple analysis of variance procedure was performed on the scores, the main factors being experience level, pretest-posttest differences, and places in the song. The questions to be answered by the study were: (1) Do varying levels of experience affect to a significant degree to Lombard effect in choral singing, and (2) Do instructions to maintain a constant level of vocal intensity affect to a significant degree the Lombard effect in singers of varying levels of experience. No significant difference (.05 level) for levels of experience was observed. The effect of the instructions, however, was significant (p <.05) and suggested that choral singers can learn to resist the Lombard effect and consciously regulate their vocal intensity to some extent in the face of masking sound. Choral directors are encouraged to help singers in this regard and inculcate principles of good voice management.
Rhythm generation, coordination, and initiation in the vocal pathways of male African clawed frogs

PubMed Central

Cavin Barnes, Jessica; Appleby, Todd

2016-01-01

Central pattern generators (CPGs) in the brain stem are considered to underlie vocalizations in many vertebrate species, but the detailed mechanisms underlying how motor rhythms are generated, coordinated, and initiated remain unclear. We addressed these issues using isolated brain preparations of Xenopus laevis from which fictive vocalizations can be elicited. Advertisement calls of male X. laevis that consist of fast and slow trills are generated by vocal CPGs contained in the brain stem. Brain stem central vocal pathways consist of a premotor nucleus [dorsal tegmental area of medulla (DTAM)] and a laryngeal motor nucleus [a homologue of nucleus ambiguus (n.IX-X)] with extensive reciprocal connections between the nuclei. In addition, DTAM receives descending inputs from the extended amygdala. We found that unilateral transection of the projections between DTAM and n.IX-X eliminated premotor fictive fast trill patterns but did not affect fictive slow trills, suggesting that the fast and slow trill CPGs are distinct; the slow trill CPG is contained in n.IX-X, and the fast trill CPG spans DTAM and n.IX-X. Midline transections that eliminated the anterior, posterior, or both commissures caused no change in the temporal structure of fictive calls, but bilateral synchrony was lost, indicating that the vocal CPGs are contained in the lateral halves of the brain stem and that the commissures synchronize the two oscillators. Furthermore, the elimination of the inputs from extended amygdala to DTAM, in addition to the anterior commissure, resulted in autonomous initiation of fictive fast but not slow trills by each hemibrain stem, indicating that the extended amygdala provides a bilateral signal to initiate fast trills. NEW & NOTEWORTHY Central pattern generators (CPGs) are considered to underlie vocalizations in many vertebrate species, but the detailed mechanisms underlying their functions remain unclear. We addressed this question using an isolated brain preparation of African clawed frogs. We discovered that two vocal phases are mediated by anatomically distinct CPGs, that there are a pair of CPGs contained in the left and right half of the brain stem, and that mechanisms underlying initiation of the two vocal phases are distinct. PMID:27760822
Competitive pressures affect sexual signal complexity in Kurixalus odontotarsus: insights into the evolution of compound calls

PubMed Central

2017-01-01

ABSTRACT Male-male vocal competition in anuran species is critical for mating success; however, it is also energetically demanding and highly time-consuming. Thus, we hypothesized that males may change signal elaboration in response to competition in real time. Male serrate-legged small treefrogs (Kurixalus odontotarsus) produce compound calls that contain two kinds of notes, harmonic sounds called ‘A notes’ and short broadband sounds called ‘B notes’. Using male evoked vocal response experiments, we found that competition influences the temporal structure and complexity of vocal signals produced by males. Males produce calls with a higher ratio of notes:call, and more compound calls including more A notes but fewer B notes with contest escalation. In doing so, males minimize the energy costs and maximize the benefits of competition when the level of competition is high. This means that the evolution of sexual signal complexity in frogs may be susceptible to selection for plasticity related to adjusting performance to the pressures of competition, and supports the idea that more complex social contexts can lead to greater vocal complexity. PMID:29175862
Aerosol emission during human speech

NASA Astrophysics Data System (ADS)

Asadi, Sima; Wexler, Anthony S.; Cappa, Christopher D.; Bouvier, Nicole M.; Barreda-Castanon, Santiago; Ristenpart, William D.

2017-11-01

We show that the rate of aerosol particle emission during healthy human speech is strongly correlated with the loudness (amplitude) of vocalization. Emission rates range from approximately 1 to 50 particles per second for quiet to loud amplitudes, regardless of language spoken (English, Spanish, Mandarin, or Arabic). Intriguingly, a small fraction of individuals behave as ``super emitters,'' consistently emitting an order of magnitude more aerosol particles than their peers. We interpret the results in terms of the eggressive flowrate during vocalization, which is known to vary significantly for different types of vocalization and for different individuals. The results suggest that individual speech patterns could affect the probability of airborne disease transmission. The results also provide a possible explanation for the existence of ``super spreaders'' who transmit pathogens much more readily than average and who play a key role in the spread of epidemics.
Applying Affect Recognition in Serious Games: The PlayMancer Project

NASA Astrophysics Data System (ADS)

Ben Moussa, Maher; Magnenat-Thalmann, Nadia

This paper presents an overview and the state-of-art in the applications of 'affect' recognition in serious games for the support of patients in behavioral and mental disorder treatments and chronic pain rehabilitation, within the framework of the European project PlayMancer. Three key technologies are discussed relating to facial affect recognition, fusion of different affect recognition methods, and the application of affect recognition in serious games.
A Multidimensional Approach to the Study of Emotion Recognition in Autism Spectrum Disorders

PubMed Central

Xavier, Jean; Vignaud, Violaine; Ruggiero, Rosa; Bodeau, Nicolas; Cohen, David; Chaby, Laurence

2015-01-01

Although deficits in emotion recognition have been widely reported in autism spectrum disorder (ASD), experiments have been restricted to either facial or vocal expressions. Here, we explored multimodal emotion processing in children with ASD (N = 19) and with typical development (TD, N = 19), considering uni (faces and voices) and multimodal (faces/voices simultaneously) stimuli and developmental comorbidities (neuro-visual, language and motor impairments). Compared to TD controls, children with ASD had rather high and heterogeneous emotion recognition scores but showed also several significant differences: lower emotion recognition scores for visual stimuli, for neutral emotion, and a greater number of saccades during visual task. Multivariate analyses showed that: (1) the difficulties they experienced with visual stimuli were partially alleviated with multimodal stimuli. (2) Developmental age was significantly associated with emotion recognition in TD children, whereas it was the case only for the multimodal task in children with ASD. (3) Language impairments tended to be associated with emotion recognition scores of ASD children in the auditory modality. Conversely, in the visual or bimodal (visuo-auditory) tasks, the impact of developmental coordination disorder or neuro-visual impairments was not found. We conclude that impaired emotion processing constitutes a dimension to explore in the field of ASD, as research has the potential to define more homogeneous subgroups and tailored interventions. However, it is clear that developmental age, the nature of the stimuli, and other developmental comorbidities must also be taken into account when studying this dimension. PMID:26733928
Synthetic, multi-layer, self-oscillating vocal fold model fabrication.

PubMed

Murray, Preston R; Thomson, Scott L

2011-12-02

Sound for the human voice is produced via flow-induced vocal fold vibration. The vocal folds consist of several layers of tissue, each with differing material properties. Normal voice production relies on healthy tissue and vocal folds, and occurs as a result of complex coupling between aerodynamic, structural dynamic, and acoustic physical phenomena. Voice disorders affect up to 7.5 million annually in the United States alone and often result in significant financial, social, and other quality-of-life difficulties. Understanding the physics of voice production has the potential to significantly benefit voice care, including clinical prevention, diagnosis, and treatment of voice disorders. Existing methods for studying voice production include in vivo experimentation using human and animal subjects, in vitro experimentation using excised larynges and synthetic models, and computational modeling. Owing to hazardous and difficult instrument access, in vivo experiments are severely limited in scope. Excised larynx experiments have the benefit of anatomical and some physiological realism, but parametric studies involving geometric and material property variables are limited. Further, they are typically only able to be vibrated for relatively short periods of time (typically on the order of minutes). Overcoming some of the limitations of excised larynx experiments, synthetic vocal fold models are emerging as a complementary tool for studying voice production. Synthetic models can be fabricated with systematic changes to geometry and material properties, allowing for the study of healthy and unhealthy human phonatory aerodynamics, structural dynamics, and acoustics. For example, they have been used to study left-right vocal fold asymmetry, clinical instrument development, laryngeal aerodynamics, vocal fold contact pressure, and subglottal acoustics (a more comprehensive list can be found in Kniesburges et al.) Existing synthetic vocal fold models, however, have either been homogenous (one-layer models) or have been fabricated using two materials of differing stiffness (two-layer models). This approach does not allow for representation of the actual multi-layer structure of the human vocal folds that plays a central role in governing vocal fold flow-induced vibratory response. Consequently, one- and two-layer synthetic vocal fold models have exhibited disadvantages such as higher onset pressures than what are typical for human phonation (onset pressure is the minimum lung pressure required to initiate vibration), unnaturally large inferior-superior motion, and lack of a "mucosal wave" (a vertically-traveling wave that is characteristic of healthy human vocal fold vibration). In this paper, fabrication of a model with multiple layers of differing material properties is described. The model layers simulate the multi-layer structure of the human vocal folds, including epithelium, superficial lamina propria (SLP), intermediate and deep lamina propria (i.e., ligament; a fiber is included for anterior-posterior stiffness), and muscle (i.e., body) layers. Results are included that show that the model exhibits improved vibratory characteristics over prior one- and two-layer synthetic models, including onset pressure closer to human onset pressure, reduced inferior-superior motion, and evidence of a mucosal wave.
Improving Speaker Recognition by Biometric Voice Deconstruction

PubMed Central

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245
Voice recognition products-an occupational risk for users with ULDs?

PubMed

Williams, N R

2003-10-01

Voice recognition systems (VRS) allow speech to be converted both directly into text-which appears on the screen of a computer-and to direct equipment to perform specific functions. Suggested applications are many and varied, including increasing efficiency in the reporting of radiographs, allowing directed surgery and enabling individuals with upper limb disorders (ULDs) who cannot use other input devices, such as keyboards and mice, to carry out word processing and other activities. Aim This paper describes four cases of vocal dysfunction related to the use of such software, which have been identified from the database of the Voice and Speech Laboratory of the Massachusetts Eye and Ear infirmary (MEEI). The database was searched using key words 'voice recognition' and four cases were identified from a total of 4800. In all cases, the VRS was supplied to assist individuals with ULDs who could not use conventional input devices. Case reports illustrate time of onset and symptoms experienced. The cases illustrate the need for risk assessment and consideration of the ergonomic aspects of voice use prior to such adaptations being used, particularly in those who already experience work-related ULDs.
The recognition of female voice based on voice registers in singing techniques in real-time using hankel transform method and macdonald function

NASA Astrophysics Data System (ADS)

Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.

2018-03-01

A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.
Improving Speaker Recognition by Biometric Voice Deconstruction.

PubMed

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.
Tourette Syndrome

MedlinePlus

... affects a person's central nervous system and causes tics (movements or sounds that a person can't ... over and over). There are two kinds of tics — motor tics and vocal tics . Motor tics are ...
Source levels of social sounds in migrating humpback whales (Megaptera novaeangliae).

PubMed

Dunlop, Rebecca A; Cato, Douglas H; Noad, Michael J; Stokes, Dale M

2013-07-01

The source level of an animal sound is important in communication, since it affects the distance over which the sound is audible. Several measurements of source levels of whale sounds have been reported, but the accuracy of many is limited because the distance to the source and the acoustic transmission loss were estimated rather than measured. This paper presents measurements of source levels of social sounds (surface-generated and vocal sounds) of humpback whales from a sample of 998 sounds recorded from 49 migrating humpback whale groups. Sources were localized using a wide baseline five hydrophone array and transmission loss was measured for the site. Social vocalization source levels were found to range from 123 to 183 dB re 1 μPa @ 1 m with a median of 158 dB re 1 μPa @ 1 m. Source levels of surface-generated social sounds ("breaches" and "slaps") were narrower in range (133 to 171 dB re 1 μPa @ 1 m) but slightly higher in level (median of 162 dB re 1 μPa @ 1 m) compared to vocalizations. The data suggest that group composition has an effect on group vocalization source levels in that singletons and mother-calf-singing escort groups tend to vocalize at higher levels compared to other group compositions.
Affective divergence: automatic responses to others' emotions depend on group membership.

PubMed

Weisbuch, Max; Ambady, Nalini

2008-11-01

Extant research suggests that targets' emotion expressions automatically evoke similar affect in perceivers. The authors hypothesized that the automatic impact of emotion expressions depends on group membership. In Experiments 1 and 2, an affective priming paradigm was used to measure immediate and preconscious affective responses to same-race or other-race emotion expressions. In Experiment 3, spontaneous vocal affect was measured as participants described the emotions of an ingroup or outgroup sports team fan. In these experiments, immediate and spontaneous affective responses depended on whether the emotional target was ingroup or outgroup. Positive responses to fear expressions and negative responses to joy expressions were observed in outgroup perceivers, relative to ingroup perceivers. In Experiments 4 and 5, discrete emotional responses were examined. In a lexical decision task (Experiment 4), facial expressions of joy elicited fear in outgroup perceivers, relative to ingroup perceivers. In contrast, facial expressions of fear elicited less fear in outgroup than in ingroup perceivers. In Experiment 5, felt dominance mediated emotional responses to ingroup and outgroup vocal emotion. These data support a signal-value model in which emotion expressions signal environmental conditions. (c) 2008 APA, all rights reserved.
Cacna1c haploinsufficiency leads to pro-social 50-kHz ultrasonic communication deficits in rats.

PubMed

Kisko, Theresa M; Braun, Moria D; Michels, Susanne; Witt, Stephanie H; Rietschel, Marcella; Culmsee, Carsten; Schwarting, Rainer K W; Wöhr, Markus

2018-06-20

The cross-disorder risk gene CACNA1C is strongly implicated in multiple neuropsychiatric disorders, including autism spectrum disorder (ASD), bipolar disorder (BPD) and schizophrenia (SCZ), with deficits in social functioning being common for all major neuropsychiatric disorders. In the present study, we explored the role of Cacna1c in regulating disorder-relevant behavioral phenotypes, focusing on socio-affective communication after weaning during the critical developmental period of adolescence in rats. To this aim, we used a newly developed genetic Cacna1c rat model and applied a truly reciprocal approach for studying communication through ultrasonic vocalizations, including both sender and receiver. Our results show that a deletion of Cacna1c leads to deficits in social behavior and pro-social 50-kHz ultrasonic communication in rats. Reduced levels of 50-kHz ultrasonic vocalizations emitted during rough-and-tumble play may suggest that Cacna1c haploinsufficient rats derive less reward from playful social interactions. Besides the emission of fewer 50-kHz ultrasonic vocalizations in the sender, Cacna1c deletion reduced social approach behavior elicited by playback of 50-kHz ultrasonic vocalizations. This indicates that Cacna1c haploinsufficiency has detrimental effects on 50-kHz ultrasonic communication in both sender and receiver. Together, these data suggest that Cacna1c plays a prominent role in regulating socio-affective communication in rats with relevance for ASD, BPD and SCZ.This article has an associated First Person interview with the first author of the paper. © 2018. Published by The Company of Biologists Ltd.
Perceptual Detection of Subtle Dysphonic Traits in Individuals with Cervical Spinal Cord Injury Using an Audience Response Systems Approach.

PubMed

Johansson, Kerstin; Strömbergsson, Sofia; Robieux, Camille; McAllister, Anita

2017-01-01

Reduced respiratory function following lower cervical spinal cord injuries (CSCIs) may indirectly result in vocal dysfunction. Although self-reports indicate voice change and limitations following CSCI, earlier efforts using global perceptual ratings to distinguish speakers with CSCI from noninjured speakers have not been very successful. We investigate the use of an audience response system-based approach to distinguish speakers with CSCI from noninjured speakers, and explore whether specific vocal traits can be identified as characteristic for speakers with CSCI. Fourteen speech-language pathologists participated in a web-based perceptual task, where their overt reactions to vocal dysfunction were registered during the continuous playback of recordings of 36 speakers (18 with CSCI, and 18 matched controls). Dysphonic events were identified through manual perceptual analysis, to allow the exploration of connections between dysphonic events and listener reactions. More dysphonic events, and more listener reactions, were registered for speakers with CSCI than for noninjured speakers. Strain (particularly in phrase-final position) and creak (particularly in nonphrase-final position) distinguish speakers with CSCI from noninjured speakers. For the identification of intermittent and subtle signs of vocal dysfunction, an approach where the temporal distribution of symptoms is registered offers a viable means to distinguish speakers affected by voice dysfunction from non-affected speakers. In speakers with CSCI, clinicians should listen for presence of final strain and nonfinal creak, and pay attention to self-reported voice function and voice problems, to identify individuals in need for clinical assessment and intervention. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Emotion recognition in early Parkinson's disease patients undergoing deep brain stimulation or dopaminergic therapy: a comparison to healthy participants.

PubMed

McIntosh, Lindsey G; Mannava, Sishir; Camalier, Corrie R; Folley, Bradley S; Albritton, Aaron; Konrad, Peter E; Charles, David; Park, Sohee; Neimat, Joseph S

2014-01-01

Parkinson's disease (PD) is traditionally regarded as a neurodegenerative movement disorder, however, nigrostriatal dopaminergic degeneration is also thought to disrupt non-motor loops connecting basal ganglia to areas in frontal cortex involved in cognition and emotion processing. PD patients are impaired on tests of emotion recognition, but it is difficult to disentangle this deficit from the more general cognitive dysfunction that frequently accompanies disease progression. Testing for emotion recognition deficits early in the disease course, prior to cognitive decline, better assesses the sensitivity of these non-motor corticobasal ganglia-thalamocortical loops involved in emotion processing to early degenerative change in basal ganglia circuits. In addition, contrasting this with a group of healthy aging individuals demonstrates changes in emotion processing specific to the degeneration of basal ganglia circuitry in PD. Early PD patients (EPD) were recruited from a randomized clinical trial testing the safety and tolerability of deep brain stimulation (DBS) of the subthalamic nucleus (STN-DBS) in early-staged PD. EPD patients were previously randomized to receive optimal drug therapy only (ODT), or drug therapy plus STN-DBS (ODT + DBS). Matched healthy elderly controls (HEC) and young controls (HYC) also participated in this study. Participants completed two control tasks and three emotion recognition tests that varied in stimulus domain. EPD patients were impaired on all emotion recognition tasks compared to HEC. Neither therapy type (ODT or ODT + DBS) nor therapy state (ON/OFF) altered emotion recognition performance in this study. Finally, HEC were impaired on vocal emotion recognition relative to HYC, suggesting a decline related to healthy aging. This study supports the existence of impaired emotion recognition early in the PD course, implicating an early disruption of fronto-striatal loops mediating emotional function.
Inconsistent emotion recognition deficits across stimulus modalities in Huntington׳s disease.

PubMed

Rees, Elin M; Farmer, Ruth; Cole, James H; Henley, Susie M D; Sprengelmeyer, Reiner; Frost, Chris; Scahill, Rachael I; Hobbs, Nicola Z; Tabrizi, Sarah J

2014-11-01

Recognition of negative emotions is impaired in Huntington׳s Disease (HD). It is unclear whether these emotion-specific problems are driven by dissociable cognitive deficits, emotion complexity, test cue difficulty, or visuoperceptual impairments. This study set out to further characterise emotion recognition in HD by comparing patterns of deficits across stimulus modalities; notably including for the first time in HD, the more ecologically and clinically relevant modality of film clips portraying dynamic facial expressions. Fifteen early HD and 17 control participants were tested on emotion recognition from static facial photographs, non-verbal vocal expressions and one second dynamic film clips, all depicting different emotions. Statistically significant evidence of impairment of anger, disgust and fear recognition was seen in HD participants compared with healthy controls across multiple stimulus modalities. The extent of the impairment, as measured by the difference in the number of errors made between HD participants and controls, differed according to the combination of emotion and modality (p=0.013, interaction test). The largest between-group difference was seen in the recognition of anger from film clips. Consistent with previous reports, anger, disgust and fear were the most poorly recognised emotions by the HD group. This impairment did not appear to be due to task demands or expression complexity as the pattern of between-group differences did not correspond to the pattern of errors made by either group; implicating emotion-specific cognitive processing pathology. There was however evidence that the extent of emotion recognition deficits significantly differed between stimulus modalities. The implications in terms of designing future tests of emotion recognition and care giving are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
ChoiceKey: a real-time speech recognition program for psychology experiments with a small response set.

PubMed

Donkin, Christopher; Brown, Scott D; Heathcote, Andrew

2009-02-01

Psychological experiments often collect choice responses using buttonpresses. However, spoken responses are useful in many cases-for example, when working with special clinical populations, or when a paradigm demands vocalization, or when accurate response time measurements are desired. In these cases, spoken responses are typically collected using a voice key, which usually involves manual coding by experimenters in a tedious and error-prone manner. We describe ChoiceKey, an open-source speech recognition package for MATLAB. It can be optimized by training for small response sets and different speakers. We show ChoiceKey to be reliable with minimal training for most participants in experiments with two different responses. Problems presented by individual differences, and occasional atypical responses, are examined, and extensions to larger response sets are explored. The ChoiceKey source files and instructions may be downloaded as supplemental materials for this article from brm.psychonomic-journals.org/content/supplemental.
Vocal fold proteoglycans and their influence on biomechanics.

PubMed

Gray, S D; Titze, I R; Chan, R; Hammond, T H

1999-06-01

To examine the interstitial proteins of the vocal fold and their influence on the biomechanical properties of that tissue. Anatomic study of the lamina propria of human cadaveric vocal folds combined with some viscosity testing. Identification of proteoglycans is performed with histochemical staining. Quantitative analysis is performed using an image analysis system. A rheometer is used for viscosity testing. Three-dimensional rendering program is used for the computer images. Proteoglycans play an important role in tissue biomechanics. Hyaluronic acid is a key molecule that affects viscosity. The proteoglycans of the lamina propria have important biological and biomechanical effects. The role of hyaluronic acid in determining tissue viscosity is emphasized. Viscosity, its effect on phonatory threshold pressure and energy expended due to phonation is discussed. Proteoglycans, particularly hyaluronic acid, play important roles in determining biomechanical properties of tissue oscillation. Future research will likely make these proteins of important therapeutic interest.
Persistent dysphonia in two performers affecting the singing and projected speaking voice: a report on a collaborative approach to management.

PubMed

Baker, Janet

2002-01-01

The projected speaking voice and the singing voice are highly sensitive to external and internal influences, and teachers of spoken voice and singing are in a unique position to identify subtle and more serious vocal difficulties in their students. Persistent anomalies may herald early onset of changes in vocal fold structure, neurophysiological control, or emotional stability. Two cases are presented to illustrate the benefits of a collaborative approach to diagnosis and management. The first, a 21-year-old male drama and singing student with an abnormally high speaking voice and falsetto singing voice was found to have a psychogenic dysphonia referred to as "puberphonia" or "mutational falsetto". The second, a 34-year-old female alto with strained phonation and perceived stutter of the vocal folds was diagnosed with "adductor spasmodic dysphonia" or "focal laryngeal dystonia" of neurological origin.
Recording vocalizations with Bluetooth technology.

PubMed

Gaona-González, Andrés; Santillán-Doherty, Ana María; Arenas-Rosas, Rita Virginia; Muñoz-Delgado, Jairo; Aguillón-Pantaleón, Miguel Angel; Ordoñez-Gómez, José Domingo; Márquez-Arias, Alejandra

2011-06-01

We propose a method for capturing vocalizations that is designed to avoid some of the limiting factors found in traditional bioacoustical methods, such as the impossibility of obtaining continuous long-term registers or analyzing amplitude due to the continuous change of distance between the subject and the position of the recording system. Using Bluetooth technology, vocalizations are captured and transmitted wirelessly into a receiving system without affecting the quality of the signal. The recordings of the proposed system were compared to those obtained as a reference, which were based on the coding of the signal with the so-called pulse-code modulation technique in WAV audio format without any compressing process. The evaluation showed p < .05 for the measured quantitative and qualitative parameters. We also describe how the transmitting system is encapsulated and fixed on the animal and a way to video record a spider monkey's behavior simultaneously with the audio recordings.
In vitro experimental investigation of voice production

PubMed Central

Horáčcek, Jaromír; Brücker, Christoph; Becker, Stefan

2012-01-01

The process of human phonation involves a complex interaction between the physical domains of structural dynamics, fluid flow, and acoustic sound production and radiation. Given the high degree of nonlinearity of these processes, even small anatomical or physiological disturbances can significantly affect the voice signal. In the worst cases, patients can lose their voice and hence the normal mode of speech communication. To improve medical therapies and surgical techniques it is very important to understand better the physics of the human phonation process. Due to the limited experimental access to the human larynx, alternative strategies, including artificial vocal folds, have been developed. The following review gives an overview of experimental investigations of artificial vocal folds within the last 30 years. The models are sorted into three groups: static models, externally driven models, and self-oscillating models. The focus is on the different models of the human vocal folds and on the ways in which they have been applied. PMID:23181007
Clinical dysphagia risk predictors after prolonged orotracheal intubation

PubMed Central

de Medeiros, Gisele Chagas; Sassi, Fernanda Chiarion; Mangilli, Laura Davison; Zilberstein, Bruno; de Andrade, Claudia Regina Furquim

2014-01-01

OBJECTIVES: To elucidate independent risk factors for dysphagia after prolonged orotracheal intubation. METHODS: The participants were 148 consecutive patients who underwent clinical bedside swallowing assessments from September 2009 to September 2011. All patients had received prolonged orotracheal intubations and were admitted to one of several intensive care units of a large Brazilian school hospital. The correlations between the conducted water swallow test results and dysphagia risk levels were analyzed for statistical significance. RESULTS: Of the 148 patients included in the study, 91 were male and 57 were female (mean age, 53.64 years). The univariate analysis results indicated that specific variables, including extraoral loss, multiple swallows, cervical auscultation, vocal quality, cough, choking, and other signs, were possible significant high-risk indicators of dysphagia onset. The multivariate analysis results indicated that cervical auscultation and coughing were independent predictive variables for high dysphagia risk. CONCLUSIONS: Patients displaying extraoral loss, multiple swallows, cervical auscultation, vocal quality, cough, choking and other signs should benefit from early swallowing evaluations. Additionally, early post-extubation dysfunction recognition is paramount in reducing the morbidity rate in this high-risk population. PMID:24473554
Clinical dysphagia risk predictors after prolonged orotracheal intubation.

PubMed

Medeiros, Gisele Chagas de; Sassi, Fernanda Chiarion; Mangilli, Laura Davison; Zilberstein, Bruno; Andrade, Claudia Regina Furquim de

2014-01-01

To elucidate independent risk factors for dysphagia after prolonged orotracheal intubation. The participants were 148 consecutive patients who underwent clinical bedside swallowing assessments from September 2009 to September 2011. All patients had received prolonged orotracheal intubations and were admitted to one of several intensive care units of a large Brazilian school hospital. The correlations between the conducted water swallow test results and dysphagia risk levels were analyzed for statistical significance. Of the 148 patients included in the study, 91 were male and 57 were female (mean age, 53.64 years). The univariate analysis results indicated that specific variables, including extraoral loss, multiple swallows, cervical auscultation, vocal quality, cough, choking, and other signs, were possible significant high-risk indicators of dysphagia onset. The multivariate analysis results indicated that cervical auscultation and coughing were independent predictive variables for high dysphagia risk. Patients displaying extraoral loss, multiple swallows, cervical auscultation, vocal quality, cough, choking and other signs should benefit from early swallowing evaluations. Additionally, early post-extubation dysfunction recognition is paramount in reducing the morbidity rate in this high-risk population.
Oral and vocal fold diadochokinesis in dysphonic women.

PubMed

Louzada, Talita; Beraldinelle, Roberta; Berretin-Felix, Giédre; Brasolotto, Alcione Ghedini

2011-01-01

The evaluation of oral and vocal fold diadochokinesis (DDK) in individuals with voice disorders may contribute to the understanding of factors that affect the balanced vocal production. Scientific studies that make use of this assessment tool support the knowledge advance of this area, reflecting the development of more appropriate therapeutic planning. To compare the results of oral and vocal fold DDK in dysphonic women and in women without vocal disorders. For this study, 28 voice recordings of women from 19 to 54 years old, diagnosed with dysphonia and submitted to a voice assessment from speech pathologist and otorhinolaryngologist, were used. The control group included 30 nondysphonic women evaluated in prior research from normal adults. The analysis parameters like number and duration of emissions, as well as the regularity of the repetition of syllables "pa", "ta", "ka" and the vowels "a" and "i," were provided by the Advanced Motor Speech Profile program (MSP) Model-5141, version-2.5.2 (KayPentax). The DDK sequence "pataka" was analyzed quantitatively through the Sound Forge 7.0 program, as well as manually with the audio-visual help of sound waves. Average values of oral and vocal fold DDK dysphonic and nondysphonic women were compared using the "t Student" test and were considered significant when p<0.05. The findings showed no significant differences between populations; however, the coefficient of variation of period (CvP) and jitter of period (JittP) average of the "ka," "a" and "i" emissions, respectively, were higher in dysphonic women (CvP=10.42%, 12.79%, 12.05%; JittP=2.05%, 6.05%, 3.63%) compared to the control group (CvP=8.86%; 10.95%, 11.20%; JittP=1.82%, 2.98%, 3.15%). Although the results do not indicate any difficulties in oral and laryngeal motor control in the dysphonic group, the largest instability in vocal fold DDK in the experimental group should be considered, and studies of this ability in individuals with communication disorders must be intensified.
Dynamic Spectral Structure Specifies Vowels for Adults and Children

PubMed Central

Nittrouer, Susan; Lowenstein, Joanna H.

2014-01-01

The dynamic specification account of vowel recognition suggests that formant movement between vowel targets and consonant margins is used by listeners to recognize vowels. This study tested that account by measuring contributions to vowel recognition of dynamic (i.e., time-varying) spectral structure and coarticulatory effects on stationary structure. Adults and children (four-and seven-year-olds) were tested with three kinds of consonant-vowel-consonant syllables: (1) unprocessed; (2) sine waves that preserved both stationary coarticulated and dynamic spectral structure; and (3) vocoded signals that primarily preserved that stationary, but not dynamic structure. Sections of two lengths were removed from syllable middles: (1) half the vocalic portion; and (2) all but the first and last three pitch periods. Adults performed accurately with unprocessed and sine-wave signals, as long as half the syllable remained; their recognition was poorer for vocoded signals, but above chance. Seven-year-olds performed more poorly than adults with both sorts of processed signals, but disproportionately worse with vocoded than sine-wave signals. Most four-year-olds were unable to recognize vowels at all with vocoded signals. Conclusions were that both dynamic and stationary coarticulated structures support vowel recognition for adults, but children attend to dynamic spectral structure more strongly because early phonological organization favors whole words. PMID:25536845
The Contribution of Brainstem and Cerebellar Pathways to Auditory Recognition

PubMed Central

McLachlan, Neil M.; Wilson, Sarah J.

2017-01-01

The cerebellum has been known to play an important role in motor functions for many years. More recently its role has been expanded to include a range of cognitive and sensory-motor processes, and substantial neuroimaging and clinical evidence now points to cerebellar involvement in most auditory processing tasks. In particular, an increase in the size of the cerebellum over recent human evolution has been attributed in part to the development of speech. Despite this, the auditory cognition literature has largely overlooked afferent auditory connections to the cerebellum that have been implicated in acoustically conditioned reflexes in animals, and could subserve speech and other auditory processing in humans. This review expands our understanding of auditory processing by incorporating cerebellar pathways into the anatomy and functions of the human auditory system. We reason that plasticity in the cerebellar pathways underpins implicit learning of spectrotemporal information necessary for sound and speech recognition. Once learnt, this information automatically recognizes incoming auditory signals and predicts likely subsequent information based on previous experience. Since sound recognition processes involving the brainstem and cerebellum initiate early in auditory processing, learnt information stored in cerebellar memory templates could then support a range of auditory processing functions such as streaming, habituation, the integration of auditory feature information such as pitch, and the recognition of vocal communications. PMID:28373850
Minimal effects of visual memory training on the auditory performance of adult cochlear implant users

PubMed Central

Oba, Sandra I.; Galvin, John J.; Fu, Qian-Jie

2014-01-01

Auditory training has been shown to significantly improve cochlear implant (CI) users’ speech and music perception. However, it is unclear whether post-training gains in performance were due to improved auditory perception or to generally improved attention, memory and/or cognitive processing. In this study, speech and music perception, as well as auditory and visual memory were assessed in ten CI users before, during, and after training with a non-auditory task. A visual digit span (VDS) task was used for training, in which subjects recalled sequences of digits presented visually. After the VDS training, VDS performance significantly improved. However, there were no significant improvements for most auditory outcome measures (auditory digit span, phoneme recognition, sentence recognition in noise, digit recognition in noise), except for small (but significant) improvements in vocal emotion recognition and melodic contour identification. Post-training gains were much smaller with the non-auditory VDS training than observed in previous auditory training studies with CI users. The results suggest that post-training gains observed in previous studies were not solely attributable to improved attention or memory, and were more likely due to improved auditory perception. The results also suggest that CI users may require targeted auditory training to improve speech and music perception. PMID:23516087
Early prediction of student goals and affect in narrative-centered learning environments

NASA Astrophysics Data System (ADS)

Lee, Sunyoung

Recent years have seen a growing recognition of the role of goal and affect recognition in intelligent tutoring systems. Goal recognition is the task of inferring users' goals from a sequence of observations of their actions. Because of the uncertainty inherent in every facet of human computer interaction, goal recognition is challenging, particularly in contexts in which users can perform many actions in any order, as is the case with intelligent tutoring systems. Affect recognition is the task of identifying the emotional state of a user from a variety of physical cues, which are produced in response to affective changes in the individual. Accurately recognizing student goals and affect states could contribute to more effective and motivating interactions in intelligent tutoring systems. By exploiting knowledge of student goals and affect states, intelligent tutoring systems can dynamically modify their behavior to better support individual students. To create effective interactions in intelligent tutoring systems, goal and affect recognition models should satisfy two key requirements. First, because incorrectly predicted goals and affect states could significantly diminish the effectiveness of interactive systems, goal and affect recognition models should provide accurate predictions of user goals and affect states. When observations of users' activities become available, recognizers should make accurate early" predictions. Second, goal and affect recognition models should be highly efficient so they can operate in real time. To address key issues, we present an inductive approach to recognizing student goals and affect states in intelligent tutoring systems by learning goals and affect recognition models. Our work focuses on goal and affect recognition in an important new class of intelligent tutoring systems, narrative-centered learning environments. We report the results of empirical studies of induced recognition models from observations of students' interactions in narrative-centered learning environments. Experimental results suggest that induced models can make accurate early predictions of student goals and affect states, and they are sufficiently efficient to meet the real-time performance requirements of interactive learning environments.
Effects of mucosal loading on vocal fold vibration.

PubMed

Tao, Chao; Jiang, Jack J

2009-06-01

A chain model was proposed in this study to examine the effects of mucosal loading on vocal fold vibration. Mucosal loading was defined as the loading caused by the interaction between the vocal folds and the surrounding tissue. In the proposed model, the vocal folds and the surrounding tissue were represented by a series of oscillators connected by a coupling spring. The lumped masses, springs, and dampers of the oscillators modeled the tissue properties of mass, stiffness, and viscosity, respectively. The coupling spring exemplified the tissue interactions. By numerically solving this chain model, the effects of mucosal loading on the phonation threshold pressure, phonation instability pressure, and energy distribution in a voice production system were studied. It was found that when mucosal loading is small, phonation threshold pressure increases with the damping constant R(r), the mass constant R(m), and the coupling constant R(mu) of mucosal loading but decreases with the stiffness constant R(k). Phonation instability pressure is also related to mucosal loading. It was found that phonation instability pressure increases with the coupling constant R(mu) but decreases with the stiffness constant R(k) of mucosal loading. Therefore, it was concluded that mucosal loading directly affects voice production.
Effects of mucosal loading on vocal fold vibration

NASA Astrophysics Data System (ADS)

Tao, Chao; Jiang, Jack J.

2009-06-01

A chain model was proposed in this study to examine the effects of mucosal loading on vocal fold vibration. Mucosal loading was defined as the loading caused by the interaction between the vocal folds and the surrounding tissue. In the proposed model, the vocal folds and the surrounding tissue were represented by a series of oscillators connected by a coupling spring. The lumped masses, springs, and dampers of the oscillators modeled the tissue properties of mass, stiffness, and viscosity, respectively. The coupling spring exemplified the tissue interactions. By numerically solving this chain model, the effects of mucosal loading on the phonation threshold pressure, phonation instability pressure, and energy distribution in a voice production system were studied. It was found that when mucosal loading is small, phonation threshold pressure increases with the damping constant Rr, the mass constant Rm, and the coupling constant Rμ of mucosal loading but decreases with the stiffness constant Rk. Phonation instability pressure is also related to mucosal loading. It was found that phonation instability pressure increases with the coupling constant Rμ but decreases with the stiffness constant Rk of mucosal loading. Therefore, it was concluded that mucosal loading directly affects voice production.
Paediatric vocal fold paralysis.

PubMed

Garcia-Lopez, Isabel; Peñorrocha-Teres, Julio; Perez-Ortin, Magdalena; Cerpa, Mauricio; Rabanal, Ignacio; Gavilan, Javier

2013-01-01

Vocal fold paralysis (VFP) is a relatively common cause of stridor and dysphonia in the paediatric population. This report summarises our experience with VFP in the paediatric age group. All patients presenting with vocal fold paralysis over a 12-month period were included. Medical charts were revised retrospectively. The diagnosis was performed by flexible endoscopic examination. The cases were evaluated with respect to aetiology of the paralysis, presenting symptoms, delay in diagnosis, affected side, vocal fold position, need for surgical treatment and outcome. The presenting symptoms were stridor and dysphonia. Iatrogenic causes formed the largest group, followed by idiopathic, neurological and obstetric VFP. Unilateral paralysis was found in most cases. The median value for delay in diagnosis was 1 month and it was significantly higher in the iatrogenic group. Surgical treatment was not necessary in most part of cases. The diagnosis of VFP may be suspected based on the patient's symptoms and confirmed by flexible endoscopy. Infants who develop stridor or dysphonia following a surgical procedure have to be examined without delay. The surgeon has to keep in mind that there is a possibility of late spontaneous recovery or compensation. Copyright © 2012 Elsevier España, S.L. All rights reserved.
Joint Attention in Autism: Teaching Smiling Coordinated with Gaze to Respond to Joint Attention Bids

ERIC Educational Resources Information Center

Krstovska-Guerrero, Ivana; Jones, Emily A.

2013-01-01

Children with autism demonstrate early deficits in joint attention and expressions of affect. Interventions to teach joint attention have addressed gaze behavior, gestures, and vocalizations, but have not specifically taught an expression of positive affect such as smiling that tends to occur during joint attention interactions. Intervention was…
Effects of human fatigue on speech signals

NASA Astrophysics Data System (ADS)

Stamoulis, Catherine

2004-05-01

Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
IMRT for Image-Guided Single Vocal Cord Irradiation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Osman, Sarah O.S., E-mail: s.osman@erasmusmc.nl; Astreinidou, Eleftheria; Boer, Hans C.J. de

2012-02-01

Purpose: We have been developing an image-guided single vocal cord irradiation technique to treat patients with stage T1a glottic carcinoma. In the present study, we compared the dose coverage to the affected vocal cord and the dose delivered to the organs at risk using conventional, intensity-modulated radiotherapy (IMRT) coplanar, and IMRT non-coplanar techniques. Methods and Materials: For 10 patients, conventional treatment plans using two laterally opposed wedged 6-MV photon beams were calculated in XiO (Elekta-CMS treatment planning system). An in-house IMRT/beam angle optimization algorithm was used to obtain the coplanar and non-coplanar optimized beam angles. Using these angles, the IMRTmore » plans were generated in Monaco (IMRT treatment planning system, Elekta-CMS) with the implemented Monte Carlo dose calculation algorithm. The organs at risk included the contralateral vocal cord, arytenoids, swallowing muscles, carotid arteries, and spinal cord. The prescription dose was 66 Gy in 33 fractions. Results: For the conventional plans and coplanar and non-coplanar IMRT plans, the population-averaged mean dose {+-} standard deviation to the planning target volume was 67 {+-} 1 Gy. The contralateral vocal cord dose was reduced from 66 {+-} 1 Gy in the conventional plans to 39 {+-} 8 Gy and 36 {+-} 6 Gy in the coplanar and non-coplanar IMRT plans, respectively. IMRT consistently reduced the doses to the other organs at risk. Conclusions: Single vocal cord irradiation with IMRT resulted in good target coverage and provided significant sparing of the critical structures. This has the potential to improve the quality-of-life outcomes after RT and maintain the same local control rates.« less

Are Vocal Alterations Caused by Smoking in Reinke's Edema in Women Entirely Reversible After Microsurgery and Smoking Cessation?

PubMed

Martins, Regina Helena Garcia; Tavares, Elaine Lara Mendes; Pessin, Adriana Bueno Benito

2017-05-01

Reinke's edema is a benign lesion of the vocal folds that affects chronic smokers, especially women. The voice becomes hoarse and virilized, and the treatment is microsurgery. However, even after surgery and smoking cessation, many patients remain with a deep and hoarse voice. The aim of the present study was to compare pre- and postoperative acoustic and perceptual-auditory vocal analyses of women with Reinke's edema and of women in the control group, who were non-smokers. A total of 20 women with videolaryngoscopy diagnosis of Reinke's edema who underwent laryngeal microsurgery were evaluated pre- and postoperatively (6 months) by videolaryngoscopy, acoustic voice, and perceptual-auditory analyses (General degree of dysphonia, Roughness, Breathiness, Asthenia, Strain, and Instability [GRBASI] scale), and the maximum phonation times were calculated. The pre- and postoperative parameters of the women with Reinke's edema were compared with those of the control group of women with no laryngeal lesions, smoking habit, or vocal symptoms. Acoustic vocal perceptual-auditory analyses and the maximum phonation time of women with Reinke's edema improved significantly in the postoperative evaluations; nevertheless, 6 months after surgery, their voices became worse than the voices of the women from the control group. Abnormalities caused by smoking in Reinke's edema in women are not fully reversible with surgery and smoking cessation. One explanation would be the presence of possible structural alterations in fibroblasts caused by the toxicity of cigarette components, resulting in the uncontrolled production of fibrous matrix in the lamina propria, and preventing complete vocal recovery. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Dynamic Vibration Cooperates with Connective Tissue Growth Factor to Modulate Stem Cell Behaviors

PubMed Central

Tong, Zhixiang; Zerdoum, Aidan B.; Duncan, Randall L.

2014-01-01

Vocal fold disorders affect 3–9% of the U.S. population. Tissue engineering offers an alternative strategy for vocal fold repair. Successful engineering of vocal fold tissues requires a strategic combination of therapeutic cells, biomimetic scaffolds, and physiologically relevant mechanical and biochemical factors. Specifically, we aim to create a vocal fold-like microenvironment to coax stem cells to adopt the phenotype of vocal fold fibroblasts (VFFs). Herein, high frequency vibratory stimulations and soluble connective tissue growth factor (CTGF) were sequentially introduced to mesenchymal stem cells (MSCs) cultured on a poly(ɛ-caprolactone) (PCL)-derived microfibrous scaffold for a total of 6 days. The initial 3-day vibratory culture resulted in an increased production of hyaluronic acids (HA), tenascin-C (TNC), decorin (DCN), and matrix metalloproteinase-1 (MMP1). The subsequent 3-day CTGF treatment further enhanced the cellular production of TNC and DCN, whereas CTGF treatment alone without the vibratory preconditioning significantly promoted the synthesis of collagen I (Col 1) and sulfated glycosaminoglycans (sGAGs). The highest level of MMP1, TNC, Col III, and DCN production was found for cells being exposed to the combined vibration and CTGF treatment. Noteworthy, the vibration and CTGF elicited a differential stimulatory effect on elastin (ELN), HA synthase 1 (HAS1), and fibroblast-specific protein-1 (FSP-1). The mitogenic activity of CTGF was only elicited in naïve cells without the vibratory preconditioning. The combined treatment had profound, but opposite effects on mitogen-activated protein kinase (MAPK) pathways, Erk1/2 and p38, and the Erk1/2 pathway was critical for the observed mechano-biochemical responses. Collectively, vibratory stresses and CTGF signals cooperatively coaxed MSCs toward a VFF-like phenotype and accelerated the synthesis and remodeling of vocal fold matrices. PMID:24456068
Relationship between patient-perceived vocal handicap and clinician-rated level of vocal dysfunction.

PubMed

Childs, Lesley F; Bielinski, Clifford; Toles, Laura; Hamilton, Amy; Deane, Janis; Mau, Ted

2015-01-01

The relationship between patient-reported vocal handicap and clinician-rated measures of vocal dysfunction is not understood. This study aimed to determine if a correlation exists between the Voice Handicap Index-10 (VHI-10) and the Voice Functional Communication Measure rating in the National Outcomes Measurement System (NOMS). Retrospective case series. Four hundred and nine voice evaluations over 12 months at a tertiary voice center were reviewed. The VHI-10 and NOMS scores, diagnoses, and potential comorbid factors were collected and analyzed. For the study population as a whole, there was a moderate negative correlation between the NOMS rating and the VHI-10 (Pearson r = -0.57). However, for a given NOMS level, there could be considerable spread in the VHI-10. In addition, as the NOMS decreased stepwise below level 4, there was a corresponding increase in the VHI-10. However, a similar trend in VHI-10 was not observed for NOMS above level 4, indicating the NOMS versus VHI-10 correlation was not linear. Among diagnostic groups, the strongest correlation was found for subjects with functional dysphonia. The NOMS versus VHI-10 correlation was not affected by gender or the coexistence of a psychiatric diagnosis. A simple relationship between VHI-10 and NOMS rating does not exist. Patients with mild vocal dysfunction have a less direct relationship between their NOMS ratings and the VHI-10. These findings provide insight into the interpretation of patient-perceived and clinician-rated measures of vocal function and may allow for better management of expectations and patient counseling in the treatment of voice disorders. © 2014 The American Laryngological, Rhinological and Otological Society, Inc.
Food for song: expression of c-Fos and ZENK in the zebra finch song nuclei during food aversion learning.

PubMed

Tokarev, Kirill; Tiunova, Anna; Scharff, Constance; Anokhin, Konstantin

2011-01-01

Specialized neural pathways, the song system, are required for acquiring, producing, and perceiving learned avian vocalizations. Birds that do not learn to produce their vocalizations lack telencephalic song system components. It is not known whether the song system forebrain regions are exclusively evolved for song or whether they also process information not related to song that might reflect their 'evolutionary history'. To address this question we monitored the induction of two immediate-early genes (IEGs) c-Fos and ZENK in various regions of the song system in zebra finches (Taeniopygia guttata) in response to an aversive food learning paradigm; this involves the association of a food item with a noxious stimulus that affects the oropharyngeal-esophageal cavity and tongue, causing subsequent avoidance of that food item. The motor response results in beak and head movements but not vocalizations. IEGs have been extensively used to map neuro-molecular correlates of song motor production and auditory processing. As previously reported, neurons in two pallial vocal motor regions, HVC and RA, expressed IEGs after singing. Surprisingly, c-Fos was induced equivalently also after food aversion learning in the absence of singing. The density of c-Fos positive neurons was significantly higher than that of birds in control conditions. This was not the case in two other pallial song nuclei important for vocal plasticity, LMAN and Area X, although singing did induce IEGs in these structures, as reported previously. Our results are consistent with the possibility that some of the song nuclei may participate in non-vocal learning and the populations of neurons involved in the two tasks show partial overlap. These findings underscore the previously advanced notion that the specialized forebrain pre-motor nuclei controlling song evolved from circuits involved in behaviors related to feeding.
Effects of the epilarynx area on vocal fold dynamics and the primary voice signal.

PubMed

Döllinger, Michael; Berry, David A; Luegmair, Georg; Hüttner, Björn; Bohr, Christopher

2012-05-01

For the analysis of vocal fold dynamics, sub- and supraglottal influences must be taken into account, as recent studies have shown. In this work, we analyze the influence of changes in the epilaryngeal area on vocal fold dynamics. We investigate two excised female larynges in a hemilarynx setup combined with a synthetic vocal tract consisting of hard plastic and simulating the vowel /a/. Eigenmodes, amplitudes, and velocities of the oscillations, the subglottal pressures (P(sub)), and sound pressure levels (SPLs) of the generated signal are investigated as a function of three distinctive epilaryngeal areas (28.4 mm(2), 71.0 mm(2), and 205.9 mm(2)). The results showed that the SPL is independent of the epilarynx cross section and exhibits a nonlinear relation to the insufflated airflow. The P(sub) decreased with an increase in the epilaryngeal area and displayed linear relations to the airflow. The principal eigenfunctions (EEFs) from the vocal fold dynamics exhibited lateral movement for the first EEF and rotational motion for the second EEF. In total, the first two EEFs covered a minimum of 60% of the energy, with an average of more than 50% for the first EEF. Correlations to the epilarynx areas were not found. Maximal values for amplitudes (up to 2.5 mm) and velocities (up to 1.57 mm/ms) changed with varying epilaryngeal area but did not show consistent behavior for both larynges. We conclude that the size of the epilaryngeal area has significant influence on vocal fold dynamics but does not significantly affect the resultant SPL. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Singing modulates parvalbumin interneurons throughout songbird forebrain vocal control circuitry

PubMed Central

Zengin-Toktas, Yildiz

2017-01-01

Across species, the performance of vocal signals can be modulated by the social environment. Zebra finches, for example, adjust their song performance when singing to females (‘female-directed’ or FD song) compared to when singing in isolation (‘undirected’ or UD song). These changes are salient, as females prefer the FD song over the UD song. Despite the importance of these performance changes, the neural mechanisms underlying this social modulation remain poorly understood. Previous work in finches has established that expression of the immediate early gene EGR1 is increased during singing and modulated by social context within the vocal control circuitry. Here, we examined whether particular neural subpopulations within those vocal control regions exhibit similar modulations of EGR1 expression. We compared EGR1 expression in neurons expressing parvalbumin (PV), a calcium buffer that modulates network plasticity and homeostasis, among males that performed FD song, males that produced UD song, or males that did not sing. We found that, overall, singing but not social context significantly affected EGR1 expression in PV neurons throughout the vocal control nuclei. We observed differences in EGR1 expression between two classes of PV interneurons in the basal ganglia nucleus Area X. Additionally, we found that singing altered the amount of PV expression in neurons in HVC and Area X and that distinct PV interneuron types in Area X exhibited different patterns of modulation by singing. These data indicate that throughout the vocal control circuitry the singing-related regulation of EGR1 expression in PV neurons may be less influenced by social context than in other neuron types and raise the possibility of cell-type specific differences in plasticity and calcium buffering. PMID:28235074
Simulated Birdwatchers’ Playback Affects the Behavior of Two Tropical Birds

PubMed Central

Harris, J. Berton C.; Haskell, David G.

2013-01-01

Although recreational birdwatchers may benefit conservation by generating interest in birds, they may also have negative effects. One such potentially negative impact is the widespread use of recorded vocalizations, or “playback,” to attract birds of interest, including range-restricted and threatened species. Although playback has been widely used to test hypotheses about the evolution of behavior, no peer-reviewed study has examined the impacts of playback in a birdwatching context on avian behavior. We studied the effects of simulated birdwatchers’ playback on the vocal behavior of Plain-tailed Wrens Thryothorus euophrys and Rufous Antpittas Grallaria rufula in Ecuador. Study species’ vocal behavior was monitored for an hour after playing either a single bout of five minutes of song or a control treatment of background noise. We also studied the effects of daily five minute playback on five groups of wrens over 20 days. In single bout experiments, antpittas made more vocalizations of all types, except for trills, after playback compared to controls. Wrens sang more duets after playback, but did not produce more contact calls. In repeated playback experiments, wren responses were strong at first, but hardly detectable by day 12. During the study, one study group built a nest, apparently unperturbed, near a playback site. The playback-induced habituation and changes in vocal behavior we observed suggest that scientists should consider birdwatching activity when selecting research sites so that results are not biased by birdwatchers’ playback. Increased vocalizations after playback could be interpreted as a negative effect of playback if birds expend energy, become stressed, or divert time from other activities. In contrast, the habituation we documented suggests that frequent, regular birdwatchers’ playback may have minor effects on wren behavior. PMID:24147094
Simulated birdwatchers' playback affects the behavior of two tropical birds.

PubMed

Harris, J Berton C; Haskell, David G

2013-01-01

Although recreational birdwatchers may benefit conservation by generating interest in birds, they may also have negative effects. One such potentially negative impact is the widespread use of recorded vocalizations, or "playback," to attract birds of interest, including range-restricted and threatened species. Although playback has been widely used to test hypotheses about the evolution of behavior, no peer-reviewed study has examined the impacts of playback in a birdwatching context on avian behavior. We studied the effects of simulated birdwatchers' playback on the vocal behavior of Plain-tailed Wrens Thryothorus euophrys and Rufous Antpittas Grallaria rufula in Ecuador. Study species' vocal behavior was monitored for an hour after playing either a single bout of five minutes of song or a control treatment of background noise. We also studied the effects of daily five minute playback on five groups of wrens over 20 days. In single bout experiments, antpittas made more vocalizations of all types, except for trills, after playback compared to controls. Wrens sang more duets after playback, but did not produce more contact calls. In repeated playback experiments, wren responses were strong at first, but hardly detectable by day 12. During the study, one study group built a nest, apparently unperturbed, near a playback site. The playback-induced habituation and changes in vocal behavior we observed suggest that scientists should consider birdwatching activity when selecting research sites so that results are not biased by birdwatchers' playback. Increased vocalizations after playback could be interpreted as a negative effect of playback if birds expend energy, become stressed, or divert time from other activities. In contrast, the habituation we documented suggests that frequent, regular birdwatchers' playback may have minor effects on wren behavior.
The predictability of frequency-altered auditory feedback changes the weighting of feedback and feedforward input for speech motor control.

PubMed

Scheerer, Nichole E; Jones, Jeffery A

2014-12-01

Speech production requires the combined effort of a feedback control system driven by sensory feedback, and a feedforward control system driven by internal models. However, the factors that dictate the relative weighting of these feedback and feedforward control systems are unclear. In this event-related potential (ERP) study, participants produced vocalisations while being exposed to blocks of frequency-altered feedback (FAF) perturbations that were either predictable in magnitude (consistently either 50 or 100 cents) or unpredictable in magnitude (50- and 100-cent perturbations varying randomly within each vocalisation). Vocal and P1-N1-P2 ERP responses revealed decreases in the magnitude and trial-to-trial variability of vocal responses, smaller N1 amplitudes, and shorter vocal, P1 and N1 response latencies following predictable FAF perturbation magnitudes. In addition, vocal response magnitudes correlated with N1 amplitudes, vocal response latencies, and P2 latencies. This pattern of results suggests that after repeated exposure to predictable FAF perturbations, the contribution of the feedforward control system increases. Examination of the presentation order of the FAF perturbations revealed smaller compensatory responses, smaller P1 and P2 amplitudes, and shorter N1 latencies when the block of predictable 100-cent perturbations occurred prior to the block of predictable 50-cent perturbations. These results suggest that exposure to large perturbations modulates responses to subsequent perturbations of equal or smaller size. Similarly, exposure to a 100-cent perturbation prior to a 50-cent perturbation within a vocalisation decreased the magnitude of vocal and N1 responses, but increased P1 and P2 latencies. Thus, exposure to a single perturbation can affect responses to subsequent perturbations. © 2014 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Influence of sedation on onset and quality of euthanasia in sheep.

PubMed

Barletta, Michele; Hofmeister, Erik H; Peroni, John F; Thoresen, Merrilee; Scharf, Alexandra M; Quandt, Jane E

2018-04-01

The purpose of this study was to determine if dexmedetomidine administered IV prior to euthanasia in sheep affected the speed or quality of euthanasia. Twenty clinically healthy Dorset-cross adult ewes between 1 and 3years of age were enrolled in a randomized blinded experimental trial. The subjects were randomly assigned to receive dexmedetomidine 5μg/kg IV or an equivalent volume of saline. Five minutes later, euthanasia was accomplished with a pentobarbital/phenytoin overdose given IV. The time to apnea, asystole, cessation of audible heartbeat, and absence of corneal reflex were recorded by two blinded investigators. If any muscle spasms, contractions, vocalization, and/or dysrhythmias were noted, the time was recorded and type of ECG abnormality was described. An overall score of the euthanasia event was assigned using a numeric rating scale (NRS) after the animal was declared dead. The time to loss of corneal reflex was significantly longer in sheep given dexmedetomidine compared with those who received saline (P=0.03). Although vocalization was observed only in some animals premedicated with dexmedetomidine, no significance was found for this event and no other significant differences between groups were noted. Dexmedetomidine at 5μg/kg IV 5min prior to injection of pentobarbital/phenytoin for euthanasia did not substantially affect the progress of euthanasia. Dexmedetomidine may be given to sedate sheep prior to euthanasia without concern for it adversely affecting the progress of euthanasia, however vocalization may occur. Copyright © 2017 Elsevier Ltd. All rights reserved.
Reinke's edema: investigations on the role of MIB-1 and hepatocyte growth factor.

PubMed

Artico, M; Bronzetti, E; Ionta, B; Bruno, M; Greco, A; Ruoppolo, G; De Virgilio, A; Longo, L; De Vincentiis, M

2010-07-08

Reinke's edema is a benign disease of the human vocal fold, which mainly affects the sub-epithelial layer of the vocal fold. Microscopic observations show a strongly oedematous epithelium with loosened intercellular junctions, a disruption of the extracellular connections between mucosal epithelium and connective tissue, closely adherent to the thyroarytenoid muscle. Thickening of the basal layer of epithelium, known as Reinke's space, high deposition of fibronectin and chronic inflammatory infiltration it is also visible. We analyzed, together with the hepatocyte growth factor (HGF), the expression level of MIB-1 in samples harvested from patients affected by Reinke's edema, in order to define its biological role and consider it as a possible prognostic factor in the follow-up after surgical treatment. We observed a moderate expression of HGF in the lamina propria of the human vocal fold and in the basal membrane of the mucosal epithelium. Our finding suggests that this growth factor acts as an antifibrotic agent in Reinke's space and affects the fibronectin deposition in the lamina propria. MIB-1, on the contrary, showed a weak expression in the basement membrane of the mucosal epithelium and a total absence in the lamina propria deep layer, thus suggesting that only the superficial layer is actively involved in the reparatory process with a high regenerative capacity, together with a high deposition of fibronectin. The latter is necessary for the cellular connections reconstruction, after the inflammatory infiltration.
Reinke's Edema: investigations on the role of MIB-1 and hepatocyte growth factor

PubMed Central

Artico, M.; Bronzetti, E.; Ionta, B.; Bruno, M.; Greco, A.; Ruoppolo, G.; De Virgilio, A.; Longo, L.; De Vincentiis, M.

2010-01-01

Reinke's edema is a benign disease of the human vocal fold, which mainly affects the sub-epithelial layer of the vocal fold. Microscopic observations show a strongly oedematous epithelium with loosened intercellular junctions, a disruption of the extracellular connections between mucosal epithelium and connective tissue, closely adherent to the thyroarytenoid muscle. Thickening of the basal layer of epithelium, known as Reinke's space, high deposition of fibronectin and chronic inflammatory infiltration it is also visible. We analyzed, together with the hepatocyte growth factor (HGF), the expression level of MIB-1 in samples harvested from patients affected by Reinke's edema, in order to define its biological role and consider it as a possible prognostic factor in the follow-up after surgical treatment. We observed a moderate expression of HGF in the lamina propria of the human vocal fold and in the basal membrane of the mucosal epithelium. Our finding suggests that this growth factor acts as an anti - fibrotic agent in Reinke's space and affects the fibronectin deposition in the lamina propria. MIB-1, on the contrary, showed a weak expression in the basement membrane of the mucosal epithelium and a total absence in the lamina propria deep layer, thus suggesting that only the superficial layer is actively involved in the reparatory process with a high regenerative capacity, together with a high deposition of fibronectin. The latter is necessary for the cellular connections reconstruction, after the inflammatory infiltration. PMID:20819770
The Paralinguistic Encoding Capability of Children. Report from the Project on Studies of Instructional Programming for the Individual Student. Technical Report No. 441.

ERIC Educational Resources Information Center

Plazewski, Joseph G.; Allen, Vernon L.

A study was conducted of the capacity of sixth-grade children to communicate accurately paralinguistic affect. A dependent measure indicating the accuracy of paralinguistic communication of affect was obtained by comparing the level of affect which children intended to encode with ratings of vocal inflections from adult judges. Four independent…
Nordic rattle: the hoarse vocalization and the inflatable laryngeal air sac of reindeer (Rangifer tarandus)

PubMed Central

Frey, Roland; Gebler, Alban; Fritsch, Guido; Nygrén, Kaarlo; Weissengruber, Gerald E

2007-01-01

Laryngeal air sacs have evolved convergently in diverse mammalian lineages including insectivores, bats, rodents, pinnipeds, ungulates and primates, but their precise function has remained elusive. Among cervids, the vocal tract of reindeer has evolved an unpaired inflatable ventrorostral laryngeal air sac. This air sac is not present at birth but emerges during ontogenetic development. It protrudes from the laryngeal vestibulum via a short duct between the epiglottis and the thyroid cartilage. In the female the growth of the air sac stops at the age of 2–3 years, whereas in males it continues to grow up to the age of about 6 years, leading to a pronounced sexual dimorphism of the air sac. In adult females it is of moderate size (about 100 cm3), whereas in adult males it is large (3000–4000 cm3) and becomes asymmetric extending either to the left or to the right side of the neck. In both adult females and males the ventral air sac walls touch the integument. In the adult male the air sac is laterally covered by the mandibular portion of the sternocephalic muscle and the skin. Both sexes of reindeer have a double stylohyoid muscle and a thyroepiglottic muscle. Possibly these muscles assist in inflation of the air sac. Head-and-neck specimens were subjected to macroscopic anatomical dissection, computer tomographic analysis and skeletonization. In addition, isolated larynges were studied for comparison. Acoustic recordings were made during an autumn round-up of semi-domestic reindeer in Finland and in a small zoo herd. Male reindeer adopt a specific posture when emitting their serial hoarse rutting calls. Head and neck are kept low and the throat region is extended. In the ventral neck region, roughly corresponding to the position of the large air sac, there is a mane of longer hairs. Neck swelling and mane spreading during vocalization may act as an optical signal to other males and females. The air sac, as a side branch of the vocal tract, can be considered as an additional acoustic filter. Individual acoustic recognition may have been the primary function in the evolution of a size-variable air sac, and this function is retained in mother–young communication. In males sexual selection seems to have favoured a considerable size increase of the air sac and a switch to call series instead of single calls. Vocalization became restricted to the rutting period serving the attraction of females. We propose two possibilities for the acoustic function of the air sac in vocalization that do not exclude each other. The first assumes a coupling between air sac and the environment, resulting in an acoustic output that is a combination of the vocal tract resonance frequencies emitted via mouth and nostrils and the resonance frequencies of the air sac transmitted via the neck skin. The second assumes a weak coupling so that resonance frequencies of the air sac are lost to surrounding tissues by dissipation. In this case the resonance frequencies of the air sac solely influence the signal that is further filtered by the remaining vocal tract. According to our results one acoustic effect of the air sac in adult reindeer might be to mask formants of the vocal tract proper. In other cervid species, however, formants of rutting calls convey essential information on the quality of the sender, related to its potential reproductive success, to conspecifics. Further studies are required to solve this inconsistency. PMID:17310544
Pairing Increases Activation of V1aR, but not OTR, in Auditory Regions of Zebra Finches: The Importance of Signal Modality in Nonapeptide-Social Behavior Relationships.

PubMed

Tomaszycki, Michelle L; Atchley, Derek

2017-10-01

Social relationships are complex, involving the production and comprehension of signals, individual recognition, and close coordination of behavior between two or more individuals. The nonapeptides oxytocin and vasopressin are widely believed to regulate social relationships. These findings come largely from prairie voles, in which nonapeptide receptors in olfactory neural circuits drive pair bonding. This research is assumed to apply to all species. Previous reviews have offered two competing hypotheses. The work of Sarah Newman has implicated a common neural network across species, the Social Behavior Network. In contrast, others have suggested that there are signal modality-specific networks that regulate social behavior. Our research focuses on evaluating these two competing hypotheses in the zebra finch, a species that relies heavily on vocal/auditory signals for communication, specifically the neural circuits underlying singing in males and song perception in females. We have demonstrated that the quality of vocal interactions is highly important for the formation of long-term monogamous bonds in zebra finches. Qualitative evidence at first suggests that nonapeptide receptor distributions are very different between monogamous rodents (olfactory species) and monogamous birds (vocal/auditory species). However, we have demonstrated that social bonding behaviors are not only correlated with activation of nonapeptide receptors in vocal and auditory circuits, but also involve regions of the common Social Behavior Network. Here, we show increased Vasopressin 1a receptor, but not oxytocin receptor, activation in two auditory regions following formation of a pair bond. To our knowledge, this is the first study to suggest a role of nonapeptides in the auditory circuit in pair bonding. Thus, we highlight converging mechanisms of social relationships and also point to the importance of studying multiple species to understand mechanisms of behavior. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Exploring the Use of Isolated Expressions and Film Clips to Evaluate Emotion Recognition by People with Traumatic Brain Injury

PubMed Central

Zupan, Barbra; Neumann, Dawn

2016-01-01

The current study presented 60 people with traumatic brain injury (TBI) and 60 controls with isolated facial emotion expressions, isolated vocal emotion expressions, and multimodal (i.e., film clips) stimuli that included contextual cues. All stimuli were presented via computer. Participants were required to indicate how the person in each stimulus was feeling using a forced-choice format. Additionally, for the film clips, participants had to indicate how they felt in response to the stimulus, and the level of intensity with which they experienced that emotion. PMID:27213280
Interactive Voice Technology: Variations in the Vocal Utterances of Speakers Performing a Stress-Inducing Task,

DTIC Science & Technology

1983-08-16

34. " .. ,,,,.-j.Aid-is.. ;,,i . -i.t . "’" ’, V ,1 5- 4. 3- kHz 2-’ r 1 r s ’.:’ BOGEY 5D 0 S BOGEY 12D Figure 10. Spectrograms of two versions of the word...MF5852801B 0001 Reviewed by Approved and Released by Ashton Graybiel, M.D. Captain W. M. Houk , MC, USN Chief Scientific Advisor Commanding Officer 16 August...incorporating knowledge about these changes into speech recognition systems. i A J- I. . S , .4, ... ..’-° -- -iii l - - .- - i- . .. " •- - i ,f , i
Effect of body position on vocal tract acoustics: Acoustic pharyngometry and vowel formants.

PubMed

Vorperian, Houri K; Kurtzweil, Sara L; Fourakis, Marios; Kent, Ray D; Tillman, Katelyn K; Austin, Diane

2015-08-01

The anatomic basis and articulatory features of speech production are often studied with imaging studies that are typically acquired in the supine body position. It is important to determine if changes in body orientation to the gravitational field alter vocal tract dimensions and speech acoustics. The purpose of this study was to assess the effect of body position (upright versus supine) on (1) oral and pharyngeal measurements derived from acoustic pharyngometry and (2) acoustic measurements of fundamental frequency (F0) and the first four formant frequencies (F1-F4) for the quadrilateral point vowels. Data were obtained for 27 male and female participants, aged 17 to 35 yrs. Acoustic pharyngometry showed a statistically significant effect of body position on volumetric measurements, with smaller values in the supine than upright position, but no changes in length measurements. Acoustic analyses of vowels showed significantly larger values in the supine than upright position for the variables of F0, F3, and the Euclidean distance from the centroid to each corner vowel in the F1-F2-F3 space. Changes in body position affected measurements of vocal tract volume but not length. Body position also affected the aforementioned acoustic variables, but the main vowel formants were preserved.
Variation in the emission rate of sounds in a captive group of false killer whales Pseudorca crassidens during feedings: possible food anticipatory vocal activity?

NASA Astrophysics Data System (ADS)

Platto, Sara; Wang, Ding; Wang, Kexiong

2016-11-01

This study examines whether a group of captive false killer whales ( Pseudorca crassidens ) showed variations in the vocal rate around feeding times. The high level of motivation to express appetitive behaviors in captive animals may lead them to respond with changes of the behavioral activities during the time prior to food deliveries which are referred to as food anticipatory activity. False killer whales at Qingdao Polar Ocean World (Qingdao, China) showed significant variations of the rates of both the total sounds and sound classes (whistles, clicks, and burst pulses) around feedings. Precisely, from the Transition interval that recorded the lowest vocalization rate (3.40 s/m/d), the whales increased their acoustic emissions upon trainers' arrival (13.08 s/m/d). The high rate was maintained or intensified throughout the food delivery (25.12 s/m/d), and then reduced immediately after the animals were fed (9.91 s/m/d). These changes in the false killer whales sound production rates around feeding times supports the hypothesis of the presence of a food anticipatory vocal activity. Although sound rates may not give detailed information regarding referential aspects of the animal communication it might still shed light about the arousal levels of the individuals during different social or environmental conditions. Further experiments should be performed to assess if variations of the time of feeding routines may affect the vocal activity of cetaceans in captivity as well as their welfare.
Body height, immunity, facial and vocal attractiveness in young men.

PubMed

Skrinda, Ilona; Krama, Tatjana; Kecko, Sanita; Moore, Fhionna R; Kaasik, Ants; Meija, Laila; Lietuvietis, Vilnis; Rantala, Markus J; Krams, Indrikis

2014-12-01

Health, facial and vocal attributes and body height of men may affect a diverse range of social outcomes such as attractiveness to potential mates and competition for resources. Despite evidence that each parameter plays a role in mate choice, the relative role of each and inter-relationships between them, is still poorly understood. In this study, we tested relationships both between these parameters and with testosterone and immune function. We report positive relationships between testosterone with facial masculinity and attractiveness, and we found that facial masculinity predicted facial attractiveness and antibody response to a vaccine. Moreover, the relationship between antibody response to a hepatitis B vaccine and body height was found to be non-linear, with a positive relationship up to a height of 188 cm, but an inverse relationship in taller men. We found that vocal attractiveness was dependent upon vocal masculinity. The relationship between vocal attractiveness and body height was also non-linear, with a positive relationship of up to 178 cm, which then decreased in taller men. We did not find a significant relationship between body height and the fundamental frequency of vowel sounds provided by young men, while body height negatively correlated with the frequency of second formant. However, formant frequency was not associated with the strength of immune response. Our results demonstrate the potential of vaccination research to reveal costly traits that govern evolution of mate choice in humans and the importance of trade-offs among these traits.

Pregnancy and the singing voice: reports from a case study.

PubMed

Lã, Filipa Martins Baptista; Sundberg, Johan

2012-07-01

Significant changes in body tissues occur during pregnancy; however, literature concerning the effects of pregnancy on the voice is sparse, especially concerning the professional classically trained voice. Hormonal variations and associated bodily changes during pregnancy affect phonatory conditions, such as vocal fold motility and glottal adduction. Longitudinal case study with a semiprofessional classically trained singer. Audio, electrolaryngograph, oral pressure, and air flow signals were recorded once a week during the last 12 weeks of pregnancy, 48 hours after birth and during the following consecutive 11 weeks. Vocal tasks included diminuendo sequences of the syllable /pae/ sung at various pitches, and performing a Lied. Phonation threshold pressures (PTPs) and collision threshold pressures (CTPs), normalized amplitude quotient (NAQ), alpha ratio, and the dominance of the voice source fundamental were determined. Concentrations of sex female steroid hormones were measured on three occasions. A listening test of timbral brightness and vocal fatigue was carried out. Results demonstrated significantly elevated concentrations of estrogen and progesterone during pregnancy, which were considerably reduced after birth. During pregnancy, CTPs and PTPs were high; and NAQ, alpha ratio, and dominance of the voice source fundamental suggested elevated glottal adduction. In addition, a perceptible decrease of vocal brightness was noted. The elevated CTPs and PTPs during pregnancy suggest reduced vocal fold motility and increased glottal adduction. These changes are compatible with expected effects of elevated concentrations of estrogen and progesterone on tissue viscosity and water retention. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Body height, immunity, facial and vocal attractiveness in young men

NASA Astrophysics Data System (ADS)

Skrinda, Ilona; Krama, Tatjana; Kecko, Sanita; Moore, Fhionna R.; Kaasik, Ants; Meija, Laila; Lietuvietis, Vilnis; Rantala, Markus J.; Krams, Indrikis

2014-12-01

Health, facial and vocal attributes and body height of men may affect a diverse range of social outcomes such as attractiveness to potential mates and competition for resources. Despite evidence that each parameter plays a role in mate choice, the relative role of each and inter-relationships between them, is still poorly understood. In this study, we tested relationships both between these parameters and with testosterone and immune function. We report positive relationships between testosterone with facial masculinity and attractiveness, and we found that facial masculinity predicted facial attractiveness and antibody response to a vaccine. Moreover, the relationship between antibody response to a hepatitis B vaccine and body height was found to be non-linear, with a positive relationship up to a height of 188 cm, but an inverse relationship in taller men. We found that vocal attractiveness was dependent upon vocal masculinity. The relationship between vocal attractiveness and body height was also non-linear, with a positive relationship of up to 178 cm, which then decreased in taller men. We did not find a significant relationship between body height and the fundamental frequency of vowel sounds provided by young men, while body height negatively correlated with the frequency of second formant. However, formant frequency was not associated with the strength of immune response. Our results demonstrate the potential of vaccination research to reveal costly traits that govern evolution of mate choice in humans and the importance of trade-offs among these traits.
Vocal Fold Vibration Following Surgical Intervention in Three Vocal Pathologies: A Preliminary Study.

PubMed

Chen, Wenli; Woo, Peak; Murry, Thomas

2017-09-01

High-speed videoendoscopy captures the cycle-to-cycle vibratory motion of each individual vocal fold in normal and severely disordered phonation. Therefore, it provides a direct method to examine the specific vibratory changes following vocal fold surgery. The purpose of this study was to examine the vocal fold vibratory pattern changes in the surgically treated pathologic vocal fold and the contralateral vocal fold in three vocal pathologies: vocal polyp (n = 3), paresis or paralysis (n = 3), and scar (n = 3). Digital kymography was used to extract high-speed kymographic vocal fold images at the mid-membranous region of the vocal fold. Spectral analysis was subsequently applied to the digital kymography to quantify the cycle-to-cycle movements of each vocal fold, expressed as a spectrum. Surgical modification resulted in significantly improved spectral power of the treated pathologic vocal fold. Furthermore, the contralateral vocal fold also presented with improved spectral power irrespective of vocal pathology. In comparison with normal vocal fold spectrum, postsurgical vocal fold vibrations continued to demonstrate decreased vibratory amplitude in both vocal folds. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Neurobiological mechanisms associated with facial affect recognition deficits after traumatic brain injury.

PubMed

Neumann, Dawn; McDonald, Brenna C; West, John; Keiski, Michelle A; Wang, Yang

2016-06-01

The neurobiological mechanisms that underlie facial affect recognition deficits after traumatic brain injury (TBI) have not yet been identified. Using functional magnetic resonance imaging (fMRI), study aims were to 1) determine if there are differences in brain activation during facial affect processing in people with TBI who have facial affect recognition impairments (TBI-I) relative to people with TBI and healthy controls who do not have facial affect recognition impairments (TBI-N and HC, respectively); and 2) identify relationships between neural activity and facial affect recognition performance. A facial affect recognition screening task performed outside the scanner was used to determine group classification; TBI patients who performed greater than one standard deviation below normal performance scores were classified as TBI-I, while TBI patients with normal scores were classified as TBI-N. An fMRI facial recognition paradigm was then performed within the 3T environment. Results from 35 participants are reported (TBI-I = 11, TBI-N = 12, and HC = 12). For the fMRI task, TBI-I and TBI-N groups scored significantly lower than the HC group. Blood oxygenation level-dependent (BOLD) signals for facial affect recognition compared to a baseline condition of viewing a scrambled face, revealed lower neural activation in the right fusiform gyrus (FG) in the TBI-I group than the HC group. Right fusiform gyrus activity correlated with accuracy on the facial affect recognition tasks (both within and outside the scanner). Decreased FG activity suggests facial affect recognition deficits after TBI may be the result of impaired holistic face processing. Future directions and clinical implications are discussed.
Affect Recognition in Adults with Attention-Deficit/Hyperactivity Disorder

PubMed Central

Miller, Meghan; Hanford, Russell B.; Fassbender, Catherine; Duke, Marshall; Schweitzer, Julie B.

2014-01-01

Objective This study compared affect recognition abilities between adults with and without Attention-Deficit/Hyperactivity Disorder (ADHD). Method The sample included 51 participants (34 men, 17 women) divided into 3 groups: ADHD-Combined Type (ADHD-C; n = 17), ADHD-Predominantly Inattentive Type (ADHD-I; n = 16), and controls (n = 18). The mean age was 34 years. Affect recognition abilities were assessed by the Diagnostic Analysis of Nonverbal Accuracy (DANVA). Results Analyses of Variance showed that the ADHD-I group made more fearful emotion errors relative to the control group. Inattentive symptoms were positively correlated while hyperactive-impulsive symptoms were negatively correlated with affect recognition errors. Conclusion These results suggest that affect recognition abilities may be impaired in adults with ADHD and that affect recognition abilities are more adversely affected by inattentive than hyperactive-impulsive symptoms. PMID:20555036
Effect of Parkinson Disease on Emotion Perception Using the Persian Affective Voices Test.

PubMed

Saffarian, Arezoo; Shavaki, Yunes Amiri; Shahidi, Gholam Ali; Jafari, Zahra

2018-05-04

Emotion perception plays a major role in proper communication with people in different social interactions. Nonverbal affect bursts can be used to evaluate vocal emotion perception. The present study was a preliminary step to establishing the psychometric properties of the Persian version of the Montreal Affective Voices (MAV) test, as well as to investigate the effect of Parkinson disease (PD) on vocal emotion perception. The short, emotional sound made by pronouncing the vowel "a" in Persian was recorded by 22 actors and actresses to develop the Persian version of the MAV, the Persian Affective Voices (PAV), for emotions of happiness, sadness, pleasure, pain, anger, disgust, fear, surprise, and neutrality. The results of the recordings of five of the actresses and five of the actors who obtained the highest score were used to generate the test. For convergent validity assessment, the correlation between the PAV and a speech prosody comprehension test was examined using a gender- and age-matched control group. To investigate the effect of the PD on emotion perception, the PAV test was performed on 28 patients with mild PD between ages 50 and 70 years. The PAV showed a high internal consistency (Cronbach's α = 0.80). A significant positive correlation was observed between the PAV and the speech prosody comprehension test. The test-retest reliability also showed the high repeatability of the PAV (intraclass correlation coefficient = 0.815, P ≤ 0.001). A significant difference was observed between the patients with PD and the controls in all subtests. The PAV test is a useful psychometric tool for examining vocal emotion perception that can be used in both behavioral and neuroimaging studies. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Relation between facial affect recognition and configural face processing in antipsychotic-free schizophrenia.

PubMed

Fakra, Eric; Jouve, Elisabeth; Guillaume, Fabrice; Azorin, Jean-Michel; Blin, Olivier

2015-03-01

Deficit in facial affect recognition is a well-documented impairment in schizophrenia, closely connected to social outcome. This deficit could be related to psychopathology, but also to a broader dysfunction in processing facial information. In addition, patients with schizophrenia inadequately use configural information-a type of processing that relies on spatial relationships between facial features. To date, no study has specifically examined the link between symptoms and misuse of configural information in the deficit in facial affect recognition. Unmedicated schizophrenia patients (n = 30) and matched healthy controls (n = 30) performed a facial affect recognition task and a face inversion task, which tests aptitude to rely on configural information. In patients, regressions were carried out between facial affect recognition, symptom dimensions and inversion effect. Patients, compared with controls, showed a deficit in facial affect recognition and a lower inversion effect. Negative symptoms and lower inversion effect could account for 41.2% of the variance in facial affect recognition. This study confirms the presence of a deficit in facial affect recognition, and also of dysfunctional manipulation in configural information in antipsychotic-free patients. Negative symptoms and poor processing of configural information explained a substantial part of the deficient recognition of facial affect. We speculate that this deficit may be caused by several factors, among which independently stand psychopathology and failure in correctly manipulating configural information. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Major depressive disorder skews the recognition of emotional prosody.

PubMed

Péron, Julie; El Tamer, Sarah; Grandjean, Didier; Leray, Emmanuelle; Travers, David; Drapier, Dominique; Vérin, Marc; Millet, Bruno

2011-06-01

Major depressive disorder (MDD) is associated with abnormalities in the recognition of emotional stimuli. MDD patients ascribe more negative emotion but also less positive emotion to facial expressions, suggesting blunted responsiveness to positive emotional stimuli. To ascertain whether these emotional biases are modality-specific, we examined the effects of MDD on the recognition of emotions from voices using a paradigm designed to capture subtle effects of biases. Twenty-one MDD patients and 21 healthy controls (HC) underwent clinical and neuropsychological assessments, followed by a paradigm featuring pseudowords spoken by actors in five types of emotional prosody, rated on continuous scales. Overall, MDD patients performed more poorly than HC, displaying significantly impaired recognition of fear, happiness and sadness. Compared with HC, they rated fear significantly more highly when listening to anger stimuli. They also displayed a bias toward surprise, rating it far higher when they heard sad or fearful utterances. Furthermore, for happiness stimuli, MDD patients gave higher ratings for negative emotions (fear and sadness). A multiple regression model on recognition of emotional prosody in MDD patients showed that the best fit was achieved using the executive functioning (categorical fluency, number of errors in the MCST, and TMT B-A) and the total score of the Montgomery-Asberg Depression Rating Scale. Impaired recognition of emotions would appear not to be specific to the visual modality but to be present also when emotions are expressed vocally, this impairment being related to depression severity and dysexecutive syndrome. MDD seems to skew the recognition of emotional prosody toward negative emotional stimuli and the blunting of positive emotion appears not to be restricted to the visual modality. Copyright © 2011 Elsevier Inc. All rights reserved.
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hogden, J.

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation maymore » decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.« less
Ultrasonic Vocalizations: evidence for an affective opponent process during cocaine self-administration

PubMed Central

Barker, David J.; Simmons, Steven J.; Servilio, Lisa C.; Bercovicz, Danielle; Ma, Sisi; Root, David H.; Pawlak, Anthony P.; West, Mark O.

2013-01-01

Rationale Preclinical models of cocaine addiction in the rodent have shown that cocaine induces both positive and negative affective states. These observations have led to the notion that the initial positive/euphoric state induced by cocaine administration may be followed by an opposing, negative process. In the rodent, one method for inferring positive and negative affective states involves measuring their ultrasonic vocalizations (USVs). Previous USV recordings from our laboratory suggested that the transition between positive and negative affect might involve decaying or sub-satiety levels of selfadministered cocaine. Objectives In order to explicitly test the role of cocaine levels on these affective states, the present study examined USVs when calculated body levels of cocaine were clamped (i.e. held at a constant level via experimenter- controlled infusions) at, below, or above subjects’ self-determined drug satiety thresholds. Results USVs indicated that 1) positive affect was predominantly observed during the drug loading period, but declined quickly to near zero during maintenance and exhibited little relation to calculated drug level, and 2) in contrast, negative affect was observed at sub-satiety cocaine levels, but was relatively absent when body levels of cocaine were clamped at or above subjects’ satiety thresholds. Conclusions The results reinforce the opponent-process hypothesis of addiction and suggest that an understanding of the mechanisms underlying negative affect might serve to inform behavioral and pharmacological therapies. PMID:24197178
Affect recognition across manic and euthymic phases of bipolar disorder in Han-Chinese patients.

PubMed

Pan, Yi-Ju; Tseng, Huai-Hsuan; Liu, Shi-Kai

2013-11-01

Patients with bipolar disorder (BD) have affect recognition deficits. Whether affect recognition deficits constitute a state or trait marker of BD has great etiopathological significance. The current study aims to explore the interrelationships between affect recognition and basic neurocognitive functions for patients with BD across different mood states, using the Diagnostic Analysis of Non-Verbal Accuracy-2, Taiwanese version (DANVA-2-TW) as the index measure for affect recognition. To our knowledge, this is the first study examining affect recognition deficits of BPD across mood states in the Han Chinese population. Twenty-nine manic patients, 16 remitted patients with BD, and 40 control subjects are included in the study. Distinct association patterns between affect recognition and neurocognitive functions are demonstrated for patients with BD and control subjects, implicating alternations in emotion associated neurocognitive processing. Compared to control subjects, manic patients but not remitted subjects perform significantly worse in the recognition of negative emotions as a whole and specifically anger, after adjusting for differences in general intellectual ability and basic neurocognitive functions. Affect recognition deficit may be a relatively independent impairment in BD rather than consequences arising from deficits in other basic neurocognition. The impairments of manic patients in the recognition of negative emotions, specifically anger, may further our understanding of core clinical psychopathology of BD and have implications in treating bipolar patients across distinct mood phases. © 2013 Elsevier B.V. All rights reserved.
Neural activity related to discrimination and vocal production of consonant and dissonant musical intervals.

PubMed

González-García, Nadia; González, Martha A; Rendón, Pablo L

2016-07-15

Relationships between musical pitches are described as either consonant, when associated with a pleasant and harmonious sensation, or dissonant, when associated with an inharmonious feeling. The accurate singing of musical intervals requires communication between auditory feedback processing and vocal motor control (i.e. audio-vocal integration) to ensure that each note is produced correctly. The objective of this study is to investigate the neural mechanisms through which trained musicians produce consonant and dissonant intervals. We utilized 4 musical intervals (specifically, an octave, a major seventh, a fifth, and a tritone) as the main stimuli for auditory discrimination testing, and we used the same interval tasks to assess vocal accuracy in a group of musicians (11 subjects, all female vocal students at conservatory level). The intervals were chosen so as to test for differences in recognition and production of consonant and dissonant intervals, as well as narrow and wide intervals. The subjects were studied using fMRI during performance of the interval tasks; the control condition consisted of passive listening. Singing dissonant intervals as opposed to singing consonant intervals led to an increase in activation in several regions, most notably the primary auditory cortex, the primary somatosensory cortex, the amygdala, the left putamen, and the right insula. Singing wide intervals as opposed to singing narrow intervals resulted in the activation of the right anterior insula. Moreover, we also observed a correlation between singing in tune and brain activity in the premotor cortex, and a positive correlation between training and activation of primary somatosensory cortex, primary motor cortex, and premotor cortex during singing. When singing dissonant intervals, a higher degree of training correlated with the right thalamus and the left putamen. Our results indicate that singing dissonant intervals requires greater involvement of neural mechanisms associated with integrating external feedback from auditory and sensorimotor systems than singing consonant intervals, and it would then seem likely that dissonant intervals are intoned by adjusting the neural mechanisms used for the production of consonant intervals. Singing wide intervals requires a greater degree of control than singing narrow intervals, as it involves neural mechanisms which again involve the integration of internal and external feedback. Copyright © 2016 Elsevier B.V. All rights reserved.
Cultural relativity in perceiving emotion from vocalizations.

PubMed

Gendron, Maria; Roberson, Debi; van der Vyver, Jacoba Marieta; Barrett, Lisa Feldman

2014-04-01

A central question in the study of human behavior is whether certain emotions, such as anger, fear, and sadness, are recognized in nonverbal cues across cultures. We predicted and found that in a concept-free experimental task, participants from an isolated cultural context (the Himba ethnic group from northwestern Namibia) did not freely label Western vocalizations with expected emotion terms. Responses indicate that Himba participants perceived more basic affective properties of valence (positivity or negativity) and to some extent arousal (high or low activation). In a second, concept-embedded task, we manipulated whether the target and foil on a given trial matched in both valence and arousal, neither valence nor arousal, valence only, or arousal only. Himba participants achieved above-chance accuracy only when foils differed from targets in valence only. Our results indicate that the voice can reliably convey affective meaning across cultures, but that perceptions of emotion from the voice are culturally variable.
How do typically developing deaf children and deaf children with autism spectrum disorder use the face when comprehending emotional facial expressions in British sign language?

PubMed

Denmark, Tanya; Atkinson, Joanna; Campbell, Ruth; Swettenham, John

2014-10-01

Facial expressions in sign language carry a variety of communicative features. While emotion can modulate a spoken utterance through changes in intonation, duration and intensity, in sign language specific facial expressions presented concurrently with a manual sign perform this function. When deaf adult signers cannot see facial features, their ability to judge emotion in a signed utterance is impaired (Reilly et al. in Sign Lang Stud 75:113-118, 1992). We examined the role of the face in the comprehension of emotion in sign language in a group of typically developing (TD) deaf children and in a group of deaf children with autism spectrum disorder (ASD). We replicated Reilly et al.'s (Sign Lang Stud 75:113-118, 1992) adult results in the TD deaf signing children, confirming the importance of the face in understanding emotion in sign language. The ASD group performed more poorly on the emotion recognition task than the TD children. The deaf children with ASD showed a deficit in emotion recognition during sign language processing analogous to the deficit in vocal emotion recognition that has been observed in hearing children with ASD.
CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset

PubMed Central

Cao, Houwei; Cooper, David G.; Keutmann, Michael K.; Gur, Ruben C.; Nenkova, Ani; Verma, Ragini

2014-01-01

People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual data are 40.9%, 58.2% and 63.6% respectively. Recognition rates are highest for neutral, followed by happy, anger, disgust, fear, and sad. Average intensity levels of emotion are rated highest for visual-only perception. The accurate recognition of disgust and fear requires simultaneous audio-visual cues, while anger and happiness can be well recognized based on evidence from a single modality. The large dataset we introduce can be used to probe other questions concerning the audio-visual perception of emotion. PMID:25653738
Reading the mind in the infant eyes: paradoxical effects of oxytocin on neural activity and emotion recognition in watching pictures of infant faces.

PubMed

Voorthuis, Alexandra; Riem, Madelon M E; Van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J

2014-09-11

The neuropeptide oxytocin facilitates parental caregiving and is involved in the processing of infant vocal cues. In this randomized-controlled trial with functional magnetic resonance imaging we examined the influence of intranasally administered oxytocin on neural activity during emotion recognition in infant faces. Blood oxygenation level dependent (BOLD) responses during emotion recognition were measured in 50 women who were administered 16 IU of oxytocin or a placebo. Participants performed an adapted version of the Infant Facial Expressions of Emotions from Looking at Pictures (IFEEL pictures), a task that has been developed to assess the perception and interpretation of infants' facial expressions. Experimentally induced oxytocin levels increased activation in the inferior frontal gyrus (IFG), the middle temporal gyrus (MTG) and the superior temporal gyrus (STG). However, oxytocin decreased performance on the IFEEL picture task. Our findings suggest that oxytocin enhances processing of facial cues of the emotional state of infants on a neural level, but at the same time it may decrease the correct interpretation of infants' facial expressions on a behavior level. This article is part of a Special Issue entitled Oxytocin and Social Behav. © 2013 Published by Elsevier B.V.
Artificially lengthened and constricted vocal tract in vocal training methods.

PubMed

Bele, Irene Velsvik

2005-01-01

It is common practice in vocal training to make use of vocal exercise techniques that involve partial occlusion of the vocal tract. Various techniques are used; some of them form an occlusion within the front part of the oral cavity or at the lips. Another vocal exercise technique involves lengthening the vocal tract; for example, the method of phonation into small tubes. This essay presents some studies made on the effects of various vocal training methods that involve an artificially lengthened and constricted vocal tract. The influence of sufficient acoustic impedance on vocal fold vibration and economical voice production is presented.
On Assisting a Visual-Facial Affect Recognition System with Keyboard-Stroke Pattern Information

NASA Astrophysics Data System (ADS)

Stathopoulou, I.-O.; Alepis, E.; Tsihrintzis, G. A.; Virvou, M.

Towards realizing a multimodal affect recognition system, we are considering the advantages of assisting a visual-facial expression recognition system with keyboard-stroke pattern information. Our work is based on the assumption that the visual-facial and keyboard modalities are complementary to each other and that their combination can significantly improve the accuracy in affective user models. Specifically, we present and discuss the development and evaluation process of two corresponding affect recognition subsystems, with emphasis on the recognition of 6 basic emotional states, namely happiness, sadness, surprise, anger and disgust as well as the emotion-less state which we refer to as neutral. We find that emotion recognition by the visual-facial modality can be aided greatly by keyboard-stroke pattern information and the combination of the two modalities can lead to better results towards building a multimodal affect recognition system.
L1 literacy affects L2 pronunciation intake and text vocalization

NASA Astrophysics Data System (ADS)

Walton, Martin

2005-04-01

For both deaf and hearing learners, L1 acquisition calls on auditive, gestural and visual modes in progressive processes over longer stages imposed in strictly anatomical and social order from the earliest pre-lexical phase [Jusczyk (1993), Kuhl & Meltzoff (1996)] to ultimate literacy. By contrast, L2 learning will call on accelerating procedures but with restricted input, arbitrated by L1 literacy as can be traced in the English of French-speaking learners, whether observed in spontaneous speech or in text vocalization modes. An inventory of their predictable omissions, intrusions and substitutions at suprasegmental and syllabic levels, many of which they can actually hear while unable to vocalize in real-time, suggests that a photogenic segmentation of continuous speech into alphabetical units has eclipsed the indispensable earlier phonogenic module, filtering L2 intake and output. This competing mode analysis hypothesizes a critical effect on L2 pronunciation of L1 graphemic procedures acquired usually before puberty, informing data for any Critical Period Hypothesis or amounts of L1 activation influencing L2 accent [Flege (1997, 1998)] or any psychoacoustic French deafness with regard to English stress-timing [Dupoux (1997)]. A metaphonic model [Howell & Dean (1991)] adapted for French learners may remedially distance L1 from L2 vocalization procedures.
Laryngoscopic, acoustic, perceptual, and functional assessment of voice in rock singers.

PubMed

Guzman, Marco; Barros, Macarena; Espinoza, Fernanda; Herrera, Alejandro; Parra, Daniela; Muñoz, Daniel; Lloyd, Adam

2013-01-01

The present study aimed to vocally assess a group of rock singers who use growl voice and reinforced falsetto. A group of 21 rock singers and a control group of 18 pop singers were included. Singing and speaking voice was assessed through acoustic, perceptual, functional and laryngoscopic analysis. No significant differences were observed between groups in most of the analyses. Acoustic and perceptual analysis of the experimental group demonstrated normality of speaking voice. Endoscopic evaluation showed that most rock singers presented during singing voice a high vertical laryngeal position, pharyngeal compression and laryngeal supraglottic compression. Supraglottic activity during speaking voice tasks was also observed. However, overall vocal fold integrity was demonstrated in most of the participants. Slightly abnormal observations were demonstrated in few of them. Singing voice handicap index revealed that the most affected variable was the physical sphere, followed by the social and emotional spheres. Although growl voice and reinforced falsetto represent laryngeal and pharyngeal hyperfunctional activity, they did not seem to contribute to the presence of any major vocal fold disorder in our subjects. Nevertheless, we cannot rule out the possibility that more evident vocal fold disorders could be found in singers who use these techniques more often and during a longer period of time.

How do you say 'hello'? Personality impressions from brief novel voices.

PubMed

McAleer, Phil; Todorov, Alexander; Belin, Pascal

2014-01-01

On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word 'hello' on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional 'social voice space' with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices.
Selective impairment of song learning following lesions of a forebrain nucleus in the juvenile zebra finch.

PubMed

Sohrabji, F; Nordeen, E J; Nordeen, K W

1990-01-01

Area X, a large sexually dimorphic nucleus in the avian ventral forebrain, is part of a highly discrete system of interconnected nuclei that have been implicated in either song learning or adult song production. Previously, this nucleus has been included in the song system because of its substantial connections with other vocal control nuclei, and because its volume is positively correlated with the capacity for song. In order to directly assess the role of Area X in song behavior, this nucleus was bilaterally lesioned in both juvenile and adult zebra finches, using ibotenic acid. We report here that lesioning Area X disrupts normal song development in juvenile birds, but does not affect the production of stereotyped song by adult birds. Although juvenile-lesioned birds were consistently judged as being in earlier stages of vocal development than age-matched controls, they continued to produce normal song-like vocalizations. Thus, unlike the lateral magnocellular nucleus of the anterior neostriatum, another avian forebrain nucleus implicated in song learning, Area X does not seem to be necessary for sustaining production of juvenile song. Rather, the behavioral results suggest Area X is important for either the acquisition of a song model or the improvement of song through vocal practice.
Cognitive bias in rats evoked by ultrasonic vocalizations suggests emotional contagion.

PubMed

Saito, Yumi; Yuki, Shoko; Seki, Yoshimasa; Kagawa, Hiroko; Okanoya, Kazuo

2016-11-01

Emotional contagion occurs when an individual acquires the emotional state of another via social cues, and is an important component of empathy. Empathic responses seen in rodents are often explained by emotional contagion. Rats emit 50kHz ultrasonic vocalizations (USVs) in positive contexts, and emit 22kHz USVs in negative contexts. We tested whether rats show positive or negative emotional contagion after hearing conspecific USVs via a cognitive bias task. We hypothesized that animals in positive emotional states would perceive an ambiguous cue as being good (optimistic bias) whereas animals in negative states would perceive the same cue as being bad (pessimistic bias). Rats were trained to respond differently to two sounds with distinct pitches, each of which signaled either a positive or a negative outcome. An ambiguous cue with a frequency falling between the two stimuli tested whether rats interpreted it as positive or negative. Results showed that rats responded to ambiguous cues as positive when they heard the 50kHz USV (positive vocalizations) and negative when they heard the 22kHz USV (negative vocalizations). This suggests that conspecific USVs can evoke emotional contagion, both for positive and negative emotions, to change the affective states in receivers. Copyright © 2016 Elsevier B.V. All rights reserved.
The impact of perilaryngeal vibration on the self-perception of loudness and the Lombard effect.

PubMed

Brajot, François-Xavier; Nguyen, Don; DiGiovanni, Jeffrey; Gracco, Vincent L

2018-04-05

The role of somatosensory feedback in speech and the perception of loudness was assessed in adults without speech or hearing disorders. Participants completed two tasks: loudness magnitude estimation of a short vowel and oral reading of a standard passage. Both tasks were carried out in each of three conditions: no-masking, auditory masking alone, and mixed auditory masking plus vibration of the perilaryngeal area. A Lombard effect was elicited in both masking conditions: speakers unconsciously increased vocal intensity. Perilaryngeal vibration further increased vocal intensity above what was observed for auditory masking alone. Both masking conditions affected fundamental frequency and the first formant frequency as well, but only vibration was associated with a significant change in the second formant frequency. An additional analysis of pure-tone thresholds found no difference in auditory thresholds between masking conditions. Taken together, these findings indicate that perilaryngeal vibration effectively masked somatosensory feedback, resulting in an enhanced Lombard effect (increased vocal intensity) that did not alter speakers' self-perception of loudness. This implies that the Lombard effect results from a general sensorimotor process, rather than from a specific audio-vocal mechanism, and that the conscious self-monitoring of speech intensity is not directly based on either auditory or somatosensory feedback.
The contribution of sound intensity in vocal emotion perception: behavioral and electrophysiological evidence.

PubMed

Chen, Xuhai; Yang, Jianfeng; Gan, Shuzhen; Yang, Yufang

2012-01-01

Although its role is frequently stressed in acoustic profile for vocal emotion, sound intensity is frequently regarded as a control parameter in neurocognitive studies of vocal emotion, leaving its role and neural underpinnings unclear. To investigate these issues, we asked participants to rate the angry level of neutral and angry prosodies before and after sound intensity modification in Experiment 1, and recorded electroencephalogram (EEG) for mismatching emotional prosodies with and without sound intensity modification and for matching emotional prosodies while participants performed emotional feature or sound intensity congruity judgment in Experiment 2. It was found that sound intensity modification had significant effect on the rating of angry level for angry prosodies, but not for neutral ones. Moreover, mismatching emotional prosodies, relative to matching ones, induced enhanced N2/P3 complex and theta band synchronization irrespective of sound intensity modification and task demands. However, mismatching emotional prosodies with reduced sound intensity showed prolonged peak latency and decreased amplitude in N2/P3 complex and smaller theta band synchronization. These findings suggest that though it cannot categorically affect emotionality conveyed in emotional prosodies, sound intensity contributes to emotional significance quantitatively, implying that sound intensity should not simply be taken as a control parameter and its unique role needs to be specified in vocal emotion studies.
The Contribution of Sound Intensity in Vocal Emotion Perception: Behavioral and Electrophysiological Evidence

PubMed Central

Chen, Xuhai; Yang, Jianfeng; Gan, Shuzhen; Yang, Yufang

2012-01-01

Although its role is frequently stressed in acoustic profile for vocal emotion, sound intensity is frequently regarded as a control parameter in neurocognitive studies of vocal emotion, leaving its role and neural underpinnings unclear. To investigate these issues, we asked participants to rate the angry level of neutral and angry prosodies before and after sound intensity modification in Experiment 1, and recorded electroencephalogram (EEG) for mismatching emotional prosodies with and without sound intensity modification and for matching emotional prosodies while participants performed emotional feature or sound intensity congruity judgment in Experiment 2. It was found that sound intensity modification had significant effect on the rating of angry level for angry prosodies, but not for neutral ones. Moreover, mismatching emotional prosodies, relative to matching ones, induced enhanced N2/P3 complex and theta band synchronization irrespective of sound intensity modification and task demands. However, mismatching emotional prosodies with reduced sound intensity showed prolonged peak latency and decreased amplitude in N2/P3 complex and smaller theta band synchronization. These findings suggest that though it cannot categorically affect emotionality conveyed in emotional prosodies, sound intensity contributes to emotional significance quantitatively, implying that sound intensity should not simply be taken as a control parameter and its unique role needs to be specified in vocal emotion studies. PMID:22291928
Transmasculine People's Voice Function: A Review of the Currently Available Evidence.

PubMed

Azul, David; Nygren, Ulrika; Södersten, Maria; Neuschaefer-Rube, Christiane

2017-03-01

This study aims to evaluate the currently available discursive and empirical data relating to those aspects of transmasculine people's vocal situations that are not primarily gender-related, to identify restrictions to voice function that have been observed in this population, and to make suggestions for future voice research and clinical practice. We conducted a comprehensive review of the voice literature. Publications were identified by searching six electronic databases and bibliographies of relevant articles. Twenty-two publications met inclusion criteria. Discourses and empirical data were analyzed for factors and practices that impact on voice function and for indications of voice function-related problems in transmasculine people. The quality of the evidence was appraised. The extent and quality of studies investigating transmasculine people's voice function was found to be limited. There was mixed evidence to suggest that transmasculine people might experience restrictions to a range of domains of voice function, including vocal power, vocal control/stability, glottal function, pitch range/variability, vocal endurance, and voice quality. More research into the different factors and practices affecting transmasculine people's voice function that takes account of a range of parameters of voice function and considers participants' self-evaluations is needed to establish how functional voice production can be best supported in this population. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Vocal Tract Discomfort and Voice-Related Quality of Life in Wind Instrumentalists.

PubMed

Cappellaro, Juliane; Beber, Bárbara Costa

2018-05-01

This study aimed to investigate vocal tract discomfort and quality of life in the voice of wind instrumentalists. It is a cross-sectional study. The sample was composed of 37 musicians of the orchestra of Caxias do Sul city, RS, Brazil. The participants answered a nonstandard questionnaire about demographic and professional information, the Voice-Related Quality of Life (V-RQOL), the Vocal Tract Discomfort (VTD) scale, and additional items about fatigue after playing the instrument and pain in the cervical muscles. Correlation analyses were performed using Spearman correlation test. The most frequent symptoms mentioned by musicians in the VTD, for both frequency and intensity of occurrence, were dryness, ache, irritability, and cervical muscle pain, in addition to the frequency of occurrence of fatigue after playing. The musicians showed high scores in the V-RQOL survey. Several symptoms evaluated by the VTD had a negative correlation with the musicians' years of orchestra membership and with V-RQOL scores. Symptoms of vocal tract discomfort are present in wind instrumentalists in low frequency and intensity of occurrence. However, these symptoms affect the musicians' voice-related quality of life, and they occur more in musicians with fewer years of orchestra membership. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Effects of melody and technique on acoustical and musical features of western operatic singing voices.

PubMed

Larrouy-Maestri, Pauline; Magis, David; Morsomme, Dominique

2014-05-01

The operatic singing technique is frequently used in classical music. Several acoustical parameters of this specific technique have been studied but how these parameters combine remains unclear. This study aims to further characterize the Western operatic singing technique by observing the effects of melody and technique on acoustical and musical parameters of the singing voice. Fifty professional singers performed two contrasting melodies (popular song and romantic melody) with two vocal techniques (with and without operatic singing technique). The common quality parameters (energy distribution, vibrato rate, and extent), perturbation parameters (standard deviation of the fundamental frequency, signal-to-noise ratio, jitter, and shimmer), and musical features (fundamental frequency of the starting note, average tempo, and sound pressure level) of the 200 sung performances were analyzed. The results regarding the effect of melody and technique on the acoustical and musical parameters show that the choice of melody had a limited impact on the parameters observed, whereas a particular vocal profile appeared depending on the vocal technique used. This study confirms that vocal technique affects most of the parameters examined. In addition, the observation of quality, perturbation, and musical parameters contributes to a better understanding of the Western operatic singing technique. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Acoustic characteristics of simulated respiratory-induced vocal tremor.

PubMed

Lester, Rosemary A; Story, Brad H

2013-05-01

The purpose of this study was to investigate the relation of respiratory forced oscillation to the acoustic characteristics of vocal tremor. Acoustical analyses were performed to determine the characteristics of the intensity and fundamental frequency (F0) for speech samples obtained by Farinella, Hixon, Hoit, Story, and Jones (2006) using a respiratory forced oscillation paradigm with 5 healthy adult males to simulate vocal tremor involving respiratory pressure modulation. The analyzed conditions were sustained productions of /a/ with amplitudes of applied pressure of 0, 1, 2, and 4 cmH2O and a rate of 5 Hz. Forced oscillation of the respiratory system produced modulation of the intensity and F0 for all participants. Variability was observed between participants and conditions in the change in intensity and F0 per unit of pressure change, as well as in the mean intensity and F0. However, the extent of modulation of intensity and F0 generally increased as the applied pressure increased, as would be expected. These findings suggest that individuals develop idiosyncratic adaptations to pressure modulations, which are important to understanding aspects of variability in vocal tremor, and highlight the need to assess all components of the speech mechanism that may be directly or indirectly affected by tremor.
Theory of Mind (ToM) and counterfactuality deficits in schizophrenia: misperception or misinterpretation?

PubMed

Leitman, David I; Ziwich, Rachel; Pasternak, Roey; Javitt, Daniel C

2006-08-01

Theory of Mind (ToM) refers to the ability to infer another person's mental state based upon interactional information. ToM deficits have been suggested to underlie crucial aspects of social interaction failure in disorders such as autism and schizophrenia, although the development of paradigms for demonstrating such deficits remains an ongoing area of research. Recent studies have explored the use of sarcasm perception, in which subjects must infer an individual's sincerity or lack thereof, as a 'real-life' index of ToM ability, and as an index of functioning of specific right hemispheric structures. Sarcastic detection ability has not previously been studied in schizophrenia, although patients have been shown to have deficits in ability to decode emotional information from speech ('affective prosody'). Twenty-two schizophrenia patients and 17 control subjects were tested on their ability to detect sarcasm from spoken speech as well as measures of affective prosody and basic pitch perception. Despite normal overall intelligence, patients performed substantially worse than controls in ability to detect sarcasm (d=2.2), showing both decreased sensitivity (A') in detection of sincerity versus sarcasm and an increased bias (B'') toward sincerity. Correlations across groups revealed significant relationships between impairments in sarcasm recognition, affective prosody and basic pitch perception. These findings demonstrate substantial deficits in ability to infer an internal subjective state based upon vocal modulation among subjects with schizophrenia. Deficits were related to, but were significantly more severe than, more general forms of prosodic and sensorial misperception, and are consistent with both right hemispheric and 'bottom-up' theories of the disorder.
Changes in Peak Airflow Measurement During Maximal Cough After Vocal Fold Augmentation in Patients With Glottic Insufficiency.

PubMed

Dion, Gregory R; Achlatis, Efstratios; Teng, Stephanie; Fang, Yixin; Persky, Michael; Branski, Ryan C; Amin, Milan R

2017-11-01

Compromised cough effectiveness is correlated with dysphagia and aspiration. Glottic insufficiency likely yields decreased cough strength and effectiveness. Although vocal fold augmentation favorably affects voice and likely improves cough strength, few data exist to support this hypothesis. To assess whether vocal fold augmentation improves peak airflow measurements during maximal-effort cough following augmentation. This case series study was conducted in a tertiary, academic laryngology clinic. Participants included 14 consecutive individuals with glottic insufficiency due to vocal fold paralysis, which was diagnosed via videostrobolaryngoscopy as a component of routine clinical examination. All participants who chose to proceed with augmentation were considered for the study whether office-based or operative augmentation was planned. Postaugmentation data were collected only at the first follow-up visit, which was targeted for 14 days after augmentation but varied on the basis of participant availability. Data were collected from June 5, 2014, to October 1, 2015. Data analysis took place between October 2, 2015, and March 3, 2017. Peak airflow during maximal volitional cough was quantified before and after vocal fold augmentation. Participants performed maximal coughs, and peak expiratory flow during the maximal cough was captured according to American Thoracic Society guidelines. Among the 14 participants (7 men and 7 women), the mean (SD) age was 62 (18) years. Three types of injectable material were used for vocal fold augmentation: carboxymethylcellulose in 5 patients, hyaluronic acid in 5, and calcium hydroxylapatite in 4. Following augmentation, cough strength increased in 11 participants and decreased cough strength was observed in 3. Peak airflow measurements during maximal cough varied from a decrease of 40 L/min to an increase of 150 L/min following augmentation. When preaugmentation and postaugmentation peak airflow measurements were compared, the median improvement was 50 L/min (95% CI, 10-75 L/min; P = .01). Immediate peak airflow measurements during cough collected within 30 minutes of augmentation varied when compared with measurements collected at follow-up (103-380 vs 160-390 L/min). Peak airflow during maximal cough may improve with vocal fold augmentation. Additional assessment and measurements are needed to further delineate which patients will benefit most regarding their cough from vocal fold augmentation.
Does a pneumotach accurately characterize voice function?

NASA Astrophysics Data System (ADS)

Walters, Gage; Krane, Michael

2016-11-01

A study is presented which addresses how a pneumotach might adversely affect clinical measurements of voice function. A pneumotach is a device, typically a mask, worn over the mouth, in order to measure time-varying glottal volume flow. By measuring the time-varying difference in pressure across a known aerodynamic resistance element in the mask, the glottal volume flow waveform is estimated. Because it adds aerodynamic resistance to the vocal system, there is some concern that using a pneumotach may not accurately portray the behavior of the voice. To test this hypothesis, experiments were performed in a simplified airway model with the principal dimensions of an adult human upper airway. A compliant constriction, fabricated from silicone rubber, modeled the vocal folds. Variations of transglottal pressure, time-averaged volume flow, model vocal fold vibration amplitude, and radiated sound with subglottal pressure were performed, with and without the pneumotach in place, and differences noted. Acknowledge support of NIH Grant 2R01DC005642-10A1.
Social-bond strength influences vocally mediated recruitment to mobbing

PubMed Central

2016-01-01

Strong social bonds form between individuals in many group-living species, and these relationships can have important fitness benefits. When responding to vocalizations produced by groupmates, receivers are expected to adjust their behaviour depending on the nature of the bond they share with the signaller. Here we investigate whether the strength of the signaller–receiver social bond affects response to calls that attract others to help mob a predator. Using field-based playback experiments on a habituated population of wild dwarf mongooses (Helogale parvula), we first demonstrate that a particular vocalization given on detecting predatory snakes does act as a recruitment call; receivers were more likely to look, approach and engage in mobbing behaviour than in response to control close calls. We then show that individuals respond more strongly to these recruitment calls if they are from groupmates with whom they are more strongly bonded (those with whom they preferentially groom and forage). Our study, therefore, provides novel evidence about the anti-predator benefits of close bonds within social groups. PMID:27903776
Foxp2 controls synaptic wiring of corticostriatal circuits and vocal communication by opposing Mef2c.

PubMed

Chen, Yi-Chuan; Kuo, Hsiao-Ying; Bornschein, Ulrich; Takahashi, Hiroshi; Chen, Shih-Yun; Lu, Kuan-Ming; Yang, Hao-Yu; Chen, Gui-May; Lin, Jing-Ruei; Lee, Yi-Hsin; Chou, Yun-Chia; Cheng, Sin-Jhong; Chien, Cheng-Ting; Enard, Wolfgang; Hevers, Wulf; Pääbo, Svante; Graybiel, Ann M; Liu, Fu-Chin

2016-11-01

Cortico-basal ganglia circuits are critical for speech and language and are implicated in autism spectrum disorder, in which language function can be severely affected. We demonstrate that in the mouse striatum, the gene Foxp2 negatively interacts with the synapse suppressor gene Mef2c. We present causal evidence that Mef2c inhibition by Foxp2 in neonatal mouse striatum controls synaptogenesis of corticostriatal inputs and vocalization in neonates. Mef2c suppresses corticostriatal synapse formation and striatal spinogenesis, but can itself be repressed by Foxp2 through direct DNA binding. Foxp2 deletion de-represses Mef2c, and both intrastriatal and global decrease of Mef2c rescue vocalization and striatal spinogenesis defects of Foxp2-deletion mutants. These findings suggest that Foxp2-Mef2C signaling is critical to corticostriatal circuit formation. If found in humans, such signaling defects could contribute to a range of neurologic and neuropsychiatric disorders.
Foxp2 Controls Synaptic Wiring of Corticostriatal Circuits and Vocal Communication by Opposing Mef2C

PubMed Central

Chen, Yi-Chuan; Kuo, Hsiao-Ying; Bornschein, Ulrich; Takahashi, Hiroshi; Chen, Shih-Yun; Lu, Kuan-Ming; Yang, Hao-Yu; Chen, Gui-May; Lin, Jing-Ruei; Lee, Yi-Hsin; Chou, Yun-Chia; Cheng, Sin-Jhong; Chien, Cheng-Ting; Enard, Wolfgang; Hevers, Wulf; Pääbo, Svante; Graybiel, Ann M.; Liu, Fu-Chin

2016-01-01

Cortico-basal ganglia circuits are critical for speech and language and are implicated in autism spectrum disorder (ASD), in which language function can be severely affected. We demonstrate that in the striatum, the gene, Foxp2, negatively interacts with the synapse suppressor, Mef2C. We present causal evidence that Mef2C inhibition by Foxp2 in neonatal mouse striatum controls synaptogenesis of corticostriatal inputs and vocalization in neonates. Mef2C suppresses corticostriatal synapse formation and striatal spinogenesis, but can, itself, be repressed by Foxp2 through direct DNA binding. Foxp2 deletion de-represses Mef2C, and both intrastriatal and global decrease of Mef2C rescue vocalization and striatal spinogenesis defects of Foxp2-deletion mutants. These findings suggest that Foxp2-Mef2C signaling is critical to corticostriatal circuit formation. If found in humans, such signaling defects could contribute to a range of neurologic and neuropsychiatric disorders. PMID:27595386
Neuromuscular control of fundamental frequency and glottal posture at phonation onset

PubMed Central

Chhetri, Dinesh K.; Neubauer, Juergen; Berry, David A.

2012-01-01

The laryngeal neuromuscular mechanisms for modulating glottal posture and fundamental frequency are of interest in understanding normal laryngeal physiology and treating vocal pathology. The intrinsic laryngeal muscles in an in vivo canine model were electrically activated in a graded fashion to investigate their effects on onset frequency, phonation onset pressure, vocal fold strain, and glottal distance at the vocal processes. Muscle activation plots for these laryngeal parameters were evaluated for the interaction of following pairs of muscle activation conditions: (1) cricothyroid (CT) versus all laryngeal adductors (TA/LCA/IA), (2) CT versus LCA/IA, (3) CT versus thyroarytenoid (TA) and, (4) TA versus LCA/IA (LCA: lateral cricoarytenoid muscle, IA: interarytenoid). Increases in onset frequency and strain were primarily affected by CT activation. Onset pressure correlated with activation of all adductors in activation condition 1, but primarily with CT activation in conditions 2 and 3. TA and CT were antagonistic for strain. LCA/IA activation primarily closed the cartilaginous glottis while TA activation closed the mid-membranous glottis. PMID:22352513
Numerical analysis of effects of transglottal pressure change on fundamental frequency of phonation.

PubMed

Deguchi, Shinji; Matsuzaki, Yuji; Ikeda, Tadashige

2007-02-01

In humans, a decrease in transglottal pressure (Pt) causes an increase in the fundamental frequency of phonation (F0) only at a specific voice pitch within the modal register, the mechanism of which remains unclear. In the present study, numerical analyses were performed to investigate the mechanism of the voice pitch-dependent positive change of F0 due to Pt decrease. The airflow and the airway, including the vocal folds, were modeled in terms of mechanics of fluid and structure. Simulations of phonation using the numerical model indicated that Pt affects both the average position and the average amplitude magnitude of vocal fold self-excited oscillation in a non-monotonous manner. This effect results in voice pitch-dependent responses of F0 to Pt decreases, including the positive response of F0 as actually observed in humans. The findings of the present study highlight the importance of considering self-excited oscillation of the vocal folds in elucidation of the phonation mechanism.
I feel your voice. Cultural differences in the multisensory perception of emotion.

PubMed

Tanaka, Akihiro; Koizumi, Ai; Imai, Hisato; Hiramatsu, Saori; Hiramoto, Eriko; de Gelder, Beatrice

2010-09-01

Cultural differences in emotion perception have been reported mainly for facial expressions and to a lesser extent for vocal expressions. However, the way in which the perceiver combines auditory and visual cues may itself be subject to cultural variability. Our study investigated cultural differences between Japanese and Dutch participants in the multisensory perception of emotion. A face and a voice, expressing either congruent or incongruent emotions, were presented on each trial. Participants were instructed to judge the emotion expressed in one of the two sources. The effect of to-be-ignored voice information on facial judgments was larger in Japanese than in Dutch participants, whereas the effect of to-be-ignored face information on vocal judgments was smaller in Japanese than in Dutch participants. This result indicates that Japanese people are more attuned than Dutch people to vocal processing in the multisensory perception of emotion. Our findings provide the first evidence that multisensory integration of affective information is modulated by perceivers' cultural background.
Prevalence and risk factors for voice problems among telemarketers.

PubMed

Jones, Katherine; Sigmon, Jason; Hock, Lynette; Nelson, Eric; Sullivan, Marsha; Ogren, Frederic

2002-05-01

To investigate whether there is an increased prevalence of voice problems among telemarketers compared with the general population and if these voice problems affect productivity and are associated with the presence of known risk factors for voice problems. Cross-sectional survey study. One outbound telemarketing firm, 3 reservations firms, 1 messaging firm, 1 survey research firm, and 1 community college. Random and cluster sampling identified 373 employees of the 6 firms; 304 employees completed the survey. A convenience sample of 187 community college students similar in age, sex, education level, and smoking prevalence served as a control group. Demographic, vocational, personality, and biological risk factors for voice problems; symptoms of vocal attrition; and effects of symptoms on work. Telemarketers were twice as likely to report 1 or more symptoms of vocal attrition compared with controls after adjusting for age, sex, and smoking status (P<.001). Of those surveyed, 31% reported that their work was affected by an average of 5.0 symptoms These respondents tended to be women (P<.001) and were more likely to smoke (P =.02); take drying medications (P<.001); have sinus problems (P =.04), frequent colds (P<.001), and dry mouth (P<.001); and be sedentary (P<.001). Telemarketers have a higher prevalence of voice problems than the control group. These problems affect productivity and are associated with modifiable risk factors. Evaluation of occupational voice disorders must encompass all of the determinants of health status, and treatment must focus on modifiable risk factors, not just the reduction of occupational vocal load.

A new measure of child vocal reciprocity in children with autism spectrum disorder.

PubMed

Harbison, Amy L; Woynaroski, Tiffany G; Tapp, Jon; Wade, Joshua W; Warlaumont, Anne S; Yoder, Paul J

2018-06-01

Children's vocal development occurs in the context of reciprocal exchanges with a communication partner who models "speechlike" productions. We propose a new measure of child vocal reciprocity, which we define as the degree to which an adult vocal response increases the probability of an immediately following child vocal response. Vocal reciprocity is likely to be associated with the speechlikeness of vocal communication in young children with autism spectrum disorder (ASD). Two studies were conducted to test the utility of the new measure. The first used simulated vocal samples with randomly sequenced child and adult vocalizations to test the accuracy of the proposed index of child vocal reciprocity. The second was an empirical study of 21 children with ASD who were preverbal or in the early stages of language development. Daylong vocal samples collected in the natural environment were computer analyzed to derive the proposed index of child vocal reciprocity, which was highly stable when derived from two daylong vocal samples and was associated with speechlikeness of vocal communication. This association was significant even when controlling for chance probability of child vocalizations to adult vocal responses, probability of adult vocalizations, or probability of child vocalizations. A valid measure of children's vocal reciprocity might eventually improve our ability to predict which children are on track to develop useful speech and/or are most likely to respond to language intervention. A link to a free, publicly-available software program to derive the new measure of child vocal reciprocity is provided. Autism Res 2018, 11: 903-915. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. Children and adults often engage in back-and-forth vocal exchanges. The extent to which they do so is believed to support children's early speech and language development. Two studies tested a new measure of child vocal reciprocity using computer-generated and real-life vocal samples of young children with autism collected in natural settings. The results provide initial evidence of accuracy, test-retest reliability, and validity of the new measure of child vocal reciprocity. A sound measure of children's vocal reciprocity might improve our ability to predict which children are on track to develop useful speech and/or are most likely to respond to language intervention. A free, publicly-available software program and manuals are provided. © 2018 International Society for Autism Research, Wiley Periodicals, Inc.
Oral and vocal fold diadochokinesis in dysphonic women

PubMed Central

LOUZADA, Talita; BERALDINELLE, Roberta; BERRETIN-FELIX, Giédre; BRASOLOTTO, Alcione Ghedini

2011-01-01

The evaluation of oral and vocal fold diadochokinesis (DDK) in individuals with voice disorders may contribute to the understanding of factors that affect the balanced vocal production. Scientific studies that make use of this assessment tool support the knowledge advance of this area, reflecting the development of more appropriate therapeutic planning. Objective To compare the results of oral and vocal fold DDK in dysphonic women and in women without vocal disorders. Material and methods For this study, 28 voice recordings of women from 19 to 54 years old, diagnosed with dysphonia and submitted to a voice assessment from speech pathologist and otorhinolaryngologist, were used. The control group included 30 nondysphonic women evaluated in prior research from normal adults. The analysis parameters like number and duration of emissions, as well as the regularity of the repetition of syllables "pa", "ta", "ka" and the vowels "a" and "i," were provided by the Advanced Motor Speech Profile program (MSP) Model-5141, version-2.5.2 (KayPentax). The DDK sequence "pataka" was analyzed quantitatively through the Sound Forge 7.0 program, as well as manually with the audio-visual help of sound waves. Average values of oral and vocal fold DDK dysphonic and nondysphonic women were compared using the "t Student" test and were considered significant when p<0.05. Results The findings showed no significant differences between populations; however, the coefficient of variation of period (CvP) and jitter of period (JittP) average of the "ka," "a" and "i" emissions, respectively, were higher in dysphonic women (CvP=10.42%, 12.79%, 12.05%; JittP=2.05%, 6.05%, 3.63%) compared to the control group (CvP=8.86%; 10.95%, 11.20%; JittP=1.82%, 2.98%, 3.15%). Conclusion Although the results do not indicate any difficulties in oral and laryngeal motor control in the dysphonic group, the largest instability in vocal fold DDK in the experimental group should be considered, and studies of this ability in individuals with communication disorders must be intensified. PMID:22230989
Role of adolescent and maternal depressive symptoms on transactional emotion recognition: context and state affect matter.

PubMed

Luebbe, Aaron M; Fussner, Lauren M; Kiel, Elizabeth J; Early, Martha C; Bell, Debora J

2013-12-01

Depressive symptomatology is associated with impaired recognition of emotion. Previous investigations have predominantly focused on emotion recognition of static facial expressions neglecting the influence of social interaction and critical contextual factors. In the current study, we investigated how youth and maternal symptoms of depression may be associated with emotion recognition biases during familial interactions across distinct contextual settings. Further, we explored if an individual's current emotional state may account for youth and maternal emotion recognition biases. Mother-adolescent dyads (N = 128) completed measures of depressive symptomatology and participated in three family interactions, each designed to elicit distinct emotions. Mothers and youth completed state affect ratings pertaining to self and other at the conclusion of each interaction task. Using multiple regression, depressive symptoms in both mothers and adolescents were associated with biased recognition of both positive affect (i.e., happy, excited) and negative affect (i.e., sadness, anger, frustration); however, this bias emerged primarily in contexts with a less strong emotional signal. Using actor-partner interdependence models, results suggested that youth's own state affect accounted for depression-related biases in their recognition of maternal affect. State affect did not function similarly in explaining depression-related biases for maternal recognition of adolescent emotion. Together these findings suggest a similar negative bias in emotion recognition associated with depressive symptoms in both adolescents and mothers in real-life situations, albeit potentially driven by different mechanisms.
Selective attention modulates early human evoked potentials during emotional face-voice processing.

PubMed

Ho, Hao Tam; Schröger, Erich; Kotz, Sonja A

2015-04-01

Recent findings on multisensory integration suggest that selective attention influences cross-sensory interactions from an early processing stage. Yet, in the field of emotional face-voice integration, the hypothesis prevails that facial and vocal emotional information interacts preattentively. Using ERPs, we investigated the influence of selective attention on the perception of congruent versus incongruent combinations of neutral and angry facial and vocal expressions. Attention was manipulated via four tasks that directed participants to (i) the facial expression, (ii) the vocal expression, (iii) the emotional congruence between the face and the voice, and (iv) the synchrony between lip movement and speech onset. Our results revealed early interactions between facial and vocal emotional expressions, manifested as modulations of the auditory N1 and P2 amplitude by incongruent emotional face-voice combinations. Although audiovisual emotional interactions within the N1 time window were affected by the attentional manipulations, interactions within the P2 modulation showed no such attentional influence. Thus, we propose that the N1 and P2 are functionally dissociated in terms of emotional face-voice processing and discuss evidence in support of the notion that the N1 is associated with cross-sensory prediction, whereas the P2 relates to the derivation of an emotional percept. Essentially, our findings put the integration of facial and vocal emotional expressions into a new perspective-one that regards the integration process as a composite of multiple, possibly independent subprocesses, some of which are susceptible to attentional modulation, whereas others may be influenced by additional factors.
Vocal Control: Is It Susceptible to the Negative Effects of Self-Regulatory Depletion?

PubMed

Vinney, Lisa A; van Mersbergen, Miriam; Connor, Nadine P; Turkstra, Lyn S

2016-09-01

Self-regulation (SR) relies on the capacity to modify behavior. This capacity may diminish with use and result in self-regulatory depletion (SRD), or the reduced ability to engage in future SR efforts. If the SRD effect applies to vocal behavior, it may hinder success during behavioral voice treatment. Thus, this proof-of-concept study sought to determine whether SRD affects vocal behavior change and if so, whether it can be repaired by an intervention meant to replete SR resources. One hundred four women without voice disorders were randomized into groups that performed either (1) a high-SR writing task followed by a high-SR voice task; (2) a low-SR writing task followed by a high-SR voice task; or (3) a high-SR writing task followed by a relaxation intervention and a high-SR voice task. The high-SR voice tasks in all groups involved suppression of the Lombard effect during reading and free speech. The low-SR group suppressed the Lombard effect to a greater extent than the high-SR group and high-SR-plus-relaxation group on the free speech task. There were no significant group differences on the reading task. Findings suggest that SRD may present challenges to vocal behavior modification during free speech but not reading. Furthermore, relaxation did not significantly replete self-regulatory resources for vocal modification during free speech. Findings may highlight potential considerations for voice treatment and assessment and support the need for future research focusing on effective methods to test self-regulatory capacity and replete self-regulatory resources in voice patients. Published by Elsevier Inc.
Assessment of breathing patterns and respiratory muscle recruitment during singing and speech in quadriplegia.

PubMed

Tamplin, Jeanette; Brazzale, Danny J; Pretto, Jeffrey J; Ruehland, Warren R; Buttifant, Mary; Brown, Douglas J; Berlowitz, David J

2011-02-01

To explore how respiratory impairment after cervical spinal cord injury affects vocal function, and to explore muscle recruitment strategies used during vocal tasks after quadriplegia. It was hypothesized that to achieve the increased respiratory support required for singing and loud speech, people with quadriplegia use different patterns of muscle recruitment and control strategies compared with control subjects without spinal cord injury. Matched, parallel-group design. Large university-affiliated public hospital. Consenting participants with motor-complete C5-7 quadriplegia (n=6) and able-bodied age-matched controls (n=6) were assessed on physiologic and voice measures during vocal tasks. Not applicable. Standard respiratory function testing, surface electromyographic activity from accessory respiratory muscles, sound pressure levels during vocal tasks, the Voice Handicap Index, and the Perceptual Voice Profile. The group with quadriplegia had a reduced lung capacity (vital capacity, 71% vs 102% of predicted; P=.028), more perceived voice problems (Voice Handicap Index score, 22.5 vs 6.5; P=.046), and greater recruitment of accessory respiratory muscles during both loud and soft volumes (P=.028) than the able-bodied controls. The group with quadriplegia also demonstrated higher accessory muscle activation in changing from soft to loud speech (P=.028). People with quadriplegia have impaired vocal ability and use different muscle recruitment strategies during speech than the able-bodied. These findings will enable us to target specific measurements of respiratory physiology for assessing functional improvements in response to formal therapeutic singing training. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
The roles of vocal and visual interactions in social learning zebra finches: A video playback experiment.

PubMed

Guillette, Lauren M; Healy, Susan D

2017-06-01

The transmission of information from an experienced demonstrator to a naïve observer often depends on characteristics of the demonstrator, such as familiarity, success or dominance status. Whether or not the demonstrator pays attention to and/or interacts with the observer may also affect social information acquisition or use by the observer. Here we used a video-demonstrator paradigm first to test whether video demonstrators have the same effect as using live demonstrators in zebra finches, and second, to test the importance of visual and vocal interactions between the demonstrator and observer on social information use by the observer. We found that female zebra finches copied novel food choices of male demonstrators they saw via live-streaming video while they did not consistently copy from the demonstrators when they were seen in playbacks of the same videos. Although naive observers copied in the absence of vocalizations by the demonstrator, as they copied from playback of videos with the sound off, females did not copy where there was a mis-match between the visual information provided by the video and vocal information from a live male that was out of sight. Taken together these results suggest that video demonstration is a useful methodology for testing social information transfer, at least in a foraging context, but more importantly, that social information use varies according to the vocal interactions, or lack thereof, between the observer and the demonstrator. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Can vocal conditioning trigger a semiotic ratchet in marmosets?

PubMed

Turesson, Hjalmar K; Ribeiro, Sidarta

2015-01-01

The complexity of human communication has often been taken as evidence that our language reflects a true evolutionary leap, bearing little resemblance to any other animal communication system. The putative uniqueness of the human language poses serious evolutionary and ethological challenges to a rational explanation of human communication. Here we review ethological, anatomical, molecular, and computational results across several species to set boundaries for these challenges. Results from animal behavior, cognitive psychology, neurobiology, and semiotics indicate that human language shares multiple features with other primate communication systems, such as specialized brain circuits for sensorimotor processing, the capability for indexical (pointing) and symbolic (referential) signaling, the importance of shared intentionality for associative learning, affective conditioning and parental scaffolding of vocal production. The most substantial differences lie in the higher human capacity for symbolic compositionality, fast vertical transmission of new symbols across generations, and irreversible accumulation of novel adaptive behaviors (cultural ratchet). We hypothesize that increasingly-complex vocal conditioning of an appropriate animal model may be sufficient to trigger a semiotic ratchet, evidenced by progressive sign complexification, as spontaneous contact calls become indexes, then symbols and finally arguments (strings of symbols). To test this hypothesis, we outline a series of conditioning experiments in the common marmoset (Callithrix jacchus). The experiments are designed to probe the limits of vocal communication in a prosocial, highly vocal primate 35 million years far from the human lineage, so as to shed light on the mechanisms of semiotic complexification and cultural transmission, and serve as a naturalistic behavioral setting for the investigation of language disorders.
Can vocal conditioning trigger a semiotic ratchet in marmosets?

PubMed Central

Turesson, Hjalmar K.; Ribeiro, Sidarta

2015-01-01

The complexity of human communication has often been taken as evidence that our language reflects a true evolutionary leap, bearing little resemblance to any other animal communication system. The putative uniqueness of the human language poses serious evolutionary and ethological challenges to a rational explanation of human communication. Here we review ethological, anatomical, molecular, and computational results across several species to set boundaries for these challenges. Results from animal behavior, cognitive psychology, neurobiology, and semiotics indicate that human language shares multiple features with other primate communication systems, such as specialized brain circuits for sensorimotor processing, the capability for indexical (pointing) and symbolic (referential) signaling, the importance of shared intentionality for associative learning, affective conditioning and parental scaffolding of vocal production. The most substantial differences lie in the higher human capacity for symbolic compositionality, fast vertical transmission of new symbols across generations, and irreversible accumulation of novel adaptive behaviors (cultural ratchet). We hypothesize that increasingly-complex vocal conditioning of an appropriate animal model may be sufficient to trigger a semiotic ratchet, evidenced by progressive sign complexification, as spontaneous contact calls become indexes, then symbols and finally arguments (strings of symbols). To test this hypothesis, we outline a series of conditioning experiments in the common marmoset (Callithrix jacchus). The experiments are designed to probe the limits of vocal communication in a prosocial, highly vocal primate 35 million years far from the human lineage, so as to shed light on the mechanisms of semiotic complexification and cultural transmission, and serve as a naturalistic behavioral setting for the investigation of language disorders. PMID:26500583
Plasma concentrations of substance P and cortisol in beef calves after castration or simulated castration.

PubMed

Coetzee, Johann F; Lubbers, Brian V; Toerber, Scott E; Gehring, Ronette; Thomson, Daniel U; White, Bradley J; Apley, Michael D

2008-06-01

To evaluate plasma concentrations of substance P (SP) and cortisol in calves after castration or simulated castration. 10 Angus-crossbred calves. Calves were acclimated for 5 days, assigned to a block on the basis of scrotal circumference, and randomly assigned to a castrated or simulated-castrated (control) group. Blood samples were collected twice before, at the time of (0 hours), and at several times points after castration or simulated castration. Vocalization and attitude scores were determined at time of castration or simulated castration. Plasma concentrations of SP and cortisol were determined by use of competitive and chemiluminescent enzyme immunoassays, respectively. Data were analyzed by use of repeated-measures analysis with a mixed model. Mean +/- SEM cortisol concentration in castrated calves (78.88+/-10.07 nmol/L) was similar to that in uncastrated control calves (73.01+/-10.07 nmol/L). However, mean SP concentration in castrated calves (506.43+/-38.11 pg/mL) was significantly higher than the concentration in control calves (386.42+/-40.09 pg/mL). Mean cortisol concentration in calves with vocalization scores of 0 was not significantly different from the concentration in calves with vocalization scores of 3. However, calves with vocalization scores of 3 had significantly higher SP concentrations, compared with SP concentrations for calves with vocalization scores of 0. Similar cortisol concentrations were measured in castrated and control calves. A significant increase in plasma concentrations of SP after castration suggested a likely association with nociception. These results may affect assessment of animal well-being in livestock production systems.
Multidimensional assessment of vocal changes in benign vocal fold lesions after voice therapy.

PubMed

Schindler, Antonio; Mozzanica, Francesco; Maruzzi, Patrizia; Atac, Murat; De Cristofaro, Valeria; Ottaviani, Francesco

2013-06-01

To evaluate through a multidimensional protocol voice changes after voice therapy in patients with benign vocal fold lesions. 65 consecutive patients affected by benign vocal fold lesions were enrolled. Depending on videolaryngostroboscopy the patients were divided into 3 groups: 23 patients with Reinke's oedema, 22 patients with vocal fold cysts and 20 patients with gelatinous polyp. Each subject received 10 voice therapy sessions and was evaluated, before and after voice therapy, through a multidimensional protocol including videolaryngostroboscopy, perception, acoustics, aerodynamics and self-rating by the patient. Data were compared using Wilcoxon signed-rank test. Kruskal-Wallis test was used to analyse the mean variation difference between the three groups of patients. Mann-Whitney test was used for post hoc analysis. Only in 11 cases videolaryngostroboscopy revealed an improvement of the initial pathology. However a significant improvement was observed in perceptual, acoustic and self-assessment ratings in the 3 groups of patients. In particular the parameters of G, R and A of the GIRBAS scale, and the noise to harmonic ratio, Jitter and shimmer scores improved after rehabilitation. A significant improvement of all the parameters of Voice Handicap Index after rehabilitation treatment was found. No significant difference among the three groups of patients was visible, except for self-assessment ratings. Voice therapy may provide a significant improvement in perceptual, acoustic and self-assessed voice quality in patients with benign glottal lesions. Utilization of voice therapy may allow some patients to avoid surgical intervention. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
The effects of preventive vocal hygiene education on the vocal hygiene habits and perceptual vocal characteristics of training singers.

PubMed

Broaddus-Lawrence, P L; Treole, K; McCabe, R B; Allen, R L; Toppin, L

2000-03-01

The purpose of the present study was to determine the effects of vocal hygiene education on the vocal hygiene behaviors and perceptual vocal characteristics of untrained singers. Eleven adult untrained singers served as subjects. They attended four 1-hour class sessions on vocal hygiene, including anatomy and physiology of the phonatory mechanism, vocally abusive behaviors, voice disorders commonly seen in singers, and measures to prevent voice disorders. Pre- and postinstruction surveys were used to record subjects' vocal abuses and their perceptions of their speaking and singing voice. They also rated their perceived value of vocal hygiene education. Results revealed minimal changes in vocal hygiene behaviors and perceptual voice characteristics. The subjects did report a high degree of benefit and learning, however.
Acoustic Analysis and Electroglottography in Elite Vocal Performers.

PubMed

Villafuerte-Gonzalez, Rocio; Valadez-Jimenez, Victor M; Sierra-Ramirez, Jose A; Ysunza, Pablo Antonio; Chavarria-Villafuerte, Karen; Hernandez-Lopez, Xochiquetzal

2017-05-01

Acoustic analysis of voice (AAV) and electroglottography (EGG) have been used for assessing vocal quality in patients with voice disorders. The effectiveness of these procedures for detecting mild disturbances in vocal quality in elite vocal performers has been controversial. To compare acoustic parameters obtained by AAV and EGG before and after vocal training to determine the effectiveness of these procedures for detecting vocal improvements in elite vocal performers. Thirty-three elite vocal performers were studied. The study group included 14 males and 19 females, ages 18-40 years, without a history of voice disorders. Acoustic parameters were obtained through AAV and EGG before and after vocal training using the Linklater method. Nonsignificant differences (P > 0.05) were found between values of fundamental frequency (F 0 ), shimmer, and jitter obtained by both procedures before vocal training. Mean F 0 was similar after vocal training. Jitter percentage as measured by AAV showed nonsignificant differences (P > 0.05) before and after vocal training. Shimmer percentage as measured by AAV demonstrated a significant reduction (P < 0.05) after vocal training. As measured by EGG after vocal training, shimmer and jitter were significantly reduced (P < 0.05); open quotient was significantly increased (P < 0.05); and irregularity was significantly reduced (P < 0.05). AAV and EGG were effective for detecting improvements in vocal function after vocal training in male and female elite vocal performers undergoing vocal training. EGG demonstrated better efficacy for detecting improvements and provided additional parameters as compared to AAV. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The First Call Note Plays a Crucial Role in Frog Vocal Communication.

PubMed

Yue, Xizi; Fan, Yanzhu; Xue, Fei; Brauth, Steven E; Tang, Yezhong; Fang, Guangzhan

2017-08-31

Vocal Communication plays a crucial role in survival and reproductive success in most amphibian species. Although amphibian communication sounds are often complex consisting of many temporal features, we know little about the biological significance of each temporal component. The present study examined the biological significance of notes of the male advertisement calls of the Emei music frog (Babina daunchina) using the optimized electroencephalogram (EEG) paradigm of mismatch negativity (MMN). Music frog calls generally contain four to six notes separated approximately by 150 millisecond intervals. A standard stimulus (white noise) and five deviant stimuli (five notes from one advertisement call) were played back to each subject while simultaneously recording multi-channel EEG signals. The results showed that the MMN amplitude for the first call note was significantly larger than for that of the others. Moreover, the MMN amplitudes evoked from the left forebrain and midbrain were typically larger than those from the right counterpart. These results are consistent with the ideas that the first call note conveys more information than the others for auditory recognition and that there is left-hemisphere dominance for processing information derived from conspecific calls in frogs.
Correlation of vocals and lyrics with left temporal musicogenic epilepsy.

PubMed

Tseng, Wei-En J; Lim, Siew-Na; Chen, Lu-An; Jou, Shuo-Bin; Hsieh, Hsiang-Yao; Cheng, Mei-Yun; Chang, Chun-Wei; Li, Han-Tao; Chiang, Hsing-I; Wu, Tony

2018-03-15

Whether the cognitive processing of music and speech relies on shared or distinct neuronal mechanisms remains unclear. Music and language processing in the brain are right and left temporal functions, respectively. We studied patients with musicogenic epilepsy (ME) that was specifically triggered by popular songs to analyze brain hyperexcitability triggered by specific stimuli. The study included two men and one woman (all right-handed, aged 35-55 years). The patients had sound-triggered left temporal ME in response to popular songs with vocals, but not to instrumental, classical, or nonvocal piano solo versions of the same song. Sentimental lyrics, high-pitched singing, specificity/familiarity, and singing in the native language were the most significant triggering factors. We found that recognition of the human voice and analysis of lyrics are important causal factors in left temporal ME and provide observational evidence that sounds with speech structure are predominantly processed in the left temporal lobe. A literature review indicated that language-associated stimuli triggered ME in the left temporal epileptogenic zone at a nearly twofold higher rate compared with the right temporal region. Further research on ME may enhance understanding of the cognitive neuroscience of music. © 2018 New York Academy of Sciences.
Yellow-bellied marmot and golden-mantled ground squirrel responses to heterospecific alarm calls

PubMed

Shriner

1998-03-01

When two species have predators in common, animals might be able to obtain important information about predation risk from the alarm calls produced by the other species. The behavioural responses of adult yellow-bellied marmots, Marmota flaviventris, and golden-mantled ground squirrels, Spermophilus lateralis, to conspecific and heterospecific alarm calls were studied to determine whether interspecific call recognition occurs in sympatric species that rarely interact. In a crossed design, marmot and squirrel alarm calls were broadcast to individuals of both species, using the song of a sympatric bird as a control. Individuals of both species responded similarly to conspecific and heterospecific anti-predator calls, and distinguished both types of alarms from the bird song. These results indicate that both marmots and squirrels recognized not only their own species' anti-predator vocalizations, but also the alarm calls of another species, and that these vocalizations were discriminated from an equally loud non-threatening sound. These findings suggest that researchers ought to think broadly when considering the sources of information available to animals in their natural environment. Copyright 1998 The Association for the Study of Animal Behaviour Copyright 1998 The Association for the Study of Animal Behaviour.
Differential Expression of Glutamate Receptors in Avian Neural Pathways for Learned Vocalization

PubMed Central

WADA, KAZUHIRO; SAKAGUCHI, HIRONOBU; JARVIS, ERICH D.; HAGIWARA, MASATOSHI

2008-01-01

Learned vocalization, the substrate for human language, is a rare trait. It is found in three distantly related groups of birds—parrots, hummingbirds, and songbirds. These three groups contain cerebral vocal nuclei for learned vocalization not found in their more closely related vocal nonlearning relatives. Here, we cloned 21 receptor subunits/subtypes of all four glutamate receptor families (AMPA, kainate, NMDA, and metabotropic) and examined their expression in vocal nuclei of songbirds. We also examined expression of a subset of these receptors in vocal nuclei of hummingbirds and parrots, as well as in the brains of dove species as examples of close vocal nonlearning relatives. Among the 21 subunits/subtypes, 19 showed higher and/or lower prominent differential expression in songbird vocal nuclei relative to the surrounding brain subdivisions in which the vocal nuclei are located. This included relatively lower levels of all four AMPA subunits in lMAN, strikingly higher levels of the kainite subunit GluR5 in the robust nucleus of the arcopallium (RA), higher and lower levels respectively of the NMDA subunits NR2A and NR2B in most vocal nuclei and lower levels of the metabotropic group I subtypes (mGluR1 and -5) in most vocal nuclei and the group II subtype (mGluR2), showing a unique expression pattern of very low levels in RA and very high levels in HVC. The splice variants of AMPA subunits showed further differential expression in vocal nuclei. Some of the receptor subunits/subtypes also showed differential expression in hummingbird and parrot vocal nuclei. The magnitude of differential expression in vocal nuclei of all three vocal learners was unique compared with the smaller magnitude of differences found for nonvocal areas of vocal learners and vocal nonlearners. Our results suggest that evolution of vocal learning was accompanied by differential expression of a conserved gene family for synaptic transmission and plasticity in vocal nuclei. They also suggest that neural activity and signal transduction in vocal nuclei of vocal learners will be different relative to the surrounding brain areas. PMID:15236466
Multilingual vocal emotion recognition and classification using back propagation neural network

NASA Astrophysics Data System (ADS)

Kayal, Apoorva J.; Nirmal, Jagannath

2016-03-01

This work implements classification of different emotions in different languages using Artificial Neural Networks (ANN). Mel Frequency Cepstral Coefficients (MFCC) and Short Term Energy (STE) have been considered for creation of feature set. An emotional speech corpus consisting of 30 acted utterances per emotion has been developed. The emotions portrayed in this work are Anger, Joy and Neutral in each of English, Marathi and Hindi languages. Different configurations of Artificial Neural Networks have been employed for classification purposes. The performance of the classifiers has been evaluated by False Negative Rate (FNR), False Positive Rate (FPR), True Positive Rate (TPR) and True Negative Rate (TNR).
Echolocation versus echo suppression in humans

PubMed Central

Wallmeier, Ludwig; Geßele, Nikodemus; Wiegrebe, Lutz

2013-01-01

Several studies have shown that blind humans can gather spatial information through echolocation. However, when localizing sound sources, the precedence effect suppresses spatial information of echoes, and thereby conflicts with effective echolocation. This study investigates the interaction of echolocation and echo suppression in terms of discrimination suppression in virtual acoustic space. In the ‘Listening’ experiment, sighted subjects discriminated between positions of a single sound source, the leading or the lagging of two sources, respectively. In the ‘Echolocation’ experiment, the sources were replaced by reflectors. Here, the same subjects evaluated echoes generated in real time from self-produced vocalizations and thereby discriminated between positions of a single reflector, the leading or the lagging of two reflectors, respectively. Two key results were observed. First, sighted subjects can learn to discriminate positions of reflective surfaces echo-acoustically with accuracy comparable to sound source discrimination. Second, in the Listening experiment, the presence of the leading source affected discrimination of lagging sources much more than vice versa. In the Echolocation experiment, however, the presence of both the lead and the lag strongly affected discrimination. These data show that the classically described asymmetry in the perception of leading and lagging sounds is strongly diminished in an echolocation task. Additional control experiments showed that the effect is owing to both the direct sound of the vocalization that precedes the echoes and owing to the fact that the subjects actively vocalize in the echolocation task. PMID:23986105
Electrophysiological neural monitoring of the laryngeal nerves in thyroid surgery: review of the current literature

PubMed Central

Deniwar, Ahmed; Randolph, Gregory

2015-01-01

Recurrent laryngeal nerve (RLN) injury is one of the most common complications of thyroid surgery. RLN injury can cause vocal cord paralysis, affecting the patient’s voice and the quality of life. Injury of the external branch of the superior laryngeal nerve (EBSLN) can cause cricothyroid muscle denervation affecting high vocal tones. Thus, securing the laryngeal nerves in these surgeries is of utmost importance. Visual identification of the nerves has long been the standard method for this precaution. Intraoperative neuromonitoring (IONM) has been introduced as a novel technology to improve the protection of the laryngeal nerves and reduce the rate of RLN injury. The aim of this article is to provide a brief description of the technique and review the literature to illustrate the value of IONM. IONM can provide early identification of anatomical variations and unusual nerve routes, which carry a higher risk of injury if not detected. IONM helps in prognosticating postoperative nerve function. Moreover, by detecting nerve injury intraoperatively, it aids in staging bilateral surgeries to avoid bilateral vocal cord paralysis and tracheostomy. The article will discuss the value of continuous IONM (C-IOMN) that may prevent nerve injury by detecting EMG waveform changes indicating impending nerve injury. Herein, we are also discussing anatomy of laryngeal nerves and aspects of its injury. PMID:26425449

Management of unilateral true vocal cord paralysis in children.

PubMed

Setlur, Jennifer; Hartnick, Christopher J

2012-12-01

Historically, information gained from the treatment of unilateral true vocal cord paralysis (UVCP) in adults was the same used to treat children. Today, there is a growing body of literature aimed specifically at the treatment of this condition in children. It is an area of growing interest as UVCP can significantly impact a child's quality of life. Children with UVCP may present with stridor, dysphonia, aspiration, feeding difficulties, or a combination of these symptoms. Diagnosis relies on laryngoscopy, but other adjuncts such as ultrasound and laryngeal electromyography may also be helpful in making the diagnosis and forming a treatment plan. In many instances, there is effective compensation by the contralateral vocal fold, making surgical intervention unnecessary. Children who cannot compensate for a unilateral defect may suffer from significant dysphonia that can affect their quality of life because their ability to be understood may be diminished. In these patients, treatment in the form of medialization or reinnervation of the affected recurrent laryngeal nerve may be warranted. UVCP is a well recognized problem in pediatric patients with disordered voice and feeding problems. Some patients will spontaneously recover their laryngeal function. For those who do not, a variety of reliable techniques are available for rehabilitative treatment. Improved diagnostics and a growing understanding of prognosis can help guide therapy decisions along with the goals and desires of the patient and his or her family.
Fuzzy approach for improved recognition of citric acid induced piglet coughing from continuous registration

NASA Astrophysics Data System (ADS)

Van Hirtum, A.; Berckmans, D.

2003-09-01

A natural acoustic indicator of animal welfare is the appearance (or absence) of coughing in the animal habitat. A sound-database of 5319 individual sounds including 2034 coughs was collected on six healthy piglets containing both animal vocalizations and background noises. Each of the test animals was repeatedly placed in a laboratory installation where coughing was induced by nebulization of citric acid. A two-class classification into 'cough' or 'other' was performed by the application of a distance function to a fast Fourier spectral sound analysis. This resulted in a positive cough recognition of 92%. For the whole sound-database however there was a misclassification of 21%. As spectral information up to 10000 Hz is available, an improved overall classification on the same database is obtained by applying the distance function to nine frequency ranges and combining the achieved distance-values in fuzzy rules. For each frequency range clustering threshold is determined by fuzzy c-means clustering.
Recognition and production of emotions in children with cochlear implants.

PubMed

Mildner, Vesna; Koska, Tena

2014-01-01

The aim of this study was to examine auditory recognition and vocal production of emotions in three prelingually bilaterally profoundly deaf children aged 6-7 who received cochlear implants before age 2, and compare them with age-matched normally hearing children. No consistent advantage was found for the normally hearing participants. In both groups, sadness was recognized best and disgust was the most difficult. Confusion matrices among other emotions (anger, happiness, and fear) showed that children with and without hearing impairment may rely on different cues. Both groups of children showed that perception is superior to production. Normally hearing children were more successful in the production of sadness, happiness, and fear, but not anger or disgust. The data set is too small to draw any definite conclusions, but it seems that a combination of early implantation and regular auditory-oral-based therapy enables children with cochlear implants to process and produce emotional content comparable with children with normal hearing.
Neural Correlates of the Lombard Effect in Primate Auditory Cortex

PubMed Central

Eliades, Steven J.

2012-01-01

Speaking is a sensory-motor process that involves constant self-monitoring to ensure accurate vocal production. Self-monitoring of vocal feedback allows rapid adjustment to correct perceived differences between intended and produced vocalizations. One important behavior in vocal feedback control is a compensatory increase in vocal intensity in response to noise masking during vocal production, commonly referred to as the Lombard effect. This behavior requires mechanisms for continuously monitoring auditory feedback during speaking. However, the underlying neural mechanisms are poorly understood. Here we show that when marmoset monkeys vocalize in the presence of masking noise that disrupts vocal feedback, the compensatory increase in vocal intensity is accompanied by a shift in auditory cortex activity toward neural response patterns seen during vocalizations under normal feedback condition. Furthermore, we show that neural activity in auditory cortex during a vocalization phrase predicts vocal intensity compensation in subsequent phrases. These observations demonstrate that the auditory cortex participates in self-monitoring during the Lombard effect, and may play a role in the compensation of noise masking during feedback-mediated vocal control. PMID:22855821
Food for Song: Expression of C-Fos and ZENK in the Zebra Finch Song Nuclei during Food Aversion Learning

PubMed Central

Tokarev, Kirill; Tiunova, Anna

2011-01-01

Background Specialized neural pathways, the song system, are required for acquiring, producing, and perceiving learned avian vocalizations. Birds that do not learn to produce their vocalizations lack telencephalic song system components. It is not known whether the song system forebrain regions are exclusively evolved for song or whether they also process information not related to song that might reflect their ‘evolutionary history’. Methodology/Principal Findings To address this question we monitored the induction of two immediate-early genes (IEGs) c-Fos and ZENK in various regions of the song system in zebra finches (Taeniopygia guttata) in response to an aversive food learning paradigm; this involves the association of a food item with a noxious stimulus that affects the oropharyngeal-esophageal cavity and tongue, causing subsequent avoidance of that food item. The motor response results in beak and head movements but not vocalizations. IEGs have been extensively used to map neuro-molecular correlates of song motor production and auditory processing. As previously reported, neurons in two pallial vocal motor regions, HVC and RA, expressed IEGs after singing. Surprisingly, c-Fos was induced equivalently also after food aversion learning in the absence of singing. The density of c-Fos positive neurons was significantly higher than that of birds in control conditions. This was not the case in two other pallial song nuclei important for vocal plasticity, LMAN and Area X, although singing did induce IEGs in these structures, as reported previously. Conclusions/Significance Our results are consistent with the possibility that some of the song nuclei may participate in non-vocal learning and the populations of neurons involved in the two tasks show partial overlap. These findings underscore the previously advanced notion that the specialized forebrain pre-motor nuclei controlling song evolved from circuits involved in behaviors related to feeding. PMID:21695176
Chimpanzees (Pan troglodytes) Produce the Same Types of ‘Laugh Faces’ when They Emit Laughter and when They Are Silent

PubMed Central

Davila-Ross, Marina; Jesus, Goncalo; Osborne, Jade; Bard, Kim A.

2015-01-01

The ability to flexibly produce facial expressions and vocalizations has a strong impact on the way humans communicate, as it promotes more explicit and versatile forms of communication. Whereas facial expressions and vocalizations are unarguably closely linked in primates, the extent to which these expressions can be produced independently in nonhuman primates is unknown. The present work, thus, examined if chimpanzees produce the same types of facial expressions with and without accompanying vocalizations, as do humans. Forty-six chimpanzees (Pan troglodytes) were video-recorded during spontaneous play with conspecifics at the Chimfunshi Wildlife Orphanage. ChimpFACS was applied, a standardized coding system to measure chimpanzee facial movements, based on FACS developed for humans. Data showed that the chimpanzees produced the same 14 configurations of open-mouth faces when laugh sounds were present and when they were absent. Chimpanzees, thus, produce these facial expressions flexibly without being morphologically constrained by the accompanying vocalizations. Furthermore, the data indicated that the facial expression plus vocalization and the facial expression alone were used differently in social play, i.e., when in physical contact with the playmates and when matching the playmates’ open-mouth faces. These findings provide empirical evidence that chimpanzees produce distinctive facial expressions independently from a vocalization, and that their multimodal use affects communicative meaning, important traits for a more explicit and versatile way of communication. As it is still uncertain how human laugh faces evolved, the ChimpFACS data were also used to empirically examine the evolutionary relation between open-mouth faces with laugh sounds of chimpanzees and laugh faces of humans. The ChimpFACS results revealed that laugh faces of humans must have gradually emerged from laughing open-mouth faces of ancestral apes. This work examines the main evolutionary changes of laugh faces since the last common ancestor of chimpanzees and humans. PMID:26061420
Chimpanzees (Pan troglodytes) Produce the Same Types of 'Laugh Faces' when They Emit Laughter and when They Are Silent.

PubMed

Davila-Ross, Marina; Jesus, Goncalo; Osborne, Jade; Bard, Kim A

2015-01-01

The ability to flexibly produce facial expressions and vocalizations has a strong impact on the way humans communicate, as it promotes more explicit and versatile forms of communication. Whereas facial expressions and vocalizations are unarguably closely linked in primates, the extent to which these expressions can be produced independently in nonhuman primates is unknown. The present work, thus, examined if chimpanzees produce the same types of facial expressions with and without accompanying vocalizations, as do humans. Forty-six chimpanzees (Pan troglodytes) were video-recorded during spontaneous play with conspecifics at the Chimfunshi Wildlife Orphanage. ChimpFACS was applied, a standardized coding system to measure chimpanzee facial movements, based on FACS developed for humans. Data showed that the chimpanzees produced the same 14 configurations of open-mouth faces when laugh sounds were present and when they were absent. Chimpanzees, thus, produce these facial expressions flexibly without being morphologically constrained by the accompanying vocalizations. Furthermore, the data indicated that the facial expression plus vocalization and the facial expression alone were used differently in social play, i.e., when in physical contact with the playmates and when matching the playmates' open-mouth faces. These findings provide empirical evidence that chimpanzees produce distinctive facial expressions independently from a vocalization, and that their multimodal use affects communicative meaning, important traits for a more explicit and versatile way of communication. As it is still uncertain how human laugh faces evolved, the ChimpFACS data were also used to empirically examine the evolutionary relation between open-mouth faces with laugh sounds of chimpanzees and laugh faces of humans. The ChimpFACS results revealed that laugh faces of humans must have gradually emerged from laughing open-mouth faces of ancestral apes. This work examines the main evolutionary changes of laugh faces since the last common ancestor of chimpanzees and humans.
Reinforcement of Infant Vocalizations through Contingent Vocal Imitation

ERIC Educational Resources Information Center

Pelaez, Martha; Virues-Ortega, Javier; Gewirtz, Jacob L.

2011-01-01

Maternal vocal imitation of infant vocalizations is highly prevalent during face-to-face interactions of infants and their caregivers. Although maternal vocal imitation has been associated with later verbal development, its potentially reinforcing effect on infant vocalizations has not been explored experimentally. This study examined the…
Perceptual fluency and affect without recognition.

PubMed

Anand, P; Sternthal, B

1991-05-01

A dichotic listening task was used to investigate the affect-without-recognition phenomenon. Subjects performed a distractor task by responding to the information presented in one ear while ignoring the target information presented in the other ear. The subjects' recognition of and affect toward the target information as well as toward foils was measured. The results offer evidence for the affect-without-recognition phenomenon. Furthermore, the data suggest that the subjects' affect toward the stimuli depended primarily on the extent to which the stimuli were perceived as familiar (i.e., subjective familiarity), and this perception was influenced by the ear in which the distractor or the target information was presented. These data are interpreted in terms of current models of recognition memory and hemispheric lateralization.
The involvement of emotion recognition in affective theory of mind.

PubMed

Mier, Daniela; Lis, Stefanie; Neuthe, Kerstin; Sauer, Carina; Esslinger, Christine; Gallhofer, Bernd; Kirsch, Peter

2010-11-01

This study was conducted to explore the relationship between emotion recognition and affective Theory of Mind (ToM). Forty subjects performed a facial emotion recognition and an emotional intention recognition task (affective ToM) in an event-related fMRI study. Conjunction analysis revealed overlapping activation during both tasks. Activation in some of these conjunctly activated regions was even stronger during affective ToM than during emotion recognition, namely in the inferior frontal gyrus, the superior temporal sulcus, the temporal pole, and the amygdala. In contrast to previous studies investigating ToM, we found no activation in the anterior cingulate, commonly assumed as the key region for ToM. The results point to a close relationship of emotion recognition and affective ToM and can be interpreted as evidence for the assumption that at least basal forms of ToM occur by an embodied, non-cognitive process. Copyright © 2010 Society for Psychophysiological Research.
Histopathologic study of human vocal fold mucosa unphonated over a decade.

PubMed

Sato, Kiminori; Umeno, Hirohito; Ono, Takeharu; Nakashima, Tadashi

2011-12-01

Mechanotransduction caused by vocal fold vibration could possibly be an important factor in the maintenance of extracellular matrices and layered structure of the human adult vocal fold mucosa as a vibrating tissue after the layered structure has been completed. Vocal fold stellate cells (VFSCs) in the human maculae flavae of the vocal fold mucosa are inferred to be involved in the metabolism of extracellular matrices of the vocal fold mucosa. Maculae flavae are also considered to be an important structure in the growth and development of the human vocal fold mucosa. Tension caused by phonation (vocal fold vibration) is hypothesized to stimulate the VFSCs to accelerate production of extracellular matrices. A human adult vocal fold mucosa unphonated over a decade was investigated histopathologically. Vocal fold mucosa unphonated for 11 years and 2 months of a 64-year-old male with cerebral hemorrhage was investigated by light and electron microscopy. The vocal fold mucosae (including maculae flavae) were atrophic. The vocal fold mucosa did not have a vocal ligament, Reinke's space or a layered structure. The lamina propria appeared as a uniform structure. Morphologically, the VFSCs synthesized fewer extracellular matrices, such as fibrous protein and glycosaminoglycan. Consequently, VFSCs appeared to decrease their level of activity.
Coos, booms, and hoots: The evolution of closed-mouth vocal behavior in birds.

PubMed

Riede, Tobias; Eliason, Chad M; Miller, Edward H; Goller, Franz; Clarke, Julia A

2016-08-01

Most birds vocalize with an open beak, but vocalization with a closed beak into an inflating cavity occurs in territorial or courtship displays in disparate species throughout birds. Closed-mouth vocalizations generate resonance conditions that favor low-frequency sounds. By contrast, open-mouth vocalizations cover a wider frequency range. Here we describe closed-mouth vocalizations of birds from functional and morphological perspectives and assess the distribution of closed-mouth vocalizations in birds and related outgroups. Ancestral-state optimizations of body size and vocal behavior indicate that closed-mouth vocalizations are unlikely to be ancestral in birds and have evolved independently at least 16 times within Aves, predominantly in large-bodied lineages. Closed-mouth vocalizations are rare in the small-bodied passerines. In light of these results and body size trends in nonavian dinosaurs, we suggest that the capacity for closed-mouth vocalization was present in at least some extinct nonavian dinosaurs. As in birds, this behavior may have been limited to sexually selected vocal displays, and hence would have co-occurred with open-mouthed vocalizations. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

PubMed

Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

2018-05-01

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
A Vocal-Based Analytical Method for Goose Behaviour Recognition

PubMed Central

Steen, Kim Arild; Therkildsen, Ole Roland; Karstoft, Henrik; Green, Ole

2012-01-01

Since human-wildlife conflicts are increasing, the development of cost-effective methods for reducing damage or conflict levels is important in wildlife management. A wide range of devices to detect and deter animals causing conflict are used for this purpose, although their effectiveness is often highly variable, due to habituation to disruptive or disturbing stimuli. Automated recognition of behaviours could form a critical component of a system capable of altering the disruptive stimuli to avoid this. In this paper we present a novel method to automatically recognise goose behaviour based on vocalisations from flocks of free-living barnacle geese (Branta leucopsis). The geese were observed and recorded in a natural environment, using a shielded shotgun microphone. The classification used Support Vector Machines (SVMs), which had been trained with labeled data. Greenwood Function Cepstral Coefficients (GFCC) were used as features for the pattern recognition algorithm, as they can be adjusted to the hearing capabilities of different species. Three behaviours are classified based in this approach, and the method achieves a good recognition of foraging behaviour (86–97% sensitivity, 89–98% precision) and a reasonable recognition of flushing (79–86%, 66–80%) and landing behaviour(73–91%, 79–92%). The Support Vector Machine has proven to be a robust classifier for this kind of classification, as generality and non-linear capabilities are important. We conclude that vocalisations can be used to automatically detect behaviour of conflict wildlife species, and as such, may be used as an integrated part of a wildlife management system. PMID:22737037
The Vocal Repertoire of Adult and Neonate Giant Otters (Pteronura brasiliensis)

PubMed Central

Mumm, Christina A. S.; Knörnschild, Mirjam

2014-01-01

Animals use vocalizations to exchange information about external events, their own physical or motivational state, or about individuality and social affiliation. Infant babbling can enhance the development of the full adult vocal repertoire by providing ample opportunity for practice. Giant otters are very social and frequently vocalizing animals. They live in highly cohesive groups, generally including a reproductive pair and their offspring born in different years. This basic social structure may vary in the degree of relatedness of the group members. Individuals engage in shared group activities and different social roles and thus, the social organization of giant otters provides a basis for complex and long-term individual relationships. We recorded and analysed the vocalizations of adult and neonate giant otters from wild and captive groups. We classified the adult vocalizations according to their acoustic structure, and described their main behavioural context. Additionally, we present the first description of vocalizations uttered in babbling bouts of new born giant otters. We expected to find 1) a sophisticated vocal repertoire that would reflect the species’ complex social organisation, 2) that giant otter vocalizations have a clear relationship between signal structure and function, and 3) that the vocal repertoire of new born giant otters would comprise age-specific vocalizations as well as precursors of the adult repertoire. We found a vocal repertoire with 22 distinct vocalization types produced by adults and 11 vocalization types within the babbling bouts of the neonates. A comparison within the otter subfamily suggests a relation between vocal and social complexity, with the giant otters being the socially and vocally most complex species. PMID:25391142
Variability of normal vocal fold dynamics for different vocal loading in one healthy subject investigated by phonovibrograms.

PubMed

Doellinger, Michael; Lohscheller, Joerg; McWhorter, Andrew; Kunduk, Melda

2009-03-01

We investigate the potential of high-speed digital imaging technique (HSI) and the phonovibrogram (PVG) analysis in normal vocal fold dynamics by studying the effects of continuous voice use (vocal loading) during the workday. One healthy subject was recorded at sustained phonation 13 times within 2 consecutive days in the morning before and in the afternoon after vocal loading, respectively. Vocal fold dynamics were extracted and visualized by PVGs. The characteristic PVG patterns were extracted representing vocal fold vibration types. The parameter values were then analyzed by statistics regarding vocal load, left-right PVG asymmetries, anterior-posterior PVG asymmetries, and opening-closing differences. For the first time, the direct impact of vocal load could be determined by analyzing vocal fold dynamics. For same vocal loading conditions, equal dynamical behavior of the vocal folds were confirmed. Comparison of recordings performed in the morning with the recordings after work revealed significant changes in vibration behavior, indicating impact of occurring vocal load. Left-right asymmetries in vocal fold dynamics were found confirming earlier assumptions. Different dynamics between opening and closing procedure as well as for anterior and posterior parts were found. Constant voice usage stresses the vocal folds even in healthy subjects and can be detected by applying the PVG technique. Furthermore, left-right PVG asymmetries do occur in healthy voice to a certain extent. HSI in combination with PVG analysis seems to be a promising tool for investigation of vocal fold fatigue and pathologies resulting in small forms of dynamical changes.
Ultrasonic Vocalizations as a Measure of Affect in Preclinical Models of Drug Abuse: A Review of Current Findings

PubMed Central

Barker, David J.; Simmons, Steven J.; West, Mark O.

2015-01-01

The present review describes ways in which ultrasonic vocalizations (USVs) have been used in studies of substance abuse. Accordingly, studies are reviewed which demonstrate roles for affective processing in response to the presentation of drug-related cues, experimenter- and self-administered drug, drug withdrawal, and during tests of relapse/reinstatement. The review focuses on data collected from studies using cocaine and amphetamine, where a large body of evidence has been collected. Data suggest that USVs capture animals’ initial positive reactions to psychostimulant administration and are capable of identifying individual differences in affective responding. Moreover, USVs have been used to demonstrate that positive affect becomes sensitized to psychostimulants over acute exposure before eventually exhibiting signs of tolerance. In the drug-dependent animal, a mixture of USVs suggesting positive and negative affect is observed, illustrating mixed responses to psychostimulants. This mixture is predominantly characterized by an initial bout of positive affect followed by an opponent negative emotional state, mirroring affective responses observed in human addicts. During drug withdrawal, USVs demonstrate the presence of negative affective withdrawal symptoms. Finally, it has been shown that drug-paired cues produce a learned, positive anticipatory response during training, and that presentation of drug-paired cues following abstinence produces both positive affect and reinstatement behavior. Thus, USVs are a useful tool for obtaining an objective measurement of affective states in animal models of substance abuse and can increase the information extracted from drug administration studies. USVs enable detection of subtle differences in a behavioral response that might otherwise be missed using traditional measures. PMID:26411762
Applicability of Cone Beam Computed Tomography to the Assessment of the Vocal Tract before and after Vocal Exercises in Normal Subjects.

PubMed

Garcia, Elisângela Zacanti; Yamashita, Hélio Kiitiro; Garcia, Davi Sousa; Padovani, Marina Martins Pereira; Azevedo, Renata Rangel; Chiari, Brasília Maria

2016-01-01

Cone beam computed tomography (CBCT), which represents an alternative to traditional computed tomography and magnetic resonance imaging, may be a useful instrument to study vocal tract physiology related to vocal exercises. This study aims to evaluate the applicability of CBCT to the assessment of variations in the vocal tract of healthy individuals before and after vocal exercises. Voice recordings and CBCT images before and after vocal exercises performed by 3 speech-language pathologists without vocal complaints were collected and compared. Each participant performed 1 type of exercise, i.e., Finnish resonance tube technique, prolonged consonant "b" technique, or chewing technique. The analysis consisted of an acoustic analysis and tomographic imaging. Modifications of the vocal tract settings following vocal exercises were properly detected by CBCT, and changes in the acoustic parameters were, for the most part, compatible with the variations detected in image measurements. CBCT was shown to be capable of properly assessing the changes in vocal tract settings promoted by vocal exercises. © 2017 S. Karger AG, Basel.
Iconicity can ground the creation of vocal symbols.

PubMed

Perlman, Marcus; Dale, Rick; Lupyan, Gary

2015-08-01

Studies of gestural communication systems find that they originate from spontaneously created iconic gestures. Yet, we know little about how people create vocal communication systems, and many have suggested that vocalizations do not afford iconicity beyond trivial instances of onomatopoeia. It is unknown whether people can generate vocal communication systems through a process of iconic creation similar to gestural systems. Here, we examine the creation and development of a rudimentary vocal symbol system in a laboratory setting. Pairs of participants generated novel vocalizations for 18 different meanings in an iterative 'vocal' charades communication game. The communicators quickly converged on stable vocalizations, and naive listeners could correctly infer their meanings in subsequent playback experiments. People's ability to guess the meanings of these novel vocalizations was predicted by how close the vocalization was to an iconic 'meaning template' we derived from the production data. These results strongly suggest that the meaningfulness of these vocalizations derived from iconicity. Our findings illuminate a mechanism by which iconicity can ground the creation of vocal symbols, analogous to the function of iconicity in gestural communication systems.
Dependence of phonation threshold pressure on vocal tract acoustics and vocal fold tissue mechanics.

PubMed

Chan, Roger W; Titze, Ingo R

2006-04-01

Analytical and computer simulation studies have shown that the acoustic impedance of the vocal tract as well as the viscoelastic properties of vocal fold tissues are critical for determining the dynamics and the energy transfer mechanism of vocal fold oscillation. In the present study, a linear, small-amplitude oscillation theory was revised by taking into account the propagation of a mucosal wave and the inertive reactance (inertance) of the supraglottal vocal tract as the major energy transfer mechanisms for flow-induced self-oscillation of the vocal fold. Specifically, analytical results predicted that phonation threshold pressure (Pth) increases with the viscous shear properties of the vocal fold, but decreases with vocal tract inertance. This theory was empirically tested using a physical model of the larynx, where biological materials (fat, hyaluronic acid, and fibronectin) were implanted into the vocal fold cover to investigate the effect of vocal fold tissue viscoelasticity on Pth. A uniform-tube supraglottal vocal tract was also introduced to examine the effect of vocal tract inertance on Pth. Results showed that Pth decreased with the inertive impedance of the vocal tract and increased with the viscous shear modulus (G") or dynamic viscosity (eta') of the vocal fold cover, consistent with theoretical predictions. These findings supported the potential biomechanical benefits of hyaluronic acid as a surgical bioimplant for repairing voice disorders involving the superficial layer of the lamina propria, such as scarring, sulcus vocalis, atrophy, and Reinke's edema.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.