Development of vocal tract length during early childhood: A magnetic resonance imaging study
NASA Astrophysics Data System (ADS)
Vorperian, Houri K.; Kent, Ray D.; Lindstrom, Mary J.; Kalina, Cliff M.; Gentry, Lindell R.; Yandell, Brian S.
2005-01-01
Speech development in children is predicated partly on the growth and anatomic restructuring of the vocal tract. This study examines the growth pattern of the various hard and soft tissue vocal tract structures as visualized by magnetic resonance imaging (MRI), and assesses their relational growth with vocal tract length (VTL). Measurements on lip thickness, hard- and soft-palate length, tongue length, naso-oro-pharyngeal length, mandibular length and depth, and distance of the hyoid bone and larynx from the posterior nasal spine were used from 63 pediatric cases (ages birth to 6 years and 9 months) and 12 adults. Results indicate (a) ongoing growth of all oral and pharyngeal vocal tract structures with no sexual dimorphism, and a period of accelerated growth between birth and 18 months; (b) vocal tract structure's region (oral/anterior versus pharyngeal/posterior) and orientation (horizontal versus vertical) determine its growth pattern; and (c) the relational growth of the different structures with VTL changes with development-while the increase in VTL throughout development is predominantly due to growth of pharyngeal/posterior structures, VTL is also substantially affected by the growth of oral/anterior structures during the first 18 months of life. Findings provide normative data that can be used for modeling the development of the vocal tract. .
NASA Astrophysics Data System (ADS)
Švancara, P.; Horáček, J.; Švec, J. G.
The study presents a three-dimensional (3D) finite element (FE) model of the flow-induced self-oscillation of the human vocal folds in interaction with acoustics of simplified vocal tract models. The 3D vocal tract models of the acoustic spaces shaped for simulation of phonation of Czech vowels [a:], [i:] and [u:] were created by converting the data from the magnetic resonance images (MRI). For modelling of the fluid-structure interaction, explicit coupling scheme with separated solvers for fluid and structure domain was utilized. The FE model comprises vocal folds pretension before starting phonation, large deformations of the vocal fold tissue, vocal-fold collisions, fluid-structure interaction, morphing the fluid mesh according to the vocal-fold motion (Arbitrary Lagrangian-Eulerian approach), unsteady viscous compressible airflow described by the Navier-Stokes equations and airflow separation. The developed FE model enables to study the relationship between flow-induced vibrations of the vocal folds and acoustic wave propagation in the vocal tract and can also be used to simulate for example pathological changes in the vocal fold tissue and their influence on the voice production.
Vocal production mechanisms in a non-human primate: morphological data and a model.
Riede, Tobias; Bronson, Ellen; Hatzikirou, Haralambos; Zuberbühler, Klaus
2005-01-01
Human beings are thought to be unique amongst the primates in their capacity to produce rapid changes in the shape of their vocal tracts during speech production. Acoustically, vocal tracts act as resonance chambers, whose geometry determines the position and bandwidth of the formants. Formants provide the acoustic basis for vowels, which enable speakers to refer to external events and to produce other kinds of meaningful communication. Formant-based referential communication is also present in non-human primates, most prominently in Diana monkey alarm calls. Previous work has suggested that the acoustic structure of these calls is the product of a non-uniform vocal tract capable of some degree of articulation. In this study we test this hypothesis by providing morphological measurements of the vocal tract of three adult Diana monkeys, using both radiography and dissection. We use these data to generate a vocal tract computational model capable of simulating the formant structures produced by wild individuals. The model performed best when it combined a non-uniform vocal tract consisting of three different tubes with a number of articulatory manoeuvres. We discuss the implications of these findings for evolutionary theories of human and non-human vocal production.
Vocal Tract Morphology in Inhaling Singing: An MRI-Based Study.
Moerman, Mieke; Vanhecke, Françoise; Van Assche, Lieven; Vercruysse, Johan; Daemers, Kristin; Leman, Marc
2016-07-01
Inhaling singing is a recently developed singing technique explored by the soprano singer Françoise Vanhecke. It is based on an inspiratory airflow instead of an expiratory airflow. This article describes the anatomical structural differences of the vocal tract between inhaling and exhaling singing. We hypothesize that the vocal tract alters significantly in inhaling singing, especially concerning the configuration of the anatomical structures in the oral cavity and the subglottal region. This is a prospective study. A professional singer (F.V.) performed sustained tones from F5 chromatically rising up to Bb5 on the vowel /a/. Vocal tract anatomy is assessed by magnetic resonance imaging. Wilcoxon directional testing demonstrates (1) that the vocal tract volume above the glottal region does not differ statistically in contrast to the subglottal region and (2) significant changes in the configuration of the tongue, the upright position of the epiglottis, the length of the floor of mouth, and the distance between the teeth. The narrowing of the subglottis is considered to be secondary to suction forces used in the inhaling singing technique. The changes in the anatomical structures above the vocal folds possibly suggest a valve-like function controlling the air inlet together with the regulator function of the resonator capacities of the vocal tract. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vocal tract resonances in speech, singing, and playing musical instruments
Wolfe, Joe; Garnier, Maëva; Smith, John
2009-01-01
In both the voice and musical wind instruments, a valve (vocal folds, lips, or reed) lies between an upstream and downstream duct: trachea and vocal tract for the voice; vocal tract and bore for the instrument. Examining the structural similarities and functional differences gives insight into their operation and the duct-valve interactions. In speech and singing, vocal tract resonances usually determine the spectral envelope and usually have a smaller influence on the operating frequency. The resonances are important not only for the phonemic information they produce, but also because of their contribution to voice timbre, loudness, and efficiency. The role of the tract resonances is usually different in brass and some woodwind instruments, where they modify and to some extent compete or collaborate with resonances of the instrument to control the vibration of a reed or the player’s lips, and∕or the spectrum of air flow into the instrument. We give a brief overview of oscillator mechanisms and vocal tract acoustics. We discuss recent and current research on how the acoustical resonances of the vocal tract are involved in singing and the playing of musical wind instruments. Finally, we compare techniques used in determining tract resonances and suggest some future developments. PMID:19649157
Vocal tract resonances in speech, singing, and playing musical instruments.
Wolfe, Joe; Garnier, Maëva; Smith, John
2009-01-01
IN BOTH THE VOICE AND MUSICAL WIND INSTRUMENTS, A VALVE (VOCAL FOLDS, LIPS, OR REED) LIES BETWEEN AN UPSTREAM AND DOWNSTREAM DUCT: trachea and vocal tract for the voice; vocal tract and bore for the instrument. Examining the structural similarities and functional differences gives insight into their operation and the duct-valve interactions. In speech and singing, vocal tract resonances usually determine the spectral envelope and usually have a smaller influence on the operating frequency. The resonances are important not only for the phonemic information they produce, but also because of their contribution to voice timbre, loudness, and efficiency. The role of the tract resonances is usually different in brass and some woodwind instruments, where they modify and to some extent compete or collaborate with resonances of the instrument to control the vibration of a reed or the player's lips, andor the spectrum of air flow into the instrument. We give a brief overview of oscillator mechanisms and vocal tract acoustics. We discuss recent and current research on how the acoustical resonances of the vocal tract are involved in singing and the playing of musical wind instruments. Finally, we compare techniques used in determining tract resonances and suggest some future developments.
Flow-structure interaction simulation of voice production in a canine larynx
NASA Astrophysics Data System (ADS)
Jiang, Weili; Zheng, Xudong; Xue, Qian; Oren, Liran; Khosla, Sid
2017-11-01
Experimental measurements conducted on a hemi-larynx canine vocal fold showed that negative pressures formed in the glottis near the superior surface of the vocal fold in the closing phase even without a supra-glottal vocal tract. It was hypothesized that such negative pressures were due to intraglottal vortices caused by flow separation in a divergent vocal tract during vocal fold closing phase. This work aims to test this hypothesis from the numerical aspect. Flow-structure interaction simulations are performed in realistic canine laryngeal shapes. In the simulations, a sharp interface immersed boundary method based incompressible flow solver is utilized to model the air flow; a finite element based solid mechanics solver is utilized to model the vocal fold vibration. The geometric structure of the vocal fold and vocal tract are based on MRI scans of a mongrel canine. The vocal fold tissue is modeled as transversely isotropic nonlinear materials with a vertical stiffness gradient. Numerical indentation is first performed and compared with the experiment data to obtain the material properties. Simulation setup about the inlet and outlet pressure follows the setup in the experiment. Simulation results including the fundamental frequency, air flow rate, the divergent angle will be compared with the experimental data, providing the validation of the simulation approach. The relationship between flow separation, intra-glottal vortices, divergent angle and flow rate will be comprehensively analyzed.
Vocal tract characteristics in Parkinson's disease.
Gillivan-Murphy, Patricia; Carding, Paul; Miller, Nick
2016-06-01
Voice tremor is strongly linked to the Parkinson's disease speech-voice symptom complex. Little is known about the underlying anatomic source(s) of voice tremor when it occurs. We review recent literature addressing this issue. Additionally we report findings from a study we conducted employing rating of vocal tract structures viewed using nasolaryngoscopy during vocal and nonspeech tasks. In Parkinson's disease, using laryngeal electromyography, tremor has not been identified in muscles in the vocal folds even when perceived auditorily. Preliminary findings using nasolaryngoscopy suggest that Parkinson's disease voice tremor is not associated with the vocal folds and may involve the palate, the global larynx, and the arytenoids. Tremor in the vertical larynx on /a/, and tremor in the arytenoid cartilages on /s/ differentiated patients with Parkinson's disease from neurologically healthy controls. Visual reliable detection of tremor when it is absent or borderline present, is challenging. Parkinson's disease voice tremor is likely to be related to oscillatory movement in structures across the vocal tract rather than just the vocal folds. To progress clinical practice, more refined tools for the visual rating of tremor would be beneficial. How far voice tremor represents a functionally significant factor for speakers would also add to the literature.
Monkey vocal tracts are speech-ready.
Fitch, W Tecumseh; de Boer, Bart; Mathur, Neil; Ghazanfar, Asif A
2016-12-01
For four decades, the inability of nonhuman primates to produce human speech sounds has been claimed to stem from limitations in their vocal tract anatomy, a conclusion based on plaster casts made from the vocal tract of a monkey cadaver. We used x-ray videos to quantify vocal tract dynamics in living macaques during vocalization, facial displays, and feeding. We demonstrate that the macaque vocal tract could easily produce an adequate range of speech sounds to support spoken language, showing that previous techniques based on postmortem samples drastically underestimated primate vocal capabilities. Our findings imply that the evolution of human speech capabilities required neural changes rather than modifications of vocal anatomy. Macaques have a speech-ready vocal tract but lack a speech-ready brain to control it.
Sielska-Badurek, Ewelina; Osuch-Wójcikiewicz, Ewa; Sobol, Maria; Kazanecka, Ewa; Niemczyk, Kazimierz
2017-01-01
This study investigated vocal function knowledge and vocal tract sensorimotor self-awareness and the impact of functional voice rehabilitation on vocal function knowledge and self-awareness. This is a prospective, randomized study. Twenty singers (study group [SG]) completed a questionnaire before and after functional voice rehabilitation. Twenty additional singers, representing the control group, also completed the questionnaire without functional voice rehabilitation at a 3-month interval. The questionnaire consisted of three parts. The first part evaluated the singers' attitude to the anatomical and physiological knowledge of the vocal tract and their self-esteem of the knowledge level. The second part assessed the theoretical knowledge of the singers' vocal tract physiology. The third part of the questionnaire assessed singers' sensorimotor self-awareness of the vocal tract. The results showed that most singers indicated that knowledge of the vocal tract's anatomy and physiology is useful (59% SG, 67% control group). However, 75% of all participants defined their knowledge of the vocal tract's anatomy and physiology as weak or inadequate. In the SG, vocal function knowledge at the first assessment was 45%. After rehabilitation, the level increased to 67.7%. Vocal tract sensorimotor self-awareness initially was 38.9% in SG but rose to 66.7%. Findings of the study suggest that classical singers lack knowledge about the physiology of the vocal mechanism, especially the breathing patterns. In addition, they have low sensorimotor self-awareness of their vocal tract. The results suggest that singers would benefit from receiving services from phoniatrists and speech-language pathologists during their voice training. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Artificially lengthened and constricted vocal tract in vocal training methods.
Bele, Irene Velsvik
2005-01-01
It is common practice in vocal training to make use of vocal exercise techniques that involve partial occlusion of the vocal tract. Various techniques are used; some of them form an occlusion within the front part of the oral cavity or at the lips. Another vocal exercise technique involves lengthening the vocal tract; for example, the method of phonation into small tubes. This essay presents some studies made on the effects of various vocal training methods that involve an artificially lengthened and constricted vocal tract. The influence of sufficient acoustic impedance on vocal fold vibration and economical voice production is presented.
Sielska-Badurek, Ewelina M; Sobol, Maria; Olszowska, Katarzyna; Niemczyk, Kazimierz
2017-10-03
The purpose of this study was to assess the voice quality and the vocal tract function in popular singing students at the beginning of their singing training at the High School of Music. This is a retrospective cross-sectional study. The study consisted of 45 popular singing students (35 females and 10 males, mean age: 19.9 ± 2.8 years). They were assessed in the first 2 months of their 4-year singing training at the High School of Music, between 2013 and 2016. Voice quality and vocal tract function were evaluated using videolaryngostroboscopy, palpation of the vocal tract structures, the perceptual speaking and singing voice assessment, acoustic analysis, maximal phonation time, the Voice Handicap Index, and the Singing Voice Handicap Index (SVHI). Twenty-two percent of Contemporary Commercial Music singing students began their education in the High School, with vocal nodules. Palpation of the vocal tract structure showed in 50% correct motions and tension in speaking and in 39.3% in singing. Perceptual voice assessment showed in 80% proper speaking voice quality and in 82.4% proper singing voice quality. The mean vocal fundamental frequency while speaking in females was 214 Hz and in males was 116 Hz. Dysphonia Severity Index was at the level of 2, and maximum phonation time was 17.7 seconds. The Voice Handicap Index and the SVHI remained within the normal range: 7.5 and 19, respectively. Perceptual singing voice assessment correlated with the SVHI (P = 0.006). Twenty-two percent of the Contemporary Commercial Music singing students began their education in the High School, with organic vocal fold lesions. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Morphological Variation in the Adult Hard Palate and Posterior Pharyngeal Wall
Lammert, Adam; Proctor, Michael; Narayanan, Shrikanth
2013-01-01
Purpose Adult human vocal tracts display considerable morphological variation across individuals, but the nature and extent of this variation has not been extensively studied for many vocal tract structures. There exists a need to analyze morphological variation and, even more basically, to develop a methodology for morphological analysis of the vocal tract. Such analysis will facilitate fundamental characterization of the speech production system, with broad implications from modeling to explaining inter-speaker variability. Method A data-driven methodology to automatically analyze the extent and variety of morphological variation is proposed and applied to a diverse subject pool of 36 adults. Analysis is focused on two key aspects of vocal tract structure: the midsagittal shape of the hard palate and the posterior pharyngeal wall. Result Palatal morphology varies widely in its degree of concavity, but also in anteriority and sharpness. Pharyngeal wall morphology, by contrast, varies mostly in terms of concavity alone. The distribution of morphological characteristics is complex, and analysis suggests that certain variations may be categorical in nature. Conclusion Major modes of morphological variation are identified, including their relative magnitude, distribution and categorical nature. Implications of these findings for speech articulation strategies and speech acoustics are discussed. PMID:23690566
Vocal tract length and acoustics of vocalization in the domestic dog (Canis familiaris).
Riede, T; Fitch, T
1999-10-01
The physical nature of the vocal tract results in the production of formants during vocalisation. In some animals (including humans), receivers can derive information (such as body size) about sender characteristics on the basis of formant characteristics. Domestication and selective breeding have resulted in a high variability in head size and shape in the dog (Canis familiaris), suggesting that there might be large differences in the vocal tract length, which could cause formant behaviour to affect interbreed communication. Lateral radiographs were made of dogs from several breeds ranging in size from a Yorkshire terrier (2.5 kg) to a German shepherd (50 kg) and were used to measure vocal tract length. In addition, we recorded an acoustic signal (growling) from some dogs. Significant correlations were found between vocal tract length, body mass and formant dispersion, suggesting that formant dispersion can deliver information about the body size of the vocalizer. Because of the low correlation between vocal tract length and the first formant, we predict a non-uniform vocal tract shape.
Correlation between vocal tract symptoms and modern singing handicap index in church gospel singers.
Pinheiro, Joel; Silverio, Kelly Cristina Alves; Siqueira, Larissa Thaís Donalonso; Ramos, Janine Santos; Brasolotto, Alcione Ghedini; Zambon, Fabiana; Behlau, Mara
2017-08-24
To verify the correlation between vocal tract discomfort symptoms and perceived voice handicaps in gospel singers, analyzing possible differences according to gender. 100 gospel singers volunteered, 50 male and 50 female. All participants answered two questionnaires: Vocal Tract Discomfort (VTD) scale and the Modern Singing Handicap Index (MSHI) that investigates the vocal handicap perceived by singers, linking the results of both instruments (p<0.05). Women presented more perceived handicaps and also more frequent and higher intensity vocal tract discomfort. Furthermore, the more frequent and intense the vocal tract symptoms, the higher the vocal handicap for singing. Female gospel singers present higher frequency and intensity of vocal tract discomfort symptoms, as well as higher voice handicap for singing than male gospel singers. The higher the frequency and intensity of the laryngeal symptoms, the higher the vocal handicap will be.
Dependence of phonation threshold pressure on vocal tract acoustics and vocal fold tissue mechanics.
Chan, Roger W; Titze, Ingo R
2006-04-01
Analytical and computer simulation studies have shown that the acoustic impedance of the vocal tract as well as the viscoelastic properties of vocal fold tissues are critical for determining the dynamics and the energy transfer mechanism of vocal fold oscillation. In the present study, a linear, small-amplitude oscillation theory was revised by taking into account the propagation of a mucosal wave and the inertive reactance (inertance) of the supraglottal vocal tract as the major energy transfer mechanisms for flow-induced self-oscillation of the vocal fold. Specifically, analytical results predicted that phonation threshold pressure (Pth) increases with the viscous shear properties of the vocal fold, but decreases with vocal tract inertance. This theory was empirically tested using a physical model of the larynx, where biological materials (fat, hyaluronic acid, and fibronectin) were implanted into the vocal fold cover to investigate the effect of vocal fold tissue viscoelasticity on Pth. A uniform-tube supraglottal vocal tract was also introduced to examine the effect of vocal tract inertance on Pth. Results showed that Pth decreased with the inertive impedance of the vocal tract and increased with the viscous shear modulus (G") or dynamic viscosity (eta') of the vocal fold cover, consistent with theoretical predictions. These findings supported the potential biomechanical benefits of hyaluronic acid as a surgical bioimplant for repairing voice disorders involving the superficial layer of the lamina propria, such as scarring, sulcus vocalis, atrophy, and Reinke's edema.
Computational model for vocal tract dynamics in a suboscine bird.
Assaneo, M F; Trevisan, M A
2010-09-01
In a recent work, active use of the vocal tract has been reported for singing oscines. The reconfiguration of the vocal tract during song serves to match its resonances to the syringeal fundamental frequency, demonstrating a precise coordination of the two main pieces of the avian vocal system for songbirds characterized by tonal songs. In this work we investigated the Great Kiskadee (Pitangus sulfuratus), a suboscine bird whose calls display a rich harmonic content. Using a recently developed mathematical model for the syrinx and a mobile vocal tract, we set up a computational model that provides a plausible reconstruction of the vocal tract movement using a few spectral features taken from the utterances. Moreover, synthetic calls were generated using the articulated vocal tract that accounts for all the acoustical features observed experimentally.
Relevance of the Implementation of Teeth in Three-Dimensional Vocal Tract Models
ERIC Educational Resources Information Center
Traser, Louisa; Birkholz, Peter; Flügge, Tabea Viktoria; Kamberger, Robert; Burdumy, Michael; Richter, Bernhard; Korvink, Jan Gerrit; Echternach, Matthias
2017-01-01
Purpose: Recently, efforts have been made to investigate the vocal tract using magnetic resonance imaging (MRI). Due to technical limitations, teeth were omitted in many previous studies on vocal tract acoustics. However, the knowledge of how teeth influence vocal tract acoustics might be important in order to estimate the necessity of…
Garcia, Elisângela Zacanti; Yamashita, Hélio Kiitiro; Garcia, Davi Sousa; Padovani, Marina Martins Pereira; Azevedo, Renata Rangel; Chiari, Brasília Maria
2016-01-01
Cone beam computed tomography (CBCT), which represents an alternative to traditional computed tomography and magnetic resonance imaging, may be a useful instrument to study vocal tract physiology related to vocal exercises. This study aims to evaluate the applicability of CBCT to the assessment of variations in the vocal tract of healthy individuals before and after vocal exercises. Voice recordings and CBCT images before and after vocal exercises performed by 3 speech-language pathologists without vocal complaints were collected and compared. Each participant performed 1 type of exercise, i.e., Finnish resonance tube technique, prolonged consonant "b" technique, or chewing technique. The analysis consisted of an acoustic analysis and tomographic imaging. Modifications of the vocal tract settings following vocal exercises were properly detected by CBCT, and changes in the acoustic parameters were, for the most part, compatible with the variations detected in image measurements. CBCT was shown to be capable of properly assessing the changes in vocal tract settings promoted by vocal exercises. © 2017 S. Karger AG, Basel.
FE Modelling of the Fluid-Structure-Acoustic Interaction for the Vocal Folds Self-Oscillation
NASA Astrophysics Data System (ADS)
Švancara, Pavel; Horáček, J.; Hrůza, V.
The flow induced self-oscillation of the human vocal folds in interaction with acoustic processes in the simplified vocal tract model was explored by three-dimensional (3D) finite element (FE) model. Developed FE model includes vocal folds pretension before phonation, large deformations of the vocal fold tissue, vocal folds contact, fluid-structure interaction, morphing the fluid mesh according the vocal folds motion (Arbitrary Lagrangian-Eulerian approach), unsteady viscous compressible airflow described by the Navier-Stokes equations and airflow separation during the glottis closure. Iterative partitioned approach is used for modelling the fluid-structure interaction. Computed results prove that the developed model can be used for simulation of the vocal folds self-oscillation and resulting acoustic waves. The developed model enables to numerically simulate an influence of some pathological changes in the vocal fold tissue on the voice production.
Ng, Manwa L; Yan, Nan; Chan, Venus; Chen, Yang; Lam, Paul K Y
2018-06-28
Previous studies of the laryngectomized vocal tract using formant frequencies reported contradictory findings. Imagining studies of the vocal tract in alaryngeal speakers are limited due to the possible radiation effect as well as the cost and time associated with the studies. The present study examined the vocal tract configuration of laryngectomized individuals using acoustic reflection technology. Thirty alaryngeal and 30 laryngeal male speakers of Cantonese participated in the study. A pharyngometer was used to obtain volumetric information of the vocal tract. All speakers were instructed to imitate the production of /a/ when the length and volume information of the oral cavity, pharyngeal cavity, and the entire vocal tract were obtained. The data of alaryngeal and laryngeal speakers were compared. Pharyngometric measurements revealed no significant difference in the vocal tract dimensions between laryngeal and alaryngeal speakers. Despite the removal of the larynx and a possible alteration in the pharyngeal cavity during total laryngectomy, the vocal tract configuration (length and volume) in laryngectomized individuals was not significantly different from laryngeal speakers. It is suggested that other factors might have affected formant measures in previous studies. © 2018 S. Karger AG, Basel.
ERIC Educational Resources Information Center
Neely, Kimberly D.; Bunton, Kate; Story, Brad H.
2016-01-01
Purpose: This study used a computational vocal tract model to investigate the relationship of diphthong duration and vocal tract movement magnitude to measures of the F2 trajectory in CV words. Method: Three words ("bough," "boy," and "buy") were simulated on the basis of an adult female vocal tract model, in which…
How small could a pup sound? The physical bases of signaling body size in harbor seals
Gross, Stephanie; Garcia, Maxime; Rubio-Garcia, Ana; de Boer, Bart
2017-01-01
Abstract Vocal communication is a crucial aspect of animal behavior. The mechanism which most mammals use to vocalize relies on three anatomical components. First, air overpressure is generated inside the lower vocal tract. Second, as the airstream goes through the glottis, sound is produced via vocal fold vibration. Third, this sound is further filtered by the geometry and length of the upper vocal tract. Evidence from mammalian anatomy and bioacoustics suggests that some of these three components may covary with an animal’s body size. The framework provided by acoustic allometry suggests that, because vocal tract length (VTL) is more strongly constrained by the growth of the body than vocal fold length (VFL), VTL generates more reliable acoustic cues to an animal’s size. This hypothesis is often tested acoustically but rarely anatomically, especially in pinnipeds. Here, we test the anatomical bases of the acoustic allometry hypothesis in harbor seal pups Phoca vitulina. We dissected and measured vocal tract, vocal folds, and other anatomical features of 15 harbor seals post-mortem. We found that, while VTL correlates with body size, VFL does not. This suggests that, while body growth puts anatomical constraints on how vocalizations are filtered by harbor seals’ vocal tract, no such constraints appear to exist on vocal folds, at least during puppyhood. It is particularly interesting to find anatomical constraints on harbor seals’ vocal tracts, the same anatomical region partially enabling pups to produce individually distinctive vocalizations. PMID:29492005
Visualizing sound emission of elephant vocalizations: evidence for two rumble production types.
Stoeger, Angela S; Heilmann, Gunnar; Zeppelzauer, Matthias; Ganswindt, André; Hensman, Sean; Charlton, Benjamin D
2012-01-01
Recent comparative data reveal that formant frequencies are cues to body size in animals, due to a close relationship between formant frequency spacing, vocal tract length and overall body size. Accordingly, intriguing morphological adaptations to elongate the vocal tract in order to lower formants occur in several species, with the size exaggeration hypothesis being proposed to justify most of these observations. While the elephant trunk is strongly implicated to account for the low formants of elephant rumbles, it is unknown whether elephants emit these vocalizations exclusively through the trunk, or whether the mouth is also involved in rumble production. In this study we used a sound visualization method (an acoustic camera) to record rumbles of five captive African elephants during spatial separation and subsequent bonding situations. Our results showed that the female elephants in our analysis produced two distinct types of rumble vocalizations based on vocal path differences: a nasally- and an orally-emitted rumble. Interestingly, nasal rumbles predominated during contact calling, whereas oral rumbles were mainly produced in bonding situations. In addition, nasal and oral rumbles varied considerably in their acoustic structure. In particular, the values of the first two formants reflected the estimated lengths of the vocal paths, corresponding to a vocal tract length of around 2 meters for nasal, and around 0.7 meters for oral rumbles. These results suggest that African elephants may be switching vocal paths to actively vary vocal tract length (with considerable variation in formants) according to context, and call for further research investigating the function of formant modulation in elephant vocalizations. Furthermore, by confirming the use of the elephant trunk in long distance rumble production, our findings provide an explanation for the extremely low formants in these calls, and may also indicate that formant lowering functions to increase call propagation distances in this species'.
Visualizing Sound Emission of Elephant Vocalizations: Evidence for Two Rumble Production Types
Stoeger, Angela S.; Heilmann, Gunnar; Zeppelzauer, Matthias; Ganswindt, André; Hensman, Sean; Charlton, Benjamin D.
2012-01-01
Recent comparative data reveal that formant frequencies are cues to body size in animals, due to a close relationship between formant frequency spacing, vocal tract length and overall body size. Accordingly, intriguing morphological adaptations to elongate the vocal tract in order to lower formants occur in several species, with the size exaggeration hypothesis being proposed to justify most of these observations. While the elephant trunk is strongly implicated to account for the low formants of elephant rumbles, it is unknown whether elephants emit these vocalizations exclusively through the trunk, or whether the mouth is also involved in rumble production. In this study we used a sound visualization method (an acoustic camera) to record rumbles of five captive African elephants during spatial separation and subsequent bonding situations. Our results showed that the female elephants in our analysis produced two distinct types of rumble vocalizations based on vocal path differences: a nasally- and an orally-emitted rumble. Interestingly, nasal rumbles predominated during contact calling, whereas oral rumbles were mainly produced in bonding situations. In addition, nasal and oral rumbles varied considerably in their acoustic structure. In particular, the values of the first two formants reflected the estimated lengths of the vocal paths, corresponding to a vocal tract length of around 2 meters for nasal, and around 0.7 meters for oral rumbles. These results suggest that African elephants may be switching vocal paths to actively vary vocal tract length (with considerable variation in formants) according to context, and call for further research investigating the function of formant modulation in elephant vocalizations. Furthermore, by confirming the use of the elephant trunk in long distance rumble production, our findings provide an explanation for the extremely low formants in these calls, and may also indicate that formant lowering functions to increase call propagation distances in this species'. PMID:23155427
Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques.
Fitch, W T
1997-08-01
Body weight, length, and vocal tract length were measured for 23 rhesus macaques (Macaca mulatta) of various sizes using radiographs and computer graphic techniques. linear predictive coding analysis of tape-recorded threat vocalizations were used to determine vocal tract resonance frequencies ("formants") for the same animals. A new acoustic variable is proposed, "formant dispersion," which should theoretically depend upon vocal tract length. Formant dispersion is the averaged difference between successive formant frequencies, and was found to be closely tied to both vocal tract length and body size. Despite the common claim that voice fundamental frequency (F0) provides an acoustic indication of body size, repeated investigations have failed to support such a relationship in many vertebrate species including humans. Formant dispersion, unlike voice pitch, is proposed to be a reliable predictor of body size in macaques, and probably many other species.
Voice classification and vocal tract of singers: a study of x-ray images and morphology.
Roers, Friederike; Mürbe, Dirk; Sundberg, Johan
2009-01-01
This investigation compares vocal tract dimensions and the classification of singer voices by examining an x-ray material assembled between 1959 and 1991 of students admitted to the solo singing education at the University of Music, Dresden, Germany. A total of 132 images were available to analysis. Different classifications' values of the lengths of the total vocal tract, the pharynx, and mouth cavities as well as of the relative position of the larynx, the height of the palatal arch, and the estimated vocal fold length were analyzed statistically, and some significant differences were found. The length of the pharynx cavity seemed particularly influential on the total vocal tract length, which varied systematically with classification. Also studied were the relationships between voice classification and the body height and weight and the body mass index. The data support the hypothesis that there are consistent morphological vocal tract differences between singers of different voice classifications.
NASA Astrophysics Data System (ADS)
Vorperian, Houri K.; Chung, Moo K.; Gentry, Lindell R.; Kent, Ray D.; Choih, Celia S.; Durtschi, Reid B.; Ziegert, Andrew J.
2005-09-01
As the vocal tract length (VTL) increases more than twofold from infancy to adulthood, its geometric proportions change. This study assesses the developmental changes of the various hard and soft tissue structures in the vicinity of the vocal tract (VT), and evaluates the relational growth of the various structures with VTL. Magnetic resonance images from 327 cases, ages birth to age 20, were used to secure quantitative measurements of the various soft, cartilaginous and bony structures in the oral and pharyngeal regions using established procedures [Vorperian et al. (1999), (2005)]. Structures measured include: lip thickness, hard- and soft-palate length, tongue length, naso-oro-pharyngeal length, mandibular length and depth, and distance of the hyoid bone and larynx from the posterior nasal spine. Findings indicate: (a) ongoing growth of all oral and pharyngeal structures with changes in growth rate as a function of age; (b) a strong interdependency between structure orientation and its growth curve; and (c) developmental changes in the relational growth of the different VT structures with VTL. Findings provide normative data on the anatomic changes of the supra-laryngeal speech apparatus, and can be used to model the development of the VT. [Work supported by NIH-NIDCD Grants R03-DC4362 R01-DC006282, and NIH-NICHHD P30-HK03352.
Vocal tract shapes in different singing functions used in musical theater singing-a pilot study.
Echternach, Matthias; Popeil, Lisa; Traser, Louisa; Wienhausen, Sascha; Richter, Bernhard
2014-09-01
Singing styles in Musical Theater singing might differ in many ways from Western Classical singing. However, vocal tract adjustments are not understood in detail. Vocal tract shapes of a single professional Music Theater female subject were analyzed concerning different aspects of singing styles using dynamic real-time magnetic resonance imaging technology with a frame rate of 8 fps. The different tasks include register differences, belting, and vibrato strategies. Articulatory differences were found between head register, modal register, and belting. Also, some vibrato strategies ("jazzy" vibrato) do involve vocal tract adjustments, whereas others (classical vibrato) do not. Vocal tract shaping might contribute to the establishment of different singing functions in Musical Theater singing. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Hoarseness and vocal tract discomfort and associated risk factors in air traffic controllers.
Korn, Gustavo Polacow; Villar, Anna Carolina; Azevedo, Renata Rangel
2018-04-05
An air traffic controller is a professional who performs air traffic control functions in air traffic control units and is responsible for controlling the various stages of a flight. To compare hoarseness and vocal tract discomfort and their risk factors among air traffic controllers in the approach control of São Paulo. In a cross-sectional survey, a voice self-evaluation adapted from to self-evaluation prepared by the Brazilian Ministry of Labor for teachers was administered to 76 air traffic controllers at approach control of São Paulo, Brazil. The percentage of hoarseness and vocal tract discomfort was 19.7% and 38.2%, respectively. In relation to air pollution, the percentages of hoarseness and vocal tract discomfort were higher among those who consider their working environment to be intolerable than among those in a comfortable or disturbing environment. The percentage of hoarseness was higher among those who seek medical advice due to vocal complaints and among those who experience difficulty using their voice at work than among those who experience mild or no difficulty. The percentage of vocal tract discomfort was higher among those in a very tense and stressful environment than among those who consider their work environment to be mild or moderately tense and stressful. The percentage of vocal tract discomfort was higher among those who describe themselves as very tense and stressed or tense and stressed than among those who describe themselves as calm. Additionally, the percentage of vocal tract discomfort was higher among those who care about their health. Among air traffic controllers, the percentage of vocal tract discomfort was almost twice that of hoarseness. Both symptoms are prevalent among air traffic controllers who considered their workplace intolerable in terms of air pollution. Vocal tract discomfort was related to a tense and stressful environment, and hoarseness was related to difficulty using the voice at work. Copyright © 2018 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Effect of artificially lengthened vocal tract on vocal fold oscillation's fundamental frequency.
Hanamitsu, Masakazu; Kataoka, Hideyuki
2004-06-01
The fundamental frequency of vocal fold oscillation (F(0)) is controlled by laryngeal mechanics and aerodynamic properties. F(0) change per unit change of transglottal pressure (dF/dP) using a shutter valve has been studied and found to have nonlinear, V-shaped relationship with F(0). On the other hand, the vocal tract is also known to affect vocal fold oscillation. This study examined the effect of artificially lengthened vocal tract length on dF/dP. dF/dP was measured in six men using two mouthpieces of different lengths. The dF/dP graph for the longer vocal tract was shifted leftward relative to the shorter one. Using the one-mass model, the nadir of the "V" on the dF/dP graph was strongly influenced by the resonance around the first formant frequency. However, a more precise model is needed to account for the effects of viscosity and turbulence.
On Short-Time Estimation of Vocal Tract Length from Formant Frequencies
Lammert, Adam C.; Narayanan, Shrikanth S.
2015-01-01
Vocal tract length is highly variable across speakers and determines many aspects of the acoustic speech signal, making it an essential parameter to consider for explaining behavioral variability. A method for accurate estimation of vocal tract length from formant frequencies would afford normalization of interspeaker variability and facilitate acoustic comparisons across speakers. A framework for considering estimation methods is developed from the basic principles of vocal tract acoustics, and an estimation method is proposed that follows naturally from this framework. The proposed method is evaluated using acoustic characteristics of simulated vocal tracts ranging from 14 to 19 cm in length, as well as real-time magnetic resonance imaging data with synchronous audio from five speakers whose vocal tracts range from 14.5 to 18.0 cm in length. Evaluations show improvements in accuracy over previously proposed methods, with 0.631 and 1.277 cm root mean square error on simulated and human speech data, respectively. Empirical results show that the effectiveness of the proposed method is based on emphasizing higher formant frequencies, which seem less affected by speech articulation. Theoretical predictions of formant sensitivity reinforce this empirical finding. Moreover, theoretical insights are explained regarding the reason for differences in formant sensitivity. PMID:26177102
ERIC Educational Resources Information Center
Meerschman, Iris; Van Lierde, Kristiane; Peeters, Karen; Meersman, Eline; Claeys, Sofie; D'haeseleer, Evelien
2017-01-01
Purpose: The purpose of this study was to determine the short-term effect of 2 semi-occluded vocal tract training programs, "resonant voice training using nasal consonants" versus "straw phonation," on the vocal quality of vocally healthy future occupational voice users. Method: A multigroup pretest--posttest randomized control…
Multimodal modeling and validation of simplified vocal tract acoustics for sibilant /s/
NASA Astrophysics Data System (ADS)
Yoshinaga, T.; Van Hirtum, A.; Wada, S.
2017-12-01
To investigate the acoustic characteristics of sibilant /s/, multimodal theory is applied to a simplified vocal tract geometry derived from a CT scan of a single speaker for whom the sound spectrum was gathered. The vocal tract was represented by a concatenation of waveguides with rectangular cross-sections and constant width, and a sound source was placed either at the inlet of the vocal tract or downstream from the constriction representing the sibilant groove. The modeled pressure amplitude was validated experimentally using an acoustic driver or airflow supply at the vocal tract inlet. Results showed that the spectrum predicted with the source at the inlet and including higher-order modes matched the spectrum measured with the acoustic driver at the inlet. Spectra modeled with the source downstream from the constriction captured the first characteristic peak observed for the speaker at 4 kHz. By positioning the source near the upper teeth wall, the higher frequency peak observed for the speaker at 8 kHz was predicted with the inclusion of higher-order modes. At the frequencies of the characteristic peaks, nodes and antinodes of the pressure amplitude were observed in the simplified vocal tract when the source was placed downstream from the constriction. These results indicate that the multimodal approach enables to capture the amplitude and frequency of the peaks in the spectrum as well as the nodes and antinodes of the pressure distribution due to /s/ inside the vocal tract.
The Vocal Tract Organ: A New Musical Instrument Using 3-D Printed Vocal Tracts.
Howard, David M
2017-10-27
The advent and now increasingly widespread availability of 3-D printers is transforming our understanding of the natural world by enabling observations to be made in a tangible manner. This paper describes the use of 3-D printed models of the vocal tract for different vowels that are used to create an acoustic output when stimulated with an appropriate sound source in a new musical instrument: the Vocal Tract Organ. The shape of each printed vocal tract is recovered from magnetic resonance imaging. It sits atop a loudspeaker to which is provided an acoustic L-F model larynx input signal that is controlled by the notes played on a musical instrument digital interface device such as a keyboard. The larynx input is subject to vibrato with extent and frequency adjustable as desired within the ranges usually found for human singing. Polyphonic inputs for choral singing textures can be applied via a single loudspeaker and vocal tract, invoking the approximation of linearity in the voice production system, thereby making multiple vowel stops a possibility while keeping the complexity of the instrument in reasonable check. The Vocal Tract Organ offers a much more human and natural sounding result than the traditional Vox Humana stops found in larger pipe organs, offering the possibility of enhancing pipe organs of the future as well as becoming the basis for a "multi-vowel" chamber organ in its own right. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vocal effort modulates the motor planning of short speech structures
NASA Astrophysics Data System (ADS)
Taitz, Alan; Shalom, Diego E.; Trevisan, Marcos A.
2018-05-01
Speech requires programming the sequence of vocal gestures that produce the sounds of words. Here we explored the timing of this program by asking our participants to pronounce, as quickly as possible, a sequence of consonant-consonant-vowel (CCV) structures appearing on screen. We measured the delay between visual presentation and voice onset. In the case of plosive consonants, produced by sharp and well defined movements of the vocal tract, we found that delays are positively correlated with the duration of the transition between consonants. We then used a battery of statistical tests and mathematical vocal models to show that delays reflect the motor planning of CCVs and transitions are proxy indicators of the vocal effort needed to produce them. These results support that the effort required to produce the sequence of movements of a vocal gesture modulates the onset of the motor plan.
Plotsky, K; Rendall, D; Riede, T; Chase, K
2013-09-01
Body size is an important determinant of resource and mate competition in many species. Competition is often mediated by conspicuous vocal displays, which may help to intimidate rivals and attract mates by providing honest cues to signaler size. Fitch proposed that vocal tract resonances (or formants) should provide particularly good, or honest, acoustic cues to signaler size because they are determined by the length of the vocal tract, which in turn, is hypothesized to scale reliably with overall body size. There is some empirical support for this hypothesis, but to date, many of the effects have been either mixed for males compared with females, weaker than expected in one or the other sex, or complicated by sampling issues. In this paper, we undertake a direct test of Fitch's hypothesis in two canid species using large samples that control for age- and sex-related variation. The samples involved radiographic images of 120 Portuguese water dogs Canis lupus familiaris and 121 Russian silver foxes Vulpes vulpes . Direct measurements were made of vocal tract length from X-ray images and compared against independent measures of body size. In adults of both species, and within both sexes, overall vocal tract length was strongly and significantly correlated with body size. Effects were strongest for the oral component of the vocal tract. By contrast, the length of the pharyngeal component was not as consistently related to body size. These outcomes are some of the clearest evidence to date in support of Fitch's hypothesis. At the same time, they highlight the potential for elements of both honest and deceptive body signaling to occur simultaneously via differential acoustic cues provided by the oral versus pharyngeal components of the vocal tract.
Plotsky, K.; Rendall, D.; Riede, T.; Chase, K.
2013-01-01
Body size is an important determinant of resource and mate competition in many species. Competition is often mediated by conspicuous vocal displays, which may help to intimidate rivals and attract mates by providing honest cues to signaler size. Fitch proposed that vocal tract resonances (or formants) should provide particularly good, or honest, acoustic cues to signaler size because they are determined by the length of the vocal tract, which in turn, is hypothesized to scale reliably with overall body size. There is some empirical support for this hypothesis, but to date, many of the effects have been either mixed for males compared with females, weaker than expected in one or the other sex, or complicated by sampling issues. In this paper, we undertake a direct test of Fitch’s hypothesis in two canid species using large samples that control for age- and sex-related variation. The samples involved radiographic images of 120 Portuguese water dogs Canis lupus familiaris and 121 Russian silver foxes Vulpes vulpes. Direct measurements were made of vocal tract length from X-ray images and compared against independent measures of body size. In adults of both species, and within both sexes, overall vocal tract length was strongly and significantly correlated with body size. Effects were strongest for the oral component of the vocal tract. By contrast, the length of the pharyngeal component was not as consistently related to body size. These outcomes are some of the clearest evidence to date in support of Fitch’s hypothesis. At the same time, they highlight the potential for elements of both honest and deceptive body signaling to occur simultaneously via differential acoustic cues provided by the oral versus pharyngeal components of the vocal tract. PMID:24363497
Samlan, Robin A.; Story, Brad H.
2011-01-01
Purpose To relate vocal fold structure and kinematics to two acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). Method A computational, kinematic model of the medial surfaces of the vocal folds was used to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow and acoustic waveforms and related to the underlying vocal fold kinematics. Results CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1–1.5 mm and decreased with separation > 1.5 mm. Conclusions CPP is generally a function of vocal process separation. H1*-H2* will increase or decrease with vocal process separation based on vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures. PMID:21498582
An acoustic glottal source for vocal tract physical models
NASA Astrophysics Data System (ADS)
Hannukainen, Antti; Kuortti, Juha; Malinen, Jarmo; Ojalammi, Antti
2017-11-01
A sound source is proposed for the acoustic measurement of physical models of the human vocal tract. The physical models are produced by fast prototyping, based on magnetic resonance imaging during prolonged vowel production. The sound source, accompanied by custom signal processing algorithms, is used for two kinds of measurements from physical models of the vocal tract: (i) amplitude frequency response and resonant frequency measurements, and (ii) signal reconstructions at the source output according to a target pressure waveform with measurements at the mouth position. The proposed source and the software are validated by computational acoustics experiments and measurements on a physical model of the vocal tract corresponding to the vowels [] of a male speaker.
Frey, Roland; Volodin, Ilya; Volodina, Elena; Soldatova, Natalia V; Juldaschev, Erkin T
2011-01-01
Similar to male humans, Homo sapiens, the males of a few polygynous ruminants – red deer Cervus elaphus, fallow deer Dama dama and Mongolian gazelle Procapra gutturosa– have a more or less enlarged, low-resting larynx and are capable of additional dynamic vocal tract elongation by larynx retraction during their rutting calls. The vocal correlates of a large larynx and an elongated vocal tract, a low fundamental frequency and low vocal tract resonance frequencies, deter rival males and attract receptive females. The males of the polygynous goitred gazelle, Gazella subgutturosa, provide another, independently evolved, example of an enlarged and low-resting larynx of high mobility. Relevant aspects of the rutting behaviour of territorial wild male goitred gazelles are described. Video and audio recordings served to study the acoustic effects of the enlarged larynx and vocal tract elongation on male rutting calls. Three call types were discriminated: roars, growls and grunts. In addition, the adult male vocal anatomy during the emission of rutting calls is described and functionally discussed using a 2D-model of larynx retraction. The combined morphological, behavioural and acoustic data are discussed in relation to the hypothesis of sexual selection for male-specific deep voices, resulting in convergent features of vocal anatomy in a few polygynous ruminants and in human males. PMID:21413987
Laukkanen, Anne-Maria; Pulakka, Hannu; Alku, Paavo; Vilkman, Erkki; Hertegård, Stellan; Lindestad, Per-Ake; Larsson, Hans; Granqvist, Svante
2007-01-01
Vocal exercises that increase the vocal tract impedance are widely used in voice training and therapy. The present study applies a versatile methodology to investigate phonation during varying artificial extension of the vocal tract. Two males and one female phonated into a hard-walled plastic tube (phi 2 cm), whose physical length was randomly pair-wise changed between 30 cm, 60 cm and 100 cm. High-speed image (1900 f/sec) sequences of the vocal folds were obtained via a rigid endoscope. Acoustic and electroglottographic signals (EGG) were recorded. Oral pressure during shuttering of the tube was used to give an estimate of subglottic pressure (Psub). The only trend observed was that with the two longer tubes compared to the shortest one, fundamental frequency was lower, open time of the glottis shorter, and Psub higher. The results may partly reflect increased vocal tract impedance as such and partly the increased vocal effort to compensate for it. In other parameters there were individual differences in tube length-related changes, suggesting complexity of the coupling between supraglottic space and the glottis.
A theoretical study of F0-F1 interaction with application to resonant speaking and singing voice.
Titze, Ingo R
2004-09-01
An interactive source-filter system, consisting of a three-mass body-cover model of the vocal folds and a wave reflection model of the vocal tract, was used to test the dependence of vocal fold vibration on the vocal tract. The degree of interaction is governed by the epilarynx tube, which raises the vocal tract impedance to match the impedance of the glottis. The key component of the impedance is inertive reactance. Whenever there is inertive reactance, the vocal tract assists the vocal folds in vibration. The amplitude of vibration and the glottal flow can more than double, and the oral radiated power can increase up to 10 dB. As F0 approaches F1, the first formant frequency, the interactive source-filter system loses its advantage (because inertive reactance changes to compliant reactance) and the noninteractive system produces greater vocal output. Thus, from a voice training and control standpoint, there may be reasons to operate the system in either interactive and noninteractive modes. The harmonics 2F0 and 3F0 can also benefit from being positioned slightly below F1.
Lester, Rosemary A.; Story, Brad H.
2015-01-01
The purpose of this study was to determine if adjustments to the voice source [i.e., fundamental frequency (F0), degree of vocal fold adduction] or vocal tract filter (i.e., vocal tract shape for vowels) reduce the perception of simulated laryngeal vocal tremor and to determine if listener perception could be explained by characteristics of the acoustical modulations. This research was carried out using a computational model of speech production that allowed for precise control and manipulation of the glottal and vocal tract configurations. Forty-two healthy adults participated in a perceptual study involving pair-comparisons of the magnitude of “shakiness” with simulated samples of laryngeal vocal tremor. Results revealed that listeners perceived a higher magnitude of voice modulation when simulated samples had a higher mean F0, greater degree of vocal fold adduction, and vocal tract shape for /i/ vs /ɑ/. However, the effect of F0 was significant only when glottal noise was not present in the acoustic signal. Acoustical analyses were performed with the simulated samples to determine the features that affected listeners' judgments. Based on regression analyses, listeners' judgments were predicted to some extent by modulation information present in both low and high frequency bands. PMID:26328711
Two-dimensional vocal tracts with three-dimensional behavior in the numerical generation of vowels.
Arnela, Marc; Guasch, Oriol
2014-01-01
Two-dimensional (2D) numerical simulations of vocal tract acoustics may provide a good balance between the high quality of three-dimensional (3D) finite element approaches and the low computational cost of one-dimensional (1D) techniques. However, 2D models are usually generated by considering the 2D vocal tract as a midsagittal cut of a 3D version, i.e., using the same radius function, wall impedance, glottal flow, and radiation losses as in 3D, which leads to strong discrepancies in the resulting vocal tract transfer functions. In this work, a four step methodology is proposed to match the behavior of 2D simulations with that of 3D vocal tracts with circular cross-sections. First, the 2D vocal tract profile becomes modified to tune the formant locations. Second, the 2D wall impedance is adjusted to fit the formant bandwidths. Third, the 2D glottal flow gets scaled to recover 3D pressure levels. Fourth and last, the 2D radiation model is tuned to match the 3D model following an optimization process. The procedure is tested for vowels /a/, /i/, and /u/ and the obtained results are compared with those of a full 3D simulation, a conventional 2D approach, and a 1D chain matrix model.
Vasconcelos, Maria J M; Ventura, Sandra M R; Freitas, Diamantino R S; Tavares, João Manuel R S
2012-03-01
The morphological and dynamic characterisation of the vocal tract during speech production has been gaining greater attention due to the motivation of the latest improvements in magnetic resonance (MR) imaging; namely, with the use of higher magnetic fields, such as 3.0 Tesla. In this work, the automatic study of the vocal tract from 3.0 Tesla MR images was assessed through the application of statistical deformable models. Therefore, the primary goal focused on the analysis of the shape of the vocal tract during the articulation of European Portuguese sounds, followed by the evaluation of the results concerning the automatic segmentation, i.e. identification of the vocal tract in new MR images. In what concerns speech production, this is the first attempt to automatically characterise and reconstruct the vocal tract shape of 3.0 Tesla MR images by using deformable models; particularly, by using active and appearance shape models. The achieved results clearly evidence the adequacy and advantage of the automatic analysis of the 3.0 Tesla MR images of these deformable models in order to extract the vocal tract shape and assess the involved articulatory movements. These achievements are mostly required, for example, for a better knowledge of speech production, mainly of patients suffering from articulatory disorders, and to build enhanced speech synthesizer models.
Guzman, Marco; Miranda, Gonzalo; Olavarria, Christian; Madrid, Sofia; Muñoz, Daniel; Leiva, Miguel; Lopez, Lorena; Bortnem, Cori
2017-01-01
The present study aimed to observe the effect of two types of tubes on vocal tract bidimensional and tridimensional images. Ten participants with hyperfunctional dysphonia were included. Computerized tomography was performed during production of sustained [a:], followed by sustained phonation into a drinking straw, and then repetition of sustained [a:]. A similar procedure was performed with a stirring straw after 15 minutes of vocal rest. Anatomic distances and area measures were obtained from computerized tomography midsagittal and transversal images. Vocal tract total volume was also calculated. During tube phonation, increases were measured in the vertical length of the vocal tract, oropharyngeal area, hypopharyngeal area, outlet of the epilaryngeal tube, and inlet to the lower pharynx. Also, the larynx was lower, and more closure was noted between the velum and the nasal passage. Tube phonation causes an increased total vocal tract volume, mostly because of the increased cross-sectional areas in the pharyngeal region. This change is more prominent when the tube offers more airflow resistance (stirring straw) compared with less airflow resistance (drinking straw). Based on our data and previous studies, it seems that vocal tract changes are not dependent on the voice condition (vocally trained, untrained, or disordered voices), but on the exercise itself and the type of instructions given to subjects. Tube phonation is a good option to reach therapeutic goals (eg, wide pharynx and low larynx) without giving biomechanical instructions, but only asking patients to feel easy voice and vibratory sensations. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Mainka, Alexander; Kürbis, Steffen; Birkholz, Peter
2018-01-01
Recently, 3D printing has been increasingly used to create physical models of the vocal tract with geometries obtained from magnetic resonance imaging. These printed models allow measuring the vocal tract transfer function, which is not reliably possible in vivo for the vocal tract of living humans. The transfer functions enable the detailed examination of the acoustic effects of specific articulatory strategies in speaking and singing, and the validation of acoustic plane-wave models for realistic vocal tract geometries in articulatory speech synthesis. To measure the acoustic transfer function of 3D-printed models, two techniques have been described: (1) excitation of the models with a broadband sound source at the glottis and measurement of the sound pressure radiated from the lips, and (2) excitation of the models with an external source in front of the lips and measurement of the sound pressure inside the models at the glottal end. The former method is more frequently used and more intuitive due to its similarity to speech production. However, the latter method avoids the intricate problem of constructing a suitable broadband glottal source and is therefore more effective. It has been shown to yield a transfer function similar, but not exactly equal to the volume velocity transfer function between the glottis and the lips, which is usually used to characterize vocal tract acoustics. Here, we revisit this method and show both, theoretically and experimentally, how it can be extended to yield the precise volume velocity transfer function of the vocal tract. PMID:29543829
Modulation of voice related to tremor and vibrato
NASA Astrophysics Data System (ADS)
Lester, Rosemary Anne
Modulation of voice is a result of physiologic oscillation within one or more components of the vocal system including the breathing apparatus (i.e., pressure supply), the larynx (i.e. sound source), and the vocal tract (i.e., sound filter). These oscillations may be caused by pathological tremor associated with neurological disorders like essential tremor or by volitional production of vibrato in singers. Because the acoustical characteristics of voice modulation specific to each component of the vocal system and the effect of these characteristics on perception are not well-understood, it is difficult to assess individuals with vocal tremor and to determine the most effective interventions for reducing the perceptual severity of the disorder. The purpose of the present studies was to determine how the acoustical characteristics associated with laryngeal-based vocal tremor affect the perception of the magnitude of voice modulation, and to determine if adjustments could be made to the voice source and vocal tract filter to alter the acoustic output and reduce the perception of modulation. This research was carried out using both a computational model of speech production and trained singers producing vibrato to simulate laryngeal-based vocal tremor with different voice source characteristics (i.e., vocal fold length and degree of vocal fold adduction) and different vocal tract filter characteristics (i.e., vowel shapes). It was expected that, by making adjustments to the voice source and vocal tract filter that reduce the amplitude of the higher harmonics, the perception of magnitude of voice modulation would be reduced. The results of this study revealed that listeners' perception of the magnitude of modulation of voice was affected by the degree of vocal fold adduction and the vocal tract shape with the computational model, but only by the vocal quality (corresponding to the degree of vocal fold adduction) with the female singer. Based on regression analyses, listeners' judgments were predicted by modulation information in both low and high frequency bands. The findings from these studies indicate that production of a breathy vocal quality might be a useful compensatory strategy for reducing the perceptual severity of modulation of voice for individuals with tremor affecting the larynx.
Lopes, Leonardo Wanderley; de Oliveira Florencio, Vanessa; Silva, Priscila Oliveira Costa; da Nóbrega E Ugulino, Ana Celiane; Almeida, Anna Alice
2018-01-04
We aimed to correlate the Vocal Tract Discomfort Scale (VTDS) with the Voice Symptom Scale (VoiSS) for evaluation of patients with dysphonia. In addition, we aimed to compare vocal tract discomfort symptoms in patients with and without self-reported voice problem. This is a descriptive, cross-sectional, and retrospective study. We analyzed 143 women and 62 men with voice disorders, as confirmed by endoscopic larynx examination. All patients completed the VTDS and VoiSS at vocal evaluation. Descriptive statistics and the Spearman correlation test were applied to all variables. The degree of covariance of variables was noted. The Mann-Whitney U test was used to compare the average number of discomfort symptoms among patients with and without self-reported voice problems. A weak to moderate positive correlation was observed between the average number, frequency, and intensity of comfort symptom and the total score, physical domain score, and limitation domain score of the VoiSS. The vocal tract discomfort symptoms and the emotional domain score of the VoiSS were weakly correlated. Patients with self-reported voice problems had a higher number, frequency, and intensity of vocal tract discomfort symptoms. There is correlation between the VTDS and VoiSS scales, with greater references to vocal tract discomfort symptom in patients with self-reported voice problems. Therefore, the discomfort symptoms seem to influence the perception of the impact of a voice problem. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Riede, Tobias; Goller, Franz
2010-10-01
Song production in songbirds is a model system for studying learned vocal behavior. As in humans, bird phonation involves three main motor systems (respiration, vocal organ and vocal tract). The avian respiratory mechanism uses pressure regulation in air sacs to ventilate a rigid lung. In songbirds sound is generated with two independently controlled sound sources, which reside in a uniquely avian vocal organ, the syrinx. However, the physical sound generation mechanism in the syrinx shows strong analogies to that in the human larynx, such that both can be characterized as myoelastic-aerodynamic sound sources. Similarities include active adduction and abduction, oscillating tissue masses which modulate flow rate through the organ and a layered structure of the oscillating tissue masses giving rise to complex viscoelastic properties. Differences in the functional morphology of the sound producing system between birds and humans require specific motor control patterns. The songbird vocal apparatus is adapted for high speed, suggesting that temporal patterns and fast modulation of sound features are important in acoustic communication. Rapid respiratory patterns determine the coarse temporal structure of song and maintain gas exchange even during very long songs. The respiratory system also contributes to the fine control of airflow. Muscular control of the vocal organ regulates airflow and acoustic features. The upper vocal tract of birds filters the sounds generated in the syrinx, and filter properties are actively adjusted. Nonlinear source-filter interactions may also play a role. The unique morphology and biomechanical system for sound production in birds presents an interesting model for exploring parallels in control mechanisms that give rise to highly convergent physical patterns of sound generation. More comparative work should provide a rich source for our understanding of the evolution of complex sound producing systems. Copyright © 2009 Elsevier Inc. All rights reserved.
Effect of the losses in the vocal tract on determination of the area function.
Gülmezoğlu, M Bilginer; Barkana, Atalay
2003-01-01
In this work, the cross-sectional areas of the vocal tract are determined for the lossy and lossless cases by using the pole-zero models obtained from the electrical equivalent circuit model of the vocal tract and the system identification method. The cross-sectional areas are used to compare the lossy and lossless cases. In the lossy case, the internal losses due to wall vibration, heat conduction, air friction and viscosity are considered, that is, the complex poles and zeros obtained from the models are used directly. Whereas, in the lossless case, only the imaginary parts of these poles and zeros are used. The vocal tract shapes obtained for the lossy case are close to the actual ones.
Combined Functional Voice Therapy in Singers With Muscle Tension Dysphonia in Singing.
Sielska-Badurek, Ewelina; Osuch-Wójcikiewicz, Ewa; Sobol, Maria; Kazanecka, Ewa; Rzepakowska, Anna; Niemczyk, Kazimierz
2017-07-01
The purpose of this study was to evaluate vocal tract function and the voice quality in singers with muscle tension dysphonia (MTD) after undergoing combined functional voice therapy of the singing voice. This is a prospective, randomized study. Forty singers (29 females and 11 males, mean age: 24.6 ± 8.8 years) with MTD were enrolled in the study. The study group consisted of 20 singers who underwent combined functional voice therapy (10-15 individual sessions, 30-40 minutes each). Singers who did not opt for vocal rehabilitation consisted of the control group. Effects of rehabilitation were assessed with videolaryngostroboscopy, palpation of the vocal tract structures, flexible fiberoptic evaluation of the pharynx and the larynx, perceptual speaking and singing voice assessment, acoustic analysis, maximal phonation time, and the Voice Handicap Index. After combined functional voice therapy in the study group, great improvement was noticed in palpation of the vocal tract structures (P < 0.001), perceptual voice assessment (P < 0.001), phonetograms (P = 0.002), and singing range obtained from acoustic analysis of glissando (P < 0.001). In the control group, no statistically significant differences were found between the first and the second assessments. Combined functional voice therapy proved to be an efficacious treatment method in singers with MTD in singing. Development of palpation and perceptual singing voice examination protocols enables one to compare results before and after rehabilitation in clinics. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Neuronal Control of Mammalian Vocalization, with Special Reference to the Squirrel Monkey
NASA Astrophysics Data System (ADS)
Jürgens, Uwe
Squirrel monkey vocalization can be considered as a suitable model for the study in humans of the neurobiological basis of nonverbal emotional vocal utterances, such as laughing, crying, and groaning. Evaluation of electrical and chemical brain stimulation data, lesioning studies, single-neurone recordings, and neuroanatomical tracing work leads to the following conclusions: The periaqueductal gray and laterally bordering tegmentum of the midbrain represent a crucial area for the production of vocalization. This area collects the various vocalization-triggering stimuli, such as auditory, visual, and somatosensory input from diverse sensory-processing structures, motivation-controlling input from some limbic structures, and volitional impulses from the anterior cingulate cortex. Destruction of this area causes mutism. It is still under dispute whether the periaqueductal region harbors the vocal pattern generator or merely couples vocalization-triggering information to motor-coordinating structures further downward in the brainstem. The periaqueductal region is connected with the phonatory motoneuron pools indirectly via one or several interneurons. The nucleus retroambiguus represents a crucial relay station for the laryngeal and expiratory component of vocalization. The articulatory component reaches the orofacial motoneuron pools via the parvocellular reticular formation. Essential proprioceptive feedback from the larynx and lungs enter the vocal-controlling network via the solitary tract nucleus.
Samlan, Robin A; Story, Brad H
2011-10-01
To relate vocal fold structure and kinematics to 2 acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). The authors used a computational, kinematic model of the medial surfaces of the vocal folds to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow, and acoustic waveforms and were related to the underlying vocal fold kinematics. CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1.0 mm to 1.5 mm and decreased with separation > 1.5 mm. CPP is generally a function of vocal process separation. H1*-H2* (see paragraph 6 of article text for an explanation of the asterisks) will increase or decrease with vocal process separation on the basis of vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures.
Rezaei, Fariba; Omrani, Mohammad Reza; Abnavi, Fateme; Mojiri, Fariba; Golabbakhsh, Marzieh; Barati, Sohrab; Mahaki, Behzad
2015-01-01
Acoustic analysis of sounds produced during speech provides significant information about the physiology of larynx and vocal tract. The analysis of voice power spectrum is a fundamental sensitive method of acoustic assessment that provides valuable information about the voice source and characteristics of vocal tract resonance cavities. The changes in long-term average spectrum (LTAS) spectral tilt and harmony to noise ratio (HNR) were analyzed to assess the voice quality before and after functional rhinoplasty in patients with internal nasal valve collapse. Before and 3 months after functional rhinoplasty, 12 participants were evaluated and HNR and LTAS spectral tilt in /a/ and /i/ vowels were estimated. It was seen that an increase in HNR and a decrease in LTAS spectral tilt existed after surgery. Mean LTAS spectral tilt in vowel /a/ decreased from 2.37 ± 1.04 to 2.28 ± 1.17 (P = 0.388), and it was decreased from 4.16 ± 1.65 to 2.73 ± 0.69 in vowel /i/ (P = 0.008). Mean HNR in the vowel /a/ increased from 20.71 ± 3.93 to 25.06 ± 2.67 (P = 0.002), and it was increased from 21.28 ± 4.11 to 25.26 ± 3.94 in vowel /i/ (P = 0.002). Modification of the vocal tract caused the vocal cords to close sufficiently, and this showed that although rhinoplasty did not affect the larynx directly, it changes the structure of the vocal tract and consequently the resonance of voice production. The aim of this study was to investigate the changes in voice parameters after functional rhinoplasty in patients with internal nasal valve collapse by computerized analysis of acoustic characteristics. PMID:26955564
NASA Astrophysics Data System (ADS)
Irino, Toshio; Patterson, Roy
2005-04-01
We hear vowels produced by men, women, and children as approximately the same although there is considerable variability in glottal pulse rate and vocal tract length. At the same time, we can identify the speaker group. Recent experiments show that it is possible to identify vowels even when the glottal pulse rate and vocal tract length are condensed or expanded beyond the range of natural vocalization. This suggests that the auditory system has an automatic process to segregate information about shape and size of the vocal tract. Recently we proposed that the auditory system uses some form of Stabilized, Wavelet-Mellin Transform (SWMT) to analyze scale information in bio-acoustic sounds as a general framework for auditory processing from cochlea to cortex. This talk explains the theoretical background of the model and how the vocal information is normalized in the representation. [Work supported by GASR(B)(2) No. 15300061, JSPS.
Traser, Louisa; Burdumy, Michael; Richter, Bernhard; Vicari, Marco; Echternach, Matthias
2014-01-01
Magnetic Resonance Imaging (MRI) of subjects in a supine position can be used to evaluate the configuration of the vocal tract during phonation. However, studies of speech phonation have shown that gravity can affect vocal tract shape and bias measurements. This is one of the reasons that MRI studies of singing phonation have used professionally trained singers as subjects, because they are generally considered to be less affected by the supine body position and environmental distractions. A study of untrained singers might not only contribute to the understanding of intuitive singing function and aid the evaluation of potential hazards for vocal health, but also provide insights into the effect of the supine position on singers in general. In the present study, an open configuration 0.25 T MRI system with a rotatable examination bed was used to study the effect of body position in 20 vocally untrained subjects. The subjects were asked to sing sustained tones in both supine and upright body positions on different pitches and in different register conditions. Morphometric measurements were taken from the acquired images of a sagittal slice depicting the vocal tract. The analysis concerning the vocal tract configuration in the two body positions revealed differences in 5 out of 10 measured articulatory parameters. In the upright position the jaw was less protruded, the uvula was elongated, the larynx more tilted and the tongue was positioned more to the front of the mouth than in the supine position. The findings presented are in agreement with several studies on gravitational effects in speech phonation, but contrast with the results of a previous study on professional singers of our group where only minor differences between upright and supine body posture were observed. The present study demonstrates that imaging of the vocal tract using weight-bearing MR imaging is a feasible tool for the study of sustained phonation in singing for vocally untrained subjects. PMID:25379885
Traser, Louisa; Burdumy, Michael; Richter, Bernhard; Vicari, Marco; Echternach, Matthias
2014-01-01
Magnetic Resonance Imaging (MRI) of subjects in a supine position can be used to evaluate the configuration of the vocal tract during phonation. However, studies of speech phonation have shown that gravity can affect vocal tract shape and bias measurements. This is one of the reasons that MRI studies of singing phonation have used professionally trained singers as subjects, because they are generally considered to be less affected by the supine body position and environmental distractions. A study of untrained singers might not only contribute to the understanding of intuitive singing function and aid the evaluation of potential hazards for vocal health, but also provide insights into the effect of the supine position on singers in general. In the present study, an open configuration 0.25 T MRI system with a rotatable examination bed was used to study the effect of body position in 20 vocally untrained subjects. The subjects were asked to sing sustained tones in both supine and upright body positions on different pitches and in different register conditions. Morphometric measurements were taken from the acquired images of a sagittal slice depicting the vocal tract. The analysis concerning the vocal tract configuration in the two body positions revealed differences in 5 out of 10 measured articulatory parameters. In the upright position the jaw was less protruded, the uvula was elongated, the larynx more tilted and the tongue was positioned more to the front of the mouth than in the supine position. The findings presented are in agreement with several studies on gravitational effects in speech phonation, but contrast with the results of a previous study on professional singers of our group where only minor differences between upright and supine body posture were observed. The present study demonstrates that imaging of the vocal tract using weight-bearing MR imaging is a feasible tool for the study of sustained phonation in singing for vocally untrained subjects.
Halwani, Gus F; Loui, Psyche; Rüber, Theodor; Schlaug, Gottfried
2011-01-01
Structure and function of the human brain are affected by training in both linguistic and musical domains. Individuals with intensive vocal musical training provide a useful model for investigating neural adaptations of learning in the vocal-motor domain and can be compared with learning in a more general musical domain. Here we confirm general differences in macrostructure (tract volume) and microstructure (fractional anisotropy, FA) of the arcuate fasciculus (AF), a prominent white-matter tract connecting temporal and frontal brain regions, between singers, instrumentalists, and non-musicians. Both groups of musicians differed from non-musicians in having larger tract volume and higher FA values of the right and left AF. The AF was then subdivided in a dorsal (superior) branch connecting the superior temporal gyrus and the inferior frontal gyrus (STG ↔ IFG), and ventral (inferior) branch connecting the middle temporal gyrus and the inferior frontal gyrus (MTG ↔ IFG). Relative to instrumental musicians, singers had a larger tract volume but lower FA values in the left dorsal AF (STG ↔ IFG), and a similar trend in the left ventral AF (MTG ↔ IFG). This between-group comparison controls for the general effects of musical training, although FA was still higher in singers compared to non-musicians. Both musician groups had higher tract volumes in the right dorsal and ventral tracts compared to non-musicians, but did not show a significant difference between each other. Furthermore, in the singers' group, FA in the left dorsal branch of the AF was inversely correlated with the number of years of participants' vocal training. Our findings suggest that long-term vocal-motor training might lead to an increase in volume and microstructural complexity of specific white-matter tracts connecting regions that are fundamental to sound perception, production, and its feedforward and feedback control which can be differentiated from a more general musician effect.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bouchard, Kristofer E.; Conant, David F.; Anumanchipalli, Gopala K.
A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship acrossmore » speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.« less
Anumanchipalli, Gopala K.; Dichter, Benjamin; Chaisanguanthum, Kris S.; Johnson, Keith; Chang, Edward F.
2016-01-01
A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial—especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship across speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics. PMID:27019106
Bouchard, Kristofer E.; Conant, David F.; Anumanchipalli, Gopala K.; ...
2016-03-28
A complete neurobiological understanding of speech motor control requires determination of the relationship between simultaneously recorded neural activity and the kinematics of the lips, jaw, tongue, and larynx. Many speech articulators are internal to the vocal tract, and therefore simultaneously tracking the kinematics of all articulators is nontrivial-especially in the context of human electrophysiology recordings. Here, we describe a noninvasive, multi-modal imaging system to monitor vocal tract kinematics, demonstrate this system in six speakers during production of nine American English vowels, and provide new analysis of such data. Classification and regression analysis revealed considerable variability in the articulator-to-acoustic relationship acrossmore » speakers. Non-negative matrix factorization extracted basis sets capturing vocal tract shapes allowing for higher vowel classification accuracy than traditional methods. Statistical speech synthesis generated speech from vocal tract measurements, and we demonstrate perceptual identification. We demonstrate the capacity to predict lip kinematics from ventral sensorimotor cortical activity. These results demonstrate a multi-modal system to non-invasively monitor articulator kinematics during speech production, describe novel analytic methods for relating kinematic data to speech acoustics, and provide the first decoding of speech kinematics from electrocorticography. These advances will be critical for understanding the cortical basis of speech production and the creation of vocal prosthetics.« less
Phonation Threshold Pressure Measurement with a Semi-Occluded Vocal Tract
ERIC Educational Resources Information Center
Titze, Ingo R.
2009-01-01
Purpose: The purpose of this article was to determine if a semi-occluded vocal tract could be used to measure phonation threshold pressure. This is in contrast to the shutter technique, where an alternation between a fully occluded tract and an unoccluded tract is used. Method: Five male and 5 female volunteers phonated through a thin straw held…
Lower Vocal Tract Morphologic Adjustments Are Relevant for Voice Timbre in Singing.
Mainka, Alexander; Poznyakovskiy, Anton; Platzek, Ivan; Fleischer, Mario; Sundberg, Johan; Mürbe, Dirk
2015-01-01
The vocal tract shape is crucial to voice production. Its lower part seems particularly relevant for voice timbre. This study analyzes the detailed morphology of parts of the epilaryngeal tube and the hypopharynx for the sustained German vowels /a/, /e/, /i/, /o/, and /u/ by thirteen male singer subjects who were at the beginning of their academic singing studies. Analysis was based on two different phonatory conditions: a natural, speech-like phonation and a singing phonation, like in classical singing. 3D models of the vocal tract were derived from magnetic resonance imaging and compared with long-term average spectrum analysis of audio recordings from the same subjects. Comparison of singing to the speech-like phonation, which served as reference, showed significant adjustments of the lower vocal tract: an average lowering of the larynx by 8 mm and an increase of the hypopharyngeal cross-sectional area (+ 21:9%) and volume (+ 16:8%). Changes in the analyzed epilaryngeal portion of the vocal tract were not significant. Consequently, lower larynx-to-hypopharynx area and volume ratios were found in singing compared to the speech-like phonation. All evaluated measures of the lower vocal tract varied significantly with vowel quality. Acoustically, an increase of high frequency energy in singing correlated with a wider hypopharyngeal area. The findings offer an explanation how classical male singers might succeed in producing a voice timbre with increased high frequency energy, creating a singer`s formant cluster.
Lower Vocal Tract Morphologic Adjustments Are Relevant for Voice Timbre in Singing
Mainka, Alexander; Poznyakovskiy, Anton; Platzek, Ivan; Fleischer, Mario; Sundberg, Johan; Mürbe, Dirk
2015-01-01
The vocal tract shape is crucial to voice production. Its lower part seems particularly relevant for voice timbre. This study analyzes the detailed morphology of parts of the epilaryngeal tube and the hypopharynx for the sustained German vowels /a/, /e/, /i/, /o/, and /u/ by thirteen male singer subjects who were at the beginning of their academic singing studies. Analysis was based on two different phonatory conditions: a natural, speech-like phonation and a singing phonation, like in classical singing. 3D models of the vocal tract were derived from magnetic resonance imaging and compared with long-term average spectrum analysis of audio recordings from the same subjects. Comparison of singing to the speech-like phonation, which served as reference, showed significant adjustments of the lower vocal tract: an average lowering of the larynx by 8 mm and an increase of the hypopharyngeal cross-sectional area (+ 21.9%) and volume (+ 16.8%). Changes in the analyzed epilaryngeal portion of the vocal tract were not significant. Consequently, lower larynx-to-hypopharynx area and volume ratios were found in singing compared to the speech-like phonation. All evaluated measures of the lower vocal tract varied significantly with vowel quality. Acoustically, an increase of high frequency energy in singing correlated with a wider hypopharyngeal area. The findings offer an explanation how classical male singers might succeed in producing a voice timbre with increased high frequency energy, creating a singer‘s formant cluster. PMID:26186691
NASA Astrophysics Data System (ADS)
Zhang, Lucy; Yu, Feimi; Krane, Michael
2017-11-01
The control volume analysis of power flow during sustained phonation is performed using results of a fully-coupled aeroelastic-aeroacoustic simulation. The control volumes consist of the laryngeal region, and the larynx and the vocal tract. Two cases are considered: an effectively infinite length vocal tract, where sound produced in the larynx radiates away and is not reflected back, and a constant-area vocal tract of normal adult human dimensions, in which phonatory sound resonates before radiating from the mouth opening. In both cases the lungs are modeled to absorb all incident sound, while providing a constant volume flow toward the larynx. Control of the acoustic boundary conditions is accomplished using perfectly matched- layers, and flow from the lungs is provided by a source distribution near the entrance to the trachea region. For both cases the power flow for the larynx and larynx plus vocal tract control volumes are computed using the integral form of the mechanical energy equation, expanded to consider power exchanges between slightly compressible flow in the larynx and the acoustic fields in the vocal tract and trachea. The funding from NIH 2R01DC005642-10A1 is greatly acknowledged.
Effect of the signal measured from the glottis on determination of the vocal tract shape.
Gülmezoğlu, M B; Barkana, A
1998-01-01
All-pole and pole-zero models for the vocal tract are developed. First an impulse train, then the pressure signal measured from the glottis, is used as the input in the models. The models for eight Turkish vowels produced by one male subject are studied to determine the effects of the presumed impulse train and the pressure signal measured from the glottis on the estimation of the vocal tract shape. The motion of the tongue is also examined for a whole word.
Vampola, Tomáš; Horáček, Jaromír; Laukkanen, Anne-Maria; Švec, Jan G
2015-04-01
Resonance frequencies of the vocal tract have traditionally been modelled using one-dimensional models. These cannot accurately represent the events in the frequency region of the formant cluster around 2.5-4.5 kHz, however. Here, the vocal tract resonance frequencies and their mode shapes are studied using a three-dimensional finite element model obtained from computed tomography measurements of a subject phonating on vowel [a:]. Instead of the traditional five, up to eight resonance frequencies of the vocal tract were found below the prominent antiresonance around 4.7 kHz. The three extra resonances were found to correspond to modes which were axially asymmetric and involved the piriform sinuses, valleculae, and transverse vibrations in the oral cavity. The results therefore suggest that the phenomenon of speaker's and singer's formant clustering may be more complex than originally thought.
The larynx of roaring and non-roaring cats.
Hast, M H
1989-04-01
Dissections were made of the larynges of 14 species of the cat family, with representative specimens from all genera. It was found that the vocal folds of the larynx of genus Panthera (with the exception of the snow leopard) form the basic structure of a sound generator well-designed to produce a high acoustical energy. Combined with an efficient sound radiator (vocal tract) that can be adjusted in length, a Panthera can use its vocal instrument literally to blow its own horn with a 'roar'. Also, it is proposed that laryngeal morphology can be used as an anatomical character in mammalian taxonomy.
Influence of the ventricular folds on a voice source with specified vocal fold motion1
McGowan, Richard S.; Howe, Michael S.
2010-01-01
The unsteady drag on the vocal folds is the major source of sound during voiced speech. The drag force is caused by vortex shedding from the vocal folds. The influence of the ventricular folds (i.e., the “false” vocal folds that protrude into the vocal tract a short distance downstream of the glottis) on the drag and the voice source are examined in this paper by means of a theoretical model involving vortex sheets in a two-dimensional geometry. The effect of the ventricular folds on the output acoustic pressure is found to be small when the movement of the vocal folds is prescribed. It is argued that the effect remains small when fluid-structure interactions account for vocal fold movement. These conclusions can be justified mathematically when the characteristic time scale for change in the velocity of the glottal jet is large compared to the time it takes for a vortex disturbance to be convected through the vocal fold and ventricular fold region. PMID:20329852
Carey, Daniel; Miquel, Marc E.; Evans, Bronwen G.; Adank, Patti; McGettigan, Carolyn
2017-01-01
Abstract Imitating speech necessitates the transformation from sensory targets to vocal tract motor output, yet little is known about the representational basis of this process in the human brain. Here, we address this question by using real-time MR imaging (rtMRI) of the vocal tract and functional MRI (fMRI) of the brain in a speech imitation paradigm. Participants trained on imitating a native vowel and a similar nonnative vowel that required lip rounding. Later, participants imitated these vowels and an untrained vowel pair during separate fMRI and rtMRI runs. Univariate fMRI analyses revealed that regions including left inferior frontal gyrus were more active during sensorimotor transformation (ST) and production of nonnative vowels, compared with native vowels; further, ST for nonnative vowels activated somatomotor cortex bilaterally, compared with ST of native vowels. Using test representational similarity analysis (RSA) models constructed from participants’ vocal tract images and from stimulus formant distances, we found that RSA searchlight analyses of fMRI data showed either type of model could be represented in somatomotor, temporal, cerebellar, and hippocampal neural activation patterns during ST. We thus provide the first evidence of widespread and robust cortical and subcortical neural representation of vocal tract and/or formant parameters, during prearticulatory ST. PMID:28334401
Carey, Daniel; Miquel, Marc E; Evans, Bronwen G; Adank, Patti; McGettigan, Carolyn
2017-05-01
Imitating speech necessitates the transformation from sensory targets to vocal tract motor output, yet little is known about the representational basis of this process in the human brain. Here, we address this question by using real-time MR imaging (rtMRI) of the vocal tract and functional MRI (fMRI) of the brain in a speech imitation paradigm. Participants trained on imitating a native vowel and a similar nonnative vowel that required lip rounding. Later, participants imitated these vowels and an untrained vowel pair during separate fMRI and rtMRI runs. Univariate fMRI analyses revealed that regions including left inferior frontal gyrus were more active during sensorimotor transformation (ST) and production of nonnative vowels, compared with native vowels; further, ST for nonnative vowels activated somatomotor cortex bilaterally, compared with ST of native vowels. Using test representational similarity analysis (RSA) models constructed from participants' vocal tract images and from stimulus formant distances, we found that RSA searchlight analyses of fMRI data showed either type of model could be represented in somatomotor, temporal, cerebellar, and hippocampal neural activation patterns during ST. We thus provide the first evidence of widespread and robust cortical and subcortical neural representation of vocal tract and/or formant parameters, during prearticulatory ST. © The Author 2017. Published by Oxford University Press.
Acoustic signatures of sound source-tract coupling.
Arneodo, Ezequiel M; Perl, Yonatan Sanz; Mindlin, Gabriel B
2011-04-01
Birdsong is a complex behavior, which results from the interaction between a nervous system and a biomechanical peripheral device. While much has been learned about how complex sounds are generated in the vocal organ, little has been learned about the signature on the vocalizations of the nonlinear effects introduced by the acoustic interactions between a sound source and the vocal tract. The variety of morphologies among bird species makes birdsong a most suitable model to study phenomena associated to the production of complex vocalizations. Inspired by the sound production mechanisms of songbirds, in this work we study a mathematical model of a vocal organ, in which a simple sound source interacts with a tract, leading to a delay differential equation. We explore the system numerically, and by taking it to the weakly nonlinear limit, we are able to examine its periodic solutions analytically. By these means we are able to explore the dynamics of oscillatory solutions of a sound source-tract coupled system, which are qualitatively different from those of a sound source-filter model of a vocal organ. Nonlinear features of the solutions are proposed as the underlying mechanisms of observed phenomena in birdsong, such as unilaterally produced "frequency jumps," enhancement of resonances, and the shift of the fundamental frequency observed in heliox experiments. ©2011 American Physical Society
Acoustic signatures of sound source-tract coupling
Arneodo, Ezequiel M.; Perl, Yonatan Sanz; Mindlin, Gabriel B.
2014-01-01
Birdsong is a complex behavior, which results from the interaction between a nervous system and a biomechanical peripheral device. While much has been learned about how complex sounds are generated in the vocal organ, little has been learned about the signature on the vocalizations of the nonlinear effects introduced by the acoustic interactions between a sound source and the vocal tract. The variety of morphologies among bird species makes birdsong a most suitable model to study phenomena associated to the production of complex vocalizations. Inspired by the sound production mechanisms of songbirds, in this work we study a mathematical model of a vocal organ, in which a simple sound source interacts with a tract, leading to a delay differential equation. We explore the system numerically, and by taking it to the weakly nonlinear limit, we are able to examine its periodic solutions analytically. By these means we are able to explore the dynamics of oscillatory solutions of a sound source-tract coupled system, which are qualitatively different from those of a sound source-filter model of a vocal organ. Nonlinear features of the solutions are proposed as the underlying mechanisms of observed phenomena in birdsong, such as unilaterally produced “frequency jumps,” enhancement of resonances, and the shift of the fundamental frequency observed in heliox experiments. PMID:21599213
Laukkanen, Anne-Maria; Titze, Ingo R.; Hoffman, Henry; Finnegan, Eileen
2015-01-01
Voice training exploits semiocclusives, which increase vocal tract interaction with the source. Modeling results suggest that vocal economy (maximum flow declination rate divided by maximum area declination rate, MADR) is improved by matching the glottal and vocal tract impedances. Changes in MADR may be correlated with thyroarytenoid (TA) muscle activity. Here the effects of impedance matching are studied for laryngeal muscle activity and glottal resistance. One female repeated [pa:p:a] before and immediately after (a) phonation into different-sized tubes and (b) voiced bilabial fricative [β:]. To allow estimation of subglottic pressure from the oral pressure, [p] was inserted also in the repetitions of the semiocclusions. Airflow was registered using a flow mask. EMG was registered from TA, cricothyroid (CT) and lateral cricoarytenoid (LCA) muscles. Phonation was simulated using a 7 × 5 × 5 point-mass model of the vocal folds, allowing inputs of simulated laryngeal muscle activation. The variables were TA, CT and LCA activities. Increased vocal tract impedance caused the subject to raise TA activity compared to CT and LCA activities. Computer simulation showed that higher glottal economy and efficiency (oral radiated power divided by aerodynamic power) were obtained with a higher TA/CT ratio when LCA activity was tuned for ideal adduction. PMID:19011306
Yu, Chengzhu; Hansen, John H L
2017-03-01
Human physiology has evolved to accommodate environmental conditions, including temperature, pressure, and air chemistry unique to Earth. However, the environment in space varies significantly compared to that on Earth and, therefore, variability is expected in astronauts' speech production mechanism. In this study, the variations of astronaut voice characteristics during the NASA Apollo 11 mission are analyzed. Specifically, acoustical features such as fundamental frequency and phoneme formant structure that are closely related to the speech production system are studied. For a further understanding of astronauts' vocal tract spectrum variation in space, a maximum likelihood frequency warping based analysis is proposed to detect the vocal tract spectrum displacement during space conditions. The results from fundamental frequency, formant structure, as well as vocal spectrum displacement indicate that astronauts change their speech production mechanism when in space. Moreover, the experimental results for astronaut voice identification tasks indicate that current speaker recognition solutions are highly vulnerable to astronaut voice production variations in space conditions. Future recommendations from this study suggest that successful applications of speaker recognition during extended space missions require robust speaker modeling techniques that could effectively adapt to voice production variation caused by diverse space conditions.
An investigation of articulatory setting using real-time magnetic resonance imaging
Ramanarayanan, Vikram; Goldstein, Louis; Byrd, Dani; Narayanan, Shrikanth S.
2013-01-01
This paper presents an automatic procedure to analyze articulatory setting in speech production using real-time magnetic resonance imaging of the moving human vocal tract. The procedure extracts frames corresponding to inter-speech pauses, speech-ready intervals and absolute rest intervals from magnetic resonance imaging sequences of read and spontaneous speech elicited from five healthy speakers of American English and uses automatically extracted image features to quantify vocal tract posture during these intervals. Statistical analyses show significant differences between vocal tract postures adopted during inter-speech pauses and those at absolute rest before speech; the latter also exhibits a greater variability in the adopted postures. In addition, the articulatory settings adopted during inter-speech pauses in read and spontaneous speech are distinct. The results suggest that adopted vocal tract postures differ on average during rest positions, ready positions and inter-speech pauses, and might, in that order, involve an increasing degree of active control by the cognitive speech planning mechanism. PMID:23862826
The larynx of roaring and non-roaring cats.
Hast, M H
1989-01-01
Dissections were made of the larynges of 14 species of the cat family, with representative specimens from all genera. It was found that the vocal folds of the larynx of genus Panthera (with the exception of the snow leopard) form the basic structure of a sound generator well-designed to produce a high acoustical energy. Combined with an efficient sound radiator (vocal tract) that can be adjusted in length, a Panthera can use its vocal instrument literally to blow its own horn with a 'roar'. Also, it is proposed that laryngeal morphology can be used as an anatomical character in mammalian taxonomy. Images Fig. 1 PMID:2606766
Improved vocal tract reconstruction and modeling using an image super-resolution technique.
Zhou, Xinhui; Woo, Jonghye; Stone, Maureen; Prince, Jerry L; Espy-Wilson, Carol Y
2013-06-01
Magnetic resonance imaging has been widely used in speech production research. Often only one image stack (sagittal, axial, or coronal) is used for vocal tract modeling. As a result, complementary information from other available stacks is not utilized. To overcome this, a recently developed super-resolution technique was applied to integrate three orthogonal low-resolution stacks into one isotropic volume. The results on vowels show that the super-resolution volume produces better vocal tract visualization than any of the low-resolution stacks. Its derived area functions generally produce formant predictions closer to the ground truth, particularly for those formants sensitive to area perturbations at constrictions.
A Vowel-Based Method for Vocal Tract Control in Clarinet Pedagogy
ERIC Educational Resources Information Center
González, Darleny; Payri, Blas
2017-01-01
Our review of scientific literature shows that the activity inside the clarinetist's vocal tract (VT) affects pitch and timbre, while also facilitating technical exercises. Clarinetists adapt their VT intuitively and, in some cases, may compensate an inadequate VT configuration through unnecessary pressure, resulting in technical blockage,…
What can vortices tell us about vocal fold vibration and voice production.
Khosla, Sid; Murugappan, Shanmugam; Gutmark, Ephraim
2008-06-01
Much clinical research on laryngeal airflow has assumed that airflow is unidirectional. This review will summarize what additional knowledge can be obtained about vocal fold vibration and voice production by studying rotational motion, or vortices, in laryngeal airflow. Recent work suggests two types of vortices that may strongly contribute to voice quality. The first kind forms just above the vocal folds during glottal closing, and is formed by flow separation in the glottis; these flow separation vortices significantly contribute to rapid closing of the glottis, and hence, to producing loudness and high frequency harmonics in the acoustic spectrum. The second is a group of highly three-dimensional and coherent supraglottal vortices, which can produce sound by interaction with structures in the vocal tract. Present work is also described that suggests that certain laryngeal pathologies, such as asymmetric vocal fold tension, will significantly modify both types of vortices, with adverse impact on sound production: decreased rate of glottal closure, increased broadband noise, and a decreased signal to noise ratio. Recent research supports the hypothesis that glottal airflow contains certain vortical structures that significantly contribute to voice quality.
Phonation Threshold Pressure Measurement With a Semi-Occluded Vocal Tract
Titze, Ingo R.
2015-01-01
Purpose The purpose of this article was to determine if a semi-occluded vocal tract could be used to measure phonation threshold pressure. This is in contrast to the shutter technique, where an alternation between a fully occluded tract and an unoccluded tract is used. Method Five male and 5 female volunteers phonated through a thin straw held between the lips. Oral pressure behind the lips was measured. Mathematical predictions of phonation threshold pressures were compared to the measured ones over a range of frequencies. Results It was shown that, for a 2.5-mm diameter straw, phonation threshold pressures were obtainable over a 2-octave range of fundamental frequency by all volunteers. In magnitude, the pressures agreed with the 0.2–0.5 kPa values obtained in previous investigations. Sensitivity to viscoelastic and geometric properties of the vocal folds was generally not compromised with greater oral impedance, but some differences were predicted theoretically in contrast to an open mouth configuration. Conclusion Because phonation threshold pressure is always dependent on vocal tract interaction, it may be advantageous to choose an exact and fixed oral semi-occlusion for the measurement and interpret the results in light of the known acoustic load. PMID:19641082
Contribution of the supraglottic larynx to the vocal product: imaging and acoustic analysis
NASA Astrophysics Data System (ADS)
Gracco, L. Carol
1996-04-01
Horizontal supraglottic laryngectomy is a surgical procedure to remove a mass lesion located in the region of the pharynx superior to the true vocal folds. In contrast to full or partial laryngectomy, patients who undergo horizontal supraglottic laryngectomy often present with little or nor involvement to the true vocal folds. This population provides an opportunity to examine the acoustic consequences of altering the pharynx while sparing the laryngeal sound source. Acoustic and magnetic resonance imaging (MRI) data were acquired in a group of four patients before and after supraglottic laryngectomy. Acoustic measures included the identification of vocal tract resonances and the fundamental frequency of the vocal fold vibration. 3D reconstruction of the pharyngeal portion of each subjects' vocal tract were made from MRIs taken during phonation and volume measures were obtained. These measures reveal a variable, but often dramatic difference in the surgically-altered area of the pharynx and changes in the formant frequencies of the vowel/i/post surgically. In some cases the presence of the tumor created a deviation from the expected formant values pre-operatively with post-operative values approaching normal. Patients who also underwent radiation treatment post surgically tended to have greater constriction in the pharyngeal area of the vocal tract.
Vocal tract and glottal function during and after vocal exercising with resonance tube and straw.
Guzman, Marco; Laukkanen, Anne-Maria; Krupa, Petr; Horáček, Jaromir; Švec, Jan G; Geneid, Ahmed
2013-07-01
The present study aimed to investigate the vocal tract and glottal function during and after phonation into a tube and a stirring straw. A male classically trained singer was assessed. Computerized tomography (CT) was performed when the subject produced [a:] at comfortable speaking pitch, phonated into the resonance tube and when repeating [a:] after the exercise. Similar procedure was performed with a narrow straw after 15 minutes silence. Anatomic distances and area measures were obtained from CT midsagittal and transversal images. Acoustic, perceptual, electroglottographic (EGG), and subglottic pressure measures were also obtained. During and after phonation into the tube or straw, the velum closed the nasal passage better, the larynx position lowered, and hypopharynx area widened. Moreover, the ratio between the inlet of the lower pharynx and the outlet of the epilaryngeal tube became larger during and after tube/straw phonation. Acoustic results revealed a stronger spectral prominence in the singer/speaker's formant cluster region after exercising. Listening test demonstrated better voice quality after straw/tube than before. Contact quotient derived from EGG decreased during both tube and straw and remained lower after exercising. Subglottic pressure increased during straw and remained somewhat higher after it. CT and acoustic results indicated that vocal exercises with increased vocal tract impedance lead to increased vocal efficiency and economy. One of the major changes was the more prominent singer's/speaker's formant cluster. Vocal tract and glottal modifications were more prominent during and after straw exercising compared with tube phonation. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Narayanan, Shrikanth
2009-01-01
We describe a method for unsupervised region segmentation of an image using its spatial frequency domain representation. The algorithm was designed to process large sequences of real-time magnetic resonance (MR) images containing the 2-D midsagittal view of a human vocal tract airway. The segmentation algorithm uses an anatomically informed object model, whose fit to the observed image data is hierarchically optimized using a gradient descent procedure. The goal of the algorithm is to automatically extract the time-varying vocal tract outline and the position of the articulators to facilitate the study of the shaping of the vocal tract during speech production. PMID:19244005
Vocal Tract Representation in the Recognition of Cerebral Palsied Speech
ERIC Educational Resources Information Center
Rudzicz, Frank; Hirst, Graeme; van Lieshout, Pascal
2012-01-01
Purpose: In this study, the authors explored articulatory information as a means of improving the recognition of dysarthric speech by machine. Method: Data were derived chiefly from the TORGO database of dysarthric articulation (Rudzicz, Namasivayam, & Wolff, 2011) in which motions of various points in the vocal tract are measured during speech.…
Social Communication and Vocal Recognition in Free-Ranging Rhesus Monkeys
NASA Astrophysics Data System (ADS)
Rendall, Christopher Andrew
Kinship and individual identity are key determinants of primate sociality, and the capacity for vocal recognition of individuals and kin is hypothesized to be an important adaptation facilitating intra-group social communication. Research was conducted on adult female rhesus monkeys on Cayo Santiago, Puerto Rico to test this hypothesis for three acoustically distinct calls characterized by varying selective pressures on communicating identity: coos (contact calls), grunts (close range social calls), and noisy screams (agonistic recruitment calls). Vocalization playback experiments confirmed a capacity for both individual and kin recognition of coos, but not screams (grunts were not tested). Acoustic analyses, using traditional spectrographic methods as well as linear predictive coding techniques, indicated that coos (but not grunts or screams) were highly distinctive, and that the effects of vocal tract filtering--formants --contributed more to statistical discriminations of both individuals and kin groups than did temporal or laryngeal source features. Formants were identified from very short (23 ms.) segments of coos and were stable within calls, indicating that formant cues to individual and kin identity were available throughout a call. This aspect of formant cues is predicted to be an especially important design feature for signaling identity efficiently in complex acoustic environments. Results of playback experiments involving manipulated coo stimuli provided preliminary perceptual support for the statistical inference that formant cues take precedence in facilitating vocal recognition. The similarity of formants among female kin suggested a mechanism for the development of matrilineal vocal signatures from the genetic and environmental determinants of vocal tract morphology shared among relatives. The fact that screams --calls strongly expected to communicate identity--were not individually distinctive nor recognized suggested the possibility that their acoustic structure and role in signaling identity might be constrained by functional or morphological design requirements associated with their role in signaling submission.
Fluid-acoustic interactions and their impact on pathological voiced speech
NASA Astrophysics Data System (ADS)
Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.
2011-11-01
Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Howe, M S; McGowan, R S
2009-11-01
An analysis is made of the nonlinear interactions between flow in the subglottal vocal tract and glottis, sound waves in the subglottal system and a mechanical model of the vocal folds. The mean flow through the system is produced by a nominally steady contraction of the lungs, and mechanical experiments frequently involve a 'lung cavity' coupled to an experimental subglottal tube of arbitrary or ill-defined effective length L, on the basis that the actual value of L has little or no influence on excitation of the vocal folds. A simple, self-exciting single mass mathematical model of the vocal folds is used to investigate the sound generated within the subglottal domain and the unsteady volume flux from the glottis for experiments where it is required to suppress feedback of sound from the supraglottal vocal tract. In experiments where the assumed absorption of sound within the sponge-like interior of the lungs is small, the influence of changes in L can be very significant: when the subglottal tube behaves as an open-ended resonator (when L is as large as half the acoustic wavelength) there is predicted to be a mild increase in volume flux magnitude and a small change in waveform. However, the strong appearance of second harmonics of the acoustic field is predicted at intermediate lengths, when L is roughly one quarter of the acoustic wavelength. In cases of large lung damping, however, only modest changes in the volume flux are predicted to occur with variations in L.
Articulatory speech synthesis and speech production modelling
NASA Astrophysics Data System (ADS)
Huang, Jun
This dissertation addresses the problem of speech synthesis and speech production modelling based on the fundamental principles of human speech production. Unlike the conventional source-filter model, which assumes the independence of the excitation and the acoustic filter, we treat the entire vocal apparatus as one system consisting of a fluid dynamic aspect and a mechanical part. We model the vocal tract by a three-dimensional moving geometry. We also model the sound propagation inside the vocal apparatus as a three-dimensional nonplane-wave propagation inside a viscous fluid described by Navier-Stokes equations. In our work, we first propose a combined minimum energy and minimum jerk criterion to estimate the dynamic vocal tract movements during speech production. Both theoretical error bound analysis and experimental results show that this method can achieve very close match at the target points and avoid the abrupt change in articulatory trajectory at the same time. Second, a mechanical vocal fold model is used to compute the excitation signal of the vocal tract. The advantage of this model is that it is closely coupled with the vocal tract system based on fundamental aerodynamics. As a result, we can obtain an excitation signal with much more detail than the conventional parametric vocal fold excitation model. Furthermore, strong evidence of source-tract interaction is observed. Finally, we propose a computational model of the fricative and stop types of sounds based on the physical principles of speech production. The advantage of this model is that it uses an exogenous process to model the additional nonsteady and nonlinear effects due to the flow mode, which are ignored by the conventional source- filter speech production model. A recursive algorithm is used to estimate the model parameters. Experimental results show that this model is able to synthesize good quality fricative and stop types of sounds. Based on our dissertation work, we carefully argue that the articulatory speech production model has the potential to flexibly synthesize natural-quality speech sounds and to provide a compact computational model for speech production that can be beneficial to a wide range of areas in speech signal processing.
Pitch bending and glissandi on the clarinet: roles of the vocal tract and partial tone hole closure.
Chen, Jer-Ming; Smith, John; Wolfe, Joe
2009-09-01
Clarinettists combine non-standard fingerings with particular vocal tract configurations to achieve pitch bending, i.e., sounding pitches that can deviate substantially from those of standard fingerings. Impedance spectra were measured in the mouth of expert clarinettists while they played normally and during pitch bending, using a measurement head incorporated within a functioning clarinet mouthpiece. These were compared with the input impedance spectra of the clarinet for the fingerings used. Partially uncovering a tone hole by sliding a finger raises the frequency of clarinet impedance peaks, thereby allowing smooth increases in sounding pitch over some of the range. To bend notes in the second register and higher, however, clarinettists produce vocal tract resonances whose impedance maxima have magnitudes comparable with those of the bore resonance, which then may influence or determine the sounding frequency. It is much easier to bend notes down than up because of the phase relations of the bore and tract resonances, and the compliance of the reed. Expert clarinettists performed the glissando opening of Gershwin's 'Rhapsody in Blue'. Here, players coordinate the two effects: They slide their fingers gradually over open tone holes, while simultaneously adjusting a strong vocal tract resonance to the desired pitch.
The Human Voice in Speech and Singing
NASA Astrophysics Data System (ADS)
Lindblom, Björn; Sundberg, Johan
This chapter
The Human Voice in Speech and Singing
NASA Astrophysics Data System (ADS)
Lindblom, Björn; Sundberg, Johan
This chapter describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control of all these acoustic characteristics of vocal signals is discussed in Sect. 16.8, while Sect. 16.9 considers expressive aspects of vocal communication.
Duke, Emily; Plexico, Laura W; Sandage, Mary J; Hoch, Matthew
2015-11-01
This study investigated the effect of traditional vocal warm-up versus semioccluded vocal tract exercises on the acoustic parameters of voice through three questions: does vocal warm-up condition significantly alter the singing power ratio of the singing voice? Is singing power ratio dependent upon vowel? Is perceived phonatory effort affected by warm-up condition? Hypotheses were that vocal warm-up would alter the singing power ratio, and that semioccluded vocal tract warm-up would affect the singing power ratio more than no warm-up or traditional warm-up, that singing power ratio would vary across vowel, and that perceived phonatory effort would vary with warm-up condition. This study was a within-participant repeated measures design with counterbalanced conditions. Thirteen male singers were recorded under three different conditions: no warm-up, traditional warm-up, and semioccluded vocal tract exercise warm-up. Recordings were made of these singers performing the Star Spangled Banner, and singing power ratio (SPR) was calculated from four vowels. Singers rated their perceived phonatory effort (PPE) singing the Star Spangled Banner after each warm-up condition. Warm-up condition did not significantly affect SPR. SPR was significantly different for /i/ and /e/. PPE was not significantly different between warm-up conditions. The present study did not find significant differences in SPR between warm-up conditions. SPR differences for /i/, support previous findings. PPE did not differ significantly across warm-up condition despite the expectation that traditional or semioccluded warm-up would cause a decrease. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Vorperian, Houri K.; Wang, Shubing; Schimek, E. Michael; Durtschi, Reid B.; Kent, Ray D.; Gentry, Lindell R.; Chung, Moo K.
2011-01-01
Purpose: The anatomic origin for prepubertal vowel acoustic differences between male and female subjects remains unknown. The purpose of this study is to examine developmental sex differences in vocal tract (VT) length and its oral and pharyngeal portions. Method: Nine VT variables were measured from 605 imaging studies (magnetic resonance imaging…
Voice Training and Therapy with a Semi-Occluded Vocal Tract: Rationale and Scientific Underpinnings
ERIC Educational Resources Information Center
Titze, Ingo R.
2006-01-01
Purpose: Voice therapy with a semi-occluded vocal tract has a long history. The use of lip trills, tongue trills, bilabial fricatives, humming, and phonation into tubes or straws has been hailed by clinicians, singing teachers, and voice coaches as efficacious for training and rehabilitation. Little has been done, however, to provide the…
Miller, Nicola A; Gregory, Jennifer S; Aspden, Richard M; Stollery, Peter J; Gilbert, Fiona J
2014-09-01
The shape of the vocal tract and associated structures (eg, tongue and velum) is complicated and varies according to development and function. This variability challenges interpretation of voice experiments. Quantifying differences between shapes and understanding how vocal structures move in relation to each other is difficult using traditional linear and angle measurements. With statistical shape models, shape can be characterized in terms of independent modes of variation. Here, we build an active shape model (ASM) to assess morphologic and pitch-related functional changes affecting vocal structures and the airway. Using a cross-sectional study design, we obtained six midsagittal magnetic resonance images from 10 healthy adults (five men and five women) at rest, while breathing out, and while listening to, and humming low and high notes. Eighty landmark points were chosen to define the shape of interest and an ASM was built using these (60) images. Principal component analysis was used to identify independent modes of variation, and statistical analysis was performed using one-way repeated-measures analysis of variance. Twenty modes of variation were identified with modes 1 and 2 accounting for half the total variance. Modes 1 and 9 were significantly associated with humming low and high notes (P < 0.001) and showed coordinated changes affecting the cervical spine, vocal structures, and airway. Mode 2 highlighted wide structural variations between subjects. This study highlights the potential of active shape modeling to advance understanding of factors underlying morphologic and pitch-related functional variations affecting vocal structures and the airway in health and disease. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Zhou, Xinhui; Espy-Wilson, Carol Y.; Boyce, Suzanne; Tiede, Mark; Holland, Christy; Choe, Ann
2008-01-01
Speakers of rhotic dialects of North American English show a range of different tongue configurations for ∕r∕. These variants produce acoustic profiles that are indistinguishable for the first three formants [Delattre, P., and Freeman, D. C., (1968). “A dialect study of American English r’s by x-ray motion picture,” Linguistics 44, 28–69; Westbury, J. R. et al. (1998), “Differences among speakers in lingual articulation for American English ∕r∕,” Speech Commun. 26, 203–206]. It is puzzling why this should be so, given the very different vocal tract configurations involved. In this paper, two subjects whose productions of “retroflex” ∕r∕ and “bunched” ∕r∕ show similar patterns of F1–F3 but very different spacing between F4 and F5 are contrasted. Using finite element analysis and area functions based on magnetic resonance images of the vocal tract for sustained productions, the results of computer vocal tract models are compared to actual speech recordings. In particular, formant-cavity affiliations are explored using formant sensitivity functions and vocal tract simple-tube models. The difference in F4∕F5 patterns between the subjects is confirmed for several additional subjects with retroflex and bunched vocal tract configurations. The results suggest that the F4∕F5 differences between the variants can be largely explained by differences in whether the long cavity behind the palatal constriction acts as a half- or a quarter-wavelength resonator. PMID:18537397
Using statistical deformable models to reconstruct vocal tract shape from magnetic resonance images.
Vasconcelos, M J M; Rua Ventura, S M; Freitas, D R S; Tavares, J M R S
2010-10-01
The mechanisms involved in speech production are complex and have thus been subject to growing attention by the scientific community. It has been demonstrated that magnetic resonance imaging (MRI) is a powerful means in the understanding of the morphology of the vocal tract. Over the last few years, statistical deformable models have been successfully used to identify and characterize bones and organs in medical images and point distribution models (PDMs) have gained particular relevance. In this work, the suitability of these models has been studied to characterize and further reconstruct the shape of the vocal tract in the articulation of Portuguese European (EP) speech sounds, one of the most spoken languages worldwide, with the aid of MR images. Therefore, a PDM has been built from a set of MR images acquired during the artificially sustained articulation of 25 EP speech sounds. Following this, the capacity of this statistical model to characterize the shape deformation of the vocal tract during the production of sounds was analysed. Next, the model was used to reconstruct five EP oral vowels and the EP fricative consonants. As far as a study on speech production is concerned, this study is considered to be the first approach to characterize and reconstruct the vocal tract shape from MR images by using PDMs. In addition, the findings achieved permit one to conclude that this modelling technique compels an enhanced understanding of the dynamic speech events involved in sustained articulations based on MRI, which are of particular interest for speech rehabilitation and simulation.
A Computerized Tomography Study of Vocal Tract Setting in Hyperfunctional Dysphonia and in Belting.
Saldias, Marcelo; Guzman, Marco; Miranda, Gonzalo; Laukkanen, Anne-Maria
2018-04-03
Vocal tract setting in hyperfunctional patients is characterized by a high larynx and narrowing of the epilaryngeal and pharyngeal region. Similar observations have been made for various singing styles, eg, belting. The voice quality in belting has been described to be loud, speech like, and high pitched. It is also often described as sounding "pressed" or "tense". The above mentioned has led to the hypothesis that belting may be strenuous to the vocal folds. However, singers and teachers of belting do not regard belting as particularly strenuous. This study investigates possible similarities and differences between hyperfunctional voice production and belting. This study concerns vocal tract setting. Four male patients with hyperfunctional dysphonia and one male contemporary commercial music singer were registered with computerized tomography while phonating on [a:] in their habitual speaking pitch. Additionally, the singer used the pitch G4 in belting. The scannings were studied in sagittal and transversal dimensions by measuring lengths, widths, and areas. Various similarities were found between belting and hyperfunction: high vertical larynx position, small hypopharyngeal width, and epilaryngeal outlet. On the other hand, belting differed from dysphonia (in addition to higher pitch) by a wider lip and jaw opening, and larger volumes of the oral cavity. Belting takes advantage of "megaphone shape" of the vocal tract. Future studies should focus on modeling and simulation to address sound energy transfer. Also, they should consider aerodynamic variables and vocal fold vibration to evaluate the "price of decibels" in these phonation types. Copyright © 2018. Published by Elsevier Inc.
ERIC Educational Resources Information Center
Zajac, David J.; Weissler, Mark C.
2004-01-01
Two studies were conducted to evaluate short-latency vocal tract air pressure responses to sudden pressure bleeds during production of voiceless bilabial stop consonants. It was hypothesized that the occurrence of respiratory reflexes would be indicated by distinct patterns of responses as a function of bleed magnitude. In Study 1, 19 adults…
A Randomized Controlled Trial of Two Semi-Occluded Vocal Tract Voice Therapy Protocols
ERIC Educational Resources Information Center
Kapsner-Smith, Mara R.; Hunter, Eric J.; Kirkham, Kimberly; Cox, Karin; Titze, Ingo R.
2015-01-01
Purpose: Although there is a long history of use of semi-occluded vocal tract gestures in voice therapy, including phonation through thin tubes or straws, the efficacy of phonation through tubes has not been established. This study compares results from a therapy program on the basis of phonation through a flow-resistant tube (FRT) with Vocal…
NASA Astrophysics Data System (ADS)
Yoshinaga, Tsukasa; Nozaki, Kazunori; Wada, Shigeo
2018-03-01
The sound generation mechanisms of sibilant fricatives were investigated with experimental measurements and large-eddy simulations using a simplified vocal tract model. The vocal tract geometry was simplified to a three-dimensional rectangular channel, and differences in the geometries while pronouncing fricatives /s/ and /∫/ were expressed by shifting the position of the tongue and its constricted flow channel. Experimental results showed that the characteristic peak frequency of the fricatives decreased when the distance between the tongue and teeth increased. Numerical simulations revealed that the jet flow generated from the constriction impinged on the upper teeth wall and caused the main sound source upstream and downstream from the gap between the teeth. While magnitudes of the sound source decreased with increments of the frequency, amplitudes of the pressure downstream from the constriction increased at the peak frequencies of the corresponding tongue position. These results indicate that the sound pressures at the peak frequencies increased by acoustic resonance in the channel downstream from the constriction, and the different frequency characteristics between /s/ and /∫/ were produced by changing the constriction and the acoustic node positions inside the vocal tract.
Toward dynamic magnetic resonance imaging of the vocal tract during speech production.
Ventura, Sandra M Rua; Freitas, Diamantino Rui S; Tavares, João Manuel R S
2011-07-01
The most recent and significant magnetic resonance imaging (MRI) improvements allow for the visualization of the vocal tract during speech production, which has been revealed to be a powerful tool in dynamic speech research. However, a synchronization technique with enhanced temporal resolution is still required. The study design was transversal in nature. Throughout this work, a technique for the dynamic study of the vocal tract with MRI by using the heart's signal to synchronize and trigger the imaging-acquisition process is presented and described. The technique in question is then used in the measurement of four speech articulatory parameters to assess three different syllables (articulatory gestures) of European Portuguese Language. The acquired MR images are automatically reconstructed so as to result in a variable sequence of images (slices) of different vocal tract shapes in articulatory positions associated with Portuguese speech sounds. The knowledge obtained as a result of the proposed technique represents a direct contribution to the improvement of speech synthesis algorithms, thereby allowing for novel perceptions in coarticulation studies, in addition to providing further efficient clinical guidelines in the pursuit of more proficient speech rehabilitation processes. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
ERIC Educational Resources Information Center
Riede, Tobias; Goller, Franz
2010-01-01
Song production in songbirds is a model system for studying learned vocal behavior. As in humans, bird phonation involves three main motor systems (respiration, vocal organ and vocal tract). The avian respiratory mechanism uses pressure regulation in air sacs to ventilate a rigid lung. In songbirds sound is generated with two independently…
A real-time LPC-based vocal tract area display for voice development.
Rossiter, D; Howard, D M; Downes, M
1994-12-01
This article reports the design and implementation of a graphical display that presents an approximation to vocal tract area in real time for voiced vowel articulation. The acoustic signal is digitally sampled by the system. From these data a set of reflection coefficients is derived using linear predictive coding. A matrix of area coefficients is then determined that approximates the vocal tract area of the user. From this information a graphical display is then generated. The complete cycle of analysis and display is repeated at approximately 20 times/s. Synchronised audio and visual sequences can be recorded and used as dynamic targets for articulatory development. Use of the system is illustrated by diagrams of system output for spoken cardinal vowels and for vowels sung in a trained and untrained style.
Watts, Christopher; Barnes-Burroughs, Kathryn; Estis, Julie; Blanton, Debra
2006-03-01
A growing body of contemporary research has investigated differences between trained and untrained singing voices. However, few studies have separated untrained singers into those who do and do not express abilities related to singing talent, including accurate pitch control and production of a pleasant timbre (voice quality). This investigation studied measures of the singing power ratio (SPR), which is a quantitative measure of the resonant quality of the singing voice. SPR reflects the amplification or suppression in the vocal tract of the harmonics produced by the sound source. This measure was acquired from the voices of untrained talented and nontalented singers as a means to objectively investigate voice quality differences. Measures of SPR were acquired from vocal samples with fast Fourier transform (FFT) power spectra to analyze the amplitude level of the partials in the acoustic spectrum. Long-term average spectra (LTAS) were also analyzed. Results indicated significant differences in SPR between groups, which suggest that vocal tract resonance, and its effect on perceived vocal timbre or quality, may be an important variable related to the perception of singing talent. LTAS confirmed group differences in the tuning of vocal tract harmonics.
Two-dimensional model of vocal fold vibration for sound synthesis of voice and soprano singing
NASA Astrophysics Data System (ADS)
Adachi, Seiji; Yu, Jason
2005-05-01
Voiced sounds were simulated with a computer model of the vocal fold composed of a single mass vibrating both parallel and perpendicular to the airflow. Similarities with the two-mass model are found in the amplitudes of the glottal area and the glottal volume flow velocity, the variation in the volume flow waveform with the vocal tract shape, and the dependence of the oscillation amplitude upon the average opening area of the glottis, among other similar features. A few dissimilarities are also found in the more symmetric glottal and volume flow waveforms in the rising and falling phases. The major improvement of the present model over the two-mass model is that it yields a smooth transition between oscillations with an inductive load and a capacitive load of the vocal tract with no sudden jumps in the vibration frequency. Self-excitation is possible both below and above the first formant frequency of the vocal tract. By taking advantage of the wider continuous frequency range, the two-dimensional model can successfully be applied to the sound synthesis of a high-pitched soprano singing, where the fundamental frequency sometimes exceeds the first formant frequency. .
Acoustic correlates of body size and individual identity in banded penguins
Gamba, Marco; Gili, Claudia; Pessani, Daniela
2017-01-01
Animal vocalisations play a role in individual recognition and mate choice. In nesting penguins, acoustic variation in vocalisations originates from distinctiveness in the morphology of the vocal apparatus. Using the source-filter theory approach, we investigated vocal individuality cues and correlates of body size and mass in the ecstatic display songs the Humboldt and Magellanic penguins. We demonstrate that both fundamental frequency (f0) and formants (F1-F4) are essential vocal features to discriminate among individuals. However, we show that only duration and f0 are honest indicators of the body size and mass, respectively. We did not find any effect of body dimension on formants, formant dispersion nor estimated vocal tract length of the emitters. Overall, our findings provide the first evidence that the resonant frequencies of the vocal tract do not correlate with body size in penguins. Our results add important information to a growing body of literature on the role of the different vocal parameters in conveying biologically meaningful information in bird vocalisations. PMID:28199318
Immediate effects of the semi-occluded vocal tract exercise with LaxVox® tube in singers.
Fadel, Congeta Bruniere Xavier; Dassie-Leite, Ana Paula; Santos, Rosane Sampaio; Santos, Celso Gonçalves Dos; Dias, Cláudio Antônio Sorondo; Sartori, Denise Jussara
The purpose of this study was to analyze the immediate effects of the semi-occluded vocal tract exercise (SOVTE) using the LaxVox® tube in singers. Participants were 23 singers, classical singing students, aged 18 to 47 years (mean age = 27.2 years). First, data was collected through the application of a demographic questionnaire and the recording of sustained emission - vowel /ε/, counting 1-10, and a music section from the participants' current repertoire. After that, the participants were instructed and performed the SOVTE using the LaxVox® tube for three minutes. Finally, the same vocal samples were collected immediately after SOVTE performance and the singers responded to a questionnaire on their perception regarding vocal changes after the exercise. The vocal samples were analyzed by referees (speech-language pathologists and singing teachers) and by means of acoustic analysis. Most of the singers reported improved voice post-exercise in both tasks - speech and singing. Regarding the perceptual assessment (sustained vowel, speech, and singing), the referees found no difference between pre- and post-exercise emissions. The acoustic analysis of the sustained vowel showed increased Fundamental Frequency (F0) and reduction of the Glottal to Noise Excitation (GNE) ratio post-exercise. The semi-occluded vocal tract exercise with LaxVox® tube promotes immediate positive effects on the self-assessment and acoustic analysis of voice in professional singers without vocal complains. No immediate significant changes were observed with respect to auditory-perceptual evaluation of speech and singing.
ERIC Educational Resources Information Center
Pa, Judy; Hickok, Gregory
2008-01-01
Several sensory-motor integration regions have been identified in parietal cortex, which appear to be organized around motor-effectors (e.g., eyes, hands). We investigated whether a sensory-motor integration area might exist for the human vocal tract. Speech requires extensive sensory-motor integration, as does other abilities such as vocal…
A model of acoustic interspeaker variability based on the concept of formant-cavity affiliation
NASA Astrophysics Data System (ADS)
Apostol, Lian; Perrier, Pascal; Bailly, Gérard
2004-01-01
A method is proposed to model the interspeaker variability of formant patterns for oral vowels. It is assumed that this variability originates in the differences existing among speakers in the respective lengths of their front and back vocal-tract cavities. In order to characterize, from the spectral description of the acoustic speech signal, these vocal-tract differences between speakers, each formant is interpreted, according to the concept of formant-cavity affiliation, as a resonance of a specific vocal-tract cavity. Its frequency can thus be directly related to the corresponding cavity length, and a transformation model can be proposed from a speaker A to a speaker B on the basis of the frequency ratios of the formants corresponding to the same resonances. In order to minimize the number of sounds to be recorded for each speaker in order to carry out this speaker transformation, the frequency ratios are exactly computed only for the three extreme cardinal vowels [eye, aye, you] and they are approximated for the remaining vowels through an interpolation function. The method is evaluated through its capacity to transform the (F1,F2) formant patterns of eight oral vowels pronounced by five male speakers into the (F1,F2) patterns of the corresponding vowels generated by an articulatory model of the vocal tract. The resulting formant patterns are compared to those provided by normalization techniques published in the literature. The proposed method is found to be efficient, but a number of limitations are also observed and discussed. These limitations can be associated with the formant-cavity affiliation model itself or with a possible influence of speaker-specific vocal-tract geometry in the cross-sectional direction, which the model might not have taken into account.
Acoustic Characteristics in Epiglottic Cyst.
Lee, YeonWoo; Kim, GeunHyo; Wang, SooGeun; Jang, JeonYeob; Cha, Wonjae; Choi, HongSik; Kim, HyangHee
2018-05-03
The purpose of this study was to analyze the acoustic characteristics associated with alternation deformation of the vocal tract due to large epiglottic cyst, and to confirm the relation between the anatomical change and resonant function of the vocal tract. Eight men with epiglottic cyst were enrolled in this study. The jitter, shimmer, noise-to-harmonic ratio, and first two formants were analyzed in vowels /a:/, /e:/, /i:/, /o:/, and /u:/. These values were analyzed before and after laryngeal microsurgery. The F1 value of /a:/ was significantly raised after surgery. Significant differences of formant frequencies in other vowels, jitter, shimmer, and noise-to-harmonic ratio were not presented. The results of this study could be used to analyze changes in the resonance of vocal tracts due to the epiglottic cysts. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vocal Tract Discomfort and Voice-Related Quality of Life in Wind Instrumentalists.
Cappellaro, Juliane; Beber, Bárbara Costa
2018-05-01
This study aimed to investigate vocal tract discomfort and quality of life in the voice of wind instrumentalists. It is a cross-sectional study. The sample was composed of 37 musicians of the orchestra of Caxias do Sul city, RS, Brazil. The participants answered a nonstandard questionnaire about demographic and professional information, the Voice-Related Quality of Life (V-RQOL), the Vocal Tract Discomfort (VTD) scale, and additional items about fatigue after playing the instrument and pain in the cervical muscles. Correlation analyses were performed using Spearman correlation test. The most frequent symptoms mentioned by musicians in the VTD, for both frequency and intensity of occurrence, were dryness, ache, irritability, and cervical muscle pain, in addition to the frequency of occurrence of fatigue after playing. The musicians showed high scores in the V-RQOL survey. Several symptoms evaluated by the VTD had a negative correlation with the musicians' years of orchestra membership and with V-RQOL scores. Symptoms of vocal tract discomfort are present in wind instrumentalists in low frequency and intensity of occurrence. However, these symptoms affect the musicians' voice-related quality of life, and they occur more in musicians with fewer years of orchestra membership. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Tilsen, Sam; Spincemaille, Pascal; Xu, Bo; Doerschuk, Peter; Luh, Wen-Ming; Feldman, Elana; Wang, Yi
2016-01-01
Models of speech production typically assume that control over the timing of speech movements is governed by the selection of higher-level linguistic units, such as segments or syllables. This study used real-time magnetic resonance imaging of the vocal tract to investigate the anticipatory movements speakers make prior to producing a vocal response. Two factors were varied: preparation (whether or not speakers had foreknowledge of the target response) and pre-response constraint (whether or not speakers were required to maintain a specific vocal tract posture prior to the response). In prepared responses, many speakers were observed to produce pre-response anticipatory movements with a variety of articulators, showing that that speech movements can be readily dissociated from higher-level linguistic units. Substantial variation was observed across speakers with regard to the articulators used for anticipatory posturing and the contexts in which anticipatory movements occurred. The findings of this study have important consequences for models of speech production and for our understanding of the normal range of variation in anticipatory speech behaviors. PMID:26760511
Tilsen, Sam; Spincemaille, Pascal; Xu, Bo; Doerschuk, Peter; Luh, Wen-Ming; Feldman, Elana; Wang, Yi
2016-01-01
Models of speech production typically assume that control over the timing of speech movements is governed by the selection of higher-level linguistic units, such as segments or syllables. This study used real-time magnetic resonance imaging of the vocal tract to investigate the anticipatory movements speakers make prior to producing a vocal response. Two factors were varied: preparation (whether or not speakers had foreknowledge of the target response) and pre-response constraint (whether or not speakers were required to maintain a specific vocal tract posture prior to the response). In prepared responses, many speakers were observed to produce pre-response anticipatory movements with a variety of articulators, showing that that speech movements can be readily dissociated from higher-level linguistic units. Substantial variation was observed across speakers with regard to the articulators used for anticipatory posturing and the contexts in which anticipatory movements occurred. The findings of this study have important consequences for models of speech production and for our understanding of the normal range of variation in anticipatory speech behaviors.
Study of human phonation in a full-body domain
NASA Astrophysics Data System (ADS)
Saurabh, Shakti; Bodony, Daniel
2015-11-01
The generation and propagation of the human voice is studied in two-dimensions using a full-body domain, using direct numerical simulation. The fluid/air in the vocal tract is modeled as a compressible and viscous fluid interacting with the non-linear, viscoelastic vocal folds (VF). The VF tissue material properties are multi-layered, with varying stiffness, and a finite-strain model is utilized and implemented in a quadratic finite element code. The fluid-solid domains are coupled through a boundary-fitted interface and utilize a Poisson equation-based mesh deformation method. The full-body domain includes the near VF region, the vocal tract, a simplified model of the soft palate and mouth, and extends out into the acoustic far-field. A new kind of inflow boundary condition based upon a quasi-one-dimensional formulation with constant sub-glottal volume velocity, which is linked to the VF movement, has been adopted. The sound pressure levels (SPL) measured are realistic and we analyze their connection to the VF dynamics and glottal and vocal tract geometries. Supported by the National Science Foundation (CAREER award number 1150439).
Computational Modeling of Fluid–Structure–Acoustics Interaction during Voice Production
Jiang, Weili; Zheng, Xudong; Xue, Qian
2017-01-01
The paper presented a three-dimensional, first-principle based fluid–structure–acoustics interaction computer model of voice production, which employed a more realistic human laryngeal and vocal tract geometries. Self-sustained vibrations, important convergent–divergent vibration pattern of the vocal folds, and entrainment of the two dominant vibratory modes were captured. Voice quality-associated parameters including the frequency, open quotient, skewness quotient, and flow rate of the glottal flow waveform were found to be well within the normal physiological ranges. The analogy between the vocal tract and a quarter-wave resonator was demonstrated. The acoustic perturbed flux and pressure inside the glottis were found to be at the same order with their incompressible counterparts, suggesting strong source–filter interactions during voice production. Such high fidelity computational model will be useful for investigating a variety of pathological conditions that involve complex vibrations, such as vocal fold paralysis, vocal nodules, and vocal polyps. The model is also an important step toward a patient-specific surgical planning tool that can serve as a no-risk trial and error platform for different procedures, such as injection of biomaterials and thyroplastic medialization. PMID:28243588
Titze, Ingo R
2014-04-01
The origin of vocal registers has generally been attributed to differential activation of cricothyroid and thyroarytenoid muscles in the larynx. Register shifts, however, have also been shown to be affected by glottal pressures exerted on vocal fold surfaces, which can change with loudness, pitch, and vowel. Here it is shown computationally and with empirical data that intraglottal pressures can change abruptly when glottal adductory geometry is changed relatively smoothly from convergent to divergent. An intermediate shape between large convergence and large divergence, namely, a nearly rectangular glottal shape with almost parallel vocal fold surfaces, is associated with mixed registration. It can be less stable than either of the highly angular shapes unless transglottal pressure is reduced and upper stiffness of vocal fold tissues is balanced with lower stiffness. This intermediate state of adduction is desirable because it leads to a low phonation threshold pressure with moderate vocal fold collision. Achieving mixed registration consistently across wide ranges of F0, lung pressure, and vocal tract shapes appears to be a balancing act of coordinating laryngeal muscle activation with vocal tract pressures. Surprisingly, a large transglottal pressure is not facilitative in this process, exacerbating the bi-stable condition and the associated register contrast.
Bi-stable vocal fold adduction: A mechanism of modal-falsetto register shifts and mixed registration
Titze, Ingo R.
2014-01-01
The origin of vocal registers has generally been attributed to differential activation of cricothyroid and thyroarytenoid muscles in the larynx. Register shifts, however, have also been shown to be affected by glottal pressures exerted on vocal fold surfaces, which can change with loudness, pitch, and vowel. Here it is shown computationally and with empirical data that intraglottal pressures can change abruptly when glottal adductory geometry is changed relatively smoothly from convergent to divergent. An intermediate shape between large convergence and large divergence, namely, a nearly rectangular glottal shape with almost parallel vocal fold surfaces, is associated with mixed registration. It can be less stable than either of the highly angular shapes unless transglottal pressure is reduced and upper stiffness of vocal fold tissues is balanced with lower stiffness. This intermediate state of adduction is desirable because it leads to a low phonation threshold pressure with moderate vocal fold collision. Achieving mixed registration consistently across wide ranges of F0, lung pressure, and vocal tract shapes appears to be a balancing act of coordinating laryngeal muscle activation with vocal tract pressures. Surprisingly, a large transglottal pressure is not facilitative in this process, exacerbating the bi-stable condition and the associated register contrast. PMID:25235006
Aerodynamically and acoustically driven modes of vibration in a physical model of the vocal folds.
Zhang, Zhaoyan; Neubauer, Juergen; Berry, David A
2006-11-01
In a single-layered, isotropic, physical model of the vocal folds, distinct phonation types were identified based on the medial surface dynamics of the vocal fold. For acoustically driven phonation, a single, in-phase, x-10 like eigenmode captured the essential dynamics, and coupled with one of the acoustic resonances of the subglottal tract. Thus, the fundamental frequency appeared to be determined primarily by a subglottal acoustic resonance. In contrast, aerodynamically driven phonation did not naturally appear in the single-layered model, but was facilitated by the introduction of a vertical constraint. For this phonation type, fundamental frequency was relatively independent of the acoustic resonances, and two eigenmodes were required to capture the essential dynamics of the vocal fold, including an out-of-phase x-11 like eigenmode and an in-phase x-10 like eigenmode, as described in earlier theoretical work. The two eigenmodes entrained to the same frequency, and were decoupled from subglottal acoustic resonances. With this independence from the acoustic resonances, vocal fold dynamics appeared to be determined primarily by near-field, fluid-structure interactions.
Simulation of singing qualities governed by lower vocal tract adjustments
NASA Astrophysics Data System (ADS)
Titze, Ingo R.
2003-04-01
In previous meetings, voice qualities such as pressed, ring, yawn, and twang were discussed in a speech context. It was shown that these qualities have unique spectral characteristics brought about by combinations of glottal and lower vocal tract adjustments (the epilarynx tube and the pharynx). Yawn has a wide glottis, a wide epilarynx tube, and a wide pharynx. On the contrary, twang has a general narrowing of all these airway sections. Ring has a wide pharynx and a relatively narrow epilarynx tube. A pressed voice is primary laryngeal, with a narrowed glottis. In this presentation, similar adjustments are made for singing with a voice simulator that controls vocal tract area functions and glottal flow pulses by rules. Results suggest that various singing styles, such as country-western, opera, or pop, may in part be characterized by these unique combinations of source and filter adjustments.
Carey, Daniel; McGettigan, Carolyn
2017-04-01
The human vocal system is highly plastic, allowing for the flexible expression of language, mood and intentions. However, this plasticity is not stable throughout the life span, and it is well documented that adult learners encounter greater difficulty than children in acquiring the sounds of foreign languages. Researchers have used magnetic resonance imaging (MRI) to interrogate the neural substrates of vocal imitation and learning, and the correlates of individual differences in phonetic "talent". In parallel, a growing body of work using MR technology to directly image the vocal tract in real time during speech has offered primarily descriptive accounts of phonetic variation within and across languages. In this paper, we review the contribution of neural MRI to our understanding of vocal learning, and give an overview of vocal tract imaging and its potential to inform the field. We propose methods by which our understanding of speech production and learning could be advanced through the combined measurement of articulation and brain activity using MRI - specifically, we describe a novel paradigm, developed in our laboratory, that uses both MRI techniques to for the first time map directly between neural, articulatory and acoustic data in the investigation of vocalisation. This non-invasive, multimodal imaging method could be used to track central and peripheral correlates of spoken language learning, and speech recovery in clinical settings, as well as provide insights into potential sites for targeted neural interventions. Copyright © 2016 Elsevier Ltd. All rights reserved.
High-frame-rate full-vocal-tract 3D dynamic speech imaging.
Fu, Maojing; Barlaz, Marissa S; Holtrop, Joseph L; Perry, Jamie L; Kuehn, David P; Shosted, Ryan K; Liang, Zhi-Pei; Sutton, Bradley P
2017-04-01
To achieve high temporal frame rate, high spatial resolution and full-vocal-tract coverage for three-dimensional dynamic speech MRI by using low-rank modeling and sparse sampling. Three-dimensional dynamic speech MRI is enabled by integrating a novel data acquisition strategy and an image reconstruction method with the partial separability model: (a) a self-navigated sparse sampling strategy that accelerates data acquisition by collecting high-nominal-frame-rate cone navigator sand imaging data within a single repetition time, and (b) are construction method that recovers high-quality speech dynamics from sparse (k,t)-space data by enforcing joint low-rank and spatiotemporal total variation constraints. The proposed method has been evaluated through in vivo experiments. A nominal temporal frame rate of 166 frames per second (defined based on a repetition time of 5.99 ms) was achieved for an imaging volume covering the entire vocal tract with a spatial resolution of 2.2 × 2.2 × 5.0 mm 3 . Practical utility of the proposed method was demonstrated via both validation experiments and a phonetics investigation. Three-dimensional dynamic speech imaging is possible with full-vocal-tract coverage, high spatial resolution and high nominal frame rate to provide dynamic speech data useful for phonetic studies. Magn Reson Med 77:1619-1629, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Embodied, Embedded Language Use
Fowler, Carol A.
2011-01-01
Language use has a public face that is as important to study as the private faces under intensive psycholinguistic study. In the domain of phonology, public use of speech must meet an interpersonal “parity” constraint if it is to serve to communicate. That is, spoken language forms must reliably be identified by listeners. To that end, language forms are embodied, at the lowest level of description, as phonetic gestures of the vocal tract that lawfully structure informational media such as air and light. Over time, under the parity constraint, sound inventories emerge over communicative exchanges that have the property of sufficient identifiability. Communicative activities involve more than vocal tract actions. Talkers gesture and use facial expressions and eye gaze to communicate. Listeners embody their language understandings, exhibiting dispositions to behave in ways related to language understanding. Moreover, linguistic interchanges are embedded in the larger context of language use. Talkers recruit the environment in their communicative activities, for example, in using deictic points. Moreover, in using language as a “coordination device,” interlocutors mutually entrain. PMID:21243080
Zimmer-Nowicka, Joanna; Januszewska-Stańczyk, Henryka
2011-07-01
Upper respiratory tract infections (URTI) are among the major causes of dysphonia. There are only scarce data available on the incidence and predisposing factors of URTI in young singers, in particular, during a period of intense voice training. The data were obtained from medical records and a 43-item questionnaire distributed among 94 students of the vocal faculty (66 females and 28 males-age: 23.5±3.7 years) at all levels of their studies. The questions were divided into several categories, that is, personal, anthropometric, demographic, history of vocal education, and both general and singer-specific health risk factors. The rate of URTI showed a steady decrease during vocal studies. The strongest factor predisposing to infections in the multivariate regression model was nonadherence to vocal hygiene. There was also a weak protective effect of a regular holiday rest and negative effect of allergy. The prevalence of several recognized risk factors of URTI was exceptionally high in the group of vocal students, for example, passive smoking (42.5%), poor dental status (39.4%), frequent gastric complaints (44.7%), and allergy (50%). Despite the persistence of many risk factors throughout the vocal studies, the frequency of URTI significantly decreases most likely because of vocal hygiene education and growing professional experience. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Meerschman, Iris; Van Lierde, Kristiane; Peeters, Karen; Meersman, Eline; Claeys, Sofie; D'haeseleer, Evelien
2017-09-18
The purpose of this study was to determine the short-term effect of 2 semi-occluded vocal tract training programs, "resonant voice training using nasal consonants" versus "straw phonation," on the vocal quality of vocally healthy future occupational voice users. A multigroup pretest-posttest randomized control group design was used. Thirty healthy speech-language pathology students with a mean age of 19 years (range: 17-22 years) were randomly assigned into a resonant voice training group (practicing resonant exercises across 6 weeks, n = 10), a straw phonation group (practicing straw phonation across 6 weeks, n = 10), or a control group (receiving no voice training, n = 10). A voice assessment protocol consisting of both subjective (questionnaire, participant's self-report, auditory-perceptual evaluation) and objective (maximum performance task, aerodynamic assessment, voice range profile, acoustic analysis, acoustic voice quality index, dysphonia severity index) measurements and determinations was used to evaluate the participants' voice pre- and posttraining. Groups were compared over time using linear mixed models and generalized linear mixed models. Within-group effects of time were determined using post hoc pairwise comparisons. No significant time × group interactions were found for any of the outcome measures, indicating no differences in evolution over time among the 3 groups. Within-group effects of time showed a significant improvement in dysphonia severity index in the resonant voice training group, and a significant improvement in the intensity range in the straw phonation group. Results suggest that the semi-occluded vocal tract training programs using resonant voice training and straw phonation may have a positive impact on the vocal quality and vocal capacities of future occupational voice users. The resonant voice training caused an improved dysphonia severity index, and the straw phonation training caused an expansion of the intensity range in this population.
Primate feedstock for the evolution of consonants.
Lameira, Adriano R; Maddieson, Ian; Zuberbühler, Klaus
2014-02-01
The evolution of speech remains an elusive scientific problem. A widespread notion is that vocal learning, underlined by vocal-fold control, is a key prerequisite for speech evolution. Although present in birds and non-primate mammals, vocal learning is ostensibly absent in non-human primates. Here we argue that the main road to speech evolution has been through controlling the supralaryngeal vocal tract, for which we find evidence for evolutionary continuity within the great apes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Modeling source-filter interaction in belting and high-pitched operatic male singing
Titze, Ingo R.; Worley, Albert S.
2009-01-01
Nonlinear source-filter theory is applied to explain some acoustic differences between two contrasting male singing productions at high pitches: operatic style versus jazz belt or theater belt. Several stylized vocal tract shapes (caricatures) are discussed that form the bases of these styles. It is hypothesized that operatic singing uses vowels that are modified toward an inverted megaphone mouth shape for transitioning into the high-pitch range. This allows all the harmonics except the fundamental to be “lifted” over the first formant. Belting, on the other hand, uses vowels that are consistently modified toward the megaphone (trumpet-like) mouth shape. Both the fundamental and the second harmonic are then kept below the first formant. The vocal tract shapes provide collective reinforcement to multiple harmonics in the form of inertive supraglottal reactance and compliant subglottal reactance. Examples of lip openings from four well-known artists are used to infer vocal tract area functions and the corresponding reactances. PMID:19739766
Acoustic analysis of trill sounds.
Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri
2012-04-01
In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.
Acoustic changes in voice after tonsillectomy.
Saida, H; Hirose, H
1996-01-01
The vocal tract from the glottis to the lips is considered to he a resonator and the voice is changeable depending upon the shape of the vocal tract. In this report, we examined the change in pharyngeal size and acoustic feature of voice after tonsillectomy. Subjects were 20 patients. The distance between both anterior pillars (glossopalatine arches), and between both posterior pillars (pharyngopalatine arches) was measured weekly. For acoustic measurements, the five Japanese vowels and Japanese conversational sentences were recorded and analyzed. The distance between both anterior pillars became wider 2 weeks postoperatively, and tended to become narrower thereafter. The distance between both posterior pillars became wider even after 4 weeks postoperatively. No consistent changes in F0, F1 and F2 were found after surgery. Although there was a tendency for a decrease in F3, tonsillectomy did not appear to change the acoustical features of the Japanese vowels remarkably. It was assumed that the subject may adjust the shape of the vocal tract to produce consistent speech sounds after the surgery using auditory feedback.
Vocal mechanics in Darwin's finches: correlation of beak gape and song frequency.
Podos, Jeffrey; Southall, Joel A; Rossi-Santos, Marcos R
2004-02-01
Recent studies of vocal mechanics in songbirds have identified a functional role for the beak in sound production. The vocal tract (trachea and beak) filters harmonic overtones from sounds produced by the syrinx, and birds can fine-tune vocal tract resonance properties through changes in beak gape. In this study, we examine patterns of beak gape during song production in seven species of Darwin's finches of the Galápagos Islands. Our principal goals were to characterize the relationship between beak gape and vocal frequency during song production and to explore the possible influence therein of diversity in beak morphology and body size. Birds were audio and video recorded (at 30 frames s(-1)) as they sang in the field, and 164 song sequences were analyzed. We found that song frequency regressed significantly and positively on beak gape for 38 of 56 individuals and for all seven species examined. This finding provides broad support for a resonance model of vocal tract function in Darwin's finches. Comparison among species revealed significant variation in regression y-intercept values. Body size correlated negatively with y-intercept values, although not at a statistically significant level. We failed to detect variation in regression slopes among finch species, although the regression slopes of Darwin's finch and two North American sparrow species were found to differ. Analysis within one species (Geospiza fortis) revealed significant inter-individual variation in regression parameters; these parameters did not correlate with song frequency features or plumage scores. Our results suggest that patterns of beak use during song production were conserved during the Darwin's finch adaptive radiation, despite the evolution of substantial variation in beak morphology and body size.
Darawsheh, Wesam B; Natour, Yaser S; Sada, Eve G
2018-07-01
This pilot study aimed to evaluate the internal consistency, convergent construct validity and criterion validity of Arabic version of the Vocal Tract Discomfort Scale (VTDS), and to investigate the correlation between the scores of the VTDS, the VHI and the acoustic measures of fundamental frequency (F0), shimmer, jitter and signal-to-noise ratio (SNR). A cross-sectional study where 97 participants participated (47 males and 50 females) (mean age 20.5 ± 2.1 years) (31 student singers and 66 other non-professional voice user students). Participants were without self-perceived voice disorders who completed the VTDS-Arab scale and the Voice Handicap Index (VHI-Arab), and recorded a vocal sample of/a:/at a comfortable level. A positive internal consistency that signifies reliability was confirmed by Cronbach's α = .884 and 0.874 for the VTDS-Arab frequency and severity subscales, respectively. A moderate positive correlation was found between the VTDS-Arab (frequency, severity, total) and the VHI-Arab total where values of Pearson's correlation coefficient were r= 0.459, 0.430 and 0.451, respectively. Weak correlations were found between all of the acoustic measures and the scores of the VTDS-Arab and VHI-Arab (total and subscales). The area under curve for the VTDS was AUC= 0.824, 0.804 and 0.817 for the VTDS frequency, VTDS severity and VTDS total, respectively. The VTDS-Arab is a valid and reliable tool in measuring vocal tract sensations and predicting the perception of vocal handicap in student singers and can be used to predict the vocal load among professional voice users.
NASA Astrophysics Data System (ADS)
Rendall, Drew; Kollias, Sophie; Ney, Christina; Lloyd, Peter
2005-02-01
Key voice features-fundamental frequency (F0) and formant frequencies-can vary extensively between individuals. Much of the variation can be traced to differences in the size of the larynx and vocal-tract cavities, but whether these differences in turn simply reflect differences in speaker body size (i.e., neutral vocal allometry) remains unclear. Quantitative analyses were therefore undertaken to test the relationship between speaker body size and voice F0 and formant frequencies for human vowels. To test the taxonomic generality of the relationships, the same analyses were conducted on the vowel-like grunts of baboons, whose phylogenetic proximity to humans and similar vocal production biology and voice acoustic patterns recommend them for such comparative research. For adults of both species, males were larger than females and had lower mean voice F0 and formant frequencies. However, beyond this, F0 variation did not track body-size variation between the sexes in either species, nor within sexes in humans. In humans, formant variation correlated significantly with speaker height but only in males and not in females. Implications for general vocal allometry are discussed as are implications for speech origins theories, and challenges to them, related to laryngeal position and vocal tract length. .
Vocal development in a Waddington landscape
Teramoto, Yayoi; Takahashi, Daniel Y; Holmes, Philip; Ghazanfar, Asif A
2017-01-01
Vocal development is the adaptive coordination of the vocal apparatus, muscles, the nervous system, and social interaction. Here, we use a quantitative framework based on optimal control theory and Waddington’s landscape metaphor to provide an integrated view of this process. With a biomechanical model of the marmoset monkey vocal apparatus and behavioral developmental data, we show that only the combination of the developing vocal tract, vocal apparatus muscles and nervous system can fully account for the patterns of vocal development. Together, these elements influence the shape of the monkeys’ vocal developmental landscape, tilting, rotating or shifting it in different ways. We can thus use this framework to make quantitative predictions regarding how interfering factors or experimental perturbations can change the landscape within a species, or to explain comparative differences in vocal development across species DOI: http://dx.doi.org/10.7554/eLife.20782.001 PMID:28092262
Resonance strategies used in Bulgarian women's singing style: a pilot study.
Henrich, Nathalie; Kiek, Mara; Smith, John; Wolfe, Joe
2007-01-01
Are the characteristic timbre and loudness of Bulgarian women's singing related to tuning of resonances of the vocal tract? We studied an Australian female singer, who practises and teaches Bulgarian singing technique. Two different vocal qualities of this style were studied. The louder teshka is characterized by a sonorous voice production. The less loud leka has a smoother timbre that is closer to that of the head voice register. Six vowels in each of teshka, leka and the subject's 'normal' (i.e. Western rather than Bulgarian) style were studied. The acoustic resonances of the singer's vocal tract were measured directly during singing by injecting a synthesized, broad-band acoustic current. This singer does not use resonance tuning consistently in her classical Western style. However, in both teshka and leka, she tunes the first tract resonance close to the second harmonic of the voice for most vowels. This tuning boosts the power output in the radiation field for that harmonic. This tuning also contributes to the very strong second harmonic which is a characteristic of the timbre identified as the Bulgarian style.
Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing.
Echternach, Matthias; Burk, Fabian; Burdumy, Michael; Traser, Louisa; Richter, Bernhard
2016-01-01
Dynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness. 12 professional singer subjects of different voice classifications were analysed concerning the vocal tract profiles recorded with dynamic real-time MRI with 25fps in different pitch and loudness conditions. The subjects were asked to sing ascending scales on the vowel /a/ in three loudness conditions (comfortable=mf, very soft=pp, very loud=ff, respectively). Furthermore, fundamental frequency and sound pressure level were analysed from the simultaneously recorded optical audio signal after noise cancellation. The data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch. For the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted.
Maurer, D; Hess, M; Gross, M
1996-12-01
Theoretic investigations of the "source-filter" model have indicated a pronounced acoustic interaction of glottal source and vocal tract. Empirical investigations of formant pattern variations apart from changes in vowel identity have demonstrated a direct relationship between the fundamental frequency and the patterns. As a consequence of both findings, independence of phonation and articulation may be limited in the speech process. Within the present study, possible interdependence of phonation and phoneme was investigated: vocal fold vibrations and larynx position for vocalizations of different vowels in a healthy man and woman were examined by high-speed light-intensified digital imaging. We found 1) different movements of the vocal folds for vocalizations of different vowel identities within one speaker and at similar fundamental frequency, and 2) constant larynx position within vocalization of one vowel identity, but different positions for vocalizations of different vowel identities. A possible relationship between the vocal fold vibrations and the phoneme is discussed.
The respiratory-vocal system of songbirds: anatomy, physiology, and neural control.
Schmidt, Marc F; Martin Wild, J
2014-01-01
This wide-ranging review presents an overview of the respiratory-vocal system in songbirds, which are the only other vertebrate group known to display a degree of respiratory control during song rivalling that of humans during speech; this despite the fact that the peripheral components of both the respiratory and vocal systems differ substantially in the two groups. We first provide a brief description of these peripheral components in songbirds (lungs, air sacs and respiratory muscles, vocal organ (syrinx), upper vocal tract) and then proceed to a review of the organization of central respiratory-related neurons in the spinal cord and brainstem, the latter having an organization fundamentally similar to that of the ventral respiratory group of mammals. The second half of the review describes the nature of the motor commands generated in a specialized "cortical" song control circuit and how these might engage brainstem respiratory networks to shape the temporal structure of song. We also discuss a bilaterally projecting "respiratory-thalamic" pathway that links the respiratory system to "cortical" song control nuclei. This necessary pathway for song originates in the brainstem's primary inspiratory center and is hypothesized to play a vital role in synchronizing song motor commands both within and across hemispheres. © 2014 Elsevier B.V. All rights reserved.
The respiratory-vocal system of songbirds: Anatomy, physiology, and neural control
Schmidt, Marc F.; Wild, J. Martin
2015-01-01
This wide-ranging review presents an overview of the respiratory-vocal system in songbirds, which are the only other vertebrate group known to display a degree of respiratory control during song rivalling that of humans during speech; this despite the fact that the peripheral components of both the respiratory and vocal systems differ substantially in the two groups. We first provide a brief description of these peripheral components in songbirds (lungs, air sacs and respiratory muscles, vocal organ (syrinx), upper vocal tract) and then proceed to a review of the organization of central respiratory-related neurons in the spinal cord and brainstem, the latter having an organization fundamentally similar to that of the ventral respiratory group of mammals. The second half of the review describes the nature of the motor commands generated in a specialized “cortical” song control circuit and how these might engage brainstem respiratory networks to shape the temporal structure of song. We also discuss a bilaterally projecting “respiratory-thalamic” pathway that links the respiratory system to “cortical” song control nuclei. This necessary pathway for song originates in the brainstem’s primary inspiratory center and is hypothesized to play a vital role in synchronizing song motor commands both within and across hemispheres. PMID:25194204
A robotic voice simulator and the interactive training for hearing-impaired people.
Sawada, Hideyuki; Kitani, Mitsuki; Hayashi, Yasumori
2008-01-01
A talking and singing robot which adaptively learns the vocalization skill by means of an auditory feedback learning algorithm is being developed. The robot consists of motor-controlled vocal organs such as vocal cords, a vocal tract and a nasal cavity to generate a natural voice imitating a human vocalization. In this study, the robot is applied to the training system of speech articulation for the hearing-impaired, because the robot is able to reproduce their vocalization and to teach them how it is to be improved to generate clear speech. The paper briefly introduces the mechanical construction of the robot and how it autonomously acquires the vocalization skill in the auditory feedback learning by listening to human speech. Then the training system is described, together with the evaluation of the speech training by auditory impaired people.
Evaluation of Synthetic Self-Oscillating Models of the Vocal Folds
NASA Astrophysics Data System (ADS)
Hubler, Elizabeth P.; Weiland, Kelley S.; Hancock, Adrienne B.; Plesniak, Michael W.
2013-11-01
Approximately 30% of people will suffer from a voice disorder at some point in their lives. The probability doubles for those who rely heavily on their voice, such as teachers and singers. Synthetic vocal fold (VF) models are fabricated and evaluated experimentally in a vocal tract simulator to replicate physiological conditions. Pressure measurements are acquired along the vocal tract and high-speed images are captured at varying flow rates during VF oscillation to facilitate understanding of the characteristics of healthy and damaged VFs. The images are analyzed using a videokymography line-scan technique that has been used to examine VF motion and mucosal wave dynamics in vivo. Clinically relevant parameters calculated from the volume-velocity output of a circumferentially-vented mask (Rothenberg mask) are compared to patient data. This study integrates speech science with engineering and flow physics to overcome current limitations of synthetic VF models to properly replicate normal phonation in order to advance the understanding of resulting flow features, progression of pathological conditions, and medical techniques. Supported by the GW Institute for Biomedical Engineering (GWIBE) and GW Center for Biomimetics and Bioinspired Engineering (COBRE).
NASA Astrophysics Data System (ADS)
Erath, Byron D.; Plesniak, Michael W.
2005-09-01
In speech, sound production arises from fluid-structure interactions within the larynx as well as viscous flow phenomena that is most likely to occur during the divergent orientation of the vocal folds. Of particular interest are the flow mechanisms that influence the location of flow separation points on the vocal folds walls. Physiologically scaled pulsatile flow fields in 7.5 times real size static divergent glottal models were investigated. Three divergence angles were investigated using phase-averaged particle image velocimetry (PIV). The pulsatile glottal jet exhibited a bi-modal stability toward both glottal walls, although there was a significant amount of variance in the angle the jet deflected from the midline. The attachment of the Coanda effect to the glottal model walls occurred when the pulsatile velocity was a maximum, and the acceleration of the waveform was zero. The location of the separation and reattachment points of the flow from the glottal models was a function of the velocity waveform and divergence angle. Acoustic analogies show that a dipole sound source contribution arising from the fluid interaction (Coanda jet) with the vocal fold walls is expected. [Work funded by NIH Grant RO1 DC03577.
Factors limiting vocal-tract length discrimination in cochlear implant simulations.
Gaudrain, Etienne; Başkent, Deniz
2015-03-01
Perception of voice characteristics allows normal hearing listeners to identify the gender of a speaker, and to better segregate speakers from each other in cocktail party situations. This benefit is largely driven by the perception of two vocal characteristics of the speaker: The fundamental frequency (F0) and the vocal-tract length (VTL). Previous studies have suggested that cochlear implant (CI) users have difficulties in perceiving these cues. The aim of the present study was to investigate possible causes for limited sensitivity to VTL differences in CI users. Different acoustic simulations of CI stimulation were implemented to characterize the role of spectral resolution on VTL, both in terms of number of channels and amount of channel interaction. The results indicate that with 12 channels, channel interaction caused by current spread is likely to prevent CI users from perceiving VTL differences typically found between male and female speakers.
NASA Astrophysics Data System (ADS)
Volodin, Ilya A.; Volodina, Elena V.; Frey, Roland; Kirilyuk, Vadim E.; Naidenko, Sergey V.
2017-06-01
In neonate ruminants, the acoustic structure of vocalizations may depend on sex, vocal anatomy, hormonal profiles and body mass and on environmental factors. In neonate wild-living Mongolian gazelles Procapra gutturosa, hand-captured during biomedical monitoring in the Daurian steppes at the Russian-Mongolian border, we spectrographically analysed distress calls and measured body mass of 22 individuals (6 males, 16 females). For 20 (5 male, 15 female) of these individuals, serum testosterone levels were also analysed. In addition, we measured relevant dimensions of the vocal apparatus (larynx, vocal folds, vocal tract) in one stillborn male Mongolian gazelle specimen. Neonate distress calls of either sex were high in maximum fundamental frequency (800-900 Hz), but the beginning and minimum fundamental frequencies were significantly lower in males than in females. Body mass was larger in males than in females. The levels of serum testosterone were marginally higher in males. No correlations were found between either body mass or serum testosterone values and any acoustic variable for males and females analysed together or separately. We discuss that the high-frequency calls of neonate Mongolian gazelles are more typical for closed-habitat neonate ruminants, whereas other open-habitat neonate ruminants (goitred gazelle Gazella subgutturosa, saiga antelope Saiga tatarica and reindeer Rangifer tarandus) produce low-frequency (<200 Hz) distress calls. Proximate cause for the high fundamental frequency of distress calls of neonate Mongolian gazelles is their very short, atypical vocal folds (4 mm) compared to the 7-mm vocal folds of neonate goitred gazelles, producing distress calls as low as 120 Hz.
Transcultural Adaptation and Validation of the German Version of the Vocal Tract Discomfort Scale.
Lukaschyk, Julia; Brockmann-Bauser, Meike; Beushausen, Ulla
2017-03-01
Currently, there is no standardized German questionnaire to assess vocal tract discomfort in voice patients. The aim of this study was to evaluate the internal consistency, reliability, and validity of the German version of the Vocal Tract Discomfort (VTD) Scale. This is a cross-sectional study. First, a cross-cultural translation and adaptation from English to German was performed. One hundred seven patients between the ages of 18 and 76 with voice disorders were divided into two different diagnosis-related groups (organic and functional voice disorder) and 50 vocally healthy adults were included. All participants completed the VTD Scale and the Voice Handicap Index (VHI). The internal consistency of the VTD Scale was analyzed through Cronbach's α coefficient. Pearson correlation between the VDT Scale and VHI total scores was used to determine criterion validity. The VDT Scale score differences related to diagnosis groups were assessed with analysis of variance. Excellent internal consistency was found (α = 0.919, P < 0.05), and criterion validity was confirmed by a high correlation between the total VTD Scale and VHI (r = 0.674). There was a significant difference between the diagnosis groups' total VTD Scale score (F[4.135] = 15.114, P = 0.000). Furthermore, the vocally healthy adults had significantly lower values than the two diagnosis groups (x¯: 11.48, s = 8.340). The German version of the VTD Scale has an excellent internal consistency and reliability, and shows high clinical validity. Thus, it is a useful instrument in voice diagnostics. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Analysis of Measured and Simulated Supraglottal Acoustic Waves.
Fraile, Rubén; Evdokimova, Vera V; Evgrafova, Karina V; Godino-Llorente, Juan I; Skrelin, Pavel A
2016-09-01
To date, although much attention has been paid to the estimation and modeling of the voice source (ie, the glottal airflow volume velocity), the measurement and characterization of the supraglottal pressure wave have been much less studied. Some previous results have unveiled that the supraglottal pressure wave has some spectral resonances similar to those of the voice pressure wave. This makes the supraglottal wave partially intelligible. Although the explanation for such effect seems to be clearly related to the reflected pressure wave traveling upstream along the vocal tract, the influence that nonlinear source-filter interaction has on it is not as clear. This article provides an insight into this issue by comparing the acoustic analyses of measured and simulated supraglottal and voice waves. Simulations have been performed using a high-dimensional discrete vocal fold model. Results of such comparative analysis indicate that spectral resonances in the supraglottal wave are mainly caused by the regressive pressure wave that travels upstream along the vocal tract and not by source-tract interaction. On the contrary and according to simulation results, source-tract interaction has a role in the loss of intelligibility that happens in the supraglottal wave with respect to the voice wave. This loss of intelligibility mainly corresponds to spectral differences for frequencies above 1500 Hz. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Frey, Roland; Gebler, Alban; Fritsch, Guido; Nygrén, Kaarlo; Weissengruber, Gerald E
2007-01-01
Laryngeal air sacs have evolved convergently in diverse mammalian lineages including insectivores, bats, rodents, pinnipeds, ungulates and primates, but their precise function has remained elusive. Among cervids, the vocal tract of reindeer has evolved an unpaired inflatable ventrorostral laryngeal air sac. This air sac is not present at birth but emerges during ontogenetic development. It protrudes from the laryngeal vestibulum via a short duct between the epiglottis and the thyroid cartilage. In the female the growth of the air sac stops at the age of 2–3 years, whereas in males it continues to grow up to the age of about 6 years, leading to a pronounced sexual dimorphism of the air sac. In adult females it is of moderate size (about 100 cm3), whereas in adult males it is large (3000–4000 cm3) and becomes asymmetric extending either to the left or to the right side of the neck. In both adult females and males the ventral air sac walls touch the integument. In the adult male the air sac is laterally covered by the mandibular portion of the sternocephalic muscle and the skin. Both sexes of reindeer have a double stylohyoid muscle and a thyroepiglottic muscle. Possibly these muscles assist in inflation of the air sac. Head-and-neck specimens were subjected to macroscopic anatomical dissection, computer tomographic analysis and skeletonization. In addition, isolated larynges were studied for comparison. Acoustic recordings were made during an autumn round-up of semi-domestic reindeer in Finland and in a small zoo herd. Male reindeer adopt a specific posture when emitting their serial hoarse rutting calls. Head and neck are kept low and the throat region is extended. In the ventral neck region, roughly corresponding to the position of the large air sac, there is a mane of longer hairs. Neck swelling and mane spreading during vocalization may act as an optical signal to other males and females. The air sac, as a side branch of the vocal tract, can be considered as an additional acoustic filter. Individual acoustic recognition may have been the primary function in the evolution of a size-variable air sac, and this function is retained in mother–young communication. In males sexual selection seems to have favoured a considerable size increase of the air sac and a switch to call series instead of single calls. Vocalization became restricted to the rutting period serving the attraction of females. We propose two possibilities for the acoustic function of the air sac in vocalization that do not exclude each other. The first assumes a coupling between air sac and the environment, resulting in an acoustic output that is a combination of the vocal tract resonance frequencies emitted via mouth and nostrils and the resonance frequencies of the air sac transmitted via the neck skin. The second assumes a weak coupling so that resonance frequencies of the air sac are lost to surrounding tissues by dissipation. In this case the resonance frequencies of the air sac solely influence the signal that is further filtered by the remaining vocal tract. According to our results one acoustic effect of the air sac in adult reindeer might be to mask formants of the vocal tract proper. In other cervid species, however, formants of rutting calls convey essential information on the quality of the sender, related to its potential reproductive success, to conspecifics. Further studies are required to solve this inconsistency. PMID:17310544
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.
Mitra, Vikramjit; Nam, Hosung; Espy-Wilson, Carol Y; Saltzman, Elliot; Goldstein, Louis
2010-09-13
Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is usually termed "speech-inversion." This study aims to propose and compare various machine learning strategies for speech inversion: Trajectory mixture density networks (TMDNs), feedforward artificial neural networks (FF-ANN), support vector regression (SVR), autoregressive artificial neural network (AR-ANN), and distal supervised learning (DSL). Further, using a database generated by the Haskins Laboratories speech production model, we test the claim that information regarding constrictions produced by the distinct organs of the vocal tract (vocal tract variables) is superior to flesh-point information (articulatory pellet trajectories) for the inversion process.
Furuyama, Takafumi; Kobayasi, Kohta I; Riquimaroux, Hiroshi
2016-08-23
The Japanese macaque (Macaca fuscata) exhibits a species-specific communication sound called the "coo call" to locate group members and maintain within-group contact. Monkeys have been demonstrated to be capable of discriminating between individuals based only on their voices, but there is still debate regarding how the fundamental frequencies (F0) and filter properties of the vocal tract characteristics (VTC) contribute to individual discrimination in nonhuman primates. This study was performed to investigate the acoustic keys used by Japanese macaques in individual discrimination. Two animals were trained with standard Go/NoGo operant conditioning to distinguish the coo calls of two unfamiliar monkeys. The subjects were required to continue depressing a lever until the stimulus changed from one monkey to the other. The test stimuli were synthesized by combining the F0s and VTC from each individual. Both subjects released the lever when the VTC changed, whereas they did not when the F0 changed. The reaction times to the test stimuli were not significantly different from that to the training stimuli that shared the same VTC. Our data suggest that vocal tract characteristics are important for the identification of individuals by Japanese macaques.
Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing
Echternach, Matthias; Burk, Fabian; Burdumy, Michael; Traser, Louisa; Richter, Bernhard
2016-01-01
Introduction Dynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness. Material and Methods 12 professional singer subjects of different voice classifications were analysed concerning the vocal tract profiles recorded with dynamic real-time MRI with 25fps in different pitch and loudness conditions. The subjects were asked to sing ascending scales on the vowel /a/ in three loudness conditions (comfortable = mf, very soft = pp, very loud = ff, respectively). Furthermore, fundamental frequency and sound pressure level were analysed from the simultaneously recorded optical audio signal after noise cancellation. Results The data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch. Conclusion For the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted. PMID:27096935
Fundamental frequency, phonation maximum time and vocal complaints in morbidly obese women
de SOUZA, Lourdes Bernadete Rocha; PEREIRA, Rayane Medeiros; dos SANTOS, Marquiony Marques; GODOY, Cynthia Meida de Almeida
2014-01-01
Background Obese people have abnormal deposition of fat in the vocal tract that can interfere with the acoustic voice. Aim To relate the fundamental frequency, the maximum phonation time and voice complaints from a group of morbidly obese women. Methods Observational, cross-sectional and descriptive study that included 44 morbidly obese women, mean age of 42.45 (±10.31) years old, observational group and 30 women without obesity, control group, with 33.79 (±4.51)years old. The voice recording was done in a quiet environment, on a laptop using the program ANAGRAF acoustic analysis of speech sounds. To extract the values of fundamental frequency the subjects were asked to produce vowel [a] at usual intensity for a period in average of three seconds. After the voice recording, participants were prompted to produce sustained vowel [ a] , [ i] and [ u] at usual intensity and height, using a stopwatch to measure the time that each participant could hold each vowel. Results The majority, 31(70.5%), had vocal complaints, with a higher percentage for complaints of vocal fatigue 20(64.51%) and voice failures 19(61.29%) followed by dryness of the throat in 15 (48.38%) and effort to speak 13(41.93%). There was no statistically significant difference regarding the mean fundamental frequency of the voice in both groups, but there was significance between the two groups regarding maximum phonation. Conclusion Increased adipose tissue in the vocal tract interfered in the vocal parameters. PMID:24676298
Automatic detection of obstructive sleep apnea using speech signals.
Goldshtein, Evgenia; Tarasiuk, Ariel; Zigel, Yaniv
2011-05-01
Obstructive sleep apnea (OSA) is a common disorder associated with anatomical abnormalities of the upper airways that affects 5% of the population. Acoustic parameters may be influenced by the vocal tract structure and soft tissue properties. We hypothesize that speech signal properties of OSA patients will be different than those of control subjects not having OSA. Using speech signal processing techniques, we explored acoustic speech features of 93 subjects who were recorded using a text-dependent speech protocol and a digital audio recorder immediately prior to polysomnography study. Following analysis of the study, subjects were divided into OSA (n=67) and non-OSA (n=26) groups. A Gaussian mixture model-based system was developed to model and classify between the groups; discriminative features such as vocal tract length and linear prediction coefficients were selected using feature selection technique. Specificity and sensitivity of 83% and 79% were achieved for the male OSA and 86% and 84% for the female OSA patients, respectively. We conclude that acoustic features from speech signals during wakefulness can detect OSA patients with good specificity and sensitivity. Such a system can be used as a basis for future development of a tool for OSA screening. © 2011 IEEE
A three-dimensional study of the glottal jet
NASA Astrophysics Data System (ADS)
Krebs, F.; Silva, F.; Sciamarella, D.; Artana, G.
2012-05-01
This work builds upon the efforts to characterize the three-dimensional features of the glottal jet during vocal fold vibration. The study uses a Stereoscopic Particle Image Velocimetry setup on a self-oscillating physical model of the vocal folds with a uniform vocal tract. Time averages are documented and analyzed within the framework given by observations reported for jets exiting elongated nozzles. Phase averages are locked to the audio signal and used to obtain a volumetric reconstruction of the jet. From this reconstruction, the intra-cycle dynamics of the jet axis switching is disclosed.
A rare case of a sharp foreign body on the vocal cord.
Nor Hisyam, C I; Misron, K; Mohamad, I
2017-01-01
A foreign body (FB) in the upper aerodigestive tract is a common clinical problem that presents as as acute emergency. Sharp FB, such as fish bone or chicken bone, commonly lodges in the tonsil, base of tongue, vallecula or pyriform fossa. Dislodgement of a FB into the laryngopharynx is very rare and specifically onto the vocal cord is extremely uncommon. This case report illustrates a rare case of a sharp FB that was dislodged into the airway and stuck on to the right vocal cord, which was removed under local anaesthesia.
ERIC Educational Resources Information Center
Titze, Ingo R.
2006-01-01
Purpose: Maximum flow declination rate (MFDR) in the glottis is known to correlate strongly with vocal intensity in voicing. This declination, or negative slope on the glottal airflow waveform, is in part attributable to the maximum area declination rate (MADR) and in part to the overall inertia of the air column of the vocal tract (lungs to…
2011-01-01
Vocal production requires complex planning and coordination of respiratory, laryngeal, and vocal tract movements, which are incompletely understood in most mammals. Rats produce a variety of whistles in the ultrasonic range that are of communicative relevance and of importance as a model system, but the sources of acoustic variability were mostly unknown. The goal was to identify sources of fundamental frequency variability. Subglottal pressure, tracheal airflow, and electromyographic (EMG) data from two intrinsic laryngeal muscles were measured during 22-kHz and 50-kHz call production in awake, spontaneously behaving adult male rats. During ultrasound vocalization, subglottal pressure ranged between 0.8 and 1.9 kPa. Pressure differences between call types were not significant. The relation between fundamental frequency and subglottal pressure within call types was inconsistent. Experimental manipulations of subglottal pressure had only small effects on fundamental frequency. Tracheal airflow patterns were also inconsistently associated with frequency. Pressure and flow seem to play a small role in regulation of fundamental frequency. Muscle activity, however, is precisely regulated and very sensitive to alterations, presumably because of effects on resonance properties in the vocal tract. EMG activity of cricothyroid and thyroarytenoid muscle was tonic in calls with slow or no fundamental frequency modulations, like 22-kHz and flat 50-kHz calls. Both muscles showed brief high-amplitude, alternating bursts at rates up to 150 Hz during production of frequency-modulated 50-kHz calls. A differentiated and fine regulation of intrinsic laryngeal muscles is critical for normal ultrasound vocalization. Many features of the laryngeal muscle activation pattern during ultrasound vocalization in rats are shared with other mammals. PMID:21832032
The effect of a voiced lip trill on estimated glottal closed quotient.
Gaskill, Christopher S; Erickson, Molly L
2008-11-01
The use of lip trills has been advocated for both vocal habilitation and rehabilitation. A voiced lip trill requires continuous vibration of the lips while simultaneously maintaining phonation. The mechanism of any effects of a lip trill on vocal fold vibration is still unknown. While other techniques that either constrict or artificially lengthen the vocal tract have been investigated, no studies thus far have systematically examined the effect of lip trills on vocal fold vibration. Classically trained singers and vocally untrained participants produced a lip trill for approximately 1 minute, and vocal fold closed quotient (CQ) was calculated both during the lip trill and on a sustained spoken vowel before and after the trill. Data are reported for both a group design and a single-subject design. Most participants showed a tendency for a reduction in CQ during the lip trill, with a more pronounced change in the untrained participants.
Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.
Rothenberg, Martin; Schutte, Harm K
2016-11-01
In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Riecker, A; Ackermann, H; Wildgruber, D; Dogil, G; Grodd, W
2000-06-26
Aside from spoken language, singing represents a second mode of acoustic (auditory-vocal) communication in humans. As a new aspect of brain lateralization, functional magnetic resonance imaging (fMRI) revealed two complementary cerebral networks subserving singing and speaking. Reproduction of a non-lyrical tune elicited activation predominantly in the right motor cortex, the right anterior insula, and the left cerebellum whereas the opposite response pattern emerged during a speech task. In contrast to the hemodynamic responses within motor cortex and cerebellum, activation of the intrasylvian cortex turned out to be bound to overt task performance. These findings corroborate the assumption that the left insula supports the coordination of speech articulation. Similarly, the right insula might mediate temporo-spatial control of vocal tract musculature during overt singing. Both speech and melody production require the integration of sound structure or tonal patterns, respectively, with a speaker's emotions and attitudes. Considering the widespread interconnections with premotor cortex and limbic structures, the insula is especially suited for this task.
NASA Astrophysics Data System (ADS)
Burnett, Gregory Clell
1999-10-01
The definition, use, and physiological basis of Glottal Electromagnetic Micropower Sensors (GEMS) is presented. These sensors are a new type of low power (<20 milliwatts radiated) microwave regime (900 MHz to 2.5 GHz) multi-purpose motion sensor developed at the Lawrence Livermore National Laboratory. The GEMS are sensitive to movement in an adjustable field of view (FOV) surrounding the antennae. In this thesis, the GEMS has been utilized for speech research, targeted to receive motion signals from the subglottal region of the trachea. The GEMS signal is analyzed to determine the physiological source of the signal, and this information is used to calculate the subglottal pressure, effectively an excitation function for the human vocal tract. For the first time, an excitation function may be calculated in near real time using a noninvasive procedure. Several experiments and models are presented to demonstrate that the GEMS signal is representative of the motion of the subglottal posterior wall of the trachea as it vibrates in response to the pressure changes caused by the folds as they modulate the airflow supplied by the lungs. The vibrational properties of the tracheal wall are modeled using a lumped-element circuit model. Taking the output of the vocal tract to be the audio pressure captured by a microphone and the input to be the subglottal pressure, the transfer function of the vocal tract (including the nasal cavities) can be approximated every 10-30 milliseconds using an autoregressive moving-average model. Unlike the currently utilized method of transfer function approximation, this new method only involves noninvasive GEMS measurements and digital signal processing and does not demand the difficult task of obtaining precise physical measurements of the tract and subsequent estimation of the transfer function using its cross-sectional area. The ability to measure the physical motion of the trachea enables a significant number of potential applications, ranging from very accurate pitch detection to speech synthesis, speaker verification, and speech recognition.
Effect of body position on vocal tract acoustics: Acoustic pharyngometry and vowel formants.
Vorperian, Houri K; Kurtzweil, Sara L; Fourakis, Marios; Kent, Ray D; Tillman, Katelyn K; Austin, Diane
2015-08-01
The anatomic basis and articulatory features of speech production are often studied with imaging studies that are typically acquired in the supine body position. It is important to determine if changes in body orientation to the gravitational field alter vocal tract dimensions and speech acoustics. The purpose of this study was to assess the effect of body position (upright versus supine) on (1) oral and pharyngeal measurements derived from acoustic pharyngometry and (2) acoustic measurements of fundamental frequency (F0) and the first four formant frequencies (F1-F4) for the quadrilateral point vowels. Data were obtained for 27 male and female participants, aged 17 to 35 yrs. Acoustic pharyngometry showed a statistically significant effect of body position on volumetric measurements, with smaller values in the supine than upright position, but no changes in length measurements. Acoustic analyses of vowels showed significantly larger values in the supine than upright position for the variables of F0, F3, and the Euclidean distance from the centroid to each corner vowel in the F1-F2-F3 space. Changes in body position affected measurements of vocal tract volume but not length. Body position also affected the aforementioned acoustic variables, but the main vowel formants were preserved.
Vocal tract resonances in singing: The soprano voice
NASA Astrophysics Data System (ADS)
Joliveau, Elodie; Smith, John; Wolfe, Joe
2004-10-01
The vocal tract resonances of trained soprano singers were measured while they sang a range of vowels softly at different pitches. The measurements were made by broad band acoustic excitation at the mouth, which allowed the resonances of the tract to be measured simultaneously with and independently from the harmonics of the voice. At low pitch, when the lowest resonance frequency R1 exceeded f0, the values of the first two resonances R1 and R2 varied little with frequency and had values consistent with normal speech. At higher pitches, however, when f0 exceeded the value of R1 observed at low pitch, R1 increased with f0 so that R1 was approximately equal to f0. R2 also increased over this high pitch range, probably as an incidental consequence of the tuning of R1. R3 increased slightly but systematically, across the whole pitch range measured. There was no evidence that any resonances are tuned close to harmonics of the pitch frequency except for R1 at high pitch. The variations in R1 and R2 at high pitch mean that vowels move, converge, and overlap their positions on the vocal plane (R2,R1) to an extent that implies loss of intelligibility. .
Vocal power and pressure–flow relationships in excised tiger larynges
Titze, Ingo R.; Fitch, W. Tecumseh; Hunter, Eric J.; Alipour, Fariborz; Montequin, Douglas; Armstrong, Douglas L.; McGee, JoAnn; Walsh, Edward J.
2010-01-01
Despite the functional importance of loud, low-pitched vocalizations in big cats of the genus Panthera, little is known about the physics and physiology of the mechanisms producing such calls. We investigated laryngeal sound production in the laboratory using an excised-larynx setup combined with sound-level measurements and pressure–flow instrumentation. The larynges of five tigers (three Siberian or Amur, one generic non-pedigreed tiger with Bengal ancestry and one Sumatran), which had died of natural causes, were provided by Omaha's Henry Doorly Zoo over a five-year period. Anatomical investigation indicated the presence of both a rigid cartilaginous plate in the arytenoid portion of the glottis, and a vocal fold fused with a ventricular fold. Both of these features have been confusingly termed ‘vocal pads’ in the previous literature. We successfully induced phonation in all of these larynges. Our results showed that aerodynamic power in the glottis was of the order of 1.0 W for all specimens, acoustic power radiated (without a vocal tract) was of the order of 0.1 mW, and fundamental frequency ranged between 20 and 100 Hz when a lung pressure in the range of 0–2.0 kPa was applied. The mean glottal airflow increased to the order of 1.0 l s–1 per 1.0 kPa of pressure, which is predictable from scaling human and canine larynges by glottal length and vibrational amplitude. Phonation threshold pressure was remarkably low, on the order of 0.3 kPa, which is lower than for human and canine larynges phonated without a vocal tract. Our results indicate that a vocal fold length approximately three times greater than that of humans is predictive of the low fundamental frequency, and the extraordinarily flat and broad medial surface of the vocal folds is predictive of the low phonation threshold pressure. PMID:21037066
Vocal power and pressure-flow relationships in excised tiger larynges.
Titze, Ingo R; Fitch, W Tecumseh; Hunter, Eric J; Alipour, Fariborz; Montequin, Douglas; Armstrong, Douglas L; McGee, Joann; Walsh, Edward J
2010-11-15
Despite the functional importance of loud, low-pitched vocalizations in big cats of the genus Panthera, little is known about the physics and physiology of the mechanisms producing such calls. We investigated laryngeal sound production in the laboratory using an excised-larynx setup combined with sound-level measurements and pressure-flow instrumentation. The larynges of five tigers (three Siberian or Amur, one generic non-pedigreed tiger with Bengal ancestry and one Sumatran), which had died of natural causes, were provided by Omaha's Henry Doorly Zoo over a five-year period. Anatomical investigation indicated the presence of both a rigid cartilaginous plate in the arytenoid portion of the glottis, and a vocal fold fused with a ventricular fold. Both of these features have been confusingly termed 'vocal pads' in the previous literature. We successfully induced phonation in all of these larynges. Our results showed that aerodynamic power in the glottis was of the order of 1.0 W for all specimens, acoustic power radiated (without a vocal tract) was of the order of 0.1 mW, and fundamental frequency ranged between 20 and 100 Hz when a lung pressure in the range of 0-2.0 kPa was applied. The mean glottal airflow increased to the order of 1.0 l s(-1) per 1.0 kPa of pressure, which is predictable from scaling human and canine larynges by glottal length and vibrational amplitude. Phonation threshold pressure was remarkably low, on the order of 0.3 kPa, which is lower than for human and canine larynges phonated without a vocal tract. Our results indicate that a vocal fold length approximately three times greater than that of humans is predictive of the low fundamental frequency, and the extraordinarily flat and broad medial surface of the vocal folds is predictive of the low phonation threshold pressure.
Gaskill, Christopher S; Quinney, Dana M
2012-05-01
Phonation into narrow tubes or straws has been used as a voice training and voice therapy technique and belongs to a group of techniques known as semi-occluded vocal tract exercises. The use of what are called resonance tubes has received renewed attention in the voice research literature, in both theoretical and empirical studies. The assumption is that the partially occluded and lengthened vocal tract alters supraglottal acoustics in such a way as to allow phonation near a lowered first vocal tract formant, which has been suggested as a way to bring about a more efficient glottal closure pattern for sustained oscillation. In this study, two groups of male participants, 10 with no vocal training and 10 with classical vocal training, phonated into a resonance tube for approximately 1 minute. Electroglottography was used to estimate glottal contact quotient (CQ) during spoken /a/ vowels before tube phonation, during tube phonation, and again during spoken /a/ vowels after tube phonation. Half of each group of participants was made to keep pitch and loudness consistent for all phases of the experiment, replicating the method of a previous study by this author. The other half was instructed to practice phonating into the resonance tube before collecting data and was encouraged to find a pitch and loudness combination that maximized ease of phonation and a sense of forward oral resonance. Glottal CQ altered considerably from baseline for almost all participants during tube phonation, with a larger variability than that during vowel production. Small differences in glottal CQ were found as a function of training and instruction, with most participants' CQ increasing during tube phonation. A small post-tube phonation effect was found primarily for the trained and instructed group. Secondary single-subject analyses revealed large intersubject variation, highlighting the highly individualized response to the resonance tube task. Continued study of resonance tubes is recommended, comparing both male and female as well as vocally trained and untrained participants. Future studies should continue to examine systematic variations in task instruction, length of practice, and resonance tube dimensions. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Acoustic passaggio pedagogy for the male voice.
Bozeman, Kenneth Wood
2013-07-01
Awareness of interactions between the lower harmonics of the voice source and the first formant of the vocal tract, and of the passive vowel modifications that accompany them, can assist in working out a smooth transition through the passaggio of the male voice. A stable vocal tract length establishes the general location of all formants, including the higher formants that form the singer's formant cluster. Untrained males instinctively shorten the tube to preserve the strong F1/H2 acoustic coupling of voce aperta, resulting in 'yell' timbre. If tube length and shape are kept stable during pitch ascent, the yell can be avoided by allowing the second harmonic to rise above the first formant, creating the balanced timbre of voce chiusa.
Formant characteristics of human laughter.
Szameitat, Diana P; Darwin, Chris J; Szameitat, André J; Wildgruber, Dirk; Alter, Kai
2011-01-01
Although laughter is an important aspect of nonverbal vocalization, its acoustic properties are still not fully understood. Extreme articulation during laughter production, such as wide jaw opening, suggests that laughter can have very high first formant (F(1)) frequencies. We measured fundamental frequency and formant frequencies of the vowels produced in the vocalic segments of laughter. Vocalic segments showed higher average F(1) frequencies than those previously reported and individual values could be as high as 1100 Hz for male speakers and 1500 Hz for female speakers. To our knowledge, these are the highest F(1) frequencies reported to date for human vocalizations, exceeding even the F(1) frequencies reported for trained soprano singers. These exceptionally high F(1) values are likely to be based on the extreme positions adopted by the vocal tract during laughter in combination with physiological constraints accompanying the production of a "pressed" voice. Copyright © 2011 The Voice Foundation. All rights reserved.
Volodin, Ilya A; Matrosova, Vera A; Frey, Roland; Kozhevnikova, Julia D; Isaeva, Inna L; Volodina, Elena V
2018-06-11
Non-hibernating pikas collect winter food reserves and store them in hay piles. Individualization of alarm calls might allow discrimination between colony members and conspecifics trying to steal food items from a colony pile. We investigated vocal posture, vocal tract length, and individual acoustic variation of alarm calls, emitted by wild-living Altai pikas Ochotona alpina toward a researcher. Recording started when a pika started calling and lasted as long as possible. The alarm call series of 442 individual callers from different colonies consisted of discrete short (0.073-0.157 s), high-frequency (7.31-15.46 kHz), and frequency-modulated calls separated by irregular intervals. Analysis of 442 discrete calls, the second of each series, revealed that 44.34% calls lacked nonlinear phenomena, in 7.02% nonlinear phenomena covered less than half of call duration, and in 48.64% nonlinear phenomena covered more than half of call duration. Peak frequencies varied among individuals but always fitted one of three maxima corresponding to the vocal tract resonance frequencies (formants) calculated for an estimated 45-mm oral vocal tract. Discriminant analysis using variables of 8 calls per series of 36 different callers, each from a different colony, correctly assigned over 90% of the calls to individuals. Consequently, Altai pika alarm calls are individualistic and nonlinear phenomena might further increase this acoustic individualization. Additionally, video analysis revealed a call-synchronous, very fast (0.13-0.23 s) folding, depression, and subsequent re-expansion of the pinna confirming an earlier report of this behavior that apparently contributes to protecting the hearing apparatus from damage by the self-generated high-intensity alarm calls.
Experiments on Analysing Voice Production: Excised (Human, Animal) and In Vivo (Animal) Approaches
Döllinger, Michael; Kobler, James; Berry, David A.; Mehta, Daryush D.; Luegmair, Georg; Bohr, Christopher
2015-01-01
Experiments on human and on animal excised specimens as well as in vivo animal preparations are so far the most realistic approaches to simulate the in vivo process of human phonation. These experiments do not have the disadvantage of limited space within the neck and enable studies of the actual organ necessary for phonation, i.e., the larynx. The studies additionally allow the analysis of flow, vocal fold dynamics, and resulting acoustics in relation to well-defined laryngeal alterations. Purpose of Review This paper provides an overview of the applications and usefulness of excised (human/animal) specimen and in vivo animal experiments in voice research. These experiments have enabled visualization and analysis of dehydration effects, vocal fold scarring, bifurcation and chaotic vibrations, three-dimensional vibrations, aerodynamic effects, and mucosal wave propagation along the medial surface. Quantitative data will be shown to give an overview of measured laryngeal parameter values. As yet, a full understanding of all existing interactions in voice production has not been achieved, and thus, where possible, we try to indicate areas needing further study. Recent Findings A further motivation behind this review is to highlight recent findings and technologies related to the study of vocal fold dynamics and its applications. For example, studies of interactions between vocal tract airflow and generation of acoustics have recently shown that airflow superior to the glottis is governed by not only vocal fold dynamics but also by subglottal and supraglottal structures. In addition, promising new methods to investigate kinematics and dynamics have been reported recently, including dynamic optical coherence tomography, X-ray stroboscopy and three-dimensional reconstruction with laser projection systems. Finally, we touch on the relevance of vocal fold dynamics to clinical laryngology and to clinically-oriented research. PMID:26581597
How do clarinet players adjust the resonances of their vocal tracts for different playing effects?
NASA Astrophysics Data System (ADS)
Fritz, Claudia; Wolfe, Joe
2005-11-01
In a simple model, the reed of the clarinet is mechanically loaded by the series combination of the acoustical impedances of the instrument itself and of the player's airway. Here we measure the complex impedance spectrum of players' airways using an impedance head adapted to fit inside a clarinet mouthpiece. A direct current shunt with high acoustical resistance allows players to blow normally, so the players can simulate the tract condition under playing conditions. The reproducibility of the results suggest that the players' ``muscle memory'' is reliable for this task. Most players use a single, highly stable vocal tract configuration over most of the playing range, except for the altissimo register. However, this ``normal'' configuration varies substantially among musicians. All musicians change the configuration, often drastically for ``special effects'' such as glissandi and slurs: the tongue is lowered and the impedance magnitude reduced when the player intends to lower the pitch or to slur downwards, and vice versa.
A fast and flexible MRI system for the study of dynamic vocal tract shaping.
Lingala, Sajan Goud; Zhu, Yinghua; Kim, Yoon-Chul; Toutios, Asterios; Narayanan, Shrikanth; Nayak, Krishna S
2017-01-01
The aim of this work was to develop and evaluate an MRI-based system for study of dynamic vocal tract shaping during speech production, which provides high spatial and temporal resolution. The proposed system utilizes (a) custom eight-channel upper airway coils that have high sensitivity to upper airway regions of interest, (b) two-dimensional golden angle spiral gradient echo acquisition, (c) on-the-fly view-sharing reconstruction, and (d) off-line temporal finite difference constrained reconstruction. The system also provides simultaneous noise-cancelled and temporally aligned audio. The system is evaluated in 3 healthy volunteers, and 1 tongue cancer patient, with a broad range of speech tasks. We report spatiotemporal resolutions of 2.4 × 2.4 mm 2 every 12 ms for single-slice imaging, and 2.4 × 2.4 mm 2 every 36 ms for three-slice imaging, which reflects roughly 7-fold acceleration over Nyquist sampling. This system demonstrates improved temporal fidelity in capturing rapid vocal tract shaping for tasks, such as producing consonant clusters in speech, and beat-boxing sounds. Novel acoustic-articulatory analysis was also demonstrated. A synergistic combination of custom coils, spiral acquisitions, and constrained reconstruction enables visualization of rapid speech with high spatiotemporal resolution in multiple planes. Magn Reson Med 77:112-125, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
The singing/acting mature adult--singing instruction perspective.
Westerman Gregg, J
1997-06-01
Complete knowledge of anatomy and physiology of the vocal mechanism and tract is essential for the voice teacher to be maximally effective. Possible contributing factors to vocal attrition in the mature singer/actor are outlined: poor posture, inadequate respiratory function, lack of adequate hydration, phonatory hyperfunction, habitual speaking pitch at too low a frequency, lack of resonance, tongue tension affecting phonation, resonation, and articulation. Techniques for rehabilitation of the damaged voice are recommended.
Tokuda, Isao T; Shimamura, Ryo
2017-08-01
As an alternative factor to produce asymmetry between left and right vocal folds, the present study focuses on level difference, which is defined as the distance between the upper surfaces of the bilateral vocal folds in the inferior-superior direction. Physical models of the vocal folds were utilized to study the effect of the level difference on the phonation threshold pressure. A vocal tract model was also attached to the vocal fold model. For two types of different models, experiments revealed that the phonation threshold pressure tended to increase as the level difference was extended. Based upon a small amplitude approximation of the vocal fold oscillations, a theoretical formula was derived for the phonation threshold pressure. This theory agrees with the experiments, especially when the phase difference between the left and right vocal folds is not extensive. Furthermore, an asymmetric two-mass model was simulated with a level difference to validate the experiments as well as the theory. The primary conclusion is that the level difference has a potential effect on voice production especially for patients with an extended level of vertical difference in the vocal folds, which might be taken into account for the diagnosis of voice disorders.
Reby, D; Wyman, M T; Frey, R; Passilongo, D; Gilbert, J; Locatelli, Y; Charlton, B D
2016-04-15
With an average male body mass of 320 kg, the wapiti, ITALIC! Cervus canadensis, is the largest extant species of Old World deer (Cervinae). Despite this large body size, male wapiti produce whistle-like sexual calls called bugles characterised by an extremely high fundamental frequency. Investigations of the biometry and physiology of the male wapiti's relatively large larynx have so far failed to account for the production of such a high fundamental frequency. Our examination of spectrograms of male bugles suggested that the complex harmonic structure is best explained by a dual-source model (biphonation), with one source oscillating at a mean of 145 Hz (F0) and the other oscillating independently at an average of 1426 Hz (G0). A combination of anatomical investigations and acoustical modelling indicated that the F0 of male bugles is consistent with the vocal fold dimensions reported in this species, whereas the secondary, much higher source at G0 is more consistent with an aerodynamic whistle produced as air flows rapidly through a narrow supraglottic constriction. We also report a possible interaction between the higher frequency G0 and vocal tract resonances, as G0 transiently locks onto individual formants as the vocal tract is extended. We speculate that male wapiti have evolved such a dual-source phonation to advertise body size at close range (with a relatively low-frequency F0 providing a dense spectrum to highlight size-related information contained in formants) while simultaneously advertising their presence over greater distances using the very high-amplitude G0 whistle component. © 2016. Published by The Company of Biologists Ltd.
Elie, Julie E.; Theunissen, Frédéric E.
2018-01-01
Although a universal code for the acoustic features of animal vocal communication calls may not exist, the thorough analysis of the distinctive acoustical features of vocalization categories is important not only to decipher the acoustical code for a specific species but also to understand the evolution of communication signals and the mechanisms used to produce and understand them. Here, we recorded more than 8,000 examples of almost all the vocalizations of the domesticated zebra finch, Taeniopygia guttata: vocalizations produced to establish contact, to form and maintain pair bonds, to sound an alarm, to communicate distress or to advertise hunger or aggressive intents. We characterized each vocalization type using complete representations that avoided any a priori assumptions on the acoustic code, as well as classical bioacoustics measures that could provide more intuitive interpretations. We then used these acoustical features to rigorously determine the potential information-bearing acoustical features for each vocalization type using both a novel regularized classifier and an unsupervised clustering algorithm. Vocalization categories are discriminated by the shape of their frequency spectrum and by their pitch saliency (noisy to tonal vocalizations) but not particularly by their fundamental frequency. Notably, the spectral shape of zebra finch vocalizations contains peaks or formants that vary systematically across categories and that would be generated by active control of both the vocal organ (source) and the upper vocal tract (filter). PMID:26581377
Scheerer, N E; Jacobson, D S; Jones, J A
2016-02-09
Auditory feedback plays an important role in the acquisition of fluent speech; however, this role may change once speech is acquired and individuals no longer experience persistent developmental changes to the brain and vocal tract. For this reason, we investigated whether the role of auditory feedback in sensorimotor learning differs across children and adult speakers. Participants produced vocalizations while they heard their vocal pitch predictably or unpredictably shifted downward one semitone. The participants' vocal pitches were measured at the beginning of each vocalization, before auditory feedback was available, to assess the extent to which the deviant auditory feedback modified subsequent speech motor commands. Sensorimotor learning was observed in both children and adults, with participants' initial vocal pitch increasing following trials where they were exposed to predictable, but not unpredictable, frequency-altered feedback. Participants' vocal pitch was also measured across each vocalization, to index the extent to which the deviant auditory feedback was used to modify ongoing vocalizations. While both children and adults were found to increase their vocal pitch following predictable and unpredictable changes to their auditory feedback, adults produced larger compensatory responses. The results of the current study demonstrate that both children and adults rapidly integrate information derived from their auditory feedback to modify subsequent speech motor commands. However, these results also demonstrate that children and adults differ in their ability to use auditory feedback to generate compensatory vocal responses during ongoing vocalization. Since vocal variability also differed across the children and adult groups, these results also suggest that compensatory vocal responses to frequency-altered feedback manipulations initiated at vocalization onset may be modulated by vocal variability. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
1992-06-01
Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.
Nouraei, S A R; Allen, J; Kaddour, H; Middleton, S E; Aylin, P; Darzi, A; Tolley, N S
2017-12-01
Thyroidectomy is the commonest operation that places normally functioning laryngeal nerves at risk of injury. Vocal palsy is a major risk factor for dysphonia, dysphagia, and less commonly, airway obstruction. We investigated the association between post-thyroidectomy vocal palsy and long-term risks of pneumonia and laryngeal failure. An N=near-all analysis of the English administrative dataset using a previously validated informatics algorithm to identify young and otherwise low-risk patients undergoing first-time elective thyroidectomy for benign disease. Information about age, sex, morbidities, social deprivation and post-operative and late complications were derived. Between 2004 and 2012, 43 515 patients between the ages of 20 and 69 who had no history of cancer, neurological, or respiratory disease underwent elective total or hemithyroidectomy without concomitant or late neck dissection, parathyroidectomy or laryngotracheal surgery for benign thyroid disease for the first and only time. Information about age, sex, morbidities and in-hospital and late complications was recorded. Mean age at surgery was 46±12. There was a strong female preponderance (85%), and most patients (89%) had no recorded Charlson comorbidities Most patients (65%) underwent hemithyroidectomy. Late vocal palsy was recorded in 449 (1.03%) patients, and its occurrence was an independent risk factor for emergency hospital readmission (n=7113; Hazard Ratio 1.52; 95% confidence interval 1.21-1.91), hospitalisation for lower respiratory tract infection (n=944; HR 2.04; 95% CI 1.07-3.75), dysphagia (n=564; HR 3.47; 95% CI 1.57-7.65) and gastrostomy/tracheostomy placement (n=80; HR 20.8; 95% CI 2.5-171.2). Independent risk factors for late vocal palsy were age, burden of morbidities, total thyroidectomy, post operative bleeding, male sex, and annual surgeon volume <30. There is a significant association between post-thyroidectomy vocal palsy and long-term risks of hospital readmission, dysphagia, hospitalisation for lower respiratory tract infection, and gastrostomy/tracheostomy tube placement. This adds weight to the need, from a thyroid surgical perspective, to undertake universal post-thyroidectomy laryngeal surveillance as a minimum standard of care, with a focus on post-operative dysphagia and aspiration, and from a medical/respiratory perspective, to initiate investigations to identify occult vocal palsy in patients who present with pneumonia, who have a history of thyroid surgery. © 2017 John Wiley & Sons Ltd.
The impact of rate reduction and increased vocal intensity on coarticulation in dysarthria
NASA Astrophysics Data System (ADS)
Tjaden, Kris
2003-04-01
The dysarthrias are a group of speech disorders resulting from impairment to nervous system structures important for the motor execution of speech. Although numerous studies have examined how dysarthria impacts articulatory movements or changes in vocal tract shape, few studies of dysarthria consider that articulatory events and their acoustic consequences overlap or are coarticulated in connected speech. The impact of rate, loudness, and clarity on coarticulatory patterns in dysarthria also are poorly understood, although these prosodic manipulations frequently are employed as therapy strategies to improve intelligibility in dysarthria and also are known to affect coarticulatory patterns for at least some neurologically healthy speakers. The current study examined the effects of slowed rate and increased vocal intensity on anticipatory coarticulation for speakers with dysarthria secondary to Multiple Sclerosis (MS), as inferred from the acoustic signal. Healthy speakers were studied for comparison purposes. Three repetitions of twelve target words embedded in the carrier phrase ``It's a -- again'' were produced in habitual, loud, and slow speaking conditions. F2 frequencies and first moment coefficients were used to infer coarticulation. Both group and individual speaker trends will be examined in the data analyses.
Adapted to Roar: Functional Morphology of Tiger and Lion Vocal Folds
Klemuk, Sarah A.; Riede, Tobias; Walsh, Edward J.; Titze, Ingo R.
2011-01-01
Vocal production requires active control of the respiratory system, larynx and vocal tract. Vocal sounds in mammals are produced by flow-induced vocal fold oscillation, which requires vocal fold tissue that can sustain the mechanical stress during phonation. Our understanding of the relationship between morphology and vocal function of vocal folds is very limited. Here we tested the hypothesis that vocal fold morphology and viscoelastic properties allow a prediction of fundamental frequency range of sounds that can be produced, and minimal lung pressure necessary to initiate phonation. We tested the hypothesis in lions and tigers who are well-known for producing low frequency and very loud roaring sounds that expose vocal folds to large stresses. In histological sections, we found that the Panthera vocal fold lamina propria consists of a lateral region with adipocytes embedded in a network of collagen and elastin fibers and hyaluronan. There is also a medial region that contains only fibrous proteins and hyaluronan but no fat cells. Young's moduli range between 10 and 2000 kPa for strains up to 60%. Shear moduli ranged between 0.1 and 2 kPa and differed between layers. Biomechanical and morphological data were used to make predictions of fundamental frequency and subglottal pressure ranges. Such predictions agreed well with measurements from natural phonation and phonation of excised larynges, respectively. We assume that fat shapes Panthera vocal folds into an advantageous geometry for phonation and it protects vocal folds. Its primary function is probably not to increase vocal fold mass as suggested previously. The large square-shaped Panthera vocal fold eases phonation onset and thereby extends the dynamic range of the voice. PMID:22073246
A Randomized Controlled Trial of Two Semi-Occluded Vocal Tract Voice Therapy Protocols
Hunter, Eric J.; Kirkham, Kimberly; Cox, Karin; Titze, Ingo R.
2015-01-01
Purpose Although there is a long history of use of semi-occluded vocal tract gestures in voice therapy, including phonation through thin tubes or straws, the efficacy of phonation through tubes has not been established. This study compares results from a therapy program on the basis of phonation through a flow-resistant tube (FRT) with Vocal Function Exercises (VFE), an established set of exercises that utilize oral semi-occlusions. Method Twenty subjects (16 women, 4 men) with dysphonia and/or vocal fatigue were randomly assigned to 1 of 4 treatment conditions: (a) immediate FRT therapy, (b) immediate VFE therapy, (c) delayed FRT therapy, or (d) delayed VFE therapy. Subjects receiving delayed therapy served as a no-treatment control group. Results Voice Handicap Index (Jacobson et al., 1997) scores showed significant improvement for both treatment groups relative to the no-treatment group. Comparison of the effect sizes suggests FRT therapy is noninferior to VFE in terms of reduction in Voice Handicap Index scores. Significant reductions in Roughness on the Consensus Auditory-Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were found for the FRT subjects, with no other significant voice quality findings. Conclusions VFE and FRT therapy may improve voice quality of life in some individuals with dysphonia. FRT therapy was noninferior to VFE in improving voice quality of life in this study. PMID:25675335
Endoscopic laterofixation in bilateral vocal cords paralysis in children.
Lidia, Zawadzka-Glos; Magdalena, Frackiewicz; Mieczyslaw, Chmielik
2010-06-01
Vocal cords paralysis is the second most frequent cause of laryngeal stridor in children. Symptoms of congenital vocal cords paralysis can occur shortly after birth or later. Vocal cords paralysis can be unilateral or bilateral. Symptoms of unilateral paralysis include hoarse weeping or stridor during a deep inhalation. In children unilateral vocal cords paralysis often retreats spontaneously or can be completely compensated. Children with bilateral vocal cords paralysis present mainly breathing disorders while phonation is normal. Symptoms are different, starting from complete occlusion of respiratory tracts and ending on small symptoms connected with the lack of effort tolerance. When symptoms are severe, patients from this group require a tracheotomy. The lack of restoration of normal function of vocal cords or lack of complete compensation and maintenance of symptoms are an indication for surgical treatment. The aim of this study is to present results of the treatment of bilateral vocal cords paralysis in children using the endoscopic method of laterofixation of vocal cords. In the Pediatric ENT Department between 1998 and 2009 sixty four children with dyspnoea and/or phonation disorders caused by vocal cords paralysis were treated. In ten cases laterofixation of vocal cords was performed, in most cases with good result. In this article the authors present the method of endoscopic laterofixation and achieved results. Endoscopic laterofixation of vocal cords in children is a safe and an easy method of surgical treatment of bilateral vocal cords paralysis. This method can be used as a first and often as a one stage treatment of vocal cords paralysis. In some cases this procedure is insufficient and has to be completed with other methods. Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.
The evolution of speech: a comparative review.
Fitch
2000-07-01
The evolution of speech can be studied independently of the evolution of language, with the advantage that most aspects of speech acoustics, physiology and neural control are shared with animals, and thus open to empirical investigation. At least two changes were necessary prerequisites for modern human speech abilities: (1) modification of vocal tract morphology, and (2) development of vocal imitative ability. Despite an extensive literature, attempts to pinpoint the timing of these changes using fossil data have proven inconclusive. However, recent comparative data from nonhuman primates have shed light on the ancestral use of formants (a crucial cue in human speech) to identify individuals and gauge body size. Second, comparative analysis of the diverse vertebrates that have evolved vocal imitation (humans, cetaceans, seals and birds) provides several distinct, testable hypotheses about the adaptive function of vocal mimicry. These developments suggest that, for understanding the evolution of speech, comparative analysis of living species provides a viable alternative to fossil data. However, the neural basis for vocal mimicry and for mimesis in general remains unknown.
Přibil, Jiří; Přibilová, Anna; Frollo, Ivan
2018-04-05
This article compares open-air and whole-body magnetic resonance imaging (MRI) equipment working with a weak magnetic field as regards the methods of its generation, spectral properties of mechanical vibration and acoustic noise produced by gradient coils during the scanning process, and the measured noise intensity. These devices are used for non-invasive MRI reconstruction of the human vocal tract during phonation with simultaneous speech recording. In this case, the vibration and noise have negative influence on quality of speech signal. Two basic measurement experiments were performed within the paper: mapping sound pressure levels in the MRI device vicinity and picking up vibration and noise signals in the MRI scanning area. Spectral characteristics of these signals are then analyzed statistically and compared visually and numerically.
Histopathologic study of human vocal fold mucosa unphonated over a decade.
Sato, Kiminori; Umeno, Hirohito; Ono, Takeharu; Nakashima, Tadashi
2011-12-01
Mechanotransduction caused by vocal fold vibration could possibly be an important factor in the maintenance of extracellular matrices and layered structure of the human adult vocal fold mucosa as a vibrating tissue after the layered structure has been completed. Vocal fold stellate cells (VFSCs) in the human maculae flavae of the vocal fold mucosa are inferred to be involved in the metabolism of extracellular matrices of the vocal fold mucosa. Maculae flavae are also considered to be an important structure in the growth and development of the human vocal fold mucosa. Tension caused by phonation (vocal fold vibration) is hypothesized to stimulate the VFSCs to accelerate production of extracellular matrices. A human adult vocal fold mucosa unphonated over a decade was investigated histopathologically. Vocal fold mucosa unphonated for 11 years and 2 months of a 64-year-old male with cerebral hemorrhage was investigated by light and electron microscopy. The vocal fold mucosae (including maculae flavae) were atrophic. The vocal fold mucosa did not have a vocal ligament, Reinke's space or a layered structure. The lamina propria appeared as a uniform structure. Morphologically, the VFSCs synthesized fewer extracellular matrices, such as fibrous protein and glycosaminoglycan. Consequently, VFSCs appeared to decrease their level of activity.
Acoustics of the trained versus untrained singing voice.
Howard, David M
2009-06-01
Acoustic voice analysis is now widely available on today's multimedia computers and knowledge of the acoustics of the trained and untrained singing voice has advanced dramatically in recent years. New techniques have emerged that are providing clearer representations of aspects of the physiology of voice function and a greater understanding of the differences between the voices of untrained and trained singers. Improvements in endoscope technology are changing understanding of vocal fold function and videokymography provides a new way of interpreting the output; some new and interesting possibilities are emerging. Larynx height variation is a feature of untrained singing and singing in different styles and its measurement has been inaccurate hitherto; perhaps the laryngoaltimeter will provide a solution. Magnetic resonance imaging is now a vital tool for vocal tract shape measurement but a new bio-inspired computing is offering a possible alternative. Differences between an untrained and trained singing voice lie in one or more of breathing technique, larynx settings or vocal tract settings. Measurement techniques in each of these areas are important to provide data on the singing voice, and accurate data are essential for natural personalized electronic voice synthesis in the future.
Temporal processing of speech in a time-feature space
NASA Astrophysics Data System (ADS)
Avendano, Carlos
1997-09-01
The performance of speech communication systems often degrades under realistic environmental conditions. Adverse environmental factors include additive noise sources, room reverberation, and transmission channel distortions. This work studies the processing of speech in the temporal-feature or modulation spectrum domain, aiming for alleviation of the effects of such disturbances. Speech reflects the geometry of the vocal organs, and the linguistically dominant component is in the shape of the vocal tract. At any given point in time, the shape of the vocal tract is reflected in the short-time spectral envelope of the speech signal. The rate of change of the vocal tract shape appears to be important for the identification of linguistic components. This rate of change, or the rate of change of the short-time spectral envelope can be described by the modulation spectrum, i.e. the spectrum of the time trajectories described by the short-time spectral envelope. For a wide range of frequency bands, the modulation spectrum of speech exhibits a maximum at about 4 Hz, the average syllabic rate. Disturbances often have modulation frequency components outside the speech range, and could in principle be attenuated without significantly affecting the range with relevant linguistic information. Early efforts for exploiting the modulation spectrum domain (temporal processing), such as the dynamic cepstrum or the RASTA processing, used ad hoc designed processing and appear to be suboptimal. As a major contribution, in this dissertation we aim for a systematic data-driven design of temporal processing. First we analytically derive and discuss some properties and merits of temporal processing for speech signals. We attempt to formalize the concept and provide a theoretical background which has been lacking in the field. In the experimental part we apply temporal processing to a number of problems including adaptive noise reduction in cellular telephone environments, reduction of reverberation for speech enhancement, and improvements on automatic recognition of speech degraded by linear distortions and reverberation.
Vocal specialization through tracheal elongation in an extinct Miocene pheasant from China.
Li, Zhiheng; Clarke, Julia A; Eliason, Chad M; Stidham, Thomas A; Deng, Tao; Zhou, Zhonghe
2018-05-25
Modifications to the upper vocal tract involving hyper-elongated tracheae have evolved many times within crown birds, and their evolution has been linked to a 'size exaggeration' hypothesis in acoustic signaling and communication, whereby smaller-sized birds can produce louder sounds. A fossil skeleton of a new extinct species of wildfowl (Galliformes: Phasianidae) from the late Miocene of China, preserves an elongated, coiled trachea that represents the oldest fossil record of this vocal modification in birds and the first documentation of its evolution within pheasants. The phylogenetic position of this species within Phasianidae has not been fully resolved, but appears to document a separate independent origination of this vocal modification within Galliformes. The fossil preserves a coiled section of the trachea and other remains supporting a tracheal length longer than the bird's body. This extinct species likely produced vocalizations with a lower fundamental frequency and reduced harmonics compared to similarly-sized pheasants. The independent evolution of this vocal feature in galliforms living in both open and closed habitats does not appear to be correlated with other factors of biology or its open savanna-like habitat. Features present in the fossil that are typically associated with sexual dimorphism suggest that sexual selection may have resulted in the evolution of both the morphology and vocalization mechanism in this extinct species.
Feminization laryngoplasty: assessment of surgical pitch elevation.
Thomas, James P; Macmillan, Cody
2013-09-01
The aim of this study is to analyze change in pitch following feminization laryngoplasty, a technique to alter the vocal tract of male to female transgender patients. This is a retrospective review of 94 patients undergoing feminization laryngoplasty between June 2002 and April 2012 of which 76 individuals completed follow-up audio recordings. Feminization laryngoplasty is a procedure removing the anterior thyroid cartilage, collapsing the diameter of the larynx as well as shortening and tensioning the vocal folds to raise the pitch. Changes in comfortable speaking pitch, lowest vocal pitch and highest vocal pitch are assessed before and after surgery. Acoustic parameters of speaking pitch and vocal range were compared between pre- and postoperative results. The average comfortable speaking pitch preoperatively, C3# (139 Hz), was raised an average of six semitones to G3 (196 Hz), after surgical intervention. The lowest attainable pitch was raised an average of seven semitones and the highest attainable pitch decreased by an average of two semitones. One aspect of the procedure, thyrohyoid approximation (introduced in 2006 to alter resonance), did not affect pitch. Feminization laryngoplasty successfully increased the comfortable fundamental frequency of speech and removed the lowest notes from the patient's vocal range. It does not typically raise the upper limits of the vocal range.
NASA Astrophysics Data System (ADS)
Lucero, Jorge C.; Koenig, Laura L.
2005-03-01
In this study we use a low-dimensional laryngeal model to reproduce temporal variations in oral airflow produced by speakers in the vicinity of an abduction gesture. It attempts to characterize these temporal patterns in terms of biomechanical parameters such as glottal area, vocal fold stiffness, subglottal pressure, and gender differences in laryngeal dimensions. A two-mass model of the vocal folds coupled to a two-tube approximation of the vocal tract is fitted to oral airflow records measured in men and women during the production of /aha/ utterances, using the subglottal pressure, glottal width, and Q factor as control parameters. The results show that the model is capable of reproducing the airflow records with good approximation. A nonlinear damping characteristics is needed, to reproduce the flow variation at glottal abduction. Devoicing is achieved by the combined action of vocal fold abduction, the decrease of subglottal pressure, and the increase of vocal fold tension. In general, the female larynx has a more restricted region of vocal fold oscillation than the male one. This would explain the more frequent devoicing in glottal abduction-adduction gestures for /h/ in running speech by women, compared to men. .
Toutios, Asterios; Narayanan, Shrikanth S
2016-01-01
Real-time magnetic resonance imaging (rtMRI) of the moving vocal tract during running speech production is an important emerging tool for speech production research providing dynamic information of a speaker's upper airway from the entire mid-sagittal plane or any other scan plane of interest. There have been several advances in the development of speech rtMRI and corresponding analysis tools, and their application to domains such as phonetics and phonological theory, articulatory modeling, and speaker characterization. An important recent development has been the open release of a database that includes speech rtMRI data from five male and five female speakers of American English each producing 460 phonetically balanced sentences. The purpose of the present paper is to give an overview and outlook of the advances in rtMRI as a tool for speech research and technology development.
Acoustic Properties of the Voice Source and the Vocal Tract: Are They Perceptually Independent?
Erickson, Molly L
2016-11-01
This study sought to determine whether the properties of the voice source and vocal tract are perceptually independent. Within-subjects design. This study employed a paired-comparison paradigm where listeners heard synthetic voices and rated them as same or different using a visual analog scale. Stimuli were synthesized using three different source slopes and two different formant patterns (mezzo-soprano and soprano) on the vowel /a/ at four pitches: A3, C4, B4, and F5. Whereas formant pattern was the strongest effect, difference in source slope also affected perceived quality difference. Source slope and formant pattern were not independently perceived. These results suggest that when judging laryngeal adduction using perceptual information, judgments may not be accurate when the stimuli are of differing formant patterns. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
TOUTIOS, ASTERIOS; NARAYANAN, SHRIKANTH S.
2016-01-01
Real-time magnetic resonance imaging (rtMRI) of the moving vocal tract during running speech production is an important emerging tool for speech production research providing dynamic information of a speaker's upper airway from the entire mid-sagittal plane or any other scan plane of interest. There have been several advances in the development of speech rtMRI and corresponding analysis tools, and their application to domains such as phonetics and phonological theory, articulatory modeling, and speaker characterization. An important recent development has been the open release of a database that includes speech rtMRI data from five male and five female speakers of American English each producing 460 phonetically balanced sentences. The purpose of the present paper is to give an overview and outlook of the advances in rtMRI as a tool for speech research and technology development. PMID:27833745
Multi-disciplinary clinical protocol for the diagnosis of bulbar amyotrophic lateral sclerosis.
Chiaramonte, Rita; Di Luciano, Carmela; Chiaramonte, Ignazio; Serra, Agostino; Bonfiglio, Marco
2018-04-23
The objective of this study was to examine the role of different specialists in the diagnosis of amyotrophic lateral sclerosis (ALS), to understand changes in verbal expression and phonation, respiratory dynamics and swallowing that occurred rapidly over a short period of time. 22 patients with bulbar ALS were submitted for voice assessment, ENT evaluation, Multi-Dimensional Voice Program (MDVP), spectrogram, electroglottography, fiberoptic endoscopic evaluation of swallowing. In the early stage of the disease, the oral tract and velopharyngeal port were involved. Three months after the initial symptoms, most of the patients presented hoarseness, breathy voice, dysarthria, pitch modulation problems and difficulties in pronunciation of explosive, velar and lingual consonants. Values of MDVP were altered. Spectrogram showed an additional formant, due to nasal resonance. Electroglottography showed periodic oscillation of the vocal folds only during short vocal cycle. Swallowing was characterized by weakness and incoordination of oro-pharyngeal muscles with penetration or aspiration. A specific multidisciplinary clinical protocol was designed to report vocal parameters and swallowing disorders that changed more quickly in bulbar ALS patients. Furthermore, the patients were stratified according to involvement of pharyngeal structures, and severity index. Copyright © 2018 Sociedad Española de Otorrinolaringología y Cirugía de Cabeza y Cuello. Publicado por Elsevier España, S.L.U. All rights reserved.
HUMAN SPEECH: A RESTRICTED USE OF THE MAMMALIAN LARYNX
Titze, Ingo R.
2016-01-01
Purpose Speech has been hailed as unique to human evolution. While the inventory of distinct sounds producible with vocal tract articulators is a great advantage in human oral communication, it is argued here that the larynx as a sound source in speech is limited in its range and capability because a low fundamental frequency is ideal for phonemic intelligibility and source-filter independence. Method Four existing data sets were combined to make an argument regarding exclusive use of the larynx for speech: (1) range of fundamental frequency, (2) laryngeal muscle activation, (3) vocal fold length in relation to sarcomere length of the major laryngeal muscles, and (4) vocal fold morphological development. Results Limited data support the notion that speech tends to produce a contracture of the larynx. The morphological design of the human vocal folds, like that of primates and other mammals, is optimized for vocal communication over distances for which higher fundamental frequency, higher intensity, and fewer unvoiced segments are utilized than in conversational speech. Conclusion The positive message is that raising one’s voice to call, shout, or sing, or executing pitch glides to stretch the vocal folds, can counteract this trend toward a contracted state. PMID:27397113
Ontogeny of individual and litter identity signaling in grunts of piglets.
Syrová, Michaela; Policht, Richard; Linhart, Pavel; Špinka, Marek
2017-11-01
Many studies have shown that animal vocalizations can signal individual identity and group/family membership. However, much less is known about the ontogeny of identity information-when and how this individual/group distinctiveness in vocalizations arises and how it changes during the animal's life. Recent findings suggest that even species that were thought to have limited vocal plasticity could adjust their calls to sound more similar to each other within a group. It has already been shown that sows can acoustically distinguish their own offspring from alien piglets and that litters differ in their calls. Surprisingly, individual identity in piglet calls has not been reported yet. In this paper, this gap is filled, and it is shown that there is information about piglet identity. Information about litter identity is confirmed as well. Individual identity increased with age, but litter vocal identity did not increase with age. The results were robust as a similar pattern was apparent in two situations differing in arousal: isolation and back-test. This paper argues that, in piglets, increased individual discrimination results from the rapid growth of piglets, which is likely to be associated with growth and diversification of the vocal tract rather than from social effects and vocal plasticity.
The Distribution and Severity of Tremor in Speech Structures of Persons with Vocal Tremor.
Hemmerich, Abby L; Finnegan, Eileen M; Hoffman, Henry T
2017-05-01
Vocal tremor may be associated with cyclic oscillations in the pulmonary, laryngeal, velopharyngeal, or oral regions. This study aimed to correlate the overall severity of vocal tremor with the distribution and severity of tremor in structures involved. Endoscopic and clinical examinations were completed on 20 adults with vocal tremor and two age-matched controls during sustained phonation. Two judges rated the severity of vocal tremor and the severity of tremor affecting each of 13 structures. Participants with mild vocal tremor typically presented with tremor in three laryngeal structures, moderate vocal tremor in five structures (laryngeal and another region), and severe vocal tremor in eight structures affecting all regions. The severity of tremor was lowest (mean = 1.2 out of 3) in persons with mild vocal tremor and greater in persons with moderate (mean = 1.5) and severe vocal tremor (mean = 1.4). Laryngeal structures were most frequently (95%) and severely (1.7 out of 3) affected, followed by velopharynx (40% occurrence, 1.3 severity), pulmonary (40% occurrence, 1.1 severity), and oral (40% occurrence, 1.0 severity) regions. Regression analyses indicated tremor severity of the supraglottic structures, and vertical laryngeal movement contributed most to vocal tremor severity during sustained phonation (r = 0.77, F = 16.17, P < 0.0001). A strong positive correlation (r = 0.72) was found between the Tremor Index and the severity of the vocal tremor during sustained phonation. It is useful to obtain a wide endoscopic view of the larynx to visualize tremor, which is rarely isolated to the true vocal folds alone. Published by Elsevier Inc.
Wild, J M; Krützfeldt, N E O
2012-02-15
During singing in songbirds, the extent of beak opening, like the extent of mouth opening in human singers, is partially correlated with the fundamental frequency of the sounds emitted. Since song in songbirds is under the control of "the song system" (a collection of interconnected forebrain nuclei dedicated to the learning and production of song), it might be expected that beak movements during singing would also be controlled by this system. However, direct neural connections between the telencephalic output of the song system and beak muscle motor neurons in the brainstem are conspicuous by their absence, leaving unresolved the question of how beak movements are affected during singing. By using standard tract tracing methods, we sought to answer this question by defining beak premotor neurons and examining their afferent projections. In the caudal medulla, jaw premotor cell bodies were located adjacent to the terminal field of the output of the song system, into which many premotor neurons extended their dendrites. The premotor neurons also received a novel input from the trigeminal ganglion and an overlapping input from a lateral arcopallial component of a trigeminal sensorimotor circuit that traverses the forebrain. The ganglionic input in songbirds, which is not present in doves and pigeons that vocalize with a closed beak, may modulate the activity of beak premotor neurons in concert with the output of the song system. These inputs to jaw premotor neurons could, together, affect beak movements as a means of modulating filter properties of the upper vocal tract during singing. Copyright © 2011 Wiley-Liss, Inc.
Maxfield, Lynn; Palaparthi, Anil; Titze, Ingo
2017-03-01
The traditional source-filter theory of voice production describes a linear relationship between the source (glottal flow pulse) and the filter (vocal tract). Such a linear relationship does not allow for nor explain how changes in the filter may impact the stability and regularity of the source. The objective of this experiment was to examine what effect unpredictable changes to vocal tract dimensions could have on fo stability and individual harmonic intensities in situations in which low frequency harmonics cross formants in a fundamental frequency glide. To determine these effects, eight human subjects (five male, three female) were recorded producing fo glides while their vocal tracts were artificially lengthened by a section of vinyl tubing inserted into the mouth. It was hypothesized that if the source and filter operated as a purely linear system, harmonic intensities would increase and decrease at nearly the same rates as they passed through a formant bandwidth, resulting in a relatively symmetric peak on an intensity-time contour. Additionally, fo stability should not be predictably perturbed by formant/harmonic crossings in a linear system. Acoustic analysis of these recordings, however, revealed that harmonic intensity peaks were asymmetric in 76% of cases, and that 85% of fo instabilities aligned with a crossing of one of the first four harmonics with the first three formants. These results provide further evidence that nonlinear dynamics in the source-filter relationship can impact fo stability as well as harmonic intensities as harmonics cross through formant bandwidths. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Wild, J.M.; Krützfeldt, N.E.O.
2014-01-01
During singing in songbirds, the extent of beak opening, like the extent of mouth opening in human singers, is partially correlated with the fundamental frequency of the sounds emitted. Since song in songbirds is under the control of “the song system” (a collection of interconnected forebrain nuclei dedicated to the learning and production of song), it might be expected that beak movements during singing would also be controlled by this system. However, direct neural connections between the telencephalic output of the song system and beak muscle motor neurons in the brainstem are conspicuous by their absence, leaving unresolved the question of how beak movements are affected during singing. By using standard tract tracing methods, we sought to answer this question by defining beak premotor neurons and examining their afferent projections. In the caudal medulla, jaw premotor cell bodies were located adjacent to the terminal field of the output of the song system, into which many premotor neurons extended their dendrites. The premotor neurons also received a novel input from the trigeminal ganglion and an overlapping input from a lateral arcopallial component of a trigeminal sensorimotor circuit that traverses the forebrain. The ganglionic input in songbirds, which is not present in doves and pigeons that vocalize with a closed beak, may modulate the activity of beak premotor neurons in concert with the output of the song system. These inputs to jaw premotor neurons could, together, affect beak movements as a means of modulating filter properties of the upper vocal tract during singing. PMID:21858818
2018-01-01
Human vocal development is dependent on learning by imitation through social feedback between infants and caregivers. Recent studies have revealed that vocal development is also influenced by parental feedback in marmoset monkeys, suggesting vocal learning mechanisms in nonhuman primates. Marmoset infants that experience more contingent vocal feedback than their littermates develop vocalizations more rapidly, and infant marmosets with limited parental interaction exhibit immature vocal behavior beyond infancy. However, it is yet unclear whether direct parental interaction is an obligate requirement for proper vocal development because all monkeys in the aforementioned studies were able to produce the adult call repertoire after infancy. Using quantitative measures to compare distinct call parameters and vocal sequence structure, we show that social interaction has a direct impact not only on the maturation of the vocal behavior but also on acoustic call structures during vocal development. Monkeys with limited parental interaction during development show systematic differences in call entropy, a measure for maturity, compared with their normally raised siblings. In addition, different call types were occasionally uttered in motif-like sequences similar to those exhibited by vocal learners, such as birds and humans, in early vocal development. These results indicate that a lack of parental interaction leads to long-term disturbances in the acoustic structure of marmoset vocalizations, suggesting an imperative role for social interaction in proper primate vocal development. PMID:29651461
Gultekin, Yasemin B; Hage, Steffen R
2018-04-01
Human vocal development is dependent on learning by imitation through social feedback between infants and caregivers. Recent studies have revealed that vocal development is also influenced by parental feedback in marmoset monkeys, suggesting vocal learning mechanisms in nonhuman primates. Marmoset infants that experience more contingent vocal feedback than their littermates develop vocalizations more rapidly, and infant marmosets with limited parental interaction exhibit immature vocal behavior beyond infancy. However, it is yet unclear whether direct parental interaction is an obligate requirement for proper vocal development because all monkeys in the aforementioned studies were able to produce the adult call repertoire after infancy. Using quantitative measures to compare distinct call parameters and vocal sequence structure, we show that social interaction has a direct impact not only on the maturation of the vocal behavior but also on acoustic call structures during vocal development. Monkeys with limited parental interaction during development show systematic differences in call entropy, a measure for maturity, compared with their normally raised siblings. In addition, different call types were occasionally uttered in motif-like sequences similar to those exhibited by vocal learners, such as birds and humans, in early vocal development. These results indicate that a lack of parental interaction leads to long-term disturbances in the acoustic structure of marmoset vocalizations, suggesting an imperative role for social interaction in proper primate vocal development.
Exploring the anatomical encoding of voice with a mathematical model of the vocal system.
Assaneo, M Florencia; Sitt, Jacobo; Varoquaux, Gael; Sigman, Mariano; Cohen, Laurent; Trevisan, Marcos A
2016-11-01
The faculty of language depends on the interplay between the production and perception of speech sounds. A relevant open question is whether the dimensions that organize voice perception in the brain are acoustical or depend on properties of the vocal system that produced it. One of the main empirical difficulties in answering this question is to generate sounds that vary along a continuum according to the anatomical properties the vocal apparatus that produced them. Here we use a mathematical model that offers the unique possibility of synthesizing vocal sounds by controlling a small set of anatomically based parameters. In a first stage the quality of the synthetic voice was evaluated. Using specific time traces for sub-glottal pressure and tension of the vocal folds, the synthetic voices generated perceptual responses, which are indistinguishable from those of real speech. The synthesizer was then used to investigate how the auditory cortex responds to the perception of voice depending on the anatomy of the vocal apparatus. Our fMRI results show that sounds are perceived as human vocalizations when produced by a vocal system that follows a simple relationship between the size of the vocal folds and the vocal tract. We found that these anatomical parameters encode the perceptual vocal identity (male, female, child) and show that the brain areas that respond to human speech also encode vocal identity. On the basis of these results, we propose that this low-dimensional model of the vocal system is capable of generating realistic voices and represents a novel tool to explore the voice perception with a precise control of the anatomical variables that generate speech. Furthermore, the model provides an explanation of how auditory cortices encode voices in terms of the anatomical parameters of the vocal system. Copyright © 2016 Elsevier Inc. All rights reserved.
HOCUS: The Haskins optically-corrected ultrasound system for measuring speech articulation
NASA Astrophysics Data System (ADS)
Whalen, D. H.; Iskarous, Khalil; Tiede, Mark K.; Ostry, David J.
2004-05-01
The tongue is the most important supralaryngeal articulator for speech, yet, because it is typically out of view, its movements have been difficult to quantify. Here is described a new combination of techniques involving ultrasound in conjunction with an optoelectric motion measurement system (Optotrak). Combining these, the movements of the tongue are imaged and simultaneously corrected for motion of the head and of the ultrasound transceiver. Optotrak's infrared-emitting diodes are placed on the transceiver and the speakers head in order to localize the ultrasound image of the tongue relative to the hard palate. The palate can be imaged with ultrasound by having the ultrasound signal penetrate a water bolus held against the palate by the tongue. This trace is coregistered with the head and potentially with the same talker's sagittal MR image, to provide additional information on the unimaged remainder of the tract. The tongue surface, from the larynx to near the tip, can then be localized in relationship to the hard palate. The result is a fairly complete view of the tongue within the vocal tract at sampling rates appropriate for running speech. A comparison with other imaging vocal tract systems will be presented. [Work supported by NIH Grant DC-02717.
Multilevel Analysis in Analyzing Speech Data
ERIC Educational Resources Information Center
Guddattu, Vasudeva; Krishna, Y.
2011-01-01
The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Can blind persons accurately assess body size from the voice?
Pisanski, Katarzyna; Oleszkiewicz, Anna; Sorokowska, Agnieszka
2016-04-01
Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20-65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. © 2016 The Author(s).
Volitional exaggeration of body size through fundamental and formant frequency modulation in humans
Pisanski, Katarzyna; Mora, Emanuel C.; Pisanski, Annette; Reby, David; Sorokowski, Piotr; Frackowiak, Tomasz; Feinberg, David R.
2016-01-01
Several mammalian species scale their voice fundamental frequency (F0) and formant frequencies in competitive and mating contexts, reducing vocal tract and laryngeal allometry thereby exaggerating apparent body size. Although humans’ rare capacity to volitionally modulate these same frequencies is thought to subserve articulated speech, the potential function of voice frequency modulation in human nonverbal communication remains largely unexplored. Here, the voices of 167 men and women from Canada, Cuba, and Poland were recorded in a baseline condition and while volitionally imitating a physically small and large body size. Modulation of F0, formant spacing (∆F), and apparent vocal tract length (VTL) were measured using Praat. Our results indicate that men and women spontaneously and systemically increased VTL and decreased F0 to imitate a large body size, and reduced VTL and increased F0 to imitate small size. These voice modulations did not differ substantially across cultures, indicating potentially universal sound-size correspondences or anatomical and biomechanical constraints on voice modulation. In each culture, men generally modulated their voices (particularly formants) more than did women. This latter finding could help to explain sexual dimorphism in F0 and formants that is currently unaccounted for by sexual dimorphism in human vocal anatomy and body size. PMID:27687571
Can blind persons accurately assess body size from the voice?
Oleszkiewicz, Anna; Sorokowska, Agnieszka
2016-01-01
Vocal tract resonances provide reliable information about a speaker's body size that human listeners use for biosocial judgements as well as speech recognition. Although humans can accurately assess men's relative body size from the voice alone, how this ability is acquired remains unknown. In this study, we test the prediction that accurate voice-based size estimation is possible without prior audiovisual experience linking low frequencies to large bodies. Ninety-one healthy congenitally or early blind, late blind and sighted adults (aged 20–65) participated in the study. On the basis of vowel sounds alone, participants assessed the relative body sizes of male pairs of varying heights. Accuracy of voice-based body size assessments significantly exceeded chance and did not differ among participants who were sighted, or congenitally blind or who had lost their sight later in life. Accuracy increased significantly with relative differences in physical height between men, suggesting that both blind and sighted participants used reliable vocal cues to size (i.e. vocal tract resonances). Our findings demonstrate that prior visual experience is not necessary for accurate body size estimation. This capacity, integral to both nonverbal communication and speech perception, may be present at birth or may generalize from broader cross-modal correspondences. PMID:27095264
Acoustic characteristics used by Japanese macaques for individual discrimination.
Furuyama, Takafumi; Kobayasi, Kohta I; Riquimaroux, Hiroshi
2017-10-01
The vocalizations of primates contain information about speaker individuality. Many primates, including humans, are able to distinguish conspecifics based solely on vocalizations. The purpose of this study was to investigate the acoustic characteristics used by Japanese macaques in individual vocal discrimination. Furthermore, we tested human subjects using monkey vocalizations to evaluate species specificity with respect to such discriminations. Two monkeys and five humans were trained to discriminate the coo calls of two unfamiliar monkeys. We created a stimulus continuum between the vocalizations of the two monkeys as a set of probe stimuli (whole morph). We also created two sets of continua in which only one acoustic parameter, fundamental frequency ( f 0 ) or vocal tract characteristic (VTC), was changed from the coo call of one monkey to that of another while the other acoustic feature remained the same ( f 0 morph and VTC morph, respectively). According to the results, the reaction times both of monkeys and humans were correlated with the morph proportion under the whole morph and f 0 morph conditions. The reaction time to the VTC morph was correlated with the morph proportion in both monkeys, whereas the reaction time in humans, on average, was not correlated with morph proportion. Japanese monkeys relied more consistently on VTC than did humans for discriminating monkey vocalizations. Our results support the idea that the auditory system of primates is specialized for processing conspecific vocalizations and suggest that VTC is a significant acoustic feature used by Japanese macaques to discriminate conspecific vocalizations. © 2017. Published by The Company of Biologists Ltd.
A Chinese alligator in heliox: formant frequencies in a crocodilian
Reber, Stephan A.; Nishimura, Takeshi; Janisch, Judith; Robertson, Mark; Fitch, W. Tecumseh
2015-01-01
ABSTRACT Crocodilians are among the most vocal non-avian reptiles. Adults of both sexes produce loud vocalizations known as ‘bellows’ year round, with the highest rate during the mating season. Although the specific function of these vocalizations remains unclear, they may advertise the caller's body size, because relative size differences strongly affect courtship and territorial behaviour in crocodilians. In mammals and birds, a common mechanism for producing honest acoustic signals of body size is via formant frequencies (vocal tract resonances). To our knowledge, formants have to date never been documented in any non-avian reptile, and formants do not seem to play a role in the vocalizations of anurans. We tested for formants in crocodilian vocalizations by using playbacks to induce a female Chinese alligator (Alligator sinensis) to bellow in an airtight chamber. During vocalizations, the animal inhaled either normal air or a helium/oxygen mixture (heliox) in which the velocity of sound is increased. Although heliox allows normal respiration, it alters the formant distribution of the sound spectrum. An acoustic analysis of the calls showed that the source signal components remained constant under both conditions, but an upward shift of high-energy frequency bands was observed in heliox. We conclude that these frequency bands represent formants. We suggest that crocodilian vocalizations could thus provide an acoustic indication of body size via formants. Because birds and crocodilians share a common ancestor with all dinosaurs, a better understanding of their vocal production systems may also provide insight into the communication of extinct Archosaurians. PMID:26246611
[Study of the supra-glottic pressure during partial constriction of the vocal tract].
Suares, M; Cayrayre, F; Ouaknine, M; de la Brèteque, B Amy; Giovanni, A
2004-01-01
Phonation in a small plastic tube 22 cm length and 5 mms diameter (basic exercise of the method of Dr Amy de la Brèteque), is current practice in vocal rehabilitation in France. This work aims to show the effects of this method on the glottic vibration. The hypothesis was that at the time of phonation in the tube with a strong flow as recommended in the method, the vocal cords vibrate without contact. This limits the mechanical trauma at this level. We have analyzed the sound production in a tube in 11 trained and not trained subjects. We simultaneously collected the intra-oral air pressure and the vocal signal which was subjected to a spectral analysis. Spectral analysis confirmed that the signal was produced correctly i.e. with a strong flow and without interruption of the sound less rich in harmonics. We interpreted these results in the light of our preceding works on the glottic vibration and we show that this vocal production was of the sinusoidal type; this implies the absence of physical contact between the vocal cords, which validates our hypothesis. Further works are necessary to better understand the physical relations between the supra-glottic aerodynamic phenomena and the vibratory functioning of the vocal cords and also to analyze the therapeutic potential ofthe method within speech therapy rehabilitation.
Yamasaki, Rosiane; Murano, Emi Z; Gebrim, Eloisa; Hachiya, Adriana; Montagnoli, Arlindo; Behlau, Mara; Tsuji, Domingos
2017-07-01
To compare vocal tract (VT) adjustments of dysphonic and non-dysphonic women before and after flexible resonance tube in water exercise (FRTWE) at rest and during phonation using magnetic resonance imaging. Prospective study. Twenty women, aged 20-40 years, 10 dysphonic with vocal nodules (VNG) and 10 controls (CG), underwent four sets of sagittal VT MRI: two pre-FRTWE, at rest and during phonation, and two post-FRTWE, during phonation and at rest. The subjects performed 3 minutes of exercise. Nine parameters at rest and 21 during phonation were performed. Pre-FRTWE, eight significant differences were found, three at rest and five during phonation: at rest - laryngeal vestibule area, distance from epiglottis to pharyngeal posterior wall (PPW) and interarytenoid complex length were smaller in the VNG; during phonation - laryngeal vestibule area, angle between PPW and vocal fold (VF), epiglottis to PPW, and anterior commissure of the larynx to laryngeal posterior wall were smaller in the VNG; tongue area was larger in the VNG. Post-FRTWE, only three significant differences were found, two during phonation and one at rest: during phonation - angle between PPW and VF and the membranous portion of the VF length were smaller in the VNG; at rest - distance from epiglottis to PPW was smaller in the VNG. Results suggest that the habitual VT adjustments of dysphonic and non-dysphonic women are different at rest and during phonation. The FRTWE promoted positive VT changes in the VNG, reducing the intergroup differences. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vocal Qualities in Music Theater Voice: Perceptions of Expert Pedagogues.
Bourne, Tracy; Kenny, Dianna
2016-01-01
To gather qualitative descriptions of music theater vocal qualities including belt, legit, and mix from expert pedagogues to better define this voice type. This is a prospective, semistructured interview. Twelve expert teachers from United States, United Kingdom, Asia, and Australia were interviewed by Skype and asked to identify characteristics of music theater vocal qualities including vocal production, physiology, esthetics, pitch range, and pedagogical techniques. Responses were compared with published studies on music theater voice. Belt and legit were generally described as distinct sounds with differing physiological and technical requirements. Teachers were concerned that belt should be taught "safely" to minimize vocal health risks. There was consensus between teachers and published research on the physiology of the glottis and vocal tract; however, teachers were not in agreement about breathing techniques. Neither were teachers in agreement about the meaning of "mix." Most participants described belt as heavily weighted, thick folds, thyroarytenoid-dominant, or chest register; however, there was no consensus on an appropriate term. Belt substyles were named and generally categorized by weightedness or tone color. Descriptions of male belt were less clear than for female belt. This survey provides an overview of expert pedagogical perspectives on the characteristics of belt, legit, and mix qualities in the music theater voice. Although teacher responses are generally in agreement with published research, there are still many controversial issues and gaps in knowledge and understanding of this vocal technique. Breathing techniques, vocal range, mix, male belt, and vocal registers require continuing investigation so that we can learn more about efficient and healthy vocal function in music theater singing. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice disorders in residual paracoccidioidomycosis in upper airways and digestive tract.
da Costa, Ananda Dutra; Vargas, Amanda Pereira; Lucena, Marcia Mendonça; Ruas, Ana Cristina Nunes; Braga, Fernanda da Silva Santos; Bom-Braga, Mateus Pereira; Bom-Braga, Frederico Pereira; do Valle, Antonio Carlos Francesconi; Igreja, Ricardo Pereira; Valete-Rosalino, Cláudia Maria
Paracoccidioidomycosis (PCM) is a systemic mycosis of acute and chronic evolution, caused by species belonging to the genus Paracoccidioides. It is considered the most prevalent systemic endemic mycosis in Latin America, with cases in the tropical and subtropical regions. Residual PCM refers to the fibrotic scar sequelae resulting from the disease treatment which, when associated with collagen accumulation, leads to functional and anatomic alterations in the organs. The aim of this study was to evaluate the vocal function of patients with residual PCM in upper airways and digestive tract. We performed a cross-sectional study in 2010 in a cohort of 21 patients with residual PCM in upper airways and digestive tract. The average age was 49.48±9.1 years, and only two (9.5%) patients were female. The study was performed in the 1-113 month-period (median 27) after the end of drug treatment. Five (23.8%) patients had alterations in the larynx as a sequela of the disease. However, all patients had vocal changes in vocal auditory perceptual analysis by GRBASI scale. The computerized acoustic analysis using the software Vox Metria, showed that 11 patients (52.4%) presented alterations in jitter, 15 (71.4%) in shimmer, 8 (38.1%) in F0, 4 (19%) in glottal to noise excitation (GNE), 7 (33.3%) in the presence of noise and 12 (57.1%) in the presence of vibratory irregularity. The great frequency of alterations in residual PCM suggests that the patients in such phase could benefit from a multidisciplinary treatment, offering them integral monitoring of the disease, including speech rehabilitation after the PCM is healed. Copyright © 2017 Asociación Española de Micología. Publicado por Elsevier España, S.L.U. All rights reserved.
Fantini, Marco; Succo, Giovanni; Crosetti, Erika; Borragán Torre, Alfonso; Demo, Roberto; Fussi, Franco
2017-05-01
The current study aimed at investigating the immediate effects of a semi-occluded vocal tract exercise with a ventilation mask in a group of contemporary commercial singers. A randomized controlled study was carried out. Thirty professional or semi-professional singers with no voice complaints were randomly divided into two groups on recruitment: an experimental group and a control group. The same warm-up exercise was performed by the experimental group with an occluded ventilation mask placed over the nose and the mouth and by the control group without the ventilation mask. Voice was recorded before and after the exercise. Acoustic and self-assessment analysis were accomplished. The acoustic parameters of the voice samples recorded before and after training were compared, as well as the parameters' variations between the experimental and the control group. Self-assessment results of the experimental and the control group were compared too. Significant changes after the warm-up exercise included jitter, shimmer, and singing power ratio (SPR) in the experimental group. No significant changes were recorded in the control group. Significant differences between the experimental and the control group were found for ΔShimmer and ΔSPR. Self-assessment analysis confirmed a significantly higher phonatory comfort and voice quality perception for the experimental group. The results of the present study support the immediate advantageous effects on singing voice of a semi-occluded vocal tract exercise with a ventilation mask in terms of acoustic quality, phonatory comfort, and voice quality perception in contemporary commercial singers. Long-term effects still remain to be studied. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Schneiderová, Irena; Zouhar, Jan
2014-01-01
Shrews have rich vocal repertoires that include vocalizations within the human audible frequency range and ultrasonic vocalizations. Here, we recorded and analyzed in detail the acoustic structure of a vocalization with unclear functional significance that was spontaneously produced by 15 adult, captive Asian house shrews (Suncus murinus) while they were lying motionless and resting in their nests. This vocalization was usually emitted repeatedly in a long series with regular intervals. It showed some structural variability; however, the shrews most frequently emitted a tonal, low-frequency vocalization with minimal frequency modulation and a low, non-vocal click that was clearly noticeable at its beginning. There was no effect of sex, but the acoustic structure of the analyzed vocalizations differed significantly between individual shrews. The encoded individuality was low, but it cannot be excluded that this individuality would allow discrimination of family members, i.e., a male and female with their young, collectively resting in a common nest. The question remains whether the Asian house shrews indeed perceive the presence of their mates, parents or young resting in a common nest via the resting-associated vocalization and whether they use it to discriminate among their family members. Additional studies are needed to explain the possible functional significance of resting-associated vocalizations emitted by captive Asian house shrews. Our study highlights that the acoustic communication of shrews is a relatively understudied topic, particularly considering that they are highly vocal mammals. PMID:25390304
Direct numerical simulation of human phonation
NASA Astrophysics Data System (ADS)
Bodony, Daniel; Saurabh, Shakti
2017-11-01
The generation and propagation of the human voice in three-dimensions is studied using direct numerical simulation. A full body domain is employed for the purpose of directly computing the sound in the region past the speaker's mouth. The air in the vocal tract is modeled as a compressible and viscous fluid interacting with the elastic vocal folds. The vocal fold tissue material properties are multi-layered, with varying stiffness, and a linear elastic transversely isotropic model is utilized and implemented in a quadratic finite element code. The fluid-solid domains are coupled through a boundary-fitted interface and utilize a Poisson equation-based mesh deformation method. A kinematic constraint based on a specified minimum gap between the vocal folds is applied to prevent collision during glottal closure. Both near VF flow dynamics and far-field acoustics have been studied. A comparison is drawn to current two-dimensional simulations as well as to data from the literature. Near field vocal fold dynamics and glottal flow results are studied and in good agreement with previous three-dimensional phonation studies. Far-field acoustic characteristics, when compared to their two-dimensional counterpart, are shown to be sensitive to the dimensionality. Supported by the National Science Foundation (CAREER Award Number 1150439).
Experiments on the Acoustics of Whistling.
ERIC Educational Resources Information Center
Shadle, Christine H.
1983-01-01
The acoustics of speech production allows the prediction of resonances for a given vocal tract configuration. Combining these predictions with aerodynamic theory developed for mechanical whistles makes theories about human whistling more complete. Several experiments involving human whistling are reported which support the theory and indicate new…
Theoretical Aspects of Speech Production.
ERIC Educational Resources Information Center
Stevens, Kenneth N.
1992-01-01
This paper on speech production in children and youth with hearing impairments summarizes theoretical aspects, including the speech production process, sound sources in the vocal tract, vowel production, and consonant production. Examples of spectra for several classes of vowel and consonant sounds in simple syllables are given. (DB)
Novel 16-channel receive coil array for accelerated upper airway MRI at 3 Tesla.
Kim, Yoon-Chul; Hayes, Cecil E; Narayanan, Shrikanth S; Nayak, Krishna S
2011-06-01
Upper airway MRI can provide a noninvasive assessment of speech and swallowing disorders and sleep apnea. Recent work has demonstrated the value of high-resolution three-dimensional imaging and dynamic two-dimensional imaging and the importance of further improvements in spatio-temporal resolution. The purpose of the study was to describe a novel 16-channel 3 Tesla receive coil that is highly sensitive to the human upper airway and investigate the performance of accelerated upper airway MRI with the coil. In three-dimensional imaging of the upper airway during static posture, 6-fold acceleration is demonstrated using parallel imaging, potentially leading to capturing a whole three-dimensional vocal tract with 1.25 mm isotropic resolution within 9 sec of sustained sound production. Midsagittal spiral parallel imaging of vocal tract dynamics during natural speech production is demonstrated with 2 × 2 mm(2) in-plane spatial and 84 ms temporal resolution. Copyright © 2010 Wiley-Liss, Inc.
Structure and dynamics of human communication at the beginning of life.
Papousek, H; Papousek, M
1986-01-01
Although the beginning of postpartum social integration and communication has been long viewed as relevant to psychiatric theories, early parent-infant communication has become a matter of scientific investigation only recently. The present survey explains the significance of an approach based upon the general systems theory and explores to what extent the early parent-infant interaction can function as a didactic system to support the development of thought and speech. Evidence of this function has been found in those forms of parental behavior that escape the parent's conscious awareness and control, as exemplified in the vocal communication with presyllabic infants. Parents unknowingly adjust the structure and dynamics of speech to the constraints of infant capacities, detach prosodic musicality from lexical structure, and use it in particularly expressive forms for the delivery of the first prototypical messages. In this and other similar ways, parents offer an abundance of learning situations in which infants can try out various integrative operations. A biological rather than cultural provenience of the support of communicative development indicates a potential relevance for the interpretation of speech evolution. In addition to qualities of the vocal tract and to complex symbolic capacities in humans, the early intuitive support of communicative development and its playful character are suggested as species-specific determinants of speech evolution. Implications for clinical research are suggested.
Aeroelastic Model of Vocal-Fold Vibrating Element for Studying the Phonation Threshold
NASA Astrophysics Data System (ADS)
Horáček, J.; Švec, J. G.
2002-10-01
An original theoretical model for vibration onset of the vocal folds in the air-flow coming from the human subglottal tract is designed, which allows studying the influence of the physical properties of the vocal folds (e.g., geometrical shape, mass, viscosity) on their vibration characteristics (such as the natural frequencies, mode shapes of vibration and the thresholds of instability). The mathematical model of the vocal fold is designed as a simplified dynamic system of two degrees of freedom (rotation and translation) vibrating on an elastic foundation in the wall of a channel conveying air. An approximate unsteady one-dimensional flow theory for the inviscid incompressible fluid is presented for the phonatory air-flow. A generally defined shape of the vocal-fold surface is considered for expressing the unsteady aerodynamic forces in the glottis. The parameters of the mechanical part of the model, i.e., the mass, stiffness and damping matrices, are related to the geometry and material density of the vocal folds as well as to the fundamental natural frequency and damping known from experiments. The coupled numerical solution yields the vibration characteristics (natural frequencies, damping and mode shapes of vibration), including the instability thresholds of the aeroelastic system. The vibration characteristics obtained from the coupled numerical solution of the system appear to be in reasonable qualitative agreement with the physiological data and clinical observations. The model is particularly suitable for studying the phonation threshold, i.e., the onset of vibration of the vocal folds.
Roers, Friederike; Mürbe, Dirk; Sundberg, Johan
2009-07-01
Students admitted to the solo singing education at the University of Music Dresden, Germany have been submitted to a detailed physical examination of a variety of factors with relevance to voice function since 1959. In the years 1959-1991, this scheme of examinations included X-ray profiles of the singers' vocal tracts. This material of 132 X-rays of voice professionals was used to investigate different laryngeal morphological measures and their relation to vocal fold length. Further, the study aimed to investigate if there are consistent anatomical differences between singers of different voice classifications. The study design used was a retrospective analysis. Vocal fold length could be measured in 29 of these singer subjects directly. These data showed a strong correlation with the anterior-posterior diameter of the subglottis and the trachea as well as with the distance from the anterior contour of the thyroid cartilage to the anterior contour of the spine. These relations were used in an attempt to predict the 132 singers' vocal fold lengths. The results revealed a clear covariation between predicted vocal fold length and voice classification. Anterior-posterior subglottic-tracheal diameter yielded mean vocal fold lengths of 14.9, 16.0, 16.6, 18.4, 19.5, and 20.9mm for sopranos, mezzo-sopranos, altos, tenors, baritones, and basses, respectively. The data support the assumption that there are consistent anatomical laryngeal differences between singers of different voice classifications, which are of relevance to pitch range and timbre of the voice.
Characterizing the graded structure of false killer whale (Pseudorca crassidens) vocalizations.
Murray, S O; Mercado, E; Roitblat, H L
1998-09-01
The vocalizations from two, captive false killer whales (Pseudorca crassidens) were analyzed. The structure of the vocalizations was best modeled as lying along a continuum with trains of discrete, exponentially damped sinusoidal pulses at one end and continuous sinusoidal signals at the other end. Pulse trains were graded as a function of the interval between pulses where the minimum interval between pulses could be zero milliseconds. The transition from a pulse train with no inter-pulse interval to a whistle could be modeled by gradations in the degree of damping. There were many examples of vocalizations that were gradually modulated from pulse trains to whistles. There were also vocalizations that showed rapid shifts in signal type--for example, switching immediately from a whistle to a pulse train. These data have implications when considering both the possible function(s) of the vocalizations and the potential sound production mechanism(s). A short-time duty cycle measure was developed to characterize the graded structure of the vocalizations. A random sample of 500 vocalizations was characterized by combining the duty cycle measure with peak frequency measurements. The analysis method proved to be an effective metric for describing the graded structure of false killer whale vocalizations.
Histopathologic investigations of the unphonated human child vocal fold mucosa.
Sato, Kiminori; Umeno, Hirohito; Nakashima, Tadashi; Nonaka, Satoshi; Harabuchi, Yasuaki
2012-01-01
Vocal fold stellate cells (VFSCs) in the maculae flavae (MFe) located at both ends of the vocal fold mucosa are inferred to be involved in the metabolism of extracellular matrices. MFe are also considered to be an important structure in the growth and development of the human vocal fold mucosa. Tension caused by phonation (vocal fold vibration) is hypothesized to stimulate VFSCs to accelerate production of extracellular matrices. Human child vocal fold mucosae unphonated since birth were investigated histologically. Histologic analysis of human child vocal fold mucosa. Vocal fold mucosae, which have remained unphonated since birth, of two children (7 and 12 years old) with cerebral palsy were investigated by light and electron microscopy and compared with normal subjects. Vocal fold mucosae and MFe were hypoplastic and rudimentary and did not have a vocal ligament, Reinke's space, or the layered structure. The lamina propria appeared as a uniform structure. Some VFSCs in the MFe showed degeneration and not many vesicles were present at the periphery of the cytoplasm. The VFSCs synthesized fewer extracellular matrices, such as fibrous protein and glycosaminoglycan. The VFSCs appeared to have decreased activity. Vocal fold vibration (phonation) after birth is an important factor in the growth and development of the human vocal fold mucosa. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
ERIC Educational Resources Information Center
Samlan, Robin A.; Story, Brad H.
2011-01-01
Purpose: To relate vocal fold structure and kinematics to 2 acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). Method: The authors used a computational, kinematic model of the medial surfaces of the vocal folds to specify features of vocal fold structure and vibration in a…
The Vocal Repertoire of Adult and Neonate Giant Otters (Pteronura brasiliensis)
Mumm, Christina A. S.; Knörnschild, Mirjam
2014-01-01
Animals use vocalizations to exchange information about external events, their own physical or motivational state, or about individuality and social affiliation. Infant babbling can enhance the development of the full adult vocal repertoire by providing ample opportunity for practice. Giant otters are very social and frequently vocalizing animals. They live in highly cohesive groups, generally including a reproductive pair and their offspring born in different years. This basic social structure may vary in the degree of relatedness of the group members. Individuals engage in shared group activities and different social roles and thus, the social organization of giant otters provides a basis for complex and long-term individual relationships. We recorded and analysed the vocalizations of adult and neonate giant otters from wild and captive groups. We classified the adult vocalizations according to their acoustic structure, and described their main behavioural context. Additionally, we present the first description of vocalizations uttered in babbling bouts of new born giant otters. We expected to find 1) a sophisticated vocal repertoire that would reflect the species’ complex social organisation, 2) that giant otter vocalizations have a clear relationship between signal structure and function, and 3) that the vocal repertoire of new born giant otters would comprise age-specific vocalizations as well as precursors of the adult repertoire. We found a vocal repertoire with 22 distinct vocalization types produced by adults and 11 vocalization types within the babbling bouts of the neonates. A comparison within the otter subfamily suggests a relation between vocal and social complexity, with the giant otters being the socially and vocally most complex species. PMID:25391142
Different Vocal Parameters Predict Perceptions of Dominance and Attractiveness.
Hodges-Simeon, Carolyn R; Gaulin, Steven J C; Puts, David A
2010-12-01
Low mean fundamental frequency (F(0)) in men's voices has been found to positively influence perceptions of dominance by men and attractiveness by women using standardized speech. Using natural speech obtained during an ecologically valid social interaction, we examined relationships between multiple vocal parameters and dominance and attractiveness judgments. Male voices from an unscripted dating game were judged by men for physical and social dominance and by women in fertile and non-fertile menstrual cycle phases for desirability in short-term and long-term relationships. Five vocal parameters were analyzed: mean F(0) (an acoustic correlate of vocal fold size), F(0) variation, intensity (loudness), utterance duration, and formant dispersion (D(f), an acoustic correlate of vocal tract length). Parallel but separate ratings of speech transcripts served as controls for content. Multiple regression analyses were used to examine the independent contributions of each of the predictors. Physical dominance was predicted by low F(0) variation and physically dominant word content. Social dominance was predicted only by socially dominant word content. Ratings of attractiveness by women were predicted by low mean F(0), low D(f), high intensity, and attractive word content across cycle phase and mating context. Low D(f) was perceived as attractive by fertile-phase women only. We hypothesize that competitors and potential mates may attend more strongly to different components of men's voices because of the different types of information these vocal parameters provide.
Interspeaker Variability in Hard Palate Morphology and Vowel Production
ERIC Educational Resources Information Center
Lammert, Adam; Proctor, Michael; Narayanan, Shrikanth
2013-01-01
Purpose: Differences in vocal tract morphology have the potential to explain interspeaker variability in speech production. The potential acoustic impact of hard palate shape was examined in simulation, in addition to the interplay among morphology, articulation, and acoustics in real vowel production data. Method: High-front vowel production from…
Effects of Long-Term Tracheostomy on Spectral Characteristics of Vowel Production.
ERIC Educational Resources Information Center
Kamen, Ruth Saletsky; Watson, Ben C.
1991-01-01
Eight preschool children who underwent tracheotomy during the prelingual period were compared to matched controls on a variety of speech measures. Children with tracheotomies showed reduced acoustic vowel space, suggesting they were limited in their ability to produce extreme vocal tract configurations for vowels postdecannulation. Oral motor…
On the Evolution of Human Language.
ERIC Educational Resources Information Center
Lieberman, Philip
Human linguistic ability depends, in part, on the gradual evolution of man's supralaryngeal vocal tract. The anatomic basis of human speech production is the result of a long evolutionary process in which the Darwinian process of natural selection acted to retain mutations. For auditory perception, the listener operates in terms of the acoustic…
Speaking Tongues Are Actively Braced
ERIC Educational Resources Information Center
Gick, Bryan; Allen, Blake; Roewer-Després, François; Stavness, Ian
2017-01-01
Purpose: Bracing of the tongue against opposing vocal-tract surfaces such as the teeth or palate has long been discussed in the context of biomechanical, somatosensory, and aeroacoustic aspects of tongue movement. However, previous studies have tended to describe bracing only in terms of contact (rather than mechanical support), and only in…
ERIC Educational Resources Information Center
Provine, Robert R.; Emmorey, Karen
2006-01-01
The placement of laughter in the speech of hearing individuals is not random but "punctuates" speech, occurring during pauses and at phrase boundaries where punctuation would be placed in a transcript of a conversation. For speakers, language is dominant in the competition for the vocal tract since laughter seldom interrupts spoken phrases. For…
The Role of the Listener's State in Speech Perception
ERIC Educational Resources Information Center
Viswanathan, Navin
2009-01-01
Accounts of speech perception disagree on whether listeners perceive the acoustic signal (Diehl, Lotto, & Holt, 2004) or the vocal tract gestures that produce the signal (e.g., Fowler, 1986). In this dissertation, I outline a research program using a phenomenon called "perceptual compensation for coarticulation" (Mann, 1980) to examine this…
Laukkanen, Anne-Maria; Horáček, Jaromir; Havlík, Radan
2012-07-01
Vocal warm-up (WU)-related changes were studied in one male musical singer and one female speech trainer. They sustained vowels before and after WU in a magnetic resonance imaging (MRI) device. Acoustic recordings were made in a studio. The vocal tract area increased after WU, a formant cluster appeared between 2 and 4.5 kHz, and SPL increased. Evidence of larynx lowering was only found for the male. The pharyngeal inlet over the epilaryngeal outlet ratio (A(ph)/A(e)) increased by 10%-28%, being 3-4 for the male and 5-7 for the female. The results seem to represent different voice training traditions. A singer's formant cluster may be achievable without a high A(ph)/A(e) (≥ 6), but limitations of the 2D method should be taken into account.
1980-01-01
Bell3 Donald Hailey Steven Eady Fredericka Bell-Berti* Terry Halwes Jo Estill Catherine Best+ Sabina D. Koroluk Laurie Feldman Gloria J. Borden* Agnes M...tract (Fant, 1971; Stevens & House, 1955, 1961). For oral phonemes, ,.he vocal tract may simply be viewed as a tube consisting of the pharyngeal and...coupling of the nasal and oral cavities. In experiments with synthesized speech, House and Stevens (1956) varied the ratio of the driving point impedance of
Dowdall, Jayme R.; Sadow, Peter M.; Hartnick, Christopher; Vinarsky, Vladimir; Mou, Hongmei; Zhao, Rui; Song, Phillip C.; Franco, Ramon A.; Rajagopal, Jayaraj
2016-01-01
Objectives/Hypothesis A precise molecular schema for classifying the different cell types of the normal human vocal fold epithelium is lacking. We hypothesize that the true vocal fold epithelium has a cellular architecture and organization similar to that of other stratified squamous epithelia including the skin, cornea, oral mucosa, and esophagus. In analogy to disorders of the skin and gastrointestinal tract, a molecular definition of the normal cell types within the human vocal fold epithelium and a description of their geometric relationships should serve as a foundation for characterizing cellular changes associated with metaplasia, dysplasia, and cancer. Study Design Qualitative study with adult human larynges. Methods Histologic sections of normal human laryngeal tissue were analyzed for morphology (hematoxylin and eosin) and immunohistochemical protein expression profile, including cytokeratins (CK13 and CK14), cornified envelope proteins (involucrin), basal cells (NGFR/p75), and proliferation markers (Ki67). Results We demonstrated that three distinct cell strata with unique marker profiles are present within the stratified squamous epithelium of the true vocal fold. We used these definitions to establish that cell proliferation is restricted to certain cell types and layers within the epithelium. These distinct cell types are reproducible across five normal adult larynges. Conclusion We have established that three layers of cells are present within the normal adult stratified squamous epithelium of the true vocal fold. Furthermore, replicating cell populations are largely restricted to the parabasal strata within the epithelium. This delineation of distinct cell populations will facilitate future studies of vocal fold regeneration and cancer. Level of Evidence N/A. PMID:25988619
Granqvist, Svante; Simberg, Susanna; Hertegård, Stellan; Holmqvist, Sofia; Larsson, Hans; Lindestad, Per-Åke; Södersten, Maria; Hammarberg, Britta
2015-10-01
Phonation into glass tubes ('resonance tubes'), keeping the free end of the tube in water, has been a frequently used voice therapy method in Finland and more recently also in other countries. The purpose of this exploratory study was to investigate what effects tube phonation with and without water has on the larynx. Two participants were included in the study. The methods used were high-speed imaging, electroglottographic observations of vocal fold vibrations, and measurements of oral pressure during tube phonation. Results showed that the fluctuation in the back pressure during tube phonation in water altered the vocal fold vibrations. In the high-speed imaging, effects were found in the open quotient and amplitude variation of the glottal opening. The open quotient increased with increasing water depth (from 2 cm to 6 cm). A modulation effect by the water bubbles on the vocal fold vibrations was seen both in the high-speed glottal area tracings and in the electroglottography signal. A second experiment revealed that the increased average oral pressure was largely determined by the water depth. The increased open quotient can possibly be explained by an increased abduction of the vocal folds and/or a reduced transglottal pressure. The back pressure of the bubbles also modulates glottal vibrations with a possible 'massage' effect on the vocal folds. This effect and the well-defined average pressure increase due to the known water depth are different from those of other methods using a semi-occluded vocal tract.
Dowdall, Jayme R; Sadow, Peter M; Hartnick, Christopher; Vinarsky, Vladimir; Mou, Hongmei; Zhao, Rui; Song, Phillip C; Franco, Ramon A; Rajagopal, Jayaraj
2015-09-01
A precise molecular schema for classifying the different cell types of the normal human vocal fold epithelium is lacking. We hypothesize that the true vocal fold epithelium has a cellular architecture and organization similar to that of other stratified squamous epithelia including the skin, cornea, oral mucosa, and esophagus. In analogy to disorders of the skin and gastrointestinal tract, a molecular definition of the normal cell types within the human vocal fold epithelium and a description of their geometric relationships should serve as a foundation for characterizing cellular changes associated with metaplasia, dysplasia, and cancer. Qualitative study with adult human larynges. Histologic sections of normal human laryngeal tissue were analyzed for morphology (hematoxylin and eosin) and immunohistochemical protein expression profile, including cytokeratins (CK13 and CK14), cornified envelope proteins (involucrin), basal cells (NGFR/p75), and proliferation markers (Ki67). We demonstrated that three distinct cell strata with unique marker profiles are present within the stratified squamous epithelium of the true vocal fold. We used these definitions to establish that cell proliferation is restricted to certain cell types and layers within the epithelium. These distinct cell types are reproducible across five normal adult larynges. We have established that three layers of cells are present within the normal adult stratified squamous epithelium of the true vocal fold. Furthermore, replicating cell populations are largely restricted to the parabasal strata within the epithelium. This delineation of distinct cell populations will facilitate future studies of vocal fold regeneration and cancer. N/A. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
Rehabilitation of a patient with complete mandibulectomy and partial glossectomy.
Meyerson, M D; Johnson, B H; Weitzman, R S
1980-05-01
Following a number of radiologic and surgical procedures for the treatment of oral cancer, a patient with severe facial disfigurement and alteration of the vocal tract acquired acceptable speech. Consultation among referring physicians and speech pathologists can aid such a patient by facilitating the rehabilitative process through improvement of communicative skills.
Tissue-Point Motion Tracking in the Tongue from Cine MRI and Tagged MRI
ERIC Educational Resources Information Center
Woo, Jonghye; Stone, Maureen; Suo, Yuanming; Murano, Emi Z.; Prince, Jerry L.
2014-01-01
Purpose: Accurate tissue motion tracking within the tongue can help professionals diagnose and treat vocal tract--related disorders, evaluate speech quality before and after surgery, and conduct various scientific studies. The authors compared tissue tracking results from 4 widely used deformable registration (DR) methods applied to cine magnetic…
NASA Astrophysics Data System (ADS)
Smith, David R. R.; Patterson, Roy D.
2005-11-01
Glottal-pulse rate (GPR) and vocal-tract length (VTL) are related to the size, sex, and age of the speaker but it is not clear how the two factors combine to influence our perception of speaker size, sex, and age. This paper describes experiments designed to measure the effect of the interaction of GPR and VTL upon judgements of speaker size, sex, and age. Vowels were scaled to represent people with a wide range of GPRs and VTLs, including many well beyond the normal range of the population, and listeners were asked to judge the size and sex/age of the speaker. The judgements of speaker size show that VTL has a strong influence upon perceived speaker size. The results for the sex and age categorization (man, woman, boy, or girl) show that, for vowels with GPR and VTL values in the normal range, judgements of speaker sex and age are influenced about equally by GPR and VTL. For vowels with abnormal combinations of low GPRs and short VTLs, the VTL information appears to decide the sex/age judgement.
Story, Brad H.
2008-01-01
A new set of area functions for vowels has been obtained with Magnetic Resonance Imaging (MRI) from the same speaker as that previously reported in 1996 [Story, Titze, & Hoffman, JASA, 100, 537–554 (1996)]. The new area functions were derived from image data collected in 2002, whereas the previously reported area functions were based on MR images obtained in 1994. When compared, the new area function sets indicated a tendency toward a constricted pharyngeal region and expanded oral cavity relative to the previous set. Based on calculated formant frequencies and sensitivity functions, these morphological differences were shown to have the primary acoustic effect of systematically shifting the second formant (F2) downward in frequency. Multiple instances of target vocal tract shapes from a specific speaker provide additional sampling of the possible area functions that may be produced during speech production. This may be of benefit for understanding intra-speaker variability in vowel production and for further development of speech synthesizers and speech models that utilize area function information. PMID:18177162
Major depressive disorder discrimination using vocal acoustic features.
Taguchi, Takaya; Tachikawa, Hirokazu; Nemoto, Kiyotaka; Suzuki, Masayuki; Nagano, Toru; Tachibana, Ryuki; Nishimura, Masafumi; Arai, Tetsuaki
2018-01-01
The voice carries various information produced by vibrations of the vocal cords and the vocal tract. Though many studies have reported a relationship between vocal acoustic features and depression, including mel-frequency cepstrum coefficients (MFCCs) which applied to speech recognition, there have been few studies in which acoustic features allowed discrimination of patients with depressive disorder. Vocal acoustic features as biomarker of depression could make differential diagnosis of patients with depressive state. In order to achieve differential diagnosis of depression, in this preliminary study, we examined whether vocal acoustic features could allow discrimination between depressive patients and healthy controls. Subjects were 36 patients who met the criteria for major depressive disorder and 36 healthy controls with no current or past psychiatric disorders. Voices of reading out digits before and after verbal fluency task were recorded. Voices were analyzed using OpenSMILE. The extracted acoustic features, including MFCCs, were used for group comparison and discriminant analysis between patients and controls. The second dimension of MFCC (MFCC 2) was significantly different between groups and allowed the discrimination between patients and controls with a sensitivity of 77.8% and a specificity of 86.1%. The difference in MFCC 2 between the two groups reflected an energy difference of frequency around 2000-3000Hz. The MFCC 2 was significantly different between depressive patients and controls. This feature could be a useful biomarker to detect major depressive disorder. Sample size was relatively small. Psychotropics could have a confounding effect on voice. Copyright © 2017 Elsevier B.V. All rights reserved.
Tafiadis, Dionysios; Kosma, Evangelia I; Chronopoulos, Spyridon K; Papadopoulos, Aggelos; Drosos, Konstantinos; Siafaka, Vassiliki; Toki, Eugenia I; Ziavra, Nausica
2018-01-01
The relationship between smoking and alterations of the vocal tract and larynx is well known. This pathology leads to the degradation of voice performance in daily living. Multiple assessment methods of vocal tract and larynx have been developed, and in recent years they were enriched with self-reported questionnaires such as Voice Handicap Index (VHI). This study determined the cutoff points of VHI's total score and its three domains for young female smokers in Greece. These estimated cutoff points could be used by voice specialists as an indicator for further clinical evaluation (foreseeing a potential risk of developing a vocal symptom because of smoking habits). A sample of 120 female nondysphonic smokers (aged 18-31) was recruited. Participants filled out the VHI and Voice Evaluation Form. VHI's cutoff point of total score was calculated at the value of 19.50 (sensitivity: 0.780, 1-specificity: 0.133). Specifically, the construct domain of functional was 7.50 (sensitivity: 0.900, 1-specificity: 0.217), for physical it was 8.50 (sensitivity: 0.867, 1-specificity: 0.483), and for emotional it was 7.50 (sensitivity: 0.833, 1-specificity: 0.200) through the use of receiver operating characteristic. Furthermore, VHI could be used as a monitoring tool for smokers and as a feedback for smoking cessation. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vocal warm-up and breathing training for teachers: randomized clinical trial
Pereira, Lílian Paternostro de Pina; Masson, Maria Lúcia Vaz; Carvalho, Fernando Martins
2015-01-01
OBJECTIVE To compare the effectiveness of two speech therapy interventions, vocal warm-up and breathing training, focusing on teachers’ voice quality. METHODS A single-blind, randomized, parallel clinical trial was conducted. The research included 31 20 to 60-year old teachers from a public school in Salvador, BA, Northeasatern Brazil, with minimum workloads of 20 hours a week, who have or have not reported having vocal alterations. The exclusion criteria were the following: being a smoker, excessive alcohol consumption, receiving additional speech therapy assistance while taking part in the study, being affected by upper respiratory tract infections, professional use of the voice in another activity, neurological disorders, and history of cardiopulmonary pathologies. The subjects were distributed through simple randomization in groups vocal warm-up (n = 14) and breathing training (n = 17). The teachers’ voice quality was subjectively evaluated through the Voice Handicap Index (Índice de Desvantagem Vocal, in the Brazilian version) and computerized voice analysis (average fundamental frequency, jitter, shimmer, noise, and glottal-to-noise excitation ratio) by speech therapists. RESULTS Before the interventions, the groups were similar regarding sociodemographic characteristics, teaching activities, and vocal quality. The variations before and after the intervention in self-assessment and acoustic voice indicators have not significantly differed between the groups. In the comparison between groups before and after the six-week interventions, significant reductions in the Voice Handicap Index of subjects in both groups were observed, as wells as reduced average fundamental frequencies in the vocal warm-up group and increased shimmer in the breathing training group. Subjects from the vocal warm-up group reported speaking more easily and having their voices more improved in a general way as compared to the breathing training group. CONCLUSIONS Both interventions were similar regarding their effects on the teachers’ voice quality. However, each contribution has individually contributed to improve the teachers’ voice quality, especially the vocal warm-up. PMID:26465664
Torabi, Hadi; Khoddami, Seyyedeh Maryam; Ansari, Noureddin Nakhostin; Dabirmoghaddam, Payman
2016-11-01
To cross-culturally adapt of Persian Vocal Tract Discomfort (VTDp) scale and evaluate its validity and reliability in the assessment of patients with muscle tension dysphonia (MTD). A cross-sectional and prospective cohort design was used to psychometrically test the VTDp. The VTD scale was cross-culturally adapted into Persian language following standard forward-backward translations. The VTDp scale was administrated to 100 patients with MTD (54 men and 46 women; mean age: 38.05 ± 10.02 years) and 50 healthy volunteers (26 men and 24 women; mean age: 36.50 ± 12.27 years). Forty-five patients with MTD completed the VTDp 7 days later for test-retest reliability. Patients also completed the Persian Voice Handicap Index (VHIp) to assess construct validity. The results of discriminative validity demonstrated that the VTDp was able to discriminate between patients with MTD and healthy participants. The internal consistency was confirmed with Cronbach α .77 and 0.73 for VTDp frequency and severity subscales, respectively. The test-retest reliability was excellent with an intraclass correlation coefficient (ICC agreement ) of 0.93 for the frequency subscale and 0.91 for the severity subscale. Construct validity of the VTDp was shown with significant correlations between the VTDp frequency and severity subscales and the VHIp total scores (0.36 and 0.37, respectively). The standard error of measurement and smallest detectable change values for VTDp frequency (2.11 and 5.85, respectively) and severity (2.25 and 6.23, respectively) were acceptable. The Bland-Altman analysis for assessing the agreement between test and retest measurements showed no systematic bias. The VTDp is a valid and reliable self-administered scale to measure patient's vocal tract sensations in Persian-speaking population. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Maciej, Peter; Ndao, Ibrahima; Hammerschmidt, Kurt; Fischer, Julia
2013-09-23
To understand the evolution of acoustic communication in animals, it is important to distinguish between the structure and the usage of vocal signals, since both aspects are subject to different constraints. In terrestrial mammals, the structure of calls is largely innate, while individuals have a greater ability to actively initiate or withhold calls. In closely related taxa, one would therefore predict a higher flexibility in call usage compared to call structure. In the present study, we investigated the vocal repertoire of free living Guinea baboons (Papio papio) and examined the structure and usage of the animals' vocal signals. Guinea baboons live in a complex multi-level social organization and exhibit a largely tolerant and affiliative social style, contrary to most other baboon taxa. To classify the vocal repertoire of male and female Guinea baboons, cluster analyses were used and focal observations were conducted to assess the usage of vocal signals in the particular contexts. In general, the vocal repertoire of Guinea baboons largely corresponded to the vocal repertoire other baboon taxa. The usage of calls, however, differed considerably from other baboon taxa and corresponded with the specific characteristics of the Guinea baboons' social behaviour. While Guinea baboons showed a diminished usage of contest and display vocalizations (a common pattern observed in chacma baboons), they frequently used vocal signals during affiliative and greeting interactions. Our study shows that the call structure of primates is largely unaffected by the species' social system (including grouping patterns and social interactions), while the usage of calls can be more flexibly adjusted, reflecting the quality of social interactions of the individuals. Our results support the view that the primary function of social signals is to regulate social interactions, and therefore the degree of competition and cooperation may be more important to explain variation in call usage than grouping patterns or group size.
2013-01-01
Background To understand the evolution of acoustic communication in animals, it is important to distinguish between the structure and the usage of vocal signals, since both aspects are subject to different constraints. In terrestrial mammals, the structure of calls is largely innate, while individuals have a greater ability to actively initiate or withhold calls. In closely related taxa, one would therefore predict a higher flexibility in call usage compared to call structure. In the present study, we investigated the vocal repertoire of free living Guinea baboons (Papio papio) and examined the structure and usage of the animals’ vocal signals. Guinea baboons live in a complex multi-level social organization and exhibit a largely tolerant and affiliative social style, contrary to most other baboon taxa. To classify the vocal repertoire of male and female Guinea baboons, cluster analyses were used and focal observations were conducted to assess the usage of vocal signals in the particular contexts. Results In general, the vocal repertoire of Guinea baboons largely corresponded to the vocal repertoire other baboon taxa. The usage of calls, however, differed considerably from other baboon taxa and corresponded with the specific characteristics of the Guinea baboons’ social behaviour. While Guinea baboons showed a diminished usage of contest and display vocalizations (a common pattern observed in chacma baboons), they frequently used vocal signals during affiliative and greeting interactions. Conclusions Our study shows that the call structure of primates is largely unaffected by the species’ social system (including grouping patterns and social interactions), while the usage of calls can be more flexibly adjusted, reflecting the quality of social interactions of the individuals. Our results support the view that the primary function of social signals is to regulate social interactions, and therefore the degree of competition and cooperation may be more important to explain variation in call usage than grouping patterns or group size. PMID:24059742
Direct numerical simulation of human phonation
NASA Astrophysics Data System (ADS)
Saurabh, Shakti; Bodony, Daniel
2016-11-01
A direct numerical simulation study of the generation and propagation of the human voice in a full-body domain is conducted. A fully compressible fluid flow model, anatomically representative vocal tract geometry, finite deformation model for vocal fold (VF) motion and a fully coupled fluid-structure interaction model are employed. The dynamics of the multi-layered VF tissue with varying stiffness are solved using a quadratic finite element code. The fluid-solid domains are coupled through a boundary-fitted interface and utilize a Poisson equation-based mesh deformation method. A new inflow boundary condition, based upon a quasi-1D formulation with constant sub-glottal volume velocity, linked to the VF movement, has been adopted. Simulations for both child and adult phonation were performed. Acoustic characteristics obtained from these simulation are consistent with expected values. A sensitivity analysis based on VF stiffness variation is undertaken and sound pressure level/fundamental frequency trends are established. An evaluation of the data against the commonly-used quasi-1D equations suggest that the latter are not sufficient to model phonation. Phonation threshold pressures are measured for several VF stiffness variations and comparisons to clinical data are carried out. Supported by the National Science Foundation (CAREER Award Number 1150439).
Brainstem origins for cortical 'what' and 'where' pathways in the auditory system.
Kraus, Nina; Nicol, Trent
2005-04-01
We have developed a data-driven conceptual framework that links two areas of science: the source-filter model of acoustics and cortical sensory processing streams. The source-filter model describes the mechanics behind speech production: the identity of the speaker is carried largely in the vocal cord source and the message is shaped by the ever-changing filters of the vocal tract. Sensory processing streams, popularly called 'what' and 'where' pathways, are well established in the visual system as a neural scheme for separately carrying different facets of visual objects, namely their identity and their position/motion, to the cortex. A similar functional organization has been postulated in the auditory system. Both speaker identity and the spoken message, which are simultaneously conveyed in the acoustic structure of speech, can be disentangled into discrete brainstem response components. We argue that these two response classes are early manifestations of auditory 'what' and 'where' streams in the cortex. This brainstem link forges a new understanding of the relationship between the acoustics of speech and cortical processing streams, unites two hitherto separate areas in science, and provides a model for future investigations of auditory function.
Collagen Content Limits Optical Coherence Tomography Image Depth in Porcine Vocal Fold Tissue.
Garcia, Jordan A; Benboujja, Fouzi; Beaudette, Kathy; Rogers, Derek; Maurer, Rie; Boudoux, Caroline; Hartnick, Christopher J
2016-11-01
Vocal fold scarring, a condition defined by increased collagen content, is challenging to treat without a method of noninvasively assessing vocal fold structure in vivo. The goal of this study was to observe the effects of vocal fold collagen content on optical coherence tomography imaging to develop a quantifiable marker of disease. Excised specimen study. Massachusetts Eye and Ear Infirmary. Porcine vocal folds were injected with collagenase to remove collagen from the lamina propria. Optical coherence tomography imaging was performed preinjection and at 0, 45, 90, and 180 minutes postinjection. Mean pixel intensity (or image brightness) was extracted from images of collagenase- and control-treated hemilarynges. Texture analysis of the lamina propria at each injection site was performed to extract image contrast. Two-factor repeated measure analysis of variance and t tests were used to determine statistical significance. Picrosirius red staining was performed to confirm collagenase activity. Mean pixel intensity was higher at injection sites of collagenase-treated vocal folds than control vocal folds (P < .0001). Fold change in image contrast was significantly increased in collagenase-treated vocal folds than control vocal folds (P = .002). Picrosirius red staining in control specimens revealed collagen fibrils most prominent in the subepithelium and above the thyroarytenoid muscle. Specimens treated with collagenase exhibited a loss of these structures. Collagen removal from vocal fold tissue increases image brightness of underlying structures. This inverse relationship may be useful in treating vocal fold scarring in patients. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2016.
Current Understanding and Future Directions for Vocal Fold Mechanobiology
Li, Nicole Y.K.; Heris, Hossein K.; Mongeau, Luc
2013-01-01
The vocal folds, which are located in the larynx, are the main organ of voice production for human communication. The vocal folds are under continuous biomechanical stress similar to other mechanically active organs, such as the heart, lungs, tendons and muscles. During speech and singing, the vocal folds oscillate at frequencies ranging from 20 Hz to 3 kHz with amplitudes of a few millimeters. The biomechanical stress associated with accumulated phonation is believed to alter vocal fold cell activity and tissue structure in many ways. Excessive phonatory stress can damage tissue structure and induce a cell-mediated inflammatory response, resulting in a pathological vocal fold lesion. On the other hand, phonatory stress is one major factor in the maturation of the vocal folds into a specialized tri-layer structure. One specific form of vocal fold oscillation, which involves low impact and large amplitude excursion, is prescribed therapeutically for patients with mild vocal fold injuries. Although biomechanical forces affect vocal fold physiology and pathology, there is little understanding of how mechanical forces regulate these processes at the cellular and molecular level. Research into vocal fold mechanobiology has burgeoned over the past several years. Vocal fold bioreactors are being developed in several laboratories to provide a biomimic environment that allows the systematic manipulation of physical and biological factors on the cells of interest in vitro. Computer models have been used to simulate the integrated response of cells and proteins as a function of phonation stress. The purpose of this paper is to review current research on the mechanobiology of the vocal folds as it relates to growth, pathogenesis and treatment as well as to propose specific research directions that will advance our understanding of this subject. PMID:24812638
Insights Into the Role of Collagen in Vocal Fold Health and Disease.
Tang, Sharon S; Mohad, Vidisha; Gowda, Madhu; Thibeault, Susan L
2017-09-01
As one of the key fibrous proteins in the extracellular matrix, collagen plays a significant role in the structural and biomechanical characteristics of the vocal fold. Anchored fibrils of collagen create secure structural regions within the vocal folds and are strong enough to sustain vibratory impact and stretch during phonation. This contributes tensile strength, density, and organization to the vocal folds and influences health and pathogenesis. This review offers a comprehensive summary for a current understanding of collagen within normal vocal fold tissues throughout the life span as well as vocal pathology and wound repair. Further, collagen's molecular structure and biosynthesis are discussed. Finally, collagen alterations in tissue injury and repair and the incorporation of collagen-based biomaterials as a method of treating voice disorders are reviewed. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Soderstrom, Ken; Wilson, Ashley R
2013-11-01
Zebra finch song is a learned behavior dependent upon successful progress through a sensitive period of late-postnatal development. This learning is associated with maturation of distinct brain nuclei and the fiber tract interconnections between them. We have previously found remarkably distinct and dense CB1 cannabinoid receptor expression within many of these song control brain regions, implying a normal role for endocannabinoid signaling in vocal learning. Activation of CB1 receptors via daily treatments with exogenous agonist during sensorimotor stages of song learning (but not in adulthood) results in persistent alteration of song patterns. Now we are working to understand physiological changes responsible for this cannabinoid-altered vocal learning. We have found that song-altering developmental treatments are associated with changes in expression of endocannabinoid signaling elements, including CB1 receptors and the principal CNS endogenous agonist, 2-AG. Within CNS, 2-AG is produced largely through activity of the α isoform of the enzyme diacylglycerol lipase (DAGLα). To better appreciate the role of 2-AG production in normal vocal development we have determined the spatial distribution of DAGLα expression within zebra finch CNS during vocal development. Early during vocal development at 25 days, DAGLα staining is typically light and of fibroid processes. Staining peaks late in the sensorimotor stage of song learning at 75 days and is characterized by fiber, neuropil and some staining of both small and large cell somata. Results provide insight to the normal role for endocannabinoid signaling in the maturation of brain regions responsible for song learning and vocal-motor output, and suggest mechanisms by which exogenous cannabinoid exposure alters acquisition of this form of vocal communication. Copyright © 2013 Elsevier B.V. All rights reserved.
An Acoustic Study of Vowels Produced by Alaryngeal Speakers in Taiwan.
Liao, Jia-Shiou
2016-11-01
This study investigated the acoustic properties of 6 Taiwan Southern Min vowels produced by 10 laryngeal speakers (LA), 10 speakers with a pneumatic artificial larynx (PA), and 8 esophageal speakers (ES). Each of the 6 monophthongs of Taiwan Southern Min (/i, e, a, ɔ, u, ə/) was represented by a Taiwan Southern Min character and appeared randomly on a list 3 times (6 Taiwan Southern Min characters × 3 repetitions = 18 tokens). Each Taiwan Southern Min character in this study has the same syllable structure, /V/, and all were read with tone 1 (high and level). Acoustic measurements of the 1st formant, 2nd formant, and 3rd formant were taken for each vowel. Then, vowel space areas (VSAs) enclosed by /i, a, u/ were calculated for each group of speakers. The Euclidean distance between vowels in the pairs /i, a/, /i, u/, and /a, u/ was also calculated and compared across the groups. PA and ES have higher 1st or 2nd formant values than LA for each vowel. The distance is significantly shorter between vowels in the corner vowel pairs /i, a/ and /i, u/. PA and ES have a significantly smaller VSA compared with LA. In accordance with previous studies, alaryngeal speakers have higher formant frequency values than LA because they have a shortened vocal tract as a result of their total laryngectomy. Furthermore, the resonance frequencies are inversely related to the length of the vocal tract (on the basis of the assumption of the source filter theory). PA and ES have a smaller VSA and shorter distances between corner vowels compared with LA, which may be related to speech intelligibility. This hypothesis needs further support from future study.
Human Language Technology: Opportunities and Challenges
2005-01-01
because of the connections to and reliance on signal processing. Audio diarization critically includes indexing of speakers [12], since speaker ...to reduce inter- speaker variability in training. Standard techniques include vocal-tract length normalization, adaptation of acoustic models using...maximum likelihood linear regression (MLLR), and speaker -adaptive training based on MLLR. The acoustic models are mixtures of Gaussians, typically with
Analysis of 3-D Tongue Motion from Tagged and Cine Magnetic Resonance Images
ERIC Educational Resources Information Center
Xing, Fangxu; Woo, Jonghye; Lee, Junghoon; Murano, Emi Z.; Stone, Maureen; Prince, Jerry L.
2016-01-01
Purpose: Measuring tongue deformation and internal muscle motion during speech has been a challenging task because the tongue deforms in 3 dimensions, contains interdigitated muscles, and is largely hidden within the vocal tract. In this article, a new method is proposed to analyze tagged and cine magnetic resonance images of the tongue during…
Vowel Acoustic Space Development in Children: A Synthesis of Acoustic and Anatomic Data
ERIC Educational Resources Information Center
Vorperian, Houri K.; Kent, Ray D.
2007-01-01
Purpose: This article integrates published acoustic data on the development of vowel production. Age specific data on formant frequencies are considered in the light of information on the development of the vocal tract (VT) to create an anatomic-acoustic description of the maturation of the vowel acoustic space for English. Method: Literature…
Analysis of a Digital Technique for Frequency Transposition of Speech.
1985-09-01
scaled excitation function drives the vocal tract model. In a phone interview with James Kaiser of Bell Laboratories, he mentioned that current thinking...is processed using the Fast Fourier Transform (FFT) and then low pass filtered if desired. mAbe (Pb) FFT LPF- nih ~a s5ee.. S. 4Nrf#Nr Flow Chart for
Study of Airflow Out of the Mouth During Speech.
ERIC Educational Resources Information Center
Catford, J.C.; And Others
Airflow outside the mouth is diagnostic of articulatory activities in the vocal tract, both total volume-velocity and the distribution of particle velocities over the flow-front being useful for this purpose. A system for recording and displaying both these types of information is described. This consists of a matrix of l6 hot-wire anemometer flow…
Characteristics of the Lax Vowel Space in Dysarthria
ERIC Educational Resources Information Center
Tjaden, Kris; Rivera, Deanna; Wilding, Gregory; Turner, Greg S.
2005-01-01
It has been hypothesized that lax vowels may be relatively unaffected by dysarthria, owing to the reduced vocal tract shapes required for these phonetic events (G. S. Turner, K. Tjaden, & G. Weismer, 1995). It also has been suggested that lax vowels may be especially susceptible to speech mode effects (M. A. Picheny, N. I. Durlach, & L. D. Braida,…
Using leap motion to investigate the emergence of structure in speech and language.
Eryilmaz, Kerem; Little, Hannah
2017-10-01
In evolutionary linguistics, experiments using artificial signal spaces are being used to investigate the emergenceof speech structure. These signal spaces need to be continuous, non-discretized spaces from which discrete unitsand patterns can emerge. They need to be dissimilar from-but comparable with-the vocal tract, in order tominimize interference from pre-existing linguistic knowledge, while informing us about language. This is a hardbalance to strike. This article outlines a new approach that uses the Leap Motion, an infrared controller that canconvert manual movement in 3d space into sound. The signal space using this approach is more flexible than signalspaces in previous attempts. Further, output data using this approach is simpler to arrange and analyze. Theexperimental interface was built using free, and mostly open- source libraries in Python. We provide our sourcecode for other researchers as open source.
The Use of Voice Cues for Speaker Gender Recognition in Cochlear Implant Recipients
ERIC Educational Resources Information Center
Meister, Hartmut; Fürsen, Katrin; Streicher, Barbara; Lang-Roth, Ruth; Walger, Martin
2016-01-01
Purpose: The focus of this study was to examine the influence of fundamental frequency (F0) and vocal tract length (VTL) modifications on speaker gender recognition in cochlear implant (CI) recipients for different stimulus types. Method: Single words and sentences were manipulated using isolated or combined F0 and VTL cues. Using an 11-point…
Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.
2002-01-01
Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.
ERIC Educational Resources Information Center
Menard, Lucie; Schwartz, Jean-Luc; Boe, Louise-Jean
2004-01-01
The development of speech from infancy to adulthood results from the interaction of neurocognitive factors, by which phonological representations and motor control abilities are gradually acquired, and physical factors, involving the complex changes in the morphology of the articulatory system. In this article, an articulatory-to-acoustic model,…
SDI Software Technology Program Plan Version 1.5
1987-06-01
computer generation of auditory communication of meaningful speech. Most speech synthesizers are based on mathematical models of the human vocal tract, but...oral/ auditory and multimodal communications. Although such state-of-the-art interaction technology has not fully matured, user experience has...superior I pattern matching capabilities and the subliminal intuitive deduction capability. The error performance of humans can be helped by careful
The Contribution of the Insula to Motor Aspects of Speech Production: A Review and a Hypothesis
ERIC Educational Resources Information Center
Ackermann, Hermann; Riecker, Axel
2004-01-01
Based on clinical and functional imaging data, the left anterior insula has been assumed to support prearticulatory functions of speech motor control such as the ''programming'' of vocal tract gestures. In order to further elucidate this model, a recent functional magnetic resonance imaging (fMRI) study of our group (Riecker, Ackermann,…
Smith, David R R; Walters, Thomas C; Patterson, Roy D
2007-12-01
A recent study [Smith and Patterson, J. Acoust. Soc. Am. 118, 3177-3186 (2005)] demonstrated that both the glottal-pulse rate (GPR) and the vocal-tract length (VTL) of vowel sounds have a large effect on the perceived sex and age (or size) of a speaker. The vowels for all of the "different" speakers in that study were synthesized from recordings of the sustained vowels of one, adult male speaker. This paper presents a follow-up study in which a range of vowels were synthesized from recordings of four different speakers--an adult man, an adult woman, a young boy, and a young girl--to determine whether the sex and age of the original speaker would have an effect upon listeners' judgments of whether a vowel was spoken by a man, woman, boy, or girl, after they were equated for GPR and VTL. The sustained vowels of the four speakers were scaled to produce the same combinations of GPR and VTL, which covered the entire range normally encountered in every day life. The results show that listeners readily distinguish children from adults based on their sustained vowels but that they struggle to distinguish the sex of the speaker.
Chaos tool implementation for non-singer and singer voice comparison (preliminary study)
NASA Astrophysics Data System (ADS)
Dajer, Me; Pereira, Jc; Maciel, Cd
2007-11-01
Voice waveform is linked to the stretch, shorten, widen or constrict vocal tract. The articulation effects of the singer's vocal tract modify the voice acoustical characteristics and differ from the non-singer voices. In the last decades, Chaos Theory has shown the possibility to explore the dynamic nature of voice signals from a different point of view. The purpose of this paper is to apply the chaos technique of phase space reconstruction to analyze non- singers and singer voices in order to explore the signal nonlinear dynamic, and correlate them with traditional acoustic parameters. Eight voice samples of sustained vowel /i/ from non-singers and eight from singers were analyzed with "ANL" software. The samples were also acoustically analyzed with "Analise de Voz 5.0" in order to extract acoustic perturbation measures jitter and shimmer, and the coefficient of excess - (EX). The results showed different visual patterns for the two groups correlated with different jitter, shimmer, and coefficient of excess values. We conclude that these results clearly indicate the potential of phase space reconstruction technique for analysis and comparison of non-singers and singer voices. They also show a promising tool for training voices application.
Vocal Fold Epithelial Barrier in Health and Injury A Research Review
Levendoski, Elizabeth Erickson; Leydon, Ciara; Thibeault, Susan L.
2015-01-01
Purpose Vocal fold epithelium is composed of layers of individual epithelial cells joined by junctional complexes constituting a unique interface with the external environment. This barrier provides structural stability to the vocal folds and protects underlying connective tissue from injury while being nearly continuously exposed to potentially hazardous insults including environmental or systemic-based irritants such as pollutants and reflux, surgical procedures, and vibratory trauma. Small disruptions in the epithelial barrier may have a large impact on susceptibility to injury and overall vocal health. The purpose of this article is to provide a broad-based review of our current knowledge of the vocal fold epithelial barrier. Methods A comprehensive review of the literature was conducted. Details of the structure of the vocal fold epithelial barrier are presented and evaluated in the context of function in injury and pathology. The importance of the epithelial-associated vocal fold mucus barrier is also introduced. Results/Conclusions Information presented in this review is valuable for clinicians and researchers as it highlights the importance of this understudied portion of the vocal folds to overall vocal health and disease. Prevention and treatment of injury to the epithelial barrier is a significant area awaiting further investigation. PMID:24686981
Yang, Jubiao; Wang, Xingshi; Krane, Michael; Zhang, Lucy T.
2017-01-01
In this study, a fully-coupled fluid–structure interaction model is developed for studying dynamic interactions between compressible fluid and aeroelastic structures. The technique is built based on the modified Immersed Finite Element Method (mIFEM), a robust numerical technique to simulate fluid–structure interactions that has capabilities to simulate high Reynolds number flows and handles large density disparities between the fluid and the solid. For accurate assessment of this intricate dynamic process between compressible fluid, such as air and aeroelastic structures, we included in the model the fluid compressibility in an isentropic process and a solid contact model. The accuracy of the compressible fluid solver is verified by examining acoustic wave propagations in a closed and an open duct, respectively. The fully-coupled fluid–structure interaction model is then used to simulate and analyze vocal folds vibrations using compressible air interacting with vocal folds that are represented as layered viscoelastic structures. Using physiological geometric and parametric setup, we are able to obtain a self-sustained vocal fold vibration with a constant inflow pressure. Parametric studies are also performed to study the effects of lung pressure and vocal fold tissue stiffness in vocal folds vibrations. All the case studies produce expected airflow behavior and a sustained vibration, which provide verification and confidence in our future studies of realistic acoustical studies of the phonation process. PMID:29527067
Koda, Hiroki; Tokuda, Isao T; Wakita, Masumi; Ito, Tsuyoshi; Nishimura, Takeshi
2015-06-01
Whistle-like high-pitched "phee" calls are often used as long-distance vocal advertisements by small-bodied marmosets and tamarins in the dense forests of South America. While the source-filter theory proposes that vibration of the vocal fold is modified independently from the resonance of the supralaryngeal vocal tract (SVT) in human speech, a source-filter coupling that constrains the vibration frequency to SVT resonance effectively produces loud tonal sounds in some musical instruments. Here, a combined approach of acoustic analyses and simulation with helium-modulated voices was used to show that phee calls are produced principally with the same mechanism as in human speech. The animal keeps the fundamental frequency (f0) close to the first formant (F1) of the SVT, to amplify f0. Although f0 and F1 are primarily independent, the degree of their tuning can be strengthened further by a flexible source-filter interaction, the variable strength of which depends upon the cross-sectional area of the laryngeal cavity. The results highlight the evolutionary antiquity and universality of the source-filter model in primates, but the study can also explore the diversification of vocal physiology, including source-filter interaction and its anatomical basis in non-human primates.
Voicing produced by a constant velocity lung source
Howe, M. S.; McGowan, R. S.
2013-01-01
An investigation is made of the influence of subglottal boundary conditions on the prediction of voiced sounds. It is generally assumed in mathematical models of voicing that vibrations of the vocal folds are maintained by a constant subglottal mean pressure pI, whereas voicing is actually initiated by contraction of the chest cavity until the subglottal pressure becomes large enough to separate the vocal folds. The problem is reformulated to determine voicing characteristics in terms of a prescribed volumetric flow rate Qo of air from the lungs—the evolution of the resulting time-dependent subglottal mean pressure p¯_(t) is then governed by glottal mechanics, the aeroacoustics of the vocal tract, and the influence of continued contraction of the lungs. The new problem is analyzed in detail for an idealized mechanical vocal system that permits precise specification of all boundary conditions. Predictions of the glottal volume velocity pulse shape are found to be in good general agreement with the traditional constant-pI theory when pI is set equal to the time averaged value of p¯_(t). But, in all cases examined the constant-pI approximation yields values of the mean flow rates Qo and sound pressure levels that are smaller by as much as 10%. PMID:23556600
High speed MRI of laryngeal gestures during speech production
NASA Astrophysics Data System (ADS)
Nissenbaum, Jon; Hillman, Robert E.; Kobler, James B.; Curtin, Hugh D.; Halle, Morris; Kirsch, John E.
2002-05-01
Dynamic sequences of magnetic resonance images (MRI) of the vocal tract were obtained with a frame rate of 144 frames/second. Changes in vertical position and length of the vocal folds, both observable in the mid-sagittal plane, have been argued to play a role in consonant production in addition to their primary function in the control of vocal fundamental frequency (F0) [W. G. Ewan and R. Krones, J. Phonet. 2, 327-335 (1974); A. Lofqvist et al., Haskins Lab. Status Report Speech Res., SR-97/98, pp. 25-40, 1989], but temporal resolution of available techniques has hindered direct imaging of these articulations. A novel data acquisition sequence was used to circumvent the imaging time imposed by standard MRI (typically 100-500 ms). Images were constructed by having subjects rhythmically repeat short utterances 256 times using the same F0 contour. Sixty-four lines of MR data were sampled during each repetition, at 7 millisecond increments, yielding partial raw data sets for 64 time points. After all repetitions were completed, one frame per time point was constructed by combining raw data from the corresponding time point during every repetition. Preliminary results indicate vocal fold shortening and lowering only during voiced consonants and in production of lower F0.
Effects of a Straw Phonation Protocol on Acoustic and Perceptual Measures of an SATB Chorus.
Manternach, Jeremy N; Daugherty, James F
2017-12-29
Recent scholarship has suggested that semi-occluded vocal tract (SOVT) exercises may increase vocal economy of individuals by reducing vocal effort while maintaining or increasing acoustic output. Choral singers, however, may use different resonance techniques or change voicing behaviors in an effort to hear their own sound in relation to others. One investigation revealed significant increases in a choir's mean spectral energy after participating in a straw phonation protocol. However, that study reported only acoustic measures and did not include choristers' perceptions of the choral sound and their own voicing efficiency. The purpose of this study was to measure the effect of a straw phonation protocol on acoustic (long-term average spectrum) and perceptual (self-report) measures of the choral sound of an intact soprano, alto, tenor, and bass (SATB) choir. This is a quasi-experimental, one-group, pretest-posttest design. An SATB choir (N = 48 singers) performed a Renaissance motet, participated in a 4-minute voicing protocol with a small straw, and then sang the motet a second time. They completed the same procedure later in the rehearsal. Long-term average spectrum results indicated no statistically significant mean changes in spectral energy after the SOVT protocols. Most participants, however, perceived that the choir sounded better (78.26%) and that their own vocal production was more efficient or comfortable (73.91%) following the protocol. Choristers perceived less vocal effort while maintaining vocal output after straw phonation, which may feasibly align with extant solo research. More research may determine whether this result is due specifically to SOVTs. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Laryngeal evidence for the first and second passaggio in professionally trained sopranos
Burk, Fabian; Köberlein, Marie; Selamtzis, Andreas; Döllinger, Michael; Burdumy, Michael; Richter, Bernhard
2017-01-01
Introduction Due to a lack of empirical data, the current understanding of the laryngeal mechanics in the passaggio regions (i.e., the fundamental frequency ranges where vocal registration events usually occur) of the female singing voice is still limited. Material and methods In this study the first and second passaggio regions of 10 professionally trained female classical soprano singers were analyzed. The sopranos performed pitch glides from A3 (ƒo = 220 Hz) to A4 (ƒo = 440 Hz) and from A4 (ƒo = 440 Hz) to A5 (ƒo = 880 Hz) on the vowel [iː]. Vocal fold vibration was assessed with trans-nasal high speed videoendoscopy at 20,000 fps, complemented by simultaneous electroglottographic (EGG) and acoustic recordings. Register breaks were perceptually rated by 12 voice experts. Voice stability was documented with the EGG-based sample entropy. Glottal opening and closing patterns during the passaggi were analyzed, supplemented with open quotient data extracted from the glottal area waveform. Results In both the first and the second passaggio, variations of vocal fold vibration patterns were found. Four distinct patterns emerged: smooth transitions with either increasing or decreasing durations of glottal closure, abrupt register transitions, and intermediate loss of vocal fold contact. Audible register transitions (in both the first and second passaggi) generally coincided with higher sample entropy values and higher open quotient variance through the respective passaggi. Conclusions Noteworthy vocal fold oscillatory registration events occur in both the first and the second passaggio even in professional sopranos. The respective transitions are hypothesized to be caused by either (a) a change of laryngeal biomechanical properties; or by (b) vocal tract resonance effects, constituting level 2 source-filter interactions. PMID:28467509
Effects of the epilarynx area on vocal fold dynamics and the primary voice signal.
Döllinger, Michael; Berry, David A; Luegmair, Georg; Hüttner, Björn; Bohr, Christopher
2012-05-01
For the analysis of vocal fold dynamics, sub- and supraglottal influences must be taken into account, as recent studies have shown. In this work, we analyze the influence of changes in the epilaryngeal area on vocal fold dynamics. We investigate two excised female larynges in a hemilarynx setup combined with a synthetic vocal tract consisting of hard plastic and simulating the vowel /a/. Eigenmodes, amplitudes, and velocities of the oscillations, the subglottal pressures (P(sub)), and sound pressure levels (SPLs) of the generated signal are investigated as a function of three distinctive epilaryngeal areas (28.4 mm(2), 71.0 mm(2), and 205.9 mm(2)). The results showed that the SPL is independent of the epilarynx cross section and exhibits a nonlinear relation to the insufflated airflow. The P(sub) decreased with an increase in the epilaryngeal area and displayed linear relations to the airflow. The principal eigenfunctions (EEFs) from the vocal fold dynamics exhibited lateral movement for the first EEF and rotational motion for the second EEF. In total, the first two EEFs covered a minimum of 60% of the energy, with an average of more than 50% for the first EEF. Correlations to the epilarynx areas were not found. Maximal values for amplitudes (up to 2.5 mm) and velocities (up to 1.57 mm/ms) changed with varying epilaryngeal area but did not show consistent behavior for both larynges. We conclude that the size of the epilaryngeal area has significant influence on vocal fold dynamics but does not significantly affect the resultant SPL. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Jiang, Lei; Han, Juan; Yang, Limin; Ma, Hongchao; Huang, Bo
2015-10-07
Vocal folds are complex and multilayer-structured where the main layer is widely composed of hyaluronan (HA). The viscoelasticity of HA is key to voice production in the vocal fold as it affects the initiation and maintenance of phonation. In this study a simple layer-structured surface model was set up to mimic the structure of the vocal folds. The interactions between two opposing surfaces bearing HA were measured and characterised to analyse HA's response to the normal and shear compression at a stress level similar to that in the vocal fold. From the measurements of the quartz crystal microbalance, atomic force microscopy and the surface force balance, the osmotic pressure, normal interactions, elasticity change, volume fraction, refractive index and friction of both HA and the supporting protein layer were obtained. These findings may shed light on the physical mechanism of HA function in the vocal fold and the specific role of HA as an important component in the effective treatment of the vocal fold disease.
Influence of Left-Right Asymmetries on Voice Quality in Simulated Paramedian Vocal Fold Paralysis
ERIC Educational Resources Information Center
Samlan, Robin A.; Story, Brad H.
2017-01-01
Purpose: The purpose of this study was to determine the vocal fold structural and vibratory symmetries that are important to vocal function and voice quality in a simulated paramedian vocal fold paralysis. Method: A computational kinematic speech production model was used to simulate an exemplar "voice" on the basis of asymmetric…
ERIC Educational Resources Information Center
Xuan, Yue; Zhang, Zhaoyan
2014-01-01
Purpose: The purpose of this study was to explore the possible structural and material property features that may facilitate complete glottal closure in an otherwise isotropic physical vocal fold model. Method: Seven vocal fold models with different structural features were used in this study. An isotropic model was used as the baseline model, and…
Garland, Ellen C; Goldizen, Anne W; Lilley, Matthew S; Rekdahl, Melinda L; Garrigue, Claire; Constantine, Rochelle; Hauser, Nan Daeschler; Poole, M Michael; Robbins, Jooke; Noad, Michael J
2015-08-01
For cetaceans, population structure is traditionally determined by molecular genetics or photographically identified individuals. Acoustic data, however, has provided information on movement and population structure with less effort and cost than traditional methods in an array of taxa. Male humpback whales (Megaptera novaeangliae) produce a continually evolving vocal sexual display, or song, that is similar among all males in a population. The rapid cultural transmission (the transfer of information or behavior between conspecifics through social learning) of different versions of this display between distinct but interconnected populations in the western and central South Pacific region presents a unique way to investigate population structure based on the movement dynamics of a song (acoustic) display. Using 11 years of data, we investigated an acoustically based population structure for the region by comparing stereotyped song sequences among populations and years. We used the Levenshtein distance technique to group previously defined populations into (vocally based) clusters based on the overall similarity of their song display in space and time. We identified the following distinct vocal clusters: western cluster, 1 population off eastern Australia; central cluster, populations around New Caledonia, Tonga, and American Samoa; and eastern region, either a single cluster or 2 clusters, one around the Cook Islands and the other off French Polynesia. These results are consistent with the hypothesis that each breeding aggregation represents a distinct population (each occupied a single, terminal node) in a metapopulation, similar to the current understanding of population structure based on genetic and photo-identification studies. However, the central vocal cluster had higher levels of song-sharing among populations than the other clusters, indicating that levels of vocal connectivity varied within the region. Our results demonstrate the utility and value of using culturally transmitted vocal patterns as a way of defining connectivity to infer population structure. We suggest vocal patterns be incorporated by the International Whaling Commission in conjunction with traditional methods in the assessment of structure. © 2015, Society for Conservation Biology.
Status Report on Speech Research
1992-06-01
fronts following pn. -ieme of English). As for phonetic Phonological and A, ticulatomy Characteris tics of Spoken Language coarticulation, Keating proposes...activated tic and antagonistic muscles, and the fixed bound- motoneurons will produce the functionally-specific aries of the vocal tract (the immobile...00121. tative methods will be presented. 45 46 Gracca INTERPRETATION OF MOVEMENT respective acoustic signals, it can be s.~ en that there are marked
Nocturnal "humming" vocalizations: adding a piece to the puzzle of giraffe vocal communication.
Baotic, Anton; Sicks, Florian; Stoeger, Angela S
2015-09-09
Recent research reveals that giraffes (Giraffa camelopardalis sp.) exhibit a socially structured, fission-fusion system. In other species possessing this kind of society, information exchange is important and vocal communication is usually well developed. But is this true for giraffes? Giraffes are known to produce sounds, but there is no evidence that they use vocalizations for communication. Reports on giraffe vocalizations are mainly anecdotal and the missing acoustic descriptions make it difficult to establish a call nomenclature. Despite inconclusive evidence to date, it is widely assumed that giraffes produce infrasonic vocalizations similar to elephants. In order to initiate a more detailed investigation of the vocal communication in giraffes, we collected data of captive individuals during day and night. We particularly focussed on detecting tonal, infrasonic or sustained vocalizations. We collected over 947 h of audio material in three European zoos and quantified the spectral and temporal components of acoustic signals to obtain an accurate set of acoustic parameters. Besides the known burst, snorts and grunts, we detected harmonic, sustained and frequency-modulated "humming" vocalizations during night recordings. None of the recorded vocalizations were within the infrasonic range. These results show that giraffes do produce vocalizations, which, based on their acoustic structure, might have the potential to function as communicative signals to convey information about the physical and motivational attributes of the caller. The data further reveal that the assumption of infrasonic communication in giraffes needs to be considered with caution and requires further investigations in future studies.
Helium Speech: An Application of Standing Waves
NASA Astrophysics Data System (ADS)
Wentworth, Christopher D.
2011-04-01
Taking a breath of helium gas and then speaking or singing to the class is a favorite demonstration for an introductory physics course, as it usually elicits appreciative laughter, which serves to energize the class session. Students will usually report that the helium speech "raises the frequency" of the voice. A more accurate description of the phenomenon requires that we distinguish between the frequencies of sound produced by the larynx and the filtering of those frequencies by the vocal tract. We will describe here an experiment done by introductory physics students that uses helium speech as a context for learning about the human vocal system and as an application of the standing sound-wave concept. Modern acoustic analysis software easily obtained by instructors for student use allows data to be obtained and analyzed quickly.
Limiting parental feedback disrupts vocal development in marmoset monkeys
Gultekin, Yasemin B.; Hage, Steffen R.
2017-01-01
Vocalizations of human infants undergo dramatic changes across the first year by becoming increasingly mature and speech-like. Human vocal development is partially dependent on learning by imitation through social feedback between infants and caregivers. Recent studies revealed similar developmental processes being influenced by parental feedback in marmoset monkeys for apparently innate vocalizations. Marmosets produce infant-specific vocalizations that disappear after the first postnatal months. However, it is yet unclear whether parental feedback is an obligate requirement for proper vocal development. Using quantitative measures to compare call parameters and vocal sequence structure we show that, in contrast to normally raised marmosets, marmosets that were separated from parents after the third postnatal month still produced infant-specific vocal behaviour at subadult stages. These findings suggest a significant role of social feedback on primate vocal development until the subadult stages and further show that marmoset monkeys are a compelling model system for early human vocal development. PMID:28090084
Zarate, Jean Mary
2013-01-01
Singing provides a unique opportunity to examine music performance—the musical instrument is contained wholly within the body, thus eliminating the need for creating artificial instruments or tasks in neuroimaging experiments. Here, more than two decades of voice and singing research will be reviewed to give an overview of the sensory-motor control of the singing voice, starting from the vocal tract and leading up to the brain regions involved in singing. Additionally, to demonstrate how sensory feedback is integrated with vocal motor control, recent functional magnetic resonance imaging (fMRI) research on somatosensory and auditory feedback processing during singing will be presented. The relationship between the brain and singing behavior will be explored also by examining: (1) neuroplasticity as a function of various lengths and types of training, (2) vocal amusia due to a compromised singing network, and (3) singing performance in individuals with congenital amusia. Finally, the auditory-motor control network for singing will be considered alongside dual-stream models of auditory processing in music and speech to refine both these theoretical models and the singing network itself. PMID:23761746
Provine, Robert R.; Emmorey, Karen
2008-01-01
The placement of laughter in the speech of hearing individuals is not random but “punctuates” speech, occurring during pauses and at phrase boundaries where punctuation would be placed in a transcript of a conversation. For speakers, language is dominant in the competition for the vocal tract since laughter seldom interrupts spoken phrases. For users of American Sign Language, however, laughter and language do not compete in the same way for a single output channel. This study investigated whether laughter occurs simultaneously with signing, or punctuates signing, as it does speech, in 11 signed conversations (with two to five participants) that had at least one instance of audible, vocal laughter. Laughter occurred 2.7 times more often during pauses and at phrase boundaries than simultaneously with a signed utterance. Thus, the production of laughter involves higher order cognitive or linguistic processes rather than the low-level regulation of motor processes competing for a single vocal channel. In an examination of other variables, the social dynamics of deaf and hearing people were similar, with “speakers” (those signing) laughing more than their audiences and females laughing more than males. PMID:16891353
Provine, Robert R; Emmorey, Karen
2006-01-01
The placement of laughter in the speech of hearing individuals is not random but "punctuates" speech, occurring during pauses and at phrase boundaries where punctuation would be placed in a transcript of a conversation. For speakers, language is dominant in the competition for the vocal tract since laughter seldom interrupts spoken phrases. For users of American Sign Language, however, laughter and language do not compete in the same way for a single output channel. This study investigated whether laughter occurs simultaneously with signing, or punctuates signing, as it does speech, in 11 signed conversations (with two to five participants) that had at least one instance of audible, vocal laughter. Laughter occurred 2.7 times more often during pauses and at phrase boundaries than simultaneously with a signed utterance. Thus, the production of laughter involves higher order cognitive or linguistic processes rather than the low-level regulation of motor processes competing for a single vocal channel. In an examination of other variables, the social dynamics of deaf and hearing people were similar, with "speakers" (those signing) laughing more than their audiences and females laughing more than males.
A nose that roars: anatomical specializations and behavioural features of rutting male saiga
Frey, Roland; Volodin, Ilya; Volodina, Elena
2007-01-01
The involvement of the unique saiga nose in vocal production has been neglected so far. Rutting male saigas produce loud nasal roars. Prior to roaring, they tense and extend their noses in a highly stereotypic manner. This change of nose configuration includes dorsal folding and convex curving of the nasal vestibulum and is maintained until the roar ends. Red and fallow deer males that orally roar achieve a temporary increase of vocal tract length (vtl) by larynx retraction. Saiga males attain a similar effect by pulling their flexible nasal vestibulum rostrally, allowing for a temporary elongation of the nasal vocal tract by about 20%. Decrease of formant frequencies and formant dispersion, as acoustic effects of an increase of vtl, are assumed to convey important information on the quality of a dominant male to conspecifics, e.g. on body size and fighting ability. Nasal roaring in saiga may equally serve to deter rival males and to attract females. Anatomical constraints might have set a limit to the rostral pulling of the nasal vestibulum. It seems likely that the sexual dimorphism of the saiga nose was induced by sexual selection. Adult males of many mammalian species, after sniffing or licking female urine or genital secretions, raise their head and strongly retract their upper lip and small nasal vestibulum while inhalating orally. This flehmen behaviour is assumed to promote transport of non-volatile substances via the incisive ducts into the vomeronasal organs for pheromone detection. The flehmen aspect in saiga involves the extensive flexible walls of the greatly enlarged nasal vestibulum and is characterized by a distinctly concave configuration of the nose region, the reverse of that observed in nasal roaring. A step-by-step model for the gradual evolution of the saiga nose is presented here. PMID:17971116
Disrupting vagal feedback affects birdsong motor control.
Méndez, Jorge M; Dall'asén, Analía G; Goller, Franz
2010-12-15
Coordination of different motor systems for sound production involves the use of feedback mechanisms. Song production in oscines is a well-established animal model for studying learned vocal behavior. Whereas the online use of auditory feedback has been studied in the songbird model, very little is known about the role of other feedback mechanisms. Auditory feedback is required for the maintenance of stereotyped adult song. In addition, the use of somatosensory feedback to maintain pressure during song has been demonstrated with experimentally induced fluctuations in air sac pressure. Feedback information mediating this response is thought to be routed to the central nervous system via afferent fibers of the vagus nerve. Here, we tested the effects of unilateral vagotomy on the peripheral motor patterns of song production and the acoustic features. Unilateral vagotomy caused a variety of disruptions and alterations to the respiratory pattern of song, some of which affected the acoustic structure of vocalizations. These changes were most pronounced a few days after nerve resection and varied between individuals. In the most extreme cases, the motor gestures of respiration were so severely disrupted that individual song syllables or the song motif were atypically terminated. Acoustic changes also suggest altered use of the two sound generators and upper vocal tract filtering, indicating that the disruption of vagal feedback caused changes to the motor program of all motor systems involved in song production and modification. This evidence for the use of vagal feedback by the song system with disruption of song during the first days after nerve cut provides a contrast to the longer-term effects of auditory feedback disruption. It suggests a significant role for somatosensory feedback that differs from that of auditory feedback.
Disrupting vagal feedback affects birdsong motor control
Méndez, Jorge M.; Dall'Asén, Analía G.; Goller, Franz
2010-01-01
Coordination of different motor systems for sound production involves the use of feedback mechanisms. Song production in oscines is a well-established animal model for studying learned vocal behavior. Whereas the online use of auditory feedback has been studied in the songbird model, very little is known about the role of other feedback mechanisms. Auditory feedback is required for the maintenance of stereotyped adult song. In addition, the use of somatosensory feedback to maintain pressure during song has been demonstrated with experimentally induced fluctuations in air sac pressure. Feedback information mediating this response is thought to be routed to the central nervous system via afferent fibers of the vagus nerve. Here, we tested the effects of unilateral vagotomy on the peripheral motor patterns of song production and the acoustic features. Unilateral vagotomy caused a variety of disruptions and alterations to the respiratory pattern of song, some of which affected the acoustic structure of vocalizations. These changes were most pronounced a few days after nerve resection and varied between individuals. In the most extreme cases, the motor gestures of respiration were so severely disrupted that individual song syllables or the song motif were atypically terminated. Acoustic changes also suggest altered use of the two sound generators and upper vocal tract filtering, indicating that the disruption of vagal feedback caused changes to the motor program of all motor systems involved in song production and modification. This evidence for the use of vagal feedback by the song system with disruption of song during the first days after nerve cut provides a contrast to the longer-term effects of auditory feedback disruption. It suggests a significant role for somatosensory feedback that differs from that of auditory feedback. PMID:21113000
Subglottal Impedance-Based Inverse Filtering of Voiced Sounds Using Neck Surface Acceleration
Zañartu, Matías; Ho, Julio C.; Mehta, Daryush D.; Hillman, Robert E.; Wodicka, George R.
2014-01-01
A model-based inverse filtering scheme is proposed for an accurate, non-invasive estimation of the aerodynamic source of voiced sounds at the glottis. The approach, referred to as subglottal impedance-based inverse filtering (IBIF), takes as input the signal from a lightweight accelerometer placed on the skin over the extrathoracic trachea and yields estimates of glottal airflow and its time derivative, offering important advantages over traditional methods that deal with the supraglottal vocal tract. The proposed scheme is based on mechano-acoustic impedance representations from a physiologically-based transmission line model and a lumped skin surface representation. A subject-specific calibration protocol is used to account for individual adjustments of subglottal impedance parameters and mechanical properties of the skin. Preliminary results for sustained vowels with various voice qualities show that the subglottal IBIF scheme yields comparable estimates with respect to current aerodynamics-based methods of clinical vocal assessment. A mean absolute error of less than 10% was observed for two glottal airflow measures –maximum flow declination rate and amplitude of the modulation component– that have been associated with the pathophysiology of some common voice disorders caused by faulty and/or abusive patterns of vocal behavior (i.e., vocal hyperfunction). The proposed method further advances the ambulatory assessment of vocal function based on the neck acceleration signal, that previously have been limited to the estimation of phonation duration, loudness, and pitch. Subglottal IBIF is also suitable for other ambulatory applications in speech communication, in which further evaluation is underway. PMID:25400531
ERIC Educational Resources Information Center
Cleland, Joanne; Mccron, Caitlin; Scobbie, James M.
2013-01-01
Speakers possess a natural capacity for lip reading; analogous to this, there may be an intuitive ability to "tongue-read." Although the ability of untrained participants to perceive aspects of the speech signal has been explored for some visual representations of the vocal tract (e.g. talking heads), it is not yet known to what extent…
Imaging for understanding speech communication: Advances and challenges
NASA Astrophysics Data System (ADS)
Narayanan, Shrikanth
2005-04-01
Research in speech communication has relied on a variety of instrumentation methods to illuminate details of speech production and perception. One longstanding challenge has been the ability to examine real-time changes in the shaping of the vocal tract; a goal that has been furthered by imaging techniques such as ultrasound, movement tracking, and magnetic resonance imaging. The spatial and temporal resolution afforded by these techniques, however, has limited the scope of the investigations that could be carried out. In this talk, we focus on some recent advances in magnetic resonance imaging that allow us to perform near real-time investigations on the dynamics of vocal tract shaping during speech. Examples include Demolin et al. (2000) (4-5 images/second, ultra-fast turbo spin echo) and Mady et al. (2001,2002) (8 images/second, T1 fast gradient echo). A recent study by Narayanan et al. (2004) that used a spiral readout scheme to accelerate image acquisition has allowed for image reconstruction rates of 24 images/second. While these developments offer exciting prospects, a number of challenges lie ahead, including: (1) improving image acquisition protocols, hardware for enhancing signal-to-noise ratio, and optimizing spatial sampling; (2) acquiring quality synchronized audio; and (3) analyzing and modeling image data including cross-modality registration. [Work supported by NIH and NSF.
Altieri, Nicholas; Pisoni, David B.; Townsend, James T.
2012-01-01
Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield’s feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration. PMID:21968081
Altieri, Nicholas; Pisoni, David B; Townsend, James T
2011-01-01
Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.
Mackersie, Carol L.; Dewey, James; Guthrie, Lesli A.
2011-01-01
The purpose was to determine the effect of hearing loss on the ability to separate competing talkers using talker differences in fundamental frequency (F0) and apparent vocal-tract length (VTL). Performance of 13 adults with hearing loss and 6 adults with normal hearing was measured using the Coordinate Response Measure. For listeners with hearing loss, the speech was amplified and filtered according to the NAL-RP hearing aid prescription. Target-to-competition ratios varied from 0 to 9 dB. The target sentence was randomly assigned to the higher or lower values of F0 or VTL on each trial. Performance improved for F0 differences up to 9 and 6 semitones for people with normal hearing and hearing loss, respectively, but only when the target talker had the higher F0. Recognition for the lower F0 target improved when trial-to-trial uncertainty was removed (9-semitone condition). Scores improved with increasing differences in VTL for the normal-hearing group. On average, hearing-impaired listeners did not benefit from VTL cues, but substantial inter-subject variability was observed. The amount of benefit from VTL cues was related to the average hearing loss in the 1–3-kHz region when the target talker had the shorter VTL. PMID:21877813
NASA Astrophysics Data System (ADS)
Wolfe, Joe; Smith, John; Tann, John; France, Ryan
2002-11-01
Acoustic pressures may generally be measured with much greater sensitivity, dynamic range, and frequency response than acoustic currents. Consequently, most measurements of acoustic impedance consist of comparison with standard impedances. The method reported here uses a semi-infinite waveguide as the reference because its impedance is purely resistive, frequency independent and accurately known, independent of theories of the boundary layer. Waveguides are effectively infinite for pulses shorter than the echo return time, or if the attenuation due to wall losses (typically 80 dB) exceeds the dynamic range of the experiment. The measurement signal from a high output impedance source is calibrated to have Fourier components proportional to fn, where n may be 1 for convenience or chosen to improve the signal:noise ratio. The method has been used on diverse systems over the range 50 Hz to 13 kHz. When applied to systems with simple geometries, the technique yields results with a little higher wall losses than those expected from the calculations of Rayleigh and Benade. Discontinuities introduce further losses as well as the expected departures from simple one-dimensional models. Measurements on musical wind instruments and on the human vocal tract are reported. [Work supported by the Australian Research Council.
Rodent ultrasonic vocalizations are bound to active sniffing behavior
Sirotin, Yevgeniy B.; Costa, Martín Elias; Laplagne, Diego A.
2014-01-01
During rodent active behavior, multiple orofacial sensorimotor behaviors, including sniffing and whisking, display rhythmicity in the theta range (~5–10 Hz). During specific behaviors, these rhythmic patterns interlock, such that execution of individual motor programs becomes dependent on the state of the others. Here we performed simultaneous recordings of the respiratory cycle and ultrasonic vocalization emission by adult rats and mice in social settings. We used automated analysis to examine the relationship between breathing patterns and vocalization over long time periods. Rat ultrasonic vocalizations (USVs, “50 kHz”) were emitted within stretches of active sniffing (5–10 Hz) and were largely absent during periods of passive breathing (1–4 Hz). Because ultrasound was tightly linked to the exhalation phase, the sniffing cycle segmented vocal production into discrete calls and imposed its theta rhythmicity on their timing. In turn, calls briefly prolonged exhalations, causing an immediate drop in sniffing rate. Similar results were obtained in mice. Our results show that ultrasonic vocalizations are an integral part of the rhythmic orofacial behavioral ensemble. This complex behavioral program is thus involved not only in active sensing but also in the temporal structuring of social communication signals. Many other social signals of mammals, including monkey calls and human speech, show structure in the theta range. Our work points to a mechanism for such structuring in rodent ultrasonic vocalizations. PMID:25477796
Lewis, James W.; Talkington, William J.; Walker, Nathan A.; Spirou, George A.; Jajosky, Audrey; Frum, Chris
2009-01-01
The ability to detect and rapidly process harmonic sounds, which in nature are typical of animal vocalizations and speech, can be critical for communication among conspecifics and for survival. Single-unit studies have reported neurons in auditory cortex sensitive to specific combinations of frequencies (e.g. harmonics), theorized to rapidly abstract or filter for specific structures of incoming sounds, where large ensembles of such neurons may constitute spectral templates. We studied the contribution of harmonic structure to activation of putative spectral templates in human auditory cortex by using a wide variety of animal vocalizations, as well as artificially constructed iterated rippled noises (IRNs). Both the IRNs and vocalization sounds were quantitatively characterized by calculating a global harmonics-to-noise ratio (HNR). Using fMRI we identified HNR-sensitive regions when presenting either artificial IRNs and/or recordings of natural animal vocalizations. This activation included regions situated between functionally defined primary auditory cortices and regions preferential for processing human non-verbal vocalizations or speech sounds. These results demonstrate that the HNR of sound reflects an important second-order acoustic signal attribute that parametrically activates distinct pathways of human auditory cortex. Thus, these results provide novel support for putative spectral templates, which may subserve a major role in the hierarchical processing of vocalizations as a distinct category of behaviorally relevant sound. PMID:19228981
Optical coherence tomography monitoring of vocal fold femtosecond laser microsurgery
NASA Astrophysics Data System (ADS)
Wisweh, Henning; Merkel, Ulrich; Hüller, Ann-Kristin; Lüerßen, Kathrin; Lubatschowski, Holger
2007-07-01
Surgery of benign pathological alterations of the vocal folds results in permanent disphonia if the bounderies of the vocal fold layers are disregarded. Precise cutting with a femtosecond laser (fs-laser) combined with simultanous imaging of the layered structure enables accurate resections with respect to the layer boundaries. Earlier works demonstrated the capability of optical coherence tomography (OCT) for utilization on vocal folds. The layered structure can be imaged with a spatial resolution of 10-20μm up to a depth of 1.5mm. The performance of fs-laser cutting was analyzed on extracted porcine vocal folds with OCT monitoring. Histopathological sections of the same processed samples could be well correlated with the OCT images. With adequate laser parameters thermal effects induced only negligable damage to the processed tissue. The dimensions of the thermal necrosis were determined to be smaller than 1μm. OCT contolled fs-laser cutting of porcine vocal fold tissue in the μm range with minimal tissue damage is presented.
Seagraves, Kelly M.; Arthur, Ben J.; Egnor, S. E. Roian
2016-01-01
ABSTRACT Mice (Mus musculus) form large and dynamic social groups and emit ultrasonic vocalizations in a variety of social contexts. Surprisingly, these vocalizations have been studied almost exclusively in the context of cues from only one social partner, despite the observation that in many social species the presence of additional listeners changes the structure of communication signals. Here, we show that male vocal behavior elicited by female odor is affected by the presence of a male audience – with changes in vocalization count, acoustic structure and syllable complexity. We further show that single sensory cues are not sufficient to elicit this audience effect, indicating that multiple cues may be necessary for an audience to be apparent. Together, these experiments reveal that some features of mouse vocal behavior are only expressed in more complex social situations, and introduce a powerful new assay for measuring detection of the presence of social partners in mice. PMID:27207951
Seagraves, Kelly M; Arthur, Ben J; Egnor, S E Roian
2016-05-15
Mice (Mus musculus) form large and dynamic social groups and emit ultrasonic vocalizations in a variety of social contexts. Surprisingly, these vocalizations have been studied almost exclusively in the context of cues from only one social partner, despite the observation that in many social species the presence of additional listeners changes the structure of communication signals. Here, we show that male vocal behavior elicited by female odor is affected by the presence of a male audience - with changes in vocalization count, acoustic structure and syllable complexity. We further show that single sensory cues are not sufficient to elicit this audience effect, indicating that multiple cues may be necessary for an audience to be apparent. Together, these experiments reveal that some features of mouse vocal behavior are only expressed in more complex social situations, and introduce a powerful new assay for measuring detection of the presence of social partners in mice. © 2016. Published by The Company of Biologists Ltd.
Hodges-Simeon, Carolyn R; Gaulin, Steven J C; Puts, David A
2011-06-01
Men's copulatory success can often be predicted by measuring traits involved in male contests and female choice. Previous research has demonstrated relationships between one such vocal trait in men, mean fundamental frequency (F(0)), and the outcomes and indicators of sexual success with women. The present study investigated the role of another vocal parameter, F(0) variation (the within-subject SD in F(0) across the utterance, F(0)-SD), in predicting men's reported number of female sexual partners in the last year. Male participants (N = 111) competed with another man for a date with a woman. Recorded interactions with the competitor ("competitive recording") and the woman ("courtship recording") were analyzed for five non-linguistic vocal parameters: F(0)-SD, mean F(0), intensity, duration, and formant dispersion (D( f ), an acoustic correlate of vocal tract length), as well as dominant and attractive linguistic content. After controlling for age and attitudes toward uncommitted sex (SOI), lower F(0)-SD (i.e., a more monotone voice) and more dominant linguistic content were strong predictors of the number of past-year sexual partners, whereas mean F(0) and D( f ) did not significantly predict past-year partners. These contrasts have implications for the relative importance of male contests and female choice in shaping men's mating success and hence the origins and maintenance of sexually dimorphic traits in humans.
Fulcher, Lewis P.; Scherer, Ronald C.
2011-01-01
In an important paper on the physics of small amplitude oscillations, Titze showed that the essence of the vertical phase difference, which allows energy to be transferred from the flowing air to the motion of the vocal folds, could be captured in a surface wave model, and he derived a formula for the phonation threshold pressure with an explicit dependence on the geometrical and biomechanical properties of the vocal folds. The formula inspired a series of experiments [e.g., R. Chan and I. Titze, J. Acoust. Soc. Am 119, 2351–2362 (2006)]. Although the experiments support many aspects of Titze’s formula, including a linear dependence on the glottal half-width, the behavior of the experiments at the smallest values of this parameter is not consistent with the formula. It is shown that a key element for removing this discrepancy lies in a careful examination of the properties of the entrance loss coefficient. In particular, measurements of the entrance loss coefficient at small widths done with a physical model of the glottis (M5) show that this coefficient varies inversely with the glottal width. A numerical solution of the time-dependent equations of the surface wave model shows that adding a supraglottal vocal tract lowers the phonation threshold pressure by an amount approximately consistent with Chan and Titze’s experiments. PMID:21895097
Fulcher, Lewis P; Scherer, Ronald C
2011-09-01
In an important paper on the physics of small amplitude oscillations, Titze showed that the essence of the vertical phase difference, which allows energy to be transferred from the flowing air to the motion of the vocal folds, could be captured in a surface wave model, and he derived a formula for the phonation threshold pressure with an explicit dependence on the geometrical and biomechanical properties of the vocal folds. The formula inspired a series of experiments [e.g., R. Chan and I. Titze, J. Acoust. Soc. Am 119, 2351-2362 (2006)]. Although the experiments support many aspects of Titze's formula, including a linear dependence on the glottal half-width, the behavior of the experiments at the smallest values of this parameter is not consistent with the formula. It is shown that a key element for removing this discrepancy lies in a careful examination of the properties of the entrance loss coefficient. In particular, measurements of the entrance loss coefficient at small widths done with a physical model of the glottis (M5) show that this coefficient varies inversely with the glottal width. A numerical solution of the time-dependent equations of the surface wave model shows that adding a supraglottal vocal tract lowers the phonation threshold pressure by an amount approximately consistent with Chan and Titze's experiments. © 2011 Acoustical Society of America
Charlton, Benjamin D.; Ellis, William A. H.; McKinnon, Allan J.; Brumm, Jacqui; Nilsson, Karen; Fitch, W. Tecumseh
2011-01-01
The ability to signal individual identity using vocal signals and distinguish between conspecifics based on vocal cues is important in several mammal species. Furthermore, it can be important for receivers to differentiate between callers in reproductive contexts. In this study, we used acoustic analyses to determine whether male koala bellows are individually distinctive and to investigate the relative importance of different acoustic features for coding individuality. We then used a habituation-discrimination paradigm to investigate whether koalas discriminate between the bellow vocalisations of different male callers. Our results show that male koala bellows are highly individualized, and indicate that cues related to vocal tract filtering contribute the most to vocal identity. In addition, we found that male and female koalas habituated to the bellows of a specific male showed a significant dishabituation when they were presented with bellows from a novel male. The significant reduction in behavioural response to a final rehabituation playback shows this was not a chance rebound in response levels. Our findings indicate that male koala bellows are highly individually distinctive and that the identity of male callers is functionally relevant to male and female koalas during the breeding season. We go on to discuss the biological relevance of signalling identity in this species' sexual communication and the potential practical implications of our findings for acoustic monitoring of male population levels. PMID:21633499
Social Vocalizations of Big Brown Bats Vary with Behavioral Context
Gadziola, Marie A.; Grimsley, Jasmine M. S.; Faure, Paul A.; Wenstrup, Jeffrey J.
2012-01-01
Bats are among the most gregarious and vocal mammals, with some species demonstrating a diverse repertoire of syllables under a variety of behavioral contexts. Despite extensive characterization of big brown bat (Eptesicus fuscus) biosonar signals, there have been no detailed studies of adult social vocalizations. We recorded and analyzed social vocalizations and associated behaviors of captive big brown bats under four behavioral contexts: low aggression, medium aggression, high aggression, and appeasement. Even limited to these contexts, big brown bats possess a rich repertoire of social vocalizations, with 18 distinct syllable types automatically classified using a spectrogram cross-correlation procedure. For each behavioral context, we describe vocalizations in terms of syllable acoustics, temporal emission patterns, and typical syllable sequences. Emotion-related acoustic cues are evident within the call structure by context-specific syllable types or variations in the temporal emission pattern. We designed a paradigm that could evoke aggressive vocalizations while monitoring heart rate as an objective measure of internal physiological state. Changes in the magnitude and duration of elevated heart rate scaled to the level of evoked aggression, confirming the behavioral state classifications assessed by vocalizations and behavioral displays. These results reveal a complex acoustic communication system among big brown bats in which acoustic cues and call structure signal the emotional state of a caller. PMID:22970247
Hahn, Allison H; Campbell, Kimberley A; Congdon, Jenna V; Hoang, John; McMillan, Neil; Scully, Erin N; Yong, Joshua J H; Elie, Julie E; Sturdy, Christopher B
2017-07-01
Chickadees produce a multi-note chick-a-dee call in multiple socially relevant contexts. One component of this call is the D note, which is a low-frequency and acoustically complex note with a harmonic-like structure. In the current study, we tested black-capped chickadees on a between-category operant discrimination task using vocalizations with acoustic structures similar to black-capped chickadee D notes, but produced by various songbird species, in order to examine the role that phylogenetic distance plays in acoustic perception of vocal signals. We assessed the extent to which discrimination performance was influenced by the phylogenetic relatedness among the species producing the vocalizations and by the phylogenetic relatedness between the subjects' species (black-capped chickadees) and the vocalizers' species. We also conducted a bioacoustic analysis and discriminant function analysis in order to examine the acoustic similarities among the discrimination stimuli. A previous study has shown that neural activation in black-capped chickadee auditory and perceptual brain regions is similar following the presentation of these vocalization categories. However, we found that chickadees had difficulty discriminating between forward and reversed black-capped chickadee D notes, a result that directly corresponded to the bioacoustic analysis indicating that these stimulus categories were acoustically similar. In addition, our results suggest that the discrimination between vocalizations produced by two parid species (chestnut-backed chickadees and tufted titmice) is perceptually difficult for black-capped chickadees, a finding that is likely in part because these vocalizations contain acoustic similarities. Overall, our results provide evidence that black-capped chickadees' perceptual abilities are influenced by both phylogenetic relatedness and acoustic structure.
In Vivo Measurement of Pediatric Vocal Fold Motion Using Structured Light Laser Projection
Patel, Rita R.; Donohue, Kevin D.; Lau, Daniel; Unnikrishnan, Harikrishnan
2013-01-01
Summary Objective The aim of the study was to present the development of a miniature structured light laser projection endoscope and to quantify vocal fold length and vibratory features related to impact stress of the pediatric glottis using high-speed imaging. Study Design The custom-developed laser projection system consists of a green laser with a 4-mm diameter optics module at the tip of the endoscope, projecting 20 vertical laser lines on the glottis. Measurements of absolute phonatory vocal fold length, membranous vocal fold length, peak amplitude, amplitude-to-length ratio, average closing velocity, and impact velocity were obtained in five children (6–9 years), two adult male and three adult female participants without voice disorders, and one child (10 years) with bilateral vocal fold nodules during modal phonation. Results Independent measurements made on the glottal length of a vocal fold phantom demonstrated a 0.13 mm bias error with a standard deviation of 0.23 mm, indicating adequate precision and accuracy for measuring vocal fold structures and displacement. First, in vivo measurements of amplitude-to-length ratio, peak closing velocity, and impact velocity during phonation in pediatric population and a child with vocal fold nodules are reported. Conclusion The proposed laser projection system can be used to obtain in vivo measurements of absolute length and vibratory features in children and adults. Children have large amplitude-to-length ratio compared with typically developing adults, whereas nodules result in larger peak amplitude, amplitude-to-length ratio, average closing velocity, and impact velocity compared with typically developing children. PMID:23809569
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dornfeld, Ken; Simmons, Joel R.; Karnell, Lucy
Purpose: To test the hypothesis that radiation dose to key sites in the upper aerodigestive tract is associated with long-term functional outcome after (chemo)radiotherapy for head-and-neck cancers. Methods and Materials: This study examined the outcome for 27 patients treated with intensity-modulated radiotherapy for definitive management of their head-and-neck cancer who were disease free for at least 1 year after treatment. Head-and-neck cancer-specific quality of life (QoL) was assessed before treatment and at 1 year after treatment. Type of diet tolerated, presence of a feeding tube, and degree of weight loss 1 year after treatment were also used as outcome measures.more » Radiation doses delivered to various points along the upper aerodigestive tract, including base of tongue, lateral pharyngeal walls, and laryngeal structures, were determined from each treatment plan. Radiation doses for each of these points were tested for correlation with outcome measures. Results: Higher doses delivered to the aryepiglottic folds, false vocal cords, and lateral pharyngeal walls near the false cords correlated with a more restrictive diet, and higher doses to the aryepiglottic folds correlated with greater weight loss (p < 0.05) 1 year after therapy. Better posttreatment speech QoL scores were associated with lower doses delivered to structures within and surrounding the larynx. Conclusion: Our data show an inverse relationship between radiation dose delivered to laryngeal structures and speech and diet and QoL outcomes after definitive (chemo)radiation treatment. These findings suggest that efforts to deliver lower doses to laryngeal structures may improve outcomes after definitive (chemo)radiation therapy.« less
North Indian Classical Vocal Music for the Classroom
ERIC Educational Resources Information Center
Arya, Divya D.
2015-01-01
This article offers information that will allow music educators to incorporate North Indian classical vocal music into a multicultural music education curriculum. Obstacles to teaching North Indian classical vocal music are acknowledged, including lack of familiarity with the cultural/structural elements and challenges in teaching ear training and…
Vocal Fold Epithelial Barrier in Health and Injury: A Research Review
ERIC Educational Resources Information Center
Levendoski, Elizabeth Erickson; Leydon, Ciara; Thibeault, Susan L.
2014-01-01
Purpose: Vocal fold epithelium is composed of layers of individual epithelial cells joined by junctional complexes constituting a unique interface with the external environment. This barrier provides structural stability to the vocal folds and protects underlying connective tissue from injury while being nearly continuously exposed to potentially…
Fluid-Structure Interactions with Flexible and Rigid Bodies
NASA Astrophysics Data System (ADS)
Daily, David Jesse
Fluid structure interactions occur to some extent in nearly every type of fluid flow. Understanding how structures interact with fluids and visa-versa is of vital importance in many engineering applications. The purpose of this research is to explore how fluids interact with flexible and rigid structures. A computational model was used to model the fluid structure interactions of vibrating synthetic vocal folds. The model simulated the coupling of the fluid and solid domains using a fluid-structure interface boundary condition. The fluid domain used a slightly compressible flow solver to allow for the possibility of acoustic coupling with the subglottal geometry and vibration of the vocal fold model. As the subglottis lengthened, the frequency of vibration decreased until a new acoustic mode could form in the subglottis. Synthetic aperture particle image velocimetry (SAPIV) is a three-dimensional particle tracking technique. SAPIV was used to image the jet of air that emerges from vibrating human vocal folds (glottal jet) during phonation. The three-dimensional reconstruction of the glottal jet found faint evidence of flow characteristics seen in previous research, such as axis-switching, but did not have sufficient resolution to detect small features. SAPIV was further applied to reconstruct the smaller flow characteristics of the glottal jet of vibrating synthetic vocal folds. Two- and four-layer synthetic vocal fold models were used to determine how the glottal jet from the synthetic models compared to the glottal jet from excised human vocal folds. The two- and four-layer models clearly exhibited axis-switching which has been seen in other 3D analyses of the glottal jet. Cavitation in a quiescent fluid can break a rigid structure such as a glass bottle. A new cavitation number was derived to include acceleration and pressure head at cavitation onset. A cavitation stick was used to validate the cavitation number by filling it with different depths and hitting the stick to cause fluid cavitation. Acceleration was measured using an accelerometer and cavitation bubbles were detected using a high-speed camera. Cavitation in an accelerating fluid occurred at a cavitation number of 1. Keywords: Fluid structure interaction, vocal folds, acoustics, SAPIV, cavitation, slightly compressible
Vocal learning in the functionally referential food grunts of chimpanzees.
Watson, Stuart K; Townsend, Simon W; Schel, Anne M; Wilke, Claudia; Wallace, Emma K; Cheng, Leveda; West, Victoria; Slocombe, Katie E
2015-02-16
One standout feature of human language is our ability to reference external objects and events with socially learned symbols, or words. Exploring the phylogenetic origins of this capacity is therefore key to a comprehensive understanding of the evolution of language. While non-human primates can produce vocalizations that refer to external objects in the environment, it is generally accepted that their acoustic structure is fixed and a product of arousal states. Indeed, it has been argued that the apparent lack of flexible control over the structure of referential vocalizations represents a key discontinuity with language. Here, we demonstrate vocal learning in the acoustic structure of referential food grunts in captive chimpanzees. We found that, following the integration of two groups of adult chimpanzees, the acoustic structure of referential food grunts produced for a specific food converged over 3 years. Acoustic convergence arose independently of preference for the food, and social network analyses indicated this only occurred after strong affiliative relationships were established between the original subgroups. We argue that these data represent the first evidence of non-human animals actively modifying and socially learning the structure of a meaningful referential vocalization from conspecifics. Our findings indicate that primate referential call structure is not simply determined by arousal and that the socially learned nature of referential words in humans likely has ancient evolutionary origins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Youngsters do not pay attention to conversational rules: is this so for nonhuman primates?
Lemasson, A; Glas, L; Barbu, S; Lacroix, A; Guilloux, M; Remeuf, K; Koda, H
2011-01-01
The potentiality to find precursors of human language in nonhuman primates is questioned because of differences related to the genetic determinism of human and nonhuman primate acoustic structures. Limiting the debate to production and acoustic plasticity might have led to underestimating parallels between human and nonhuman primates. Adult-young differences concerning vocal usage have been reported in various primate species. A key feature of language is the ability to converse, respecting turn-taking rules. Turn-taking structures some nonhuman primates' adult vocal exchanges, but the development and the cognitive relevancy of this rule have never been investigated in monkeys. Our observations of Campbell's monkeys' spontaneous vocal utterances revealed that juveniles broke the turn-taking rule more often than did experienced adults. Only adults displayed different levels of interest when hearing playbacks of vocal exchanges respecting or not the turn-taking rule. This study strengthens parallels between human conversations and nonhuman primate vocal exchanges.
Fournet, Michelle E; Szabo, Andy; Mellinger, David K
2015-01-01
On low-latitude breeding grounds, humpback whales produce complex and highly stereotyped songs as well as a range of non-song sounds associated with breeding behaviors. While on their Southeast Alaskan foraging grounds, humpback whales produce a range of previously unclassified non-song vocalizations. This study investigates the vocal repertoire of Southeast Alaskan humpback whales from a sample of 299 non-song vocalizations collected over a 3-month period on foraging grounds in Frederick Sound, Southeast Alaska. Three classification systems were used, including aural spectrogram analysis, statistical cluster analysis, and discriminant function analysis, to describe and classify vocalizations. A hierarchical acoustic structure was identified; vocalizations were classified into 16 individual call types nested within four vocal classes. The combined classification method shows promise for identifying variability in call stereotypy between vocal groupings and is recommended for future classification of broad vocal repertoires.
Chen, Cheryl Chia-Hui; Wu, Kuo-Hsiang; Ku, Shih-Chi; Chan, Ding-Cheng; Lee, Jang-Jaer; Wang, Tyng-Guey; Hsiao, Tzu-Yu
2018-06-01
To describe the sequelae of oral endotracheal intubation by evaluating prevalence rates of structural injury, hyposalivation, and impaired vocal production over 14days following extubation. Consecutive adults (≥20years, N=114) with prolonged (≥48h) endotracheal intubation were enrolled from medical intensive care units at a university hospital. Participants were assessed by trained nurses at 2, 7, and 14days after extubation, using a standardized bedside screening protocol. Within 48-hour postextubation, structural injuries were common, with 51% having restricted mouth opening. Unstimulated salivary flow was reduced in 43%. For vocal production, 51% had inadequate breathing support for phonation, dysphonia was common (94% had hoarseness and 36% showed reduced efficiency of vocal fold closure), and >40% had impaired articulatory precision. By 14days postextubation, recovery was noted in most conditions, but reduced efficiency of vocal fold closure persisted. Restricted mouth opening (39%) and reduced salivary flow (34%) remained highly prevalent. After extubation, restricted mouth opening, reduced salivary flow, and dysphonia were common and prolonged in recovery. Reduced efficiency of vocal cord closure persisted at 14days postextubation. The extent and duration of these sequelae remind clinicians to screen for them up to 2weeks after extubation. Copyright © 2017 Elsevier Inc. All rights reserved.
Expression and distribution of hyaluronic acid and CD44 in unphonated human vocal fold mucosa.
Sato, Kiminori; Umeno, Hirohito; Nakashima, Tadashi; Nonaka, Satoshi; Harabuchi, Yasuaki
2009-11-01
The tension caused by phonation (vocal fold vibration) is hypothesized to stimulate vocal fold stellate cells (VFSCs) in the maculae flavae (MFe) to accelerate production of extracellular matrices. The distribution of hyaluronic acid (HA) and expression of CD44 (a cell surface receptor for HA) were examined in human vocal fold mucosae (VFMe) that had remained unphonated since birth. Five specimens of VFMe (3 adults, 2 children) that had remained unphonated since birth were investigated with Alcian blue staining, hyaluronidase digestion, and immunohistochemistry for CD44. The VFMe containing MFe were hypoplastic and rudimentary. The VFMe did not have a vocal ligament, Reinke's space, or a layered structure, and the lamina propria appeared as a uniform structure. In the children, HA was distributed in the VFMe containing MFe. In the adults, HA had decreased in the VFMe containing MFe. In both groups, the VFSCs in the MFe and the fibroblasts in the lamina propria expressed little CD44. This study supports the hypothesis that the tensions caused by vocal fold vibration stimulate the VFSCs in the MFe to accelerate production of extracellular matrices and form the layered structure. Phonation after birth is one of the important factors in the growth and development of the human VFMe.
Speech production following partial glossectomy.
Fletcher, S G
1988-08-01
Changes in the dimensions and patterns of articulation used by three speakers to compensate for different amounts of tongue tissue excised during partial glossectomy were investigated. Place of articulation was shifted to parts of the vocal tract congruent with the speakers' surgically altered lingual morphology. Certain metrical properties of the articulatory gestures, such as width of the sibilant groove, were maintained. Intelligibility data indicated that perceptually acceptable substitute sounds could be produced by such transposed gestures.
NASA Astrophysics Data System (ADS)
Saidi, Hiba; Erath, Byron D.
2015-11-01
The vocal folds play a major role in human communication by initiating voiced sound production. During voiced speech, the vocal folds are set into sustained vibrations. Synthetic self-oscillating vocal fold models are regularly employed to gain insight into flow-structure interactions governing the phonation process. Commonly, a fixed boundary condition is applied to the lateral, anterior, and posterior sides of the synthetic vocal fold models. However, physiological observations reveal the presence of adipose tissue on the lateral surface between the thyroid cartilage and the vocal folds. The goal of this study is to investigate the influence of including this substrate layer of adipose tissue on the dynamics of phonation. For a more realistic representation of the human vocal folds, synthetic multi-layer vocal fold models have been fabricated and tested while including a soft lateral layer representative of adipose tissue. Phonation parameters have been collected and are compared to those of the standard vocal fold models. Results show that vocal fold kinematics are affected by adding the adipose tissue layer as a new boundary condition.
In Vivo measurement of pediatric vocal fold motion using structured light laser projection.
Patel, Rita R; Donohue, Kevin D; Lau, Daniel; Unnikrishnan, Harikrishnan
2013-07-01
The aim of the study was to present the development of a miniature structured light laser projection endoscope and to quantify vocal fold length and vibratory features related to impact stress of the pediatric glottis using high-speed imaging. The custom-developed laser projection system consists of a green laser with a 4-mm diameter optics module at the tip of the endoscope, projecting 20 vertical laser lines on the glottis. Measurements of absolute phonatory vocal fold length, membranous vocal fold length, peak amplitude, amplitude-to-length ratio, average closing velocity, and impact velocity were obtained in five children (6-9 years), two adult male and three adult female participants without voice disorders, and one child (10 years) with bilateral vocal fold nodules during modal phonation. Independent measurements made on the glottal length of a vocal fold phantom demonstrated a 0.13mm bias error with a standard deviation of 0.23mm, indicating adequate precision and accuracy for measuring vocal fold structures and displacement. First, in vivo measurements of amplitude-to-length ratio, peak closing velocity, and impact velocity during phonation in pediatric population and a child with vocal fold nodules are reported. The proposed laser projection system can be used to obtain in vivo measurements of absolute length and vibratory features in children and adults. Children have large amplitude-to-length ratio compared with typically developing adults, whereas nodules result in larger peak amplitude, amplitude-to-length ratio, average closing velocity, and impact velocity compared with typically developing children. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Cazau, Dorian; Adam, Olivier; Aubin, Thierry; Laitman, Jeffrey T; Reidenberg, Joy S
2016-10-10
Although mammalian vocalizations are predominantly harmonically structured, they can exhibit an acoustic complexity with nonlinear vocal sounds, including deterministic chaos and frequency jumps. Such sounds are normative events in mammalian vocalizations, and can be directly traceable to the nonlinear nature of vocal-fold dynamics underlying typical mammalian sound production. In this study, we give qualitative descriptions and quantitative analyses of nonlinearities in the song repertoire of humpback whales from the Ste Marie channel (Madagascar) to provide more insight into the potential communication functions and underlying production mechanisms of these features. A low-dimensional biomechanical modeling of the whale's U-fold (vocal folds homolog) is used to relate specific vocal mechanisms to nonlinear vocal features. Recordings of living humpback whales were searched for occurrences of vocal nonlinearities (instabilities). Temporal distributions of nonlinearities were assessed within sound units, and between different songs. The anatomical production sources of vocal nonlinearities and the communication context of their occurrences in recordings are discussed. Our results show that vocal nonlinearities may be a communication strategy that conveys information about the whale's body size and physical fitness, and thus may be an important component of humpback whale songs.
NASA Astrophysics Data System (ADS)
Cazau, Dorian; Adam, Olivier; Aubin, Thierry; Laitman, Jeffrey T.; Reidenberg, Joy S.
2016-10-01
Although mammalian vocalizations are predominantly harmonically structured, they can exhibit an acoustic complexity with nonlinear vocal sounds, including deterministic chaos and frequency jumps. Such sounds are normative events in mammalian vocalizations, and can be directly traceable to the nonlinear nature of vocal-fold dynamics underlying typical mammalian sound production. In this study, we give qualitative descriptions and quantitative analyses of nonlinearities in the song repertoire of humpback whales from the Ste Marie channel (Madagascar) to provide more insight into the potential communication functions and underlying production mechanisms of these features. A low-dimensional biomechanical modeling of the whale’s U-fold (vocal folds homolog) is used to relate specific vocal mechanisms to nonlinear vocal features. Recordings of living humpback whales were searched for occurrences of vocal nonlinearities (instabilities). Temporal distributions of nonlinearities were assessed within sound units, and between different songs. The anatomical production sources of vocal nonlinearities and the communication context of their occurrences in recordings are discussed. Our results show that vocal nonlinearities may be a communication strategy that conveys information about the whale’s body size and physical fitness, and thus may be an important component of humpback whale songs.
Ultrasonic Vocalizations Emitted by Flying Squirrels
Murrant, Meghan N.; Bowman, Jeff; Garroway, Colin J.; Prinzen, Brian; Mayberry, Heather; Faure, Paul A.
2013-01-01
Anecdotal reports of ultrasound use by flying squirrels have existed for decades, yet there has been little detailed analysis of their vocalizations. Here we demonstrate that two species of flying squirrel emit ultrasonic vocalizations. We recorded vocalizations from northern (Glaucomys sabrinus) and southern (G. volans) flying squirrels calling in both the laboratory and at a field site in central Ontario, Canada. We demonstrate that flying squirrels produce ultrasonic emissions through recorded bursts of broadband noise and time-frequency structured frequency modulated (FM) vocalizations, some of which were purely ultrasonic. Squirrels emitted three types of ultrasonic calls in laboratory recordings and one type in the field. The variety of signals that were recorded suggest that flying squirrels may use ultrasonic vocalizations to transfer information. Thus, vocalizations may be an important, although still poorly understood, aspect of flying squirrel social biology. PMID:24009728
Harmonic template neurons in primate auditory cortex underlying complex sound processing
Feng, Lei
2017-01-01
Harmonicity is a fundamental element of music, speech, and animal vocalizations. How the auditory system extracts harmonic structures embedded in complex sounds and uses them to form a coherent unitary entity is not fully understood. Despite the prevalence of sounds rich in harmonic structures in our everyday hearing environment, it has remained largely unknown what neural mechanisms are used by the primate auditory cortex to extract these biologically important acoustic structures. In this study, we discovered a unique class of harmonic template neurons in the core region of auditory cortex of a highly vocal New World primate, the common marmoset (Callithrix jacchus), across the entire hearing frequency range. Marmosets have a rich vocal repertoire and a similar hearing range to that of humans. Responses of these neurons show nonlinear facilitation to harmonic complex sounds over inharmonic sounds, selectivity for particular harmonic structures beyond two-tone combinations, and sensitivity to harmonic number and spectral regularity. Our findings suggest that the harmonic template neurons in auditory cortex may play an important role in processing sounds with harmonic structures, such as animal vocalizations, human speech, and music. PMID:28096341
Bjørgesaeter, Anders; Ugland, Karl Inne; Bjørge, Arne
2004-10-01
The male harbor seal (Phoca vitulina) produces broadband nonharmonic vocalizations underwater during the breeding season. In total, 120 vocalizations from six colonies were analyzed to provide a description of the acoustic structure and for the presence of geographic variation. The complex harbor seal vocalizations may be described by how the frequency bandwidth varies over time. An algorithm that identifies the boundaries between noise and signal from digital spectrograms was developed in order to extract a frequency bandwidth contour. The contours were used as inputs for multivariate analysis. The vocalizations' sound types (e.g., pulsed sound, whistle, and broadband nonharmonic sound) were determined by comparing the vocalizations' spectrographic representations with sound waves produced by known sound sources. Comparison between colonies revealed differences in the frequency contours, as well as some geographical variation in use of sound types. The vocal differences may reflect a limited exchange of individuals between the six colonies due to long distances and strong site fidelity. Geographically different vocal repertoires have potential for identifying discrete breeding colonies of harbor seals, but more information is needed on the nature and extent of early movements of young, the degree of learning, and the stability of the vocal repertoire. A characteristic feature of many vocalizations in this study was the presence of tonal-like introductory phrases that fit into the categories pulsed sound and whistles. The functions of these phrases are unknown but may be important in distance perception and localization of the sound source. The potential behavioral consequences of the observed variability may be indicative of adaptations to different environmental properties influencing determination of distance and direction and plausible different male mating tactics.
Anatomical study of minor alterations in neonate vocal folds.
Silva, Adriano Rezende; Machado, Almiro José; Crespo, Agrício Nubiato
2014-01-01
Minor structural alterations of the vocal fold cover are frequent causes of voice abnormalities. They may be difficult to diagnose, and are expressed in different manners. Cases of intracordal cysts, sulcus vocalis, mucosal bridge, and laryngeal micro-diaphragm form the group of minor structural alterations of the vocal fold cover investigated in the present study. The etiopathogenesis and epidemiology of these alterations are poorly known. To evaluate the existence and anatomical characterization of minor structural alterations in the vocal folds of newborns. 56 larynxes excised from neonates of both genders were studied. They were examined fresh, or defrosted after conservation via freezing, under a microscope at magnifications of 25× and 40×. The vocal folds were inspected and palpated by two examiners, with the aim of finding minor structural alterations similar to those described classically, and other undetermined minor structural alterations. Larynges presenting abnormalities were submitted to histological examination. Six cases of abnormalities were found in different larynges: one (1.79%) compatible with a sulcus vocalis and five (8.93%) compatible with a laryngeal micro-diaphragm. No cases of cysts or mucosal bridges were found. The observed abnormalities had characteristics similar to those described in other age groups. Abnormalities similar to sulcus vocalis or micro-diaphragm may be present at birth. Copyright © 2014 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Petekkaya, Emine; Yücel, Ahmet Hilmi; Sürmelioğlu, Özgür
2017-12-28
Opera and chant singers learn to effectively use aerodynamic components by breathing exercises during their education. Aerodynamic components, including subglottic air pressure and airflow, deteriorate in voice disorders. This study aimed to evaluate the changes in aerodynamic parameters and supraglottic structures of men and women with different vocal registers who are in an opera and chant education program. Vocal acoustic characteristics, aerodynamic components, and supraglottic structures were evaluated in 40 opera and chant art branch students. The majority of female students were sopranos, and the male students were baritone or tenor vocalists. The acoustic analyses revealed that the mean fundamental frequency was 152.33 Hz in the males and 218.77 Hz in the females. The estimated mean subglottal pressures were similar in females (14.99 cmH 2 O) and in males (14.48 cmH 2 O). Estimated mean airflow rates were also similar in both groups. The supraglottic structure compression analyses revealed partial anterior-posterior compressions in 2 tenors and 2 sopranos, and false vocal fold compression in 2 sopranos. Opera music is sung in high-pitched sounds. Attempts to sing high-pitched notes and frequently using register transitions overstrain the vocal structures. This intense muscular effort eventually traumatizes the vocal structures and causes supraglottic activity. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Dynamic spectral structure specifies vowels for children and adultsa
Nittrouer, Susan
2008-01-01
When it comes to making decisions regarding vowel quality, adults seem to weight dynamic syllable structure more strongly than static structure, although disagreement exists over the nature of the most relevant kind of dynamic structure: spectral change intrinsic to the vowel or structure arising from movements between consonant and vowel constrictions. Results have been even less clear regarding the signal components children use in making vowel judgments. In this experiment, listeners of four different ages (adults, and 3-, 5-, and 7-year-old children) were asked to label stimuli that sounded either like steady-state vowels or like CVC syllables which sometimes had middle sections masked by coughs. Four vowel contrasts were used, crossed for type (front/back or closed/open) and consonant context (strongly or only slightly constraining of vowel tongue position). All listeners recognized vowel quality with high levels of accuracy in all conditions, but children were disproportionately hampered by strong coarticulatory effects when only steady-state formants were available. Results clarified past studies, showing that dynamic structure is critical to vowel perception for all aged listeners, but particularly for young children, and that it is the dynamic structure arising from vocal-tract movement between consonant and vowel constrictions that is most important. PMID:17902868
Three-Dimensional Flow Separation Induced by a Model Vocal Fold Polyp
NASA Astrophysics Data System (ADS)
Stewart, Kelley C.; Erath, Byron D.; Plesniak, Michael W.
2012-11-01
The fluid-structure energy exchange process for normal speech has been studied extensively, but it is not well understood for pathological conditions. Polyps and nodules, which are geometric abnormalities that form on the medial surface of the vocal folds, can disrupt vocal fold dynamics and thus can have devastating consequences on a patient's ability to communicate. A recent in-vitro investigation of a model polyp in a driven vocal fold apparatus demonstrated that such a geometric abnormality considerably disrupts the glottal jet behavior and that this flow field adjustment was a likely reason for the severe degradation of the vocal quality in patients. Understanding of the formation and propagation of vortical structures from a geometric protuberance, and their subsequent impact on the aerodynamic loadings that drive vocal fold dynamic, is a critical component in advancing the treatment of this pathological condition. The present investigation concerns the three-dimensional flow separation induced by a wall-mounted prolate hemispheroid with a 2:1 aspect ratio in cross flow, i.e. a model vocal fold polyp. Unsteady three-dimensional flow separation and its impact of the wall pressure loading are examined using skin friction line visualization and wall pressure measurements. Supported by the National Science Foundation, Grant No. CBET-1236351 and GW Center for Biomimetics and Bioinspired Engineering (COBRE).
Synthetic, multi-layer, self-oscillating vocal fold model fabrication.
Murray, Preston R; Thomson, Scott L
2011-12-02
Sound for the human voice is produced via flow-induced vocal fold vibration. The vocal folds consist of several layers of tissue, each with differing material properties. Normal voice production relies on healthy tissue and vocal folds, and occurs as a result of complex coupling between aerodynamic, structural dynamic, and acoustic physical phenomena. Voice disorders affect up to 7.5 million annually in the United States alone and often result in significant financial, social, and other quality-of-life difficulties. Understanding the physics of voice production has the potential to significantly benefit voice care, including clinical prevention, diagnosis, and treatment of voice disorders. Existing methods for studying voice production include in vivo experimentation using human and animal subjects, in vitro experimentation using excised larynges and synthetic models, and computational modeling. Owing to hazardous and difficult instrument access, in vivo experiments are severely limited in scope. Excised larynx experiments have the benefit of anatomical and some physiological realism, but parametric studies involving geometric and material property variables are limited. Further, they are typically only able to be vibrated for relatively short periods of time (typically on the order of minutes). Overcoming some of the limitations of excised larynx experiments, synthetic vocal fold models are emerging as a complementary tool for studying voice production. Synthetic models can be fabricated with systematic changes to geometry and material properties, allowing for the study of healthy and unhealthy human phonatory aerodynamics, structural dynamics, and acoustics. For example, they have been used to study left-right vocal fold asymmetry, clinical instrument development, laryngeal aerodynamics, vocal fold contact pressure, and subglottal acoustics (a more comprehensive list can be found in Kniesburges et al.) Existing synthetic vocal fold models, however, have either been homogenous (one-layer models) or have been fabricated using two materials of differing stiffness (two-layer models). This approach does not allow for representation of the actual multi-layer structure of the human vocal folds that plays a central role in governing vocal fold flow-induced vibratory response. Consequently, one- and two-layer synthetic vocal fold models have exhibited disadvantages such as higher onset pressures than what are typical for human phonation (onset pressure is the minimum lung pressure required to initiate vibration), unnaturally large inferior-superior motion, and lack of a "mucosal wave" (a vertically-traveling wave that is characteristic of healthy human vocal fold vibration). In this paper, fabrication of a model with multiple layers of differing material properties is described. The model layers simulate the multi-layer structure of the human vocal folds, including epithelium, superficial lamina propria (SLP), intermediate and deep lamina propria (i.e., ligament; a fiber is included for anterior-posterior stiffness), and muscle (i.e., body) layers. Results are included that show that the model exhibits improved vibratory characteristics over prior one- and two-layer synthetic models, including onset pressure closer to human onset pressure, reduced inferior-superior motion, and evidence of a mucosal wave.
A Novel Marker Based Method to Teeth Alignment in MRI
NASA Astrophysics Data System (ADS)
Luukinen, Jean-Marc; Aalto, Daniel; Malinen, Jarmo; Niikuni, Naoko; Saunavaara, Jani; Jääsaari, Päivi; Ojalammi, Antti; Parkkola, Riitta; Soukka, Tero; Happonen, Risto-Pekka
2018-04-01
Magnetic resonance imaging (MRI) can precisely capture the anatomy of the vocal tract. However, the crowns of teeth are not visible in standard MRI scans. In this study, a marker-based teeth alignment method is presented and evaluated. Ten patients undergoing orthognathic surgery were enrolled. Supraglottal airways were imaged preoperatively using structural MRI. MRI visible markers were developed, and they were attached to maxillary teeth and corresponding locations on the dental casts. Repeated measurements of intermarker distances in MRI and in a replica model was compared using linear regression analysis. Dental cast MRI and corresponding caliper measurements did not differ significantly. In contrast, the marker locations in vivo differed somewhat from the dental cast measurements likely due to marker placement inaccuracies. The markers were clearly visible in MRI and allowed for dental models to be aligned to head and neck MRI scans.
Zhang, Lucy T.; Yang, Jubiao
2017-01-01
In this work we explore the aerodynamics flow characteristics of a coupled fluid-structure interaction system using a generalized Bernoulli equation derived directly from the Cauchy momentum equations. Unlike the conventional Bernoulli equation where incompressible, inviscid, and steady flow conditions are assumed, this generalized Bernoulli equation includes the contributions from compressibility, viscous, and unsteadiness, which could be essential in defining aerodynamic characteristics. The application of the derived Bernoulli’s principle is on a fully-coupled fluid-structure interaction simulation of the vocal folds vibration. The coupled system is simulated using the immersed finite element method where compressible Navier-Stokes equations are used to describe the air and an elastic pliable structure to describe the vocal fold. The vibration of the vocal fold works to open and close the glottal flow. The aerodynamics flow characteristics are evaluated using the derived Bernoulli’s principles for a vibration cycle in a carefully partitioned control volume based on the moving structure. The results agree very well to experimental observations, which validate the strategy and its use in other types of flow characteristics that involve coupled fluid-structure interactions. PMID:29527541
Zhang, Lucy T; Yang, Jubiao
2016-12-01
In this work we explore the aerodynamics flow characteristics of a coupled fluid-structure interaction system using a generalized Bernoulli equation derived directly from the Cauchy momentum equations. Unlike the conventional Bernoulli equation where incompressible, inviscid, and steady flow conditions are assumed, this generalized Bernoulli equation includes the contributions from compressibility, viscous, and unsteadiness, which could be essential in defining aerodynamic characteristics. The application of the derived Bernoulli's principle is on a fully-coupled fluid-structure interaction simulation of the vocal folds vibration. The coupled system is simulated using the immersed finite element method where compressible Navier-Stokes equations are used to describe the air and an elastic pliable structure to describe the vocal fold. The vibration of the vocal fold works to open and close the glottal flow. The aerodynamics flow characteristics are evaluated using the derived Bernoulli's principles for a vibration cycle in a carefully partitioned control volume based on the moving structure. The results agree very well to experimental observations, which validate the strategy and its use in other types of flow characteristics that involve coupled fluid-structure interactions.
Henry, Laurence; Craig, Adrian J. F. K.; Lemasson, Alban; Hausberger, Martine
2015-01-01
Turn-taking in conversation appears to be a common feature in various human cultures and this universality raises questions about its biological basis and evolutionary trajectory. Functional convergence is a widespread phenomenon in evolution, revealing sometimes striking functional similarities between very distant species even though the mechanisms involved may be different. Studies on mammals (including non-human primates) and bird species with different levels of social coordination reveal that temporal and structural regularities in vocal interactions may depend on the species' social structure. Here we test the hypothesis that turn-taking and associated rules of conversations may be an adaptive response to the requirements of social life, by testing the applicability of turn-taking rules to an animal model, the European starling. Birdsong has for many decades been considered as one of the best models of human language and starling songs have been well described in terms of vocal production and perception. Starlings do have vocal interactions where alternating patterns predominate. Observational and experimental data on vocal interactions reveal that (1) there are indeed clear temporal and structural regularities, (2) the temporal and structural patterning is influenced by the immediate social context, the general social situation, the individual history, and the internal state of the emitter. Comparison of phylogenetically close species of Sturnids reveals that the alternating pattern of vocal interactions varies greatly according to the species' social structure, suggesting that interactional regularities may have evolved together with social systems. These findings lead to solid bases of discussion on the evolution of communication rules in relation to social evolution. They will be discussed also in terms of processes, at the light of recent neurobiological findings. PMID:26441787
Pulse Vector-Excitation Speech Encoder
NASA Technical Reports Server (NTRS)
Davidson, Grant; Gersho, Allen
1989-01-01
Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.
Evaluation of articulation simulation system using artificial maxillectomy models.
Elbashti, M E; Hattori, M; Sumita, Y I; Taniguchi, H
2015-09-01
Acoustic evaluation is valuable for guiding the treatment of maxillofacial defects and determining the effectiveness of rehabilitation with an obturator prosthesis. Model simulations are important in terms of pre-surgical planning and pre- and post-operative speech function. This study aimed to evaluate the acoustic characteristics of voice generated by an articulation simulation system using a vocal tract model with or without artificial maxillectomy defects. More specifically, we aimed to establish a speech simulation system for maxillectomy defect models that both surgeons and maxillofacial prosthodontists can use in guiding treatment planning. Artificially simulated maxillectomy defects were prepared according to Aramany's classification (Classes I-VI) in a three-dimensional vocal tract plaster model of a subject uttering the vowel /a/. Formant and nasalance acoustic data were analysed using Computerized Speech Lab and the Nasometer, respectively. Formants and nasalance of simulated /a/ sounds were successfully detected and analysed. Values of Formants 1 and 2 for the non-defect model were 675.43 and 976.64 Hz, respectively. Median values of Formants 1 and 2 for the defect models were 634.36 and 1026.84 Hz, respectively. Nasalance was 11% in the non-defect model, whereas median nasalance was 28% in the defect models. The results suggest that an articulation simulation system can be used to help surgeons and maxillofacial prosthodontists to plan post-surgical defects that will be facilitate maxillofacial rehabilitation. © 2015 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Meltzner, Geoffrey S.; Kobler, James B.; Hillman, Robert E.
2003-08-01
Measurements of the neck frequency response function (NFRF), defined as the ratio of the spectrum of the estimated volume velocity that excites the vocal tract to the spectrum of the acceleration delivered to the neck wall, were made at three different positions on the necks of nine laryngectomized subjects (five males and four females) and four normal laryngeal speakers (two males and two females). A minishaker driven by broadband noise provided excitation to the necks of subjects as they configured their vocal tracts to mimic the production of the vowels /aye/, /æ/, and /I/. The sound pressure at the lips was measured with a microphone and an impedance head mounted on the shaker measured the acceleration. The neck wall passed low-frequency sound energy better than high-frequency sound energy, and thus the NFRF was accurately modeled as a low-pass filter. The NFRFs of the different subject groups (female laryngeal, male laryngeal speakers, laryngectomized males, and laryngectomized females) differed from each other in terms of corner frequency and gain, with both types of male subjects presenting NFRFs with larger overall gains. In addition, there was a notable amount of intersubject variability within groups. Because the NFRF is an estimate of how sound energy passes through the neck wall, these results should aid in the design of improved neck-type electrolarynx devices.
Everyday bat vocalizations contain information about emitter, addressee, context, and behavior
Prat, Yosef; Taub, Mor; Yovel, Yossi
2016-01-01
Animal vocal communication is often diverse and structured. Yet, the information concealed in animal vocalizations remains elusive. Several studies have shown that animal calls convey information about their emitter and the context. Often, these studies focus on specific types of calls, as it is rarely possible to probe an entire vocal repertoire at once. In this study, we continuously monitored Egyptian fruit bats for months, recording audio and video around-the-clock. We analyzed almost 15,000 vocalizations, which accompanied the everyday interactions of the bats, and were all directed toward specific individuals, rather than broadcast. We found that bat vocalizations carry ample information about the identity of the emitter, the context of the call, the behavioral response to the call, and even the call’s addressee. Our results underline the importance of studying the mundane, pairwise, directed, vocal interactions of animals. PMID:28005079
Knockout of Foxp2 disrupts vocal development in mice.
Castellucci, Gregg A; McGinley, Matthew J; McCormick, David A
2016-03-16
The FOXP2 gene is important for the development of proper speech motor control in humans. However, the role of the gene in general vocal behavior in other mammals, including mice, is unclear. Here, we track the vocal development of Foxp2 heterozygous knockout (Foxp2+/-) mice and their wildtype (WT) littermates from juvenile to adult ages, and observe severe abnormalities in the courtship song of Foxp2+/- mice. In comparison to their WT littermates, Foxp2+/- mice vocalized less, produced shorter syllable sequences, and possessed an abnormal syllable inventory. In addition, Foxp2+/- song also exhibited irregular rhythmic structure, and its development did not follow the consistent trajectories observed in WT vocalizations. These results demonstrate that the Foxp2 gene is critical for normal vocal behavior in juvenile and adult mice, and that Foxp2 mutant mice may provide a tractable model system for the study of the gene's role in general vocal motor control.
The voice conveys specific emotions: evidence from vocal burst displays.
Simon-Thomas, Emiliana R; Keltner, Dacher J; Sauter, Disa; Sinicropi-Yao, Lara; Abramson, Anna
2009-12-01
Studies of emotion signaling inform claims about the taxonomic structure, evolutionary origins, and physiological correlates of emotions. Emotion vocalization research has tended to focus on a limited set of emotions: anger, disgust, fear, sadness, surprise, happiness, and for the voice, also tenderness. Here, we examine how well brief vocal bursts can communicate 22 different emotions: 9 negative (Study 1) and 13 positive (Study 2), and whether prototypical vocal bursts convey emotions more reliably than heterogeneous vocal bursts (Study 3). Results show that vocal bursts communicate emotions like anger, fear, and sadness, as well as seldom-studied states like awe, compassion, interest, and embarrassment. Ancillary analyses reveal family-wise patterns of vocal burst expression. Errors in classification were more common within emotion families (e.g., 'self-conscious,' 'pro-social') than between emotion families. The three studies reported highlight the voice as a rich modality for emotion display that can inform fundamental constructs about emotion.
Samlan, Robin A.; Story, Brad H.; Bunton, Kate
2014-01-01
Purpose To determine 1) how specific vocal fold structural and vibratory features relate to breathy voice quality and 2) the relation of perceived breathiness to four acoustic correlates of breathiness. Method A computational, kinematic model of the vocal fold medial surfaces was used to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: vocal process separation, surface bulging, vibratory nodal point, and epilaryngeal constriction. Twelve naïve listeners rated breathiness of 364 samples relative to a reference. The degree of breathiness was then compared to 1) the underlying kinematic profile and 2) four acoustic measures: cepstral peak prominence (CPP), harmonics-to-noise ratio, and two measures of spectral slope. Results Vocal process separation alone accounted for 61.4% of the variance in perceptual rating. Adding nodal point ratio and bulging to the equation increased the explained variance to 88.7%. The acoustic measure CPP accounted for 86.7% of the variance in perceived breathiness, and explained variance increased to 92.6% with the addition of one spectral slope measure. Conclusions Breathiness ratings were best explained kinematically by the degree of vocal process separation and acoustically by CPP. PMID:23785184
Luo, Haoxiang; Mittal, Rajat; Zheng, Xudong; Bielamowicz, Steven A.; Walsh, Raymond J.; Hahn, James K.
2008-01-01
A new numerical approach for modeling a class of flow–structure interaction problems typically encountered in biological systems is presented. In this approach, a previously developed, sharp-interface, immersed-boundary method for incompressible flows is used to model the fluid flow and a new, sharp-interface Cartesian grid, immersed boundary method is devised to solve the equations of linear viscoelasticity that governs the solid. The two solvers are coupled to model flow–structure interaction. This coupled solver has the advantage of simple grid generation and efficient computation on simple, single-block structured grids. The accuracy of the solid-mechanics solver is examined by applying it to a canonical problem. The solution methodology is then applied to the problem of laryngeal aerodynamics and vocal fold vibration during human phonation. This includes a three-dimensional eigen analysis for a multi-layered vocal fold prototype as well as two-dimensional, flow-induced vocal fold vibration in a modeled larynx. Several salient features of the aerodynamics as well as vocal-fold dynamics are presented. PMID:19936017
Fernández-Vargas, Marcela; Johnston, Robert E
2015-01-01
Vocal signaling is one of many behaviors that animals perform during social interactions. Vocalizations produced by both sexes before mating can communicate sex, identity and condition of the caller. Adult golden hamsters produce ultrasonic vocalizations (USV) after intersexual contact. To determine whether these vocalizations are sexually dimorphic, we analyzed the vocal repertoire for sex differences in: 1) calling rates, 2) composition (structural complexity, call types and nonlinear phenomena) and 3) acoustic structure. In addition, we examined it for individual variation in the calls. The vocal repertoire was mainly composed of 1-note simple calls and at least half of them presented some degree of deterministic chaos. The prevalence of this nonlinear phenomenon was confirmed by low values of harmonic-to-noise ratio for most calls. We found modest sexual differences between repertoires. Males were more likely than females to produce tonal and less chaotic calls, as well as call types with frequency jumps. Multivariate analysis of the acoustic features of 1-note simple calls revealed significant sex differences in the second axis represented mostly by entropy and bandwidth parameters. Male calls showed lower entropy and inter-quartile bandwidth than female calls. Because the variation of acoustic structure within individuals was higher than among individuals, USV could not be reliably assigned to the correct individual. Interestingly, however, this high variability, augmented by the prevalence of chaos and frequency jumps, could be the result of increased vocal effort. Hamsters motivated to produce high calling rates also produced longer calls of broader bandwidth. Thus, the sex differences found could be the result of different sex preferences but also of a sex difference in calling motivation or condition. We suggest that variable and complex USV may have been selected to increase responsiveness of a potential mate by communicating sexual arousal and preventing habituation to the caller.
Lanovaz, Marc J; Fletcher, Sarah E; Rapp, John T
2009-09-01
We used a three-component multiple-schedule with a brief reversal design to evaluate the effects of structurally unmatched and matched stimuli on immediate and subsequent vocal stereotypy that was displayed by three children with autism spectrum disorders. For 2 of the 3 participants, access to matched stimuli, unmatched stimuli, and music decreased immediate levels of vocal stereotypy; however, with the exception of matched stimuli for one participant, none of the stimuli produced a clear abolishing operation for subsequent vocal stereotypy. That is, vocal stereotypy typically increased to baseline levels shortly after alternative stimulation was removed. Detection of motivating operations for each participant's vocal stereotypy was aided by the analysis of component distributions. The results are discussed in terms of immediate and subsequent effects of preferred stimuli on automatically reinforced problem behavior.
NASA Astrophysics Data System (ADS)
Westervelt, Andrea; Erath, Byron
2013-11-01
Voiced speech is produced by fluid-structure interactions that drive vocal fold motion. Viscous flow features influence the pressure in the gap between the vocal folds (i.e. glottis), thereby altering vocal fold dynamics and the sound that is produced. During the closing phases of the phonatory cycle, vortices form as a result of flow separation as air passes through the divergent glottis. It is hypothesized that the reduced pressure within a vortex core will alter the pressure distribution along the vocal fold surface, thereby aiding in vocal fold closure. The objective of this study is to determine the impact of intraglottal vortices on the fluid-structure interactions of voiced speech by investigating how the dynamics of a flexible plate are influenced by a vortex ring passing tangentially over it. A flexible plate, which models the medial vocal fold surface, is placed in a water-filled tank and positioned parallel to the exit of a vortex generator. The physical parameters of plate stiffness and vortex circulation are scaled with physiological values. As vortices propagate over the plate, particle image velocimetry measurements are captured to analyze the energy exchange between the fluid and flexible plate. The investigations are performed over a range of vortex formation numbers, and lateral displacements of the plate from the centerline of the vortex trajectory. Observations show plate oscillations with displacements directly correlated with the vortex core location.
Discriminating Simulated Vocal Tremor Source Using Amplitude Modulation Spectra
Carbonell, Kathy M.; Lester, Rosemary A.; Story, Brad H.; Lotto, Andrew J.
2014-01-01
Objectives/Hypothesis Sources of vocal tremor are difficult to categorize perceptually and acoustically. This paper describes a preliminary attempt to discriminate vocal tremor sources through the use of spectral measures of the amplitude envelope. The hypothesis is that different vocal tremor sources are associated with distinct patterns of acoustic amplitude modulations. Study Design Statistical categorization methods (discriminant function analysis) were used to discriminate signals from simulated vocal tremor with different sources using only acoustic measures derived from the amplitude envelopes. Methods Simulations of vocal tremor were created by modulating parameters of a vocal fold model corresponding to oscillations of respiratory driving pressure (respiratory tremor), degree of vocal fold adduction (adductory tremor) and fundamental frequency of vocal fold vibration (F0 tremor). The acoustic measures were based on spectral analyses of the amplitude envelope computed across the entire signal and within select frequency bands. Results The signals could be categorized (with accuracy well above chance) in terms of the simulated tremor source using only measures of the amplitude envelope spectrum even when multiple sources of tremor were included. Conclusions These results supply initial support for an amplitude-envelope based approach to identify the source of vocal tremor and provide further evidence for the rich information about talker characteristics present in the temporal structure of the amplitude envelope. PMID:25532813
Fischer, J; Hammerschmidt, K
2011-01-01
Comparative analyses used to reconstruct the evolution of traits associated with the human language faculty, including its socio-cognitive underpinnings, highlight the importance of evolutionary constraints limiting vocal learning in non-human primates. After a brief overview of this field of research and the neural basis of primate vocalizations, we review studies that have addressed the genetic basis of usage and structure of ultrasonic communication in mice, with a focus on the gene FOXP2 involved in specific language impairments and neuroligin genes (NL-3 and NL-4) involved in autism spectrum disorders. Knockout of FoxP2 leads to reduced vocal behavior and eventually premature death. Introducing the human variant of FoxP2 protein into mice, in contrast, results in shifts in frequency and modulation of pup ultrasonic vocalizations. Knockout of NL-3 and NL-4 in mice diminishes social behavior and vocalizations. Although such studies may provide insights into the molecular and neural basis of social and communicative behavior, the structure of mouse vocalizations is largely innate, limiting the suitability of the mouse model to study human speech, a learned mode of production. Although knockout or replacement of single genes has perceptible effects on behavior, these genes are part of larger networks whose functions remain poorly understood. In humans, for instance, deficiencies in NL-4 can lead to a broad spectrum of disorders, suggesting that further factors (experiential and/or genetic) contribute to the variation in clinical symptoms. The precise nature as well as the interaction of these factors is yet to be determined. PMID:20579107
Functional assessment of the ex vivo vocal folds through biomechanical testing: A review
Dion, Gregory R.; Jeswani, Seema; Roof, Scott; Fritz, Mark; Coelho, Paulo; Sobieraj, Michael; Amin, Milan R.; Branski, Ryan C.
2016-01-01
The human vocal folds are complex structures made up of distinct layers that vary in cellular and extracellular composition. The mechanical properties of vocal fold tissue are fundamental to the study of both the acoustics and biomechanics of voice production. To date, quantitative methods have been applied to characterize the vocal fold tissue in both normal and pathologic conditions. This review describes, summarizes, and discusses the most commonly employed methods for vocal fold biomechanical testing. Force-elongation, torsional parallel plate rheometry, simple-shear parallel plate rheometry, linear skin rheometry, and indentation are the most frequently employed biomechanical tests for vocal fold tissues and each provide material properties data that can be used to compare native tissue verses diseased for treated tissue. Force-elongation testing is clinically useful, as it allows for functional unit testing, while rheometry provides physiologically relevant shear data, and nanoindentation permits micrometer scale testing across different areas of the vocal fold as well as whole organ testing. Thoughtful selection of the testing technique during experimental design to evaluate a hypothesis is important to optimizing biomechanical testing of vocal fold tissues. PMID:27127075
Mechanomimetic hydrogels for vocal fold lamina propria regeneration.
Kutty, Jaishankar K; Webb, Ken
2009-01-01
Vocal fold injury commonly leads to reduced vocal quality due to scarring-induced alterations in matrix composition and tissue biomechanics. The long-term hypothesis motivating our work is that rapid restoration of phonation and the associated dynamic mechanical environment will reduce scarring and promote regenerative healing. Toward this end, the objective of this study was to develop mechanomimetic, degradable hydrogels approximating the viscoelastic properties of the vocal ligament and mucosa that may be photopolymerized in situ to restore structural integrity to vocal fold tissues. The tensile and rheological properties of hydrogels (targeting the vocal ligament and mucosa, respectively) were varied as a function of macromer concentration. PEG diacrylate-based hydrogels exhibited linear stress-strain response and elastic modulus consistent with the properties of the vocal ligament at low strains (0-15%), but did not replicate the non-linear behavior observed in native tissue at higher strains. Methacrylated hyaluronic acid hydrogels displayed dynamic viscosity consistent with native vocal mucosa, while elastic shear moduli values were several-fold higher. Cell culture studies indicated that both hydrogels supported spreading, proliferation and collagen/proteoglycan matrix deposition by encapsulated fibroblasts throughout the 3D network.
Elias-Costa, Agustin J; Montesinos, Rachel; Grant, Taran; Faivovich, Julián
2017-11-01
Anuran vocal sacs are elastic chambers that recycle exhaled air during vocalizations and are present in males of most species of frogs. Most knowledge of the diversity of vocal sacs relates to external morphology; detailed information on internal anatomy is available for few groups of frogs. Frogs of the family Hylodidae, which is endemic to the Atlantic Forest of Brazil and adjacent Argentina and Paraguay, have three patterns of vocal sac morphology-that is, single, subgular; paired, lateral; and absent. The submandibular musculature and structure of the vocal sac mucosa (the internal wall of the vocal sac) of exemplar species of this family and relatives were studied. In contrast to previous accounts, we found that all species of Crossodactylus and Hylodes possess paired, lateral vocal sacs, with the internal mucosa of each sac being separate from the contralateral one. Unlike all other frogs for which data are available, the mucosa of the vocal sacs in these genera is not supported externally by the mm. intermandibularis and interhyoideus. Rather, the vocal sac mucosa projects through the musculature and is free in the submandibular lymphatic sac. The presence of paired, lateral vocal sacs, the internal separation of the sac mucosae, and their projection through the m. interhyoideus are synapomorphies of the family. Furthermore, the specific configuration of the m. interhyoideus allows asymmetric inflation of paired vocal sacs, a feature only reported in species of these diurnal, stream-dwelling frogs. © 2017 Wiley Periodicals, Inc.
Musical Structure Modulates Semantic Priming in Vocal Music
ERIC Educational Resources Information Center
Poulin-Charronnat, Benedicte; Bigand, Emmanuel; Madurell, Francois; Peereman, Ronald
2005-01-01
It has been shown that harmonic structure may influence the processing of phonemes whatever the extent of participants' musical expertise [Bigand, E., Tillmann, B., Poulin, B., D'Adamo, D. A., & Madurell, F. (2001). The effect of harmonic context on phoneme monitoring in vocal music. "Cognition," 81, B11-B20]. The present study goes a step further…
Knockout of Foxp2 disrupts vocal development in mice
Castellucci, Gregg A.; McGinley, Matthew J.; McCormick, David A.
2016-01-01
The FOXP2 gene is important for the development of proper speech motor control in humans. However, the role of the gene in general vocal behavior in other mammals, including mice, is unclear. Here, we track the vocal development of Foxp2 heterozygous knockout (Foxp2+/−) mice and their wildtype (WT) littermates from juvenile to adult ages, and observe severe abnormalities in the courtship song of Foxp2+/− mice. In comparison to their WT littermates, Foxp2+/− mice vocalized less, produced shorter syllable sequences, and possessed an abnormal syllable inventory. In addition, Foxp2+/− song also exhibited irregular rhythmic structure, and its development did not follow the consistent trajectories observed in WT vocalizations. These results demonstrate that the Foxp2 gene is critical for normal vocal behavior in juvenile and adult mice, and that Foxp2 mutant mice may provide a tractable model system for the study of the gene’s role in general vocal motor control. PMID:26980647
Tissue engineering therapies for the vocal fold lamina propria.
Kutty, Jaishankar K; Webb, Ken
2009-09-01
The vocal folds are laryngeal connective tissues with complex matrix composition/organization that provide the viscoelastic mechanical properties required for voice production. Vocal fold injury results in alterations in tissue structure and corresponding changes in tissue biomechanics that reduce vocal quality. Recent work has begun to elucidate the biochemical changes underlying injury-induced pathology and to apply tissue engineering principles to the prevention and reversal of vocal fold scarring. Based on the extensive history of injectable biomaterials in laryngeal surgery, a major focus of regenerative therapies has been the development of novel scaffolds with controlled in vivo residence time and viscoelastic properties approximating the native tissue. Additional strategies have included cell transplantation and delivery of the antifibrotic cytokine hepatocyte growth factor, as well as investigation of the effects of the unique vocal fold vibratory microenvironment using in vitro dynamic culture systems. Recent achievements of significant reductions in fibrosis and improved recovery of native tissue viscoelasticity and vibratory/functional performance in animal models are rapidly moving vocal fold tissue engineering toward clinical application.
Gaskill, Christopher S; Erickson, Molly L
2010-01-01
The use of hard-walled narrow tubes, often called resonance tubes, for the purpose of voice therapy and voice training has a historical precedent and some theoretical support, but the mechanism of any potential benefit from the application of this technique is not well understood. Fifteen vocally untrained male participants produced a series of spoken /a/ vowels at a modal pitch and constant loudness, before and after a minute of repeated phonation into a 50-cm hard-walled glass tube at the same pitch and loudness targets. Electroglottography was used to measure the glottal contact quotient (CQ) during each phase of the experiment. Single-subject analysis revealed statistically significant changes in CQ during tube phonation, but with no discernable pattern across the 15 participants. These results indicate that the use of resonance tubes can have a distinct effect on glottal closure, but the mechanism behind this change remains unclear. The implication is that vocal loading techniques such as this need to be studied further with specific attention paid to the underlying mechanism of any measured changes in glottal behavior, and especially to the role of instruction and feedback in the therapeutic and pedagogical application of these techniques. Copyright 2010 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Frey, Roland; Volodin, Ilya; Volodina, Elena; Carranza, Juan; Torres-Porras, Jerónimo
2012-01-01
Roaring in rutting Iberian red deer stags Cervus elaphus hispanicus is unusual compared to other subspecies of red deer, which radiated from the Iberian refugium after the last glacial maximum. In all red deer stags, the larynx occupies a permanent low mid-neck resting position and is momentarily retracted almost down to the rostral end of the sternum during the production of rutting calls. Simultaneous with the retraction of the larynx, male Iberian red deer pronouncedly protrude the tongue during most of their rutting roars. This poses a mechanical challenge for the vocal tract (vt) and for the hyoid apparatus, as tongue and larynx are strongly pulled in opposite directions. This study (i) examines the vocal anatomy and the acoustics of the rutting roars in free-ranging male C. e. hispanicus; (ii) establishes a potential mechanism of simultaneous tongue protrusion and larynx retraction by applying a two-dimensional model based on graphic reconstructions in single video frames of unrestrained animals; and (iii) advances a hypothesis of evaporative cooling by tongue protrusion in the males of a subspecies of red deer constrained to perform all of the exhausting rutting activities, including acoustic display, in a hot and arid season. PMID:22257361
Effects of vocal training in a musicophile with congenital amusia.
Wilbiks, Jonathan M P; Vuvan, Dominique T; Girard, Pier-Yves; Peretz, Isabelle; Russo, Frank A
2016-12-01
Congenital amusia is a condition in which an individual suffers from a deficit of musical pitch perception and production. Individuals suffering from congenital amusia generally tend to abstain from musical activities. Here, we present the unique case of Tim Falconer, a self-described musicophile who also suffers from congenital amusia. We describe and assess Tim's attempts to train himself out of amusia through a self-imposed 18-month program of formal vocal training and practice. We tested Tim with respect to music perception and vocal production across seven sessions including pre- and post-training assessments. We also obtained diffusion-weighted images of his brain to assess connectivity between auditory and motor planning areas via the arcuate fasciculus (AF). Tim's behavioral and brain data were compared to that of normal and amusic controls. While Tim showed temporary gains in his singing ability, he did not reach normal levels, and these gains faded when he was not engaged in regular lessons and practice. Tim did show some sustained gains with respect to the perception of musical rhythm and meter. We propose that Tim's lack of improvement in pitch perception and production tasks is due to long-standing and likely irreversible reduction in connectivity along the AF fiber tract.
Acoustic analysis of voice in children with cleft palate and velopharyngeal insufficiency.
Villafuerte-Gonzalez, Rocio; Valadez-Jimenez, Victor M; Hernandez-Lopez, Xochiquetzal; Ysunza, Pablo Antonio
2015-07-01
Acoustic analysis of voice can provide instrumental data concerning vocal abnormalities. These findings can be used for monitoring clinical course in cases of voice disorders. Cleft palate severely affects the structure of the vocal tract. Hence, voice quality can also be also affected. To study whether the main acoustic parameters of voice, including fundamental frequency, shimmer and jitter are significantly different in patients with a repaired cleft palate, as compared with normal children without speech, language and voice disorders. Fourteen patients with repaired unilateral cleft lip and palate and persistent or residual velopharyngeal insufficiency (VPI) were studied. A control group was assembled with healthy volunteer subjects matched by age and gender. Hypernasality and nasal emission were perceptually assessed in patients with VPI. Size of the gap as assessed by videonasopharyngoscopy was classified in patients with VPI. Acoustic analysis of voice including Fundamental frequency (F0), shimmer and jitter were compared between patients with VPI and control subjects. F0 was significantly higher in male patients as compared with male controls. Shimmer was significantly higher in patients with VPI regardless of gender. Moreover, patients with moderate VPI showed a significantly higher shimmer perturbation, regardless of gender. Although future research regarding voice disorders in patients with VPI is needed, at the present time it seems reasonable to include strategies for voice therapy in the speech and language pathology intervention plan for patients with VPI. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Vocal Fold Pathologies and Three-Dimensional Flow Separation Phenomena
NASA Astrophysics Data System (ADS)
Apostoli, Adam G.; Weiland, Kelley S.; Plesniak, Michael W.
2013-11-01
Polyps and nodules are two different pathologies, which are geometric abnormalities that form on the medial surface of the vocal folds, and have been shown to significantly disrupt a person's ability to communicate. Although the mechanism by which the vocal folds self-oscillate and the three-dimensional nature of the glottal jet has been studied, the effect of irregularities caused by pathologies is not fully understood. Examining the formation and evolution of vortical structures created by a geometric protuberance is important, not only for understanding the aerodynamic forces exerted by these structures on the vocal folds, but also in the treatment of the above-mentioned pathological conditions. Using a wall-mounted prolate hemispheroid with a 2:1 aspect ratio in cross flow, the present investigation considers three-dimensional flow separation induced by a model vocal fold polyp. Building on previous work using skin friction line visualization, both the velocity flow field and wall pressure measurements around the model polyp are presented and compared. Supported by the National Science Foundation, Grant No. CBET-1236351 and GW Center for Biomimetics and Bioinspired Engineering (COBRE).
Correlation of phonatory behavior with vocal fold structure, observed in a physical model
NASA Astrophysics Data System (ADS)
Krane, Michael; Walters, Gage; McPhail, Michael
2017-11-01
The effect of vocal fold shape and internal structure on phonation was studied experimentally using a physical model of the human airway. Model folds used a ``M5'' or a swept ellipse coronal cross-section shape. Models were molded in either 2 or three layers. Two-layer models included a more stiff ``body'' layer and a much softer ``cover'' layer, while the 3-layer models also incorporated an additional, thin, ``ligament/conus'' layer stiffer than the body layer. The elliptical section models were all molded in 3 such layers. Measurements of transglottal pressure, volume flow, mouth sound pressure, and high-speed imaging of vocal fold vibration were performed. These show that models with the ``ligament'' layer experienced much attenuated vertical deformation, that glottal closure was more likely, and that phonation was much easier to initiate. These findings suggest that the combination of the vocal ligament and the conus elasticus stabilize the vocal fold for efficient phonation by limiting vertical deformation, while allowing transverse deformations to occur. Acknowledge support from NIH DC R01005642-11.
The acoustic structure of male giant panda bleats varies according to intersexual context.
Charlton, Benjamin D; Keating, Jennifer L; Rengui, Li; Huang, Yan; Swaisgood, Ronald R
2015-09-01
Although the acoustic structure of mammal vocal signals often varies according to the social context of emission, relatively few mammal studies have examined acoustic variation during intersexual advertisement. In the current study male giant panda bleats were recorded during the breeding season in three behavioural contexts: vocalising alone, during vocal interactions with females outside of peak oestrus, and during vocal interactions with peak-oestrous females. Male bleats produced during vocal interactions with peak-oestrous females were longer in duration and had higher mean fundamental frequency than those produced when males were either involved in a vocal interaction with a female outside of peak oestrus or vocalising alone. In addition, males produced bleats with higher rates of fundamental frequency modulation when they were vocalising alone than when they were interacting with females. These results show that acoustic features of male giant panda bleats have the potential to signal the caller's motivational state, and suggest that males increase the rate of fundamental frequency modulation in bleats when they are alone to maximally broadcast their quality and promote close-range contact with receptive females during the breeding season.
Neck Circumference and Vocal Parameters in Women Before and After Bariatric Surgery.
de Souza, Lourdes Bernadete Rocha; Pernambuco, Leandro de Araújo; dos Santos, Marquiony Marques; Pereira, Rayane Medeiros
2016-03-01
Morbidly obese patients may suffer from vocal disorders, as vocal production is directly related to the volume of the vocal tract, and the large-scale accumulation of fat in this region may interfere with voice production. The aim of this study was to analyze the neck circumference, fundamental frequency, and maximum phonation time of a group of morbidly obese women before and after bariatric surgery. An observational, longitudinal, and descriptive study was performed with patients of the Obesity and Related Diseases Surgery Unit of a university hospital. A total of 21 morbidly obese women aged 28-68 years, with a mean age of 41.33 years, participated in the study. Neck circumference was measured using a tape measure. To obtain fundamental frequency values, the patient was asked to produce the vowel [a] at normal intensity and pitch for an average period of 3 s. After recording, the participants were asked to produce the sustained vowels [a], [i], and [u] at normal intensity and pitch, with a stopwatch used to measure maximum phonation time. Eight months after surgery, patients were reassessed using the same data collecting procedures as were carried out prior to surgery. After surgery, there was an increase in the average value of fundamental frequency and maximum phonation time for all the vowels and a reduction in neck circumference. The differences were statistically significant. Weight reduction and a consequent decrease in neck circumference affected the changes in maximum phonation time and fundamental frequency values in the voices of these patients, after weight loss.
Unsteady flow motions in the supraglottal region during phonation
NASA Astrophysics Data System (ADS)
Luo, Haoxiang; Dai, Hu
2008-11-01
The highly unsteady flow motions in the larynx are not only responsible for producing the fundamental frequency tone in phonation, but also have a significant contribution to the broadband noise in the human voice. In this work, the laryngeal flow is modeled either as an incompressible pulsatile jet confined in a two-dimensional channel, or a pressure-driven flow modulated by a pair of viscoelastic vocal folds through the flow--structure interaction. The flow in the supraglottal region is found to be dominated by large-scale vortices whose unsteady motions significantly deflect the glottal jet. In the flow--structure interaction, a hybrid model based on the immersed-boundary method is developed to simulate the flow-induced vocal fold vibration, which involves a three-dimensional vocal fold prototype and a two-dimensional viscous flow. Both the flow behavior and the vibratory characteristics of the vocal folds will be presented.
Hierarchical Diagnosis of Vocal Fold Disorders
NASA Astrophysics Data System (ADS)
Nikkhah-Bahrami, Mansour; Ahmadi-Noubari, Hossein; Seyed Aghazadeh, Babak; Khadivi Heris, Hossein
This paper explores the use of hierarchical structure for diagnosis of vocal fold disorders. The hierarchical structure is initially used to train different second-level classifiers. At the first level normal and pathological signals have been distinguished. Next, pathological signals have been classified into neurogenic and organic vocal fold disorders. At the final level, vocal fold nodules have been distinguished from polyps in organic disorders category. For feature selection at each level of hierarchy, the reconstructed signal at each wavelet packet decomposition sub-band in 5 levels of decomposition with mother wavelet of (db10) is used to extract the nonlinear features of self-similarity and approximate entropy. Also, wavelet packet coefficients are used to measure energy and Shannon entropy features at different spectral sub-bands. Davies-Bouldin criterion has been employed to find the most discriminant features. Finally, support vector machines have been adopted as classifiers at each level of hierarchy resulting in the diagnosis accuracy of 92%.
Experimental analysis of the characteristics of artificial vocal folds.
Misun, Vojtech; Svancara, Pavel; Vasek, Martin
2011-05-01
Specialized literature presents a number of models describing the function of the vocal folds. In most of those models, an emphasis is placed on the air flowing through the glottis and, further, on the effect of the parameters of the air alone (its mass, speed, and so forth). The article focuses on the constructional definition of artificial vocal folds and their experimental analysis. The analysis is conducted for voiced source voice phonation and for the changing mean value of the subglottal pressure. The article further deals with the analysis of the pressure of the airflow through the vocal folds, which is cut (separated) into individual pulses by the vibrating vocal folds. The analysis results show that air pulse characteristics are relevant to voice generation, as they are produced by the flowing air and vibrating vocal folds. A number of artificial vocal folds have been constructed to date, and the aforementioned view of their phonation is confirmed by their analysis. The experiments have confirmed that man is able to consciously affect only two parameters of the source voice, that is, its fundamental frequency and voice intensity. The main forces acting on the vocal folds during phonation are as follows: subglottal air pressure and elastic and inertia forces of the vocal folds' structure. The correctness of the function of the artificial vocal folds is documented by the experimental verification of the spectra of several types of artificial vocal folds. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Choi, Jeong-Seok; Kim, Nahn Ju; Klemuk, Sarah; Jang, Yun Ho; Park, In Suh; Ahn, Kyung Hyun; Lim, Jae-Yol; Kim, Young-Mo
2012-09-01
To compare the rheological characteristics of structurally different hyaluronic acid (HA)-based biomaterials that are presently used for phonosurgery and to investigate their influence on the viscoelastic properties of vocal folds after implantation in an in vivo rabbit model. In vitro and in vivo rheometric investigation. Experimental laboratory, Inha and Seoul National Universities. Viscoelastic shear properties of 3 HA-based biomaterials (Rofilan, Restylane, and Reviderm) were measured with a strain-controlled rheometer. These biomaterials were injected into the deep layers of rabbit vocal folds, and viscoelastic moduli of the injected vocal folds were determined 2 months after the injection. The vocal fold specimens were observed using a light microscope and a transmission electron microscope. All HA-based biomaterials showed similar levels of shear viscosity, which were slightly higher than that of human vocal folds reported in previous studies. Compared with noninjected control vocal folds, there were no significant differences in the magnitudes of both elastic shear modulus (G') and viscous modulus (G") of injected vocal folds among all of the materials. Light microscopic images showed that all materials were observed in the deep layers of vocal folds and electron scanning images revealed that injected HA particles were homogeneously distributed in regions of collagenous fibers. HA-based biomaterials could preserve the viscoelastic properties of the vocal folds, when they were injected into vocal folds in an in vivo rabbit model. However, further studies on the influence of the biomaterials on the viscoelasticity of human vocal folds in ECM surroundings are still needed.
Neural coding of syntactic structure in learned vocalizations in the songbird.
Fujimoto, Hisataka; Hasegawa, Taku; Watanabe, Dai
2011-07-06
Although vocal signals including human languages are composed of a finite number of acoustic elements, complex and diverse vocal patterns can be created from combinations of these elements, linked together by syntactic rules. To enable such syntactic vocal behaviors, neural systems must extract the sequence patterns from auditory information and establish syntactic rules to generate motor commands for vocal organs. However, the neural basis of syntactic processing of learned vocal signals remains largely unknown. Here we report that the basal ganglia projecting premotor neurons (HVC(X) neurons) in Bengalese finches represent syntactic rules that generate variable song sequences. When vocalizing an alternative transition segment between song elements called syllables, sparse burst spikes of HVC(X) neurons code the identity of a specific syllable type or a specific transition direction among the alternative trajectories. When vocalizing a variable repetition sequence of the same syllable, HVC(X) neurons not only signal the initiation and termination of the repetition sequence but also indicate the progress and state-of-completeness of the repetition. These different types of syntactic information are frequently integrated within the activity of single HVC(X) neurons, suggesting that syntactic attributes of the individual neurons are not programmed as a basic cellular subtype in advance but acquired in the course of vocal learning and maturation. Furthermore, some auditory-vocal mirroring type HVC(X) neurons display transition selectivity in the auditory phase, much as they do in the vocal phase, suggesting that these songbirds may extract syntactic rules from auditory experience and apply them to form their own vocal behaviors.
Precise Motor Control Enables Rapid Flexibility in Vocal Behavior of Marmoset Monkeys.
Pomberger, Thomas; Risueno-Segovia, Cristina; Löschner, Julia; Hage, Steffen R
2018-03-05
Investigating the evolution of human speech is difficult and controversial because human speech surpasses nonhuman primate vocal communication in scope and flexibility [1-3]. Monkey vocalizations have been assumed to be largely innate, highly affective, and stereotyped for over 50 years [4, 5]. Recently, this perception has dramatically changed. Current studies have revealed distinct learning mechanisms during vocal development [6-8] and vocal flexibility, allowing monkeys to cognitively control when [9, 10], where [11], and what to vocalize [10, 12, 13]. However, specific call features (e.g., duration, frequency) remain surprisingly robust and stable in adult monkeys, resulting in rather stereotyped and discrete call patterns [14]. Additionally, monkeys seem to be unable to modulate their acoustic call structure under reinforced conditions beyond natural constraints [15, 16]. Behavioral experiments have shown that monkeys can stop sequences of calls immediately after acoustic perturbation but cannot interrupt ongoing vocalizations, suggesting that calls consist of single impartible pulses [17, 18]. Using acoustic perturbation triggered by the vocal behavior itself and quantitative measures of resulting vocal adjustments, we show that marmoset monkeys are capable of producing calls with durations beyond the natural boundaries of their repertoire by interrupting ongoing vocalizations rapidly after perturbation onset. Our results indicate that marmosets are capable of interrupting vocalizations only at periodic time points throughout calls, further supported by the occurrence of periodically segmented phees. These ideas overturn decades-old concepts on primate vocal pattern generation, indicating that vocalizations do not consist of one discrete call pattern but are built of many sequentially uttered units, like human speech. Copyright © 2018 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Tissue Engineering-based Therapeutic Strategies for Vocal Fold Repair and Regeneration
Li, Linqing; Stiadle, Jeanna M.; Lau, Hang K.; Zerdoum, Aidan B.; Jia, Xinqiao; L.Thibeault, Susan; Kiick, Kristi L.
2016-01-01
Vocal folds are soft laryngeal connective tissues with distinct layered structures and complex multicomponent matrix compositions that endow phonatory and respiratory functions. This delicate tissue is easily damaged by various environmental factors and pathological conditions, altering vocal biomechanics and causing debilitating vocal disorders that detrimentally affect the daily lives of suffering individuals. Modern techniques and advanced knowledge of regenerative medicine have led to a deeper understanding of the microstructure, microphysiology, and micropathophysiology of vocal fold tissues. State-of-the-art materials ranging from extracecullar-matrix (ECM)-derived biomaterials to synthetic polymer scaffolds have been proposed for the prevention and treatment of voice disorders including vocal fold scarring and fibrosis. This review intends to provide a thorough overview of current achievements in the field of vocal fold tissue engineering, including the fabrication of injectable biomaterials to mimic in vitro cell microenvironments, novel designs of bioreactors that capture in vivo tissue biomechanics, and establishment of various animal models to characterize the in vivo biocompatibility of these materials. The combination of polymeric scaffolds, cell transplantation, biomechanical stimulation, and delivery of antifibrotic growth factors will lead to successful restoration of functional vocal folds and improved vocal recovery in animal models, facilitating the application of these materials and related methodologies in clinical practice. PMID:27619243
Vocalizations produced by humpback whale (Megaptera novaeangliae) calves recorded in Hawaii.
Zoidis, Ann M; Smultea, Mari A; Frankel, Adam S; Hopkins, Julia L; Day, Andy; McFarland, A Sasha; Whitt, Amy D; Fertl, Dagmar
2008-03-01
Although humpback whale (Megaptera novaeangliae) calves are reported to vocalize, this has not been measurably verified. During March 2006, an underwater video camera and two-element hydrophone array were used to record nonsong vocalizations from a mother-calf escort off Hawaii. Acoustic data were analyzed; measured time delays between hydrophones provided bearings to 21 distinct vocalizations produced by the male calf. Signals were pulsed (71%), frequency modulated (19%), or amplitude modulated (10%). They were of simple structure, low frequency (mean=220 Hz), brief duration (mean=170 ms), and relatively narrow bandwidth (mean=2 kHz). The calf produced three series of "grunts" when approaching the diver. During winters of the years 2001-2005 in Hawaii, nonsong vocalizations were recorded in 109 (65%) of 169 groups with a calf using an underwater video and single (omnidirectional) hydrophone. Nonsong vocalizations were most common (34 of 39) in lone mother-calf pairs. A subsample from this dataset of 60 signals assessed to be vocalizations provided strong evidence that 10 male and 18 female calves vocalized based on statistical similarity to the 21 verified calf signals, proximity to an isolated calf (27 of 28 calves), strong signal-to-noise ratio, and/or bubble emissions coincident to sound.
Speech perception of sine-wave signals by children with cochlear implants
Nittrouer, Susan; Kuess, Jamie; Lowenstein, Joanna H.
2015-01-01
Children need to discover linguistically meaningful structures in the acoustic speech signal. Being attentive to recurring, time-varying formant patterns helps in that process. However, that kind of acoustic structure may not be available to children with cochlear implants (CIs), thus hindering development. The major goal of this study was to examine whether children with CIs are as sensitive to time-varying formant structure as children with normal hearing (NH) by asking them to recognize sine-wave speech. The same materials were presented as speech in noise, as well, to evaluate whether any group differences might simply reflect general perceptual deficits on the part of children with CIs. Vocabulary knowledge, phonemic awareness, and “top-down” language effects were all also assessed. Finally, treatment factors were examined as possible predictors of outcomes. Results showed that children with CIs were as accurate as children with NH at recognizing sine-wave speech, but poorer at recognizing speech in noise. Phonemic awareness was related to that recognition. Top-down effects were similar across groups. Having had a period of bimodal stimulation near the time of receiving a first CI facilitated these effects. Results suggest that children with CIs have access to the important time-varying structure of vocal-tract formants. PMID:25994709
Effects of human fatigue on speech signals
NASA Astrophysics Data System (ADS)
Stamoulis, Catherine
2004-05-01
Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
Articulatory capacity of Neanderthals, a very recent and human-like fossil hominin
Barney, Anna; Martelli, Sandra; Serrurier, Antoine; Steele, James
2012-01-01
Scientists seek to use fossil and archaeological evidence to constrain models of the coevolution of human language and tool use. We focus on Neanderthals, for whom indirect evidence from tool use and ancient DNA appears consistent with an adaptation to complex vocal-auditory communication. We summarize existing arguments that the articulatory apparatus for speech had not yet come under intense positive selection pressure in Neanderthals, and we outline some recent evidence and analyses that challenge such arguments. We then provide new anatomical results from our own attempt to reconstruct vocal tract (VT) morphology in Neanderthals, and document our simulations of the acoustic and articulatory potential of this reconstructed Neanderthal VT. Our purpose in this paper is not to polarize debate about whether or not Neanderthals were human-like in all relevant respects, but to contribute to the development of methods that can be used to make further incremental advances in our understanding of the evolution of speech based on fossil and archaeological evidence. PMID:22106429
Modal response of a computational vocal fold model with a substrate layer of adipose tissue.
Jones, Cameron L; Achuthan, Ajit; Erath, Byron D
2015-02-01
This study demonstrates the effect of a substrate layer of adipose tissue on the modal response of the vocal folds, and hence, on the mechanics of voice production. Modal analysis is performed on the vocal fold structure with a lateral layer of adipose tissue. A finite element model is employed, and the first six mode shapes and modal frequencies are studied. The results show significant changes in modal frequencies and substantial variation in mode shapes depending on the strain rate of the adipose tissue. These findings highlight the importance of considering adipose tissue in computational vocal fold modeling.
Fenzl, Thomas; Schuller, Gerd
2005-01-01
Background Echolocating bats emit vocalizations that can be classified either as echolocation calls or communication calls. Neural control of both types of calls must govern the same pool of motoneurons responsible for vocalizations. Electrical microstimulation in the periaqueductal gray matter (PAG) elicits both communication and echolocation calls, whereas stimulation of the paralemniscal area (PLA) induces only echolocation calls. In both the PAG and the PLA, the current thresholds for triggering natural vocalizations do not habituate to stimuli and remain low even for long stimulation periods, indicating that these structures have relative direct access to the final common pathway for vocalization. This study intended to clarify whether echolocation calls and communication calls are controlled differentially below the level of the PAG via separate vocal pathways before converging on the motoneurons used in vocalization. Results Both structures were probed simultaneously in a single experimental approach. Two stimulation electrodes were chronically implanted within the PAG in order to elicit either echolocation or communication calls. Blockade of the ipsilateral PLA site with iontophoretically application of the glutamate antagonist kynurenic acid did not impede either echolocation or communication calls elicited from the PAG. However, blockade of the contralateral PLA suppresses PAG-elicited echolocation calls but not communication calls. In both cases the blockade was reversible. Conclusion The neural control of echolocation and communication calls seems to be differentially organized below the level of the PAG. The PLA is an essential functional unit for echolocation call control before the descending pathways share again the final common pathway for vocalization. PMID:16053533
Determining the etiology of mild vocal fold hypomobility.
Heman-Ackah, Yolanda D; Batory, Mark
2003-12-01
The prevalence of mild vocal fold hypomobility is unknown. In a study by Heman-Ackah et al, vocal fold hypomobility in a population of singing teachers was found to be associated more frequently with vocal complaints than was the presence of vocal fold masses. The etiology of mild vocal fold hypomobility has not been previously explored. In the present study, a retrospective chart review was performed of 134 patients who presented to a tertiary laryngology referral center over a 6-month period for evaluation of vocal complaints. Of the 134 patients, 61 (46%) were found to have mild vocal referring otolaryngologist. Imaging studies and laboratory tests to evaluate for structural, metabolic, and infectious causes of the decreased mobility had been ordered. Forty-nine patients completed the work-up. Of these, 41 out of 49 (84%) were found to have imaging or laboratory findings that could explain the hypomobility. Thyroid abnormalities were found to be associated with vocal fold hypomobility in 21 out of 49 (43%) of those with a complete evaluation. Other causes of vocal fold hypomobility included idiopathic (8 of 49, 16%), viral neuritis (5 of 49, 10%), central nervous system abnormality (4 of 49, 8%), neural tumor (3 of 49, 6%), joint dysfunction (3 of 49, 6%), iatrogenic nerve injury (2 of 49, 4%), myopathy (2 of 49, 4%), and noniatrogenic traumatic nerve injury (1 of 49, 2%), This study shows that unilateral vocal fold hypomobility often is associated with a physiologic process, and a complete investigation to determine the etiology is warranted in all cases.
Vocal fold contact patterns based on normal modes of vibration.
Smith, Simeon L; Titze, Ingo R
2018-05-17
The fluid-structure interaction and energy transfer from respiratory airflow to self-sustained vocal fold oscillation continues to be a topic of interest in vocal fold research. Vocal fold vibration is driven by pressures on the vocal fold surface, which are determined by the shape of the glottis and the contact between vocal folds. Characterization of three-dimensional glottal shapes and contact patterns can lead to increased understanding of normal and abnormal physiology of the voice, as well as to development of improved vocal fold models, but a large inventory of shapes has not been directly studied previously. This study aimed to take an initial step toward characterizing vocal fold contact patterns systematically. Vocal fold motion and contact was modeled based on normal mode vibration, as it has been shown that vocal fold vibration can be almost entirely described by only the few lowest order vibrational modes. Symmetric and asymmetric combinations of the four lowest normal modes of vibration were superimposed on left and right vocal fold medial surfaces, for each of three prephonatory glottal configurations, according to a surface wave approach. Contact patterns were generated from the interaction of modal shapes at 16 normalized phases during the vibratory cycle. Eight major contact patterns were identified and characterized by the shape of the flow channel, with the following descriptors assigned: convergent, divergent, convergent-divergent, uniform, split, merged, island, and multichannel. Each of the contact patterns and its variation are described, and future work and applications are discussed. Copyright © 2018 Elsevier Ltd. All rights reserved.
Patient-Specific Computational Modeling of Human Phonation
NASA Astrophysics Data System (ADS)
Xue, Qian; Zheng, Xudong; University of Maine Team
2013-11-01
Phonation is a common biological process resulted from the complex nonlinear coupling between glottal aerodynamics and vocal fold vibrations. In the past, the simplified symmetric straight geometric models were commonly employed for experimental and computational studies. The shape of larynx lumen and vocal folds are highly three-dimensional indeed and the complex realistic geometry produces profound impacts on both glottal flow and vocal fold vibrations. To elucidate the effect of geometric complexity on voice production and improve the fundamental understanding of human phonation, a full flow-structure interaction simulation is carried out on a patient-specific larynx model. To the best of our knowledge, this is the first patient-specific flow-structure interaction study of human phonation. The simulation results are well compared to the established human data. The effects of realistic geometry on glottal flow and vocal fold dynamics are investigated. It is found that both glottal flow and vocal fold dynamics present a high level of difference from the previous simplified model. This study also paved the important step toward the development of computer model for voice disease diagnosis and surgical planning. The project described was supported by Grant Number ROlDC007125 from the National Institute on Deafness and Other Communication Disorders (NIDCD).
Current treatment of vocal fold scarring.
Hirano, Shigeru
2005-06-01
Vocal fold scarring still remains a therapeutic challenge, with the most problematic issue being the histologic changes that are primarily responsible for altering the viscoelasticity of the vocal fold mucosa. Optimal treatment for vocal fold scarring has not yet been established. To restore or regenerate damaged vocal folds, it is important to investigate the changes to the layer structure of the lamina propria. Tissue engineering and regenerative medicine may provide new strategies for the prevention and treatment of vocal fold scarring. Recent developments in this field are reviewed in the present article. Histologic studies have revealed that hyaluronic acid, fibronectin, decorin, and various other extracellular matrix components, as well as collagen, may contribute to determining the vibratory properties of the vocal fold mucosa. Changes of these molecules are thought to affect the viscoelasticity of the scarred vocal folds. Based on such histologic findings, innovative approaches have been developed, including administration of hyaluronic acid into injured or scarred vocal folds. Other strategies that have recently shown advances include growth factor therapy and cell therapy using stem cells or mature fibroblasts. The effects of these new treatments have not fully been confirmed clinically, but there seems to be great therapeutic potential in such regenerative medical strategies. Recent research has revealed the detailed histologic and rheologic changes related to vocal fold scarring. Based on these findings, various new therapeutic strategies have been developed in animal models using tissue engineering and regenerative medicine. However, no clinical trials have been performed, and more studies are necessary to establish the optimum modality.
Kozakiewicz, Jacek; Gierlotka, Agata; Dec, Maciej; Stockfish, Jerzy
2010-01-01
The rare case of 75-years-old female patient was presented in this paper. She reported hoarseness in addition to pharyngeal pain, dysphagia and medium level dyspnea. Her exploration revealed a wide hematoma of the left lateral wall of orohypopharynx spreading to the left aryepiglottic fold, left aryepiglottic cartilage, false and true vocal fold and later to left lateral and posterior tracheal wall. The patient did not require a control of airway by intubation or tracheotomy according to quick relief after pharmacological treatment.
A biorobotic model of the human larynx.
Manti, M; Cianchetti, M; Nacci, A; Ursino, F; Laschi, C
2015-08-01
This work focuses on a physical model of the human larynx that replicates its main components and functions. The prototype reproduces the multilayer vocal folds and the ab/adduction movements. In particular, the vocal folds prototype is made with soft materials whose mechanical properties have been obtained to be similar to the natural tissue in terms of viscoelasticity. A computational model was used to study fluid-structure interaction between vocal folds and the airflow. This tool allowed us to make a comparison between theoretical and experimental results. Measurements were performed with this prototype in an experimental platform comprising a controlled air flow, pressure sensors and a high-speed camera for measuring vocal fold vibrations. Data included oscillation frequency at the onset pressure and glottal width. Results show that the combination between vocal fold geometry, mechanical properties and dimensions exhibits an oscillation frequency close to that of the human vocal fold. Moreover, computational results show a high correlation with the experimental one.
Measurement of flow separation in a human vocal folds model
NASA Astrophysics Data System (ADS)
Šidlof, Petr; Doaré, Olivier; Cadot, Olivier; Chaigne, Antoine
2011-07-01
The paper provides experimental data on flow separation from a model of the human vocal folds. Data were measured on a four times scaled physical model, where one vocal fold was fixed and the other oscillated due to fluid-structure interaction. The vocal folds were fabricated from silicone rubber and placed on elastic support in the wall of a transparent wind tunnel. A PIV system was used to visualize the flow fields immediately downstream of the glottis and to measure the velocity fields. From the visualizations, the position of the flow separation point was evaluated using a semiautomatic procedure and plotted for different airflow velocities. The separation point position was quantified relative to the orifice width separately for the left and right vocal folds to account for flow asymmetry. The results indicate that the flow separation point remains close to the narrowest cross-section during most of the vocal fold vibration cycle, but moves significantly further downstream shortly prior to and after glottal closure.
A versatile pitch tracking algorithm: from human speech to killer whale vocalizations.
Shapiro, Ari Daniel; Wang, Chao
2009-07-01
In this article, a pitch tracking algorithm [named discrete logarithmic Fourier transformation-pitch detection algorithm (DLFT-PDA)], originally designed for human telephone speech, was modified for killer whale vocalizations. The multiple frequency components of some of these vocalizations demand a spectral (rather than temporal) approach to pitch tracking. The DLFT-PDA algorithm derives reliable estimations of pitch and the temporal change of pitch from the harmonic structure of the vocal signal. Scores from both estimations are combined in a dynamic programming search to find a smooth pitch track. The algorithm is capable of tracking killer whale calls that contain simultaneous low and high frequency components and compares favorably across most signal to noise ratio ranges to the peak-picking and sidewinder algorithms that have been used for tracking killer whale vocalizations previously.
The audiovisual structure of onomatopoeias: An intrusion of real-world physics in lexical creation.
Taitz, Alan; Assaneo, M Florencia; Elisei, Natalia; Trípodi, Mónica; Cohen, Laurent; Sitt, Jacobo D; Trevisan, Marcos A
2018-01-01
Sound-symbolic word classes are found in different cultures and languages worldwide. These words are continuously produced to code complex information about events. Here we explore the capacity of creative language to transport complex multisensory information in a controlled experiment, where our participants improvised onomatopoeias from noisy moving objects in audio, visual and audiovisual formats. We found that consonants communicate movement types (slide, hit or ring) mainly through the manner of articulation in the vocal tract. Vowels communicate shapes in visual stimuli (spiky or rounded) and sound frequencies in auditory stimuli through the configuration of the lips and tongue. A machine learning model was trained to classify movement types and used to validate generalizations of our results across formats. We implemented the classifier with a list of cross-linguistic onomatopoeias simple actions were correctly classified, while different aspects were selected to build onomatopoeias of complex actions. These results show how the different aspects of complex sensory information are coded and how they interact in the creation of novel onomatopoeias.
Avian vocal mimicry: a unified conceptual framework.
Dalziell, Anastasia H; Welbergen, Justin A; Igic, Branislav; Magrath, Robert D
2015-05-01
Mimicry is a classical example of adaptive signal design. Here, we review the current state of research into vocal mimicry in birds. Avian vocal mimicry is a conspicuous and often spectacular form of animal communication, occurring in many distantly related species. However, the proximate and ultimate causes of vocal mimicry are poorly understood. In the first part of this review, we argue that progress has been impeded by conceptual confusion over what constitutes vocal mimicry. We propose a modified version of Vane-Wright's (1980) widely used definition of mimicry. According to our definition, a vocalisation is mimetic if the behaviour of the receiver changes after perceiving the acoustic resemblance between the mimic and the model, and the behavioural change confers a selective advantage on the mimic. Mimicry is therefore specifically a functional concept where the resemblance between heterospecific sounds is a target of selection. It is distinct from other forms of vocal resemblance including those that are the result of chance or common ancestry, and those that have emerged as a by-product of other processes such as ecological convergence and selection for large song-type repertoires. Thus, our definition provides a general and functionally coherent framework for determining what constitutes vocal mimicry, and takes account of the diversity of vocalisations that incorporate heterospecific sounds. In the second part we assess and revise hypotheses for the evolution of avian vocal mimicry in the light of our new definition. Most of the current evidence is anecdotal, but the diverse contexts and acoustic structures of putative vocal mimicry suggest that mimicry has multiple functions across and within species. There is strong experimental evidence that vocal mimicry can be deceptive, and can facilitate parasitic interactions. There is also increasing support for the use of vocal mimicry in predator defence, although the mechanisms are unclear. Less progress has been made in explaining why many birds incorporate heterospecific sounds into their sexual displays, and in determining whether these vocalisations are functionally mimetic or by-products of sexual selection for other traits such as repertoire size. Overall, this discussion reveals a more central role for vocal mimicry in the behavioural ecology of birds than has previously been appreciated. The final part of this review identifies important areas for future research. Detailed empirical data are needed on individual species, including on the structure of mimetic signals, the contexts in which mimicry is produced, how mimicry is acquired, and the ecological relationships between mimic, model and receiver. At present, there is little information and no consensus about the various costs of vocal mimicry for the protagonists in the mimicry complex. The diversity and complexity of vocal mimicry in birds raises important questions for the study of animal communication and challenges our view of the nature of mimicry itself. Therefore, a better understanding of avian vocal mimicry is essential if we are to account fully for the diversity of animal signals. © 2014 The Authors. Biological Reviews © 2014 Cambridge Philosophical Society.
Manternach, Jeremy N; Clark, Chad; Daugherty, James F
2017-07-01
Researchers have found that semi-occluded vocal tract (SOVT) exercises may increase vocal economy by reducing phonation threshold pressure and effort while increasing or maintaining consistent acoustic output. This research has focused solely on individual singers. Much singing instruction, however, takes place in choral settings. Choral singers may use different resonance strategies or unconsciously adjust their singing based on the ability to hear their own sound in relation to others. Results of studies with individual singers, then, may not be directly applicable to choral settings. The purpose of this investigation was to measure the effect of an SOVT protocol (ie, straw phonation) on acoustic changes of conglomerate, choral sound. This is a quasi-experimental, one-group, pretest-posttest design. Participants in this study constituted an intact SATB choir (soprano, alto, tenor, and bass) (N = 15 singers) who performed from memory two unaccompanied pieces of varied tempos from memory, participated in a 4-minute straw phonation protocol with a small stirring straw, and then sang each piece a second time. The long-term average spectrum results indicated small, statistically significant increases in spectral energy for both pieces in the 0-10 kHz (.32 and .20 dB Sound Pressure Level) and 2-4 kHz regions (.46 and .25 dB SPL). These results, although not likely audible to average hearing humans, seem consistent with the assertion that singers enjoy vocal benefits with consistent or increased vocal output. SOVT exercises, therefore, may be useful as a time-efficient way to evoke more efficient and economical singing during choral warm-up and voice building procedures. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The vocal repertoire of Tibetan macaques (Macaca thibetana): A quantitative classification.
Bernstein, Sofia K; Sheeran, Lori K; Wagner, R Steven; Li, Jin-Hua; Koda, Hiroki
2016-09-01
Vocal repertoires are basic and essential components for describing vocal communication in animals. Studying the entire suite of vocal signals aids investigations on the variation of acoustic structure across social contexts, comparisons on the complexity of communication systems across taxa, and in exploration of the evolutionary origins of species-specific vocalizations. Here, we describe the vocal repertoire of the largest species in the macaque genus, Macaca thibetana. We extracted thirty acoustic parameters from call recordings. Post hoc validation through quantitative analyses of the a priori repertoire classified eleven call types: coo, squawk, squeal, noisy scream, growl, bark, compound squeak, leap coo, weeping, modulated tonal scream, and pant. In comparison to the rest of the genus, Tibetan macaques uttered a wider array of vocalizations in the context of copulations. Previous reports did not include modulated tonal screams and pants during harassment of copulatory dyads. Furthermore, in comparison to the rest of the genus, Tibetan macaque females emit acoustically distinct copulation calls. The vocal repertoire of Tibetan macaques contributes to the literature on the emergence of species-specific calls in the genus Macaca with potential insights from social, reproductive, and ecological comparisons across species. Am. J. Primatol. 78:937-949, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
High-precision spatial localization of mouse vocalizations during social interaction.
Heckman, Jesse J; Proville, Rémi; Heckman, Gert J; Azarfar, Alireza; Celikel, Tansu; Englitz, Bernhard
2017-06-07
Mice display a wide repertoire of vocalizations that varies with age, sex, and context. Especially during courtship, mice emit ultrasonic vocalizations (USVs) of high complexity, whose detailed structure is poorly understood. As animals of both sexes vocalize, the study of social vocalizations requires attributing single USVs to individuals. The state-of-the-art in sound localization for USVs allows spatial localization at centimeter resolution, however, animals interact at closer ranges, involving tactile, snout-snout exploration. Hence, improved algorithms are required to reliably assign USVs. We develop multiple solutions to USV localization, and derive an analytical solution for arbitrary vertical microphone positions. The algorithms are compared on wideband acoustic noise and single mouse vocalizations, and applied to social interactions with optically tracked mouse positions. A novel, (frequency) envelope weighted generalised cross-correlation outperforms classical cross-correlation techniques. It achieves a median error of ~1.4 mm for noise and ~4-8.5 mm for vocalizations. Using this algorithms in combination with a level criterion, we can improve the assignment for interacting mice. We report significant differences in mean USV properties between CBA mice of different sexes during social interaction. Hence, the improved USV attribution to individuals lays the basis for a deeper understanding of social vocalizations, in particular sequences of USVs.
Numerical solution of fluid-structure interaction represented by human vocal folds in airflow
NASA Astrophysics Data System (ADS)
Valášek, J.; Sváček, P.; Horáček, J.
2016-03-01
The paper deals with the human vocal folds vibration excited by the fluid flow. The vocal fold is modelled as an elastic body assuming small displacements and therefore linear elasticity theory is used. The viscous incompressible fluid flow is considered. For purpose of numerical solution the arbitrary Lagrangian-Euler method (ALE) is used. The whole problem is solved by the finite element method (FEM) based solver. Results of numerical experiments with different boundary conditions are presented.
Analysis of human scream and its impact on text-independent speaker verification.
Hansen, John H L; Nandwana, Mahesh Kumar; Shokouhi, Navid
2017-04-01
Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.
Frey, Roland; Volodin, Ilya; Volodina, Elena; Carranza, Juan; Torres-Porras, Jerónimo
2012-03-01
Roaring in rutting Iberian red deer stags Cervus elaphus hispanicus is unusual compared to other subspecies of red deer, which radiated from the Iberian refugium after the last glacial maximum. In all red deer stags, the larynx occupies a permanent low mid-neck resting position and is momentarily retracted almost down to the rostral end of the sternum during the production of rutting calls. Simultaneous with the retraction of the larynx, male Iberian red deer pronouncedly protrude the tongue during most of their rutting roars. This poses a mechanical challenge for the vocal tract (vt) and for the hyoid apparatus, as tongue and larynx are strongly pulled in opposite directions. This study (i) examines the vocal anatomy and the acoustics of the rutting roars in free-ranging male C. e. hispanicus; (ii) establishes a potential mechanism of simultaneous tongue protrusion and larynx retraction by applying a two-dimensional model based on graphic reconstructions in single video frames of unrestrained animals; and (iii) advances a hypothesis of evaporative cooling by tongue protrusion in the males of a subspecies of red deer constrained to perform all of the exhausting rutting activities, including acoustic display, in a hot and arid season. © 2012 The Authors. Journal of Anatomy © 2012 Anatomical Society.
Effect of Performance Time of the Semi-Occluded Vocal Tract Exercises in Dysphonic Children.
Ramos, Lorena de Almeida; Gama, Ana Cristina Côrtes
2017-05-01
This study aimed to verify the effects of execution time on auditory-perceptual and acoustic responses in children with dysphonia completing straw phonation exercises. A randomized, prospective, comparative intra-subject study design was used. Twenty-seven children, ranging from 5 to 10 years of age, diagnosed with vocal cord nodules or cysts, were enrolled in the study. All subjects included in the Experimental Group were also included in the Control Group which involved complete voice rest. Sustained vowels (/a/e/ε/e/) counting from 1 to 10 were recorded before the exercises (m0) and then again after the first (m1), third (m3), fifth (m5), and seventh (m7) minutes of straw phonation exercises. The recordings were randomized and presented to five speech therapists, who evaluated vocal quality based on the Grade Roughness Breathiness Asthenia/Strain Instability scale. For acoustic analysis, fundamental frequency, jitter, shimmer, glottal to noise excitation ratio, and noise parameters were analyzed. Reduced roughness, breathiness, and noise measurements as well as increased glottal to noise excitation ratio were observed in the Experimental Group after 3 minutes of exercise. Reduced grade of dysphonia and breathiness were noted after 5 minutes. The ideal duration of straw phonation in children with dysphonia is from 3 to 5 minutes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Killer whales are capable of vocal learning
Foote, Andrew D; Griffin, Rachael M; Howitt, David; Larsson, Lisa; Miller, Patrick J.O; Rus Hoelzel, A
2006-01-01
The production learning of vocalizations by manipulation of the sound production organs to alter the physical structure of sound has been demonstrated in only a few mammals. In this natural experiment, we document the vocal behaviour of two juvenile killer whales, Orcinus orca, separated from their natal pods, which are the only cases of dispersal seen during the three decades of observation of their populations. We find mimicry of California sea lion (Zalophus californianus) barks, demonstrating the vocal production learning ability for one of the calves. We also find differences in call usage (compared to the natal pod) that may reflect the absence of a repertoire model from tutors or some unknown effect related to isolation or context. PMID:17148275
Rehearsal Effects in Adult Word Learning
ERIC Educational Resources Information Center
Kaushanskaya, Margarita; Yoo, Jeewon
2011-01-01
The goal of this research was to examine the effects of phonological familiarity and rehearsal method (vocal vs. subvocal) on novel word learning. In Experiment 1, English-speaking adults learned phonologically familiar novel words that followed English phonological structure. Participants learned half the words via vocal rehearsal (saying the…
Nonlinear laser scanning microscopy of human vocal folds.
Miri, Amir K; Tripathy, Umakanta; Mongeau, Luc; Wiseman, Paul W
2012-02-01
The purpose of this work was to apply nonlinear laser scanning microscopy (NLSM) for visualizing the morphology of extracellular matrix proteins within human vocal folds. This technique may potentially assist clinicians in making rapid diagnoses of vocal fold tissue disease or damage. Microstructural characterization based on NLSM provides valuable information for better understanding molecular mechanisms and tissue structure. Experimental, ex vivo human vocal fold. A custom-built multimodal nonlinear laser scanning microscope was used to scan fibrillar proteins in three 4% formaldehyde-fixed cadaveric samples. Collagen and elastin, key extracellular matrix proteins in the vocal fold lamina propria, were imaged by two nonlinear microscopy modalities: second harmonic generation (SHG) and two-photon fluorescence (TPF), respectively. An experimental protocol was introduced to characterize the geometrical properties of the imaged fibrous proteins. NLSM revealed the biomorphology of the human vocal fold fibrous proteins. No photobleaching was observed for the incident laser power of ∼60 mW before the excitation objective. Types I and III fibrillar collagen were imaged without label in the tissue by intrinsic SHG. Imaging while rotating the incident laser light-polarization direction confirmed a helical shape for the collagen fibers. The amplitude, periodicity, and overall orientation were then computed for the helically distributed collagen network. The elastin network was simultaneously imaged via TPF and found to have a basket-like structure. In some regions, particularly close to the epithelium, colocalization of both extracellular matrix components were observed. A benchmark study is presented for quantitative real-time, ex vivo, NLSM imaging of the extracellular macromolecules in human vocal fold lamina propria. The results are promising for clinical applications. Copyright © 2011 The American Laryngological, Rhinological, and Otological Society, Inc.
Social learning of vocal structure in a nonhuman primate?
2011-01-01
Background Non-human primate communication is thought to be fundamentally different from human speech, mainly due to vast differences in vocal control. The lack of these abilities in non-human primates is especially striking if compared to some marine mammals and bird species, which has generated somewhat of an evolutionary conundrum. What are the biological roots and underlying evolutionary pressures of the human ability to voluntarily control sound production and learn the vocal utterances of others? One hypothesis is that this capacity has evolved gradually in humans from an ancestral stage that resembled the vocal behavior of modern primates. Support for this has come from studies that have documented limited vocal flexibility and convergence in different primate species, typically in calls used during social interactions. The mechanisms underlying these patterns, however, are currently unknown. Specifically, it has been difficult to rule out explanations based on genetic relatedness, suggesting that such vocal flexibility may not be the result of social learning. Results To address this point, we compared the degree of acoustic similarity of contact calls in free-ranging Campbell's monkeys as a function of their social bonds and genetic relatedness. We calculated three different indices to compare the similarities between the calls' frequency contours, the duration of grooming interactions and the microsatellite-based genetic relatedness between partners. We found a significantly positive relation between bond strength and acoustic similarity that was independent of genetic relatedness. Conclusion Genetic factors determine the general species-specific call repertoire of a primate species, while social factors can influence the fine structure of some the call types. The finding is in line with the more general hypothesis that human speech has evolved gradually from earlier primate-like vocal communication. PMID:22177339
Vocalization frequency and duration are coded in separate hindbrain nuclei.
Chagnaud, Boris P; Baker, Robert; Bass, Andrew H
2011-06-14
Temporal patterning is an essential feature of neural networks producing precisely timed behaviours such as vocalizations that are widely used in vertebrate social communication. Here we show that intrinsic and network properties of separate hindbrain neuronal populations encode the natural call attributes of frequency and duration in vocal fish. Intracellular structure/function analyses indicate that call duration is encoded by a sustained membrane depolarization in vocal prepacemaker neurons that innervate downstream pacemaker neurons. Pacemaker neurons, in turn, encode call frequency by rhythmic, ultrafast oscillations in their membrane potential. Pharmacological manipulations show prepacemaker activity to be independent of pacemaker function, thus accounting for natural variation in duration which is the predominant feature distinguishing call types. Prepacemaker neurons also innervate key hindbrain auditory nuclei thereby effectively serving as a call-duration corollary discharge. We propose that premotor compartmentalization of neurons coding distinct acoustic attributes is a fundamental trait of hindbrain vocal pattern generators among vertebrates.
Vocalization frequency and duration are coded in separate hindbrain nuclei
Chagnaud, Boris P.; Baker, Robert; Bass, Andrew H.
2011-01-01
Temporal patterning is an essential feature of neural networks producing precisely timed behaviours such as vocalizations that are widely used in vertebrate social communication. Here we show that intrinsic and network properties of separate hindbrain neuronal populations encode the natural call attributes of frequency and duration in vocal fish. Intracellular structure/function analyses indicate that call duration is encoded by a sustained membrane depolarization in vocal prepacemaker neurons that innervate downstream pacemaker neurons. Pacemaker neurons, in turn, encode call frequency by rhythmic, ultrafast oscillations in their membrane potential. Pharmacological manipulations show prepacemaker activity to be independent of pacemaker function, thus accounting for natural variation in duration which is the predominant feature distinguishing call types. Prepacemaker neurons also innervate key hindbrain auditory nuclei thereby effectively serving as a call-duration corollary discharge. We propose that premotor compartmentalization of neurons coding distinct acoustic attributes is a fundamental trait of hindbrain vocal pattern generators among vertebrates. PMID:21673667
A budget of energy transfer in a sustained vocal folds vibration in glottis
NASA Astrophysics Data System (ADS)
Zhang, Lucy; Yang, Jubiao; Krane, Michael
2016-11-01
A set of force and energy balance equations using the control volume approach is derived based on the first principles of physics for a sustained vocal folds vibration in glottis. The control volume analysis is done for compressible airflow in a moving and deforming control volume in the vicinity of the vocal folds. The interaction between laryngeal airflow and vocal folds are successfully simulated using the modified Immersed Finite Element Method (mIFEM), a fully coupled approach to simulate fluid-structure interactions. Detailed mathematical terms are separated out for deeper physical understanding and utilization of mechanical energy is quantified with the derived equation. The results show that majority of energy input is consumed for driving laryngeal airflow, while a smaller portion is for compensating viscous losses in and sustaining the vibration of the vocal folds. We acknowledge the funding support of NIH 2R01DC005642-10A1.
NASA Astrophysics Data System (ADS)
Sommer, David; Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.
2011-11-01
Voiced speech is produced by dynamic fluid-structure interactions in the larynx. Traditionally, reduced order models of speech have relied upon simplified inviscid flow solvers to prescribe the fluid loadings that drive vocal fold motion, neglecting viscous flow effects that occur naturally in voiced speech. Viscous phenomena, such as skewing of the intraglottal jet, have the most pronounced effect on voiced speech in cases of vocal fold paralysis where one vocal fold loses some, or all, muscular control. The impact of asymmetric intraglottal flow in pathological speech is captured in a reduced order two-mass model of speech by coupling a boundary-layer estimation of the asymmetric pressures with asymmetric tissue parameters that are representative of recurrent laryngeal nerve paralysis. Nonlinear analysis identifies the emergence of irregular and chaotic vocal fold dynamics at values representative of pathological speech conditions.
Properties of vocalization- and gesture-combinations in the transition to first words.
Murillo, Eva; Capilla, Almudena
2016-07-01
Gestures and vocal elements interact from the early stages of language development, but the role of this interaction in the language learning process is not yet completely understood. The aim of this study is to explore gestural accompaniment's influence on the acoustic properties of vocalizations in the transition to first words. Eleven Spanish children aged 0;9 to 1;3 were observed longitudinally in a semi-structured play situation with an adult. Vocalizations were analyzed using several acoustic parameters based on those described by Oller et al. (2010). Results indicate that declarative vocalizations have fewer protosyllables than imperative ones, but only when they are produced with a gesture. Protosyllables duration and f(0) are more similar to those of mature speech when produced with pointing and declarative function than when produced with reaching gestures and imperative purposes. The proportion of canonical syllables produced increases with age, but only when combined with a gesture.
Vocal communication in African elephants (Loxodonta africana).
Soltis, Joseph
2010-01-01
Research on vocal communication in African elephants has increased in recent years, both in the wild and in captivity, providing an opportunity to present a comprehensive review of research related to their vocal behavior. Current data indicate that the vocal repertoire consists of perhaps nine acoustically distinct call types, "rumbles" being the most common and acoustically variable. Large vocal production anatomy is responsible for the low-frequency nature of rumbles, with fundamental frequencies in the infrasonic range. Additionally, resonant frequencies of rumbles implicate the trunk in addition to the oral cavity in shaping the acoustic structure of rumbles. Long-distance communication is thought possible because low-frequency sounds propagate more faithfully than high-frequency sounds, and elephants respond to rumbles at distances of up to 2.5 km. Elephant ear anatomy appears designed for detecting low frequencies, and experiments demonstrate that elephants can detect infrasonic tones and discriminate small frequency differences. Two vocal communication functions in the African elephant now have reasonable empirical support. First, closely bonded but spatially separated females engage in rumble exchanges, or "contact calls," that function to coordinate movement or reunite animals. Second, both males and females produce "mate attraction" rumbles that may advertise reproductive states to the opposite sex. Additionally, there is evidence that the structural variation in rumbles reflects the individual identity, reproductive state, and emotional state of callers. Growth in knowledge about the communication system of the African elephant has occurred from a rich combination of research on wild elephants in national parks and captive elephants in zoological parks.
Ellis, Jesse M S; Riters, Lauren V
2012-01-01
Transmitting information via communicative signals is integral to interacting with conspecifics, and some species achieve this task by varying vocalizations to reflect context. Although signal variation is critical to social interactions, the underlying neural control has not been studied. In response to a predator, black-capped chickadees (Poecile atricapilla) produce mobbing calls (chick-a-dee calls) with various parameters, some of which convey information about the threat stimulus. We predicted that vocal parameters indicative of threat would be associated with distinct patterns of neuronal activity within brain areas involved in social behavior and those involved in the sensorimotor control of vocal production. To test this prediction, we measured the syntax and structural aspects of chick-a-dee call production in response to a hawk model and assessed the protein product of the immediate early gene FOS in brain regions implicated in context-specific vocal and social behavior. These regions include the medial preoptic area (POM) and lateral septum (LS), as well as regions involved in vocal motor control, including the dorsomedial nucleus of the intercollicular complex and the HVC. We found correlations linking call rate (previously demonstrated to reflect threat) to labeling in the POM and LS. Labeling in the HVC correlated with the number of D notes per call, which may also signal threat level. Labeling in the call control region dorsomedial nucleus was associated with the structure of D notes and the overall number of notes, but not call rate or type of notes produced. These results suggest that the POM and LS may influence attributes of vocalizations produced in response to predators and that the brain region implicated in song control, the HVC, also influences call production. Because variation in chick-a-dee call rate indicates predator threat, we speculate that these areas could integrate with motor control regions to imbue mobbing signals with additional information about threat level. Copyright © 2011 S. Karger AG, Basel.
Smith, Simeon L.; Titze, Ingo R.
2016-01-01
Objectives To characterize the pressure-flow relationship of tubes used for semi-occluded vocal tract voice training/therapy, as well as to answer these major questions: (1) What is the relative importance of tube length to tube diameter? (2) What is the range of oral pressures achieved with tubes at phonation flow rates? (3) Does mouth configuration behind the tubes matter? Methods Plastic tubes of various diameters and lengths were mounted in line with an upstream pipe, and the pressure drop across each tube was measured at stepwise increments in flow rate. Basic flow theory and modified flow theory equations were used to describe the pressure-flow relationship of the tubes based on diameter and length. Additionally, the upstream pipe diameter was varied to explore how mouth shape affects tube resistance. Results The modified equation provided an excellent prediction of the pressure-flow relationship across all tube sizes (6% error compared to the experimental data). Variation in upstream pipe diameter yielded up to 10% deviation in pressure for tube sizes typically used in voice training/therapy. Conclusions Using the presented equations, resistance can be characterized for any tube based on diameter, length, and flow rate. With regard to the original questions, we found that: (1) For commonly used tubes, diameter is the critical variable for governing flow resistance; (2) For phonation flow rates, a range of tube dimensions produced pressures between 0 and 7.0 kPa; (3) The mouth pressure behind the lips will vary slightly with different mouth shapes, but this effect can be considered relatively insignificant. PMID:27133001
Cotter, Meghan M.; Whyms, Brian J.; Kelly, Michael P.; Doherty, Benjamin M.; Gentry, Lindell R.; Bersu, Edward T.; Vorperian, Houri K.
2015-01-01
The hyoid bone anchors and supports the vocal tract. Its complex shape is best studied in three dimensions, but it is difficult to capture on computed tomography (CT) images and three-dimensional volume renderings. The goal of this study was to determine the optimal CT scanning and rendering parameters to accurately measure the growth and developmental anatomy of the hyoid and to determine whether it is feasible and necessary to use these parameters in the measurement of hyoids from in vivo CT scans. Direct linear and volumetric measurements of skeletonized hyoid bone specimens were compared to corresponding CT images to determine the most accurate scanning parameters and three-dimensional rendering techniques. A pilot study was undertaken using in vivo scans from a retrospective CT database to determine feasibility of quantifying hyoid growth. Scanning parameters and rendering technique affected accuracy of measurements. Most linear CT measurements were within 10% of direct measurements; however, volume was overestimated when CT scans were acquired with a slice thickness greater than 1.25 mm. Slice-by-slice thresholding of hyoid images decreased volume overestimation. The pilot study revealed that the linear measurements tested correlate with age. A fine-tuned rendering approach applied to small slice thickness CT scans produces the most accurate measurements of hyoid bones. However, linear measurements can be accurately assessed from in vivo CT scans at a larger slice thickness. Such findings imply that investigation into the growth and development of the hyoid bone, and the vocal tract as a whole, can now be performed using these techniques. PMID:25810349
Cotter, Meghan M; Whyms, Brian J; Kelly, Michael P; Doherty, Benjamin M; Gentry, Lindell R; Bersu, Edward T; Vorperian, Houri K
2015-08-01
The hyoid bone anchors and supports the vocal tract. Its complex shape is best studied in three dimensions, but it is difficult to capture on computed tomography (CT) images and three-dimensional volume renderings. The goal of this study was to determine the optimal CT scanning and rendering parameters to accurately measure the growth and developmental anatomy of the hyoid and to determine whether it is feasible and necessary to use these parameters in the measurement of hyoids from in vivo CT scans. Direct linear and volumetric measurements of skeletonized hyoid bone specimens were compared with corresponding CT images to determine the most accurate scanning parameters and three-dimensional rendering techniques. A pilot study was undertaken using in vivo scans from a retrospective CT database to determine feasibility of quantifying hyoid growth. Scanning parameters and rendering technique affected accuracy of measurements. Most linear CT measurements were within 10% of direct measurements; however, volume was overestimated when CT scans were acquired with a slice thickness greater than 1.25 mm. Slice-by-slice thresholding of hyoid images decreased volume overestimation. The pilot study revealed that the linear measurements tested correlate with age. A fine-tuned rendering approach applied to small slice thickness CT scans produces the most accurate measurements of hyoid bones. However, linear measurements can be accurately assessed from in vivo CT scans at a larger slice thickness. Such findings imply that investigation into the growth and development of the hyoid bone, and the vocal tract as a whole, can now be performed using these techniques. © 2015 Wiley Periodicals, Inc.
Adapted cuing technique: facilitating sequential phoneme production.
Klick, S L
1994-09-01
ACT is a visual cuing technique designed to facilitate dyspraxic speech by highlighting the sequential production of phonemes. In using ACT, cues are presented in such a way as to suggest sequential, coarticulatory movement in an overall pattern of motion. While using ACT, the facilitator's hand moves forward and back along the side of her (or his) own face. Finger movements signal specific speech sounds in formations loosely based on the manual alphabet for the hearing impaired. The best movements suggest the flowing, interactive nature of coarticulated phonemes. The synergistic nature of speech is suggested by coordinated hand motions which tighten and relax, move quickly or slowly, reflecting the motions of the vocal tract at various points during production of phonemic sequences. General principles involved in using ACT include a primary focus on speech-in-motion, the monitoring and fading of cues, and the presentation of stimuli based on motor-task analysis of phonemic sequences. Phonemic sequences are cued along three dimensions: place, manner, and vowel-related mandibular motion. Cuing vowels is a central feature of ACT. Two parameters of vowel production, focal point of resonance and mandibular closure, are cued. The facilitator's hand motions reflect the changing shape of the vocal tract and the trajectory of the tongue that result from the coarticulation of vowels and consonants. Rigid presentation of the phonemes is secondary to the facilitator's primary focus on presenting the overall sequential movement. The facilitator's goal is to self-tailor ACT in response to the changing needs and abilities of the client.(ABSTRACT TRUNCATED AT 250 WORDS)
Mechanisms underlying the social enhancement of vocal learning in songbirds.
Chen, Yining; Matheson, Laura E; Sakata, Jon T
2016-06-14
Social processes profoundly influence speech and language acquisition. Despite the importance of social influences, little is known about how social interactions modulate vocal learning. Like humans, songbirds learn their vocalizations during development, and they provide an excellent opportunity to reveal mechanisms of social influences on vocal learning. Using yoked experimental designs, we demonstrate that social interactions with adult tutors for as little as 1 d significantly enhanced vocal learning. Social influences on attention to song seemed central to the social enhancement of learning because socially tutored birds were more attentive to the tutor's songs than passively tutored birds, and because variation in attentiveness and in the social modulation of attention significantly predicted variation in vocal learning. Attention to song was influenced by both the nature and amount of tutor song: Pupils paid more attention to songs that tutors directed at them and to tutors that produced fewer songs. Tutors altered their song structure when directing songs at pupils in a manner that resembled how humans alter their vocalizations when speaking to infants, that was distinct from how tutors changed their songs when singing to females, and that could influence attention and learning. Furthermore, social interactions that rapidly enhanced learning increased the activity of noradrenergic and dopaminergic midbrain neurons. These data highlight striking parallels between humans and songbirds in the social modulation of vocal learning and suggest that social influences on attention and midbrain circuitry could represent shared mechanisms underlying the social modulation of vocal learning.
Mechanisms underlying the social enhancement of vocal learning in songbirds
Chen, Yining; Matheson, Laura E.; Sakata, Jon T.
2016-01-01
Social processes profoundly influence speech and language acquisition. Despite the importance of social influences, little is known about how social interactions modulate vocal learning. Like humans, songbirds learn their vocalizations during development, and they provide an excellent opportunity to reveal mechanisms of social influences on vocal learning. Using yoked experimental designs, we demonstrate that social interactions with adult tutors for as little as 1 d significantly enhanced vocal learning. Social influences on attention to song seemed central to the social enhancement of learning because socially tutored birds were more attentive to the tutor’s songs than passively tutored birds, and because variation in attentiveness and in the social modulation of attention significantly predicted variation in vocal learning. Attention to song was influenced by both the nature and amount of tutor song: Pupils paid more attention to songs that tutors directed at them and to tutors that produced fewer songs. Tutors altered their song structure when directing songs at pupils in a manner that resembled how humans alter their vocalizations when speaking to infants, that was distinct from how tutors changed their songs when singing to females, and that could influence attention and learning. Furthermore, social interactions that rapidly enhanced learning increased the activity of noradrenergic and dopaminergic midbrain neurons. These data highlight striking parallels between humans and songbirds in the social modulation of vocal learning and suggest that social influences on attention and midbrain circuitry could represent shared mechanisms underlying the social modulation of vocal learning. PMID:27247385
Context-dependent effects of noise on echolocation pulse characteristics in free-tailed bats
Smotherman, Michael S.
2010-01-01
Background noise evokes a similar suite of adaptations in the acoustic structure of communication calls across a diverse range of vertebrates. Echolocating bats may have evolved specialized vocal strategies for echolocating in noise, but also seem to exhibit generic vertebrate responses such as the ubiquitous Lombard response. We wondered how bats balance generic and echolocation-specific vocal responses to noise. To address this question, we first characterized the vocal responses of flying free-tailed bats (Tadarida brasiliensis) to broadband noises varying in amplitude. Secondly, we measured the bats’ responses to band-limited noises that varied in the extent of overlap with their echolocation pulse bandwidth. We hypothesized that the bats’ generic responses to noise would be graded proportionally with noise amplitude, total bandwidth and frequency content, and consequently that more selective responses to band-limited noise such as the jamming avoidance response could be explained by a linear decomposition of the response to broadband noise. Instead, the results showed that both the nature and the magnitude of the vocal responses varied with the acoustic structure of the outgoing pulse as well as non-linearly with noise parameters. We conclude that free-tailed bats utilize separate generic and specialized vocal responses to noise in a context-dependent fashion. PMID:19672604
Vasconcelos, Raquel O.; Fonseca, Paulo J.; Amorim, M. Clara P.; Ladich, Friedrich
2011-01-01
Many fishes rely on their auditory skills to interpret crucial information about predators and prey, and to communicate intraspecifically. Few studies, however, have examined how complex natural sounds are perceived in fishes. We investigated the representation of conspecific mating and agonistic calls in the auditory system of the Lusitanian toadfish Halobatrachus didactylus, and analysed auditory responses to heterospecific signals from ecologically relevant species: a sympatric vocal fish (meagre Argyrosomus regius) and a potential predator (dolphin Tursiops truncatus). Using auditory evoked potential (AEP) recordings, we showed that both sexes can resolve fine features of conspecific calls. The toadfish auditory system was most sensitive to frequencies well represented in the conspecific vocalizations (namely the mating boatwhistle), and revealed a fine representation of duration and pulsed structure of agonistic and mating calls. Stimuli and corresponding AEP amplitudes were highly correlated, indicating an accurate encoding of amplitude modulation. Moreover, Lusitanian toadfish were able to detect T. truncatus foraging sounds and A. regius calls, although at higher amplitudes. We provide strong evidence that the auditory system of a vocal fish, lacking accessory hearing structures, is capable of resolving fine features of complex vocalizations that are probably important for intraspecific communication and other relevant stimuli from the auditory scene. PMID:20861044
Trial-Based Functional Analysis Informs Treatment for Vocal Scripting.
Rispoli, Mandy; Brodhead, Matthew; Wolfe, Katie; Gregori, Emily
2018-05-01
Research on trial-based functional analysis has primarily focused on socially maintained challenging behaviors. However, procedural modifications may be necessary to clarify ambiguous assessment results. The purposes of this study were to evaluate the utility of iterative modifications to trial-based functional analysis on the identification of putative reinforcement and subsequent treatment for vocal scripting. For all participants, modifications to the trial-based functional analysis identified a primary function of automatic reinforcement. The structure of the trial-based format led to identification of social attention as an abolishing operation for vocal scripting. A noncontingent attention treatment was evaluated using withdrawal designs for each participant. This noncontingent attention treatment resulted in near zero levels of vocal scripting for all participants. Implications for research and practice are presented.
Precise auditory-vocal mirroring in neurons for learned vocal communication.
Prather, J F; Peters, S; Nowicki, S; Mooney, R
2008-01-17
Brain mechanisms for communication must establish a correspondence between sensory and motor codes used to represent the signal. One idea is that this correspondence is established at the level of single neurons that are active when the individual performs a particular gesture or observes a similar gesture performed by another individual. Although neurons that display a precise auditory-vocal correspondence could facilitate vocal communication, they have yet to be identified. Here we report that a certain class of neurons in the swamp sparrow forebrain displays a precise auditory-vocal correspondence. We show that these neurons respond in a temporally precise fashion to auditory presentation of certain note sequences in this songbird's repertoire and to similar note sequences in other birds' songs. These neurons display nearly identical patterns of activity when the bird sings the same sequence, and disrupting auditory feedback does not alter this singing-related activity, indicating it is motor in nature. Furthermore, these neurons innervate striatal structures important for song learning, raising the possibility that singing-related activity in these cells is compared to auditory feedback to guide vocal learning.
Modeling coupled aerodynamics and vocal fold dynamics using immersed boundary methods.
Duncan, Comer; Zhai, Guangnian; Scherer, Ronald
2006-11-01
The penalty immersed boundary (PIB) method, originally introduced by Peskin (1972) to model the function of the mammalian heart, is tested as a fluid-structure interaction model of the closely coupled dynamics of the vocal folds and aerodynamics in phonation. Two-dimensional vocal folds are simulated with material properties chosen to result in self-oscillation and volume flows in physiological frequency ranges. Properties of the glottal flow field, including vorticity, are studied in conjunction with the dynamic vocal fold motion. The results of using the PIB method to model self-oscillating vocal folds for the case of 8 cm H20 as the transglottal pressure gradient are described. The volume flow at 8 cm H20, the transglottal pressure, and vortex dynamics associated with the self-oscillating model are shown. Volume flow is also given for 2, 4, and 12 cm H2O, illustrating the robustness of the model to a range of transglottal pressures. The results indicate that the PIB method applied to modeling phonation has good potential for the study of the interdependence of aerodynamics and vocal fold motion.
[3D visualization and analysis of vocal fold dynamics].
Bohr, C; Döllinger, M; Kniesburges, S; Traxdorf, M
2016-04-01
Visual investigation methods of the larynx mainly allow for the two-dimensional presentation of the three-dimensional structures of the vocal fold dynamics. The vertical component of the vocal fold dynamics is often neglected, yielding a loss of information. The latest studies show that the vertical dynamic components are in the range of the medio-lateral dynamics and play a significant role within the phonation process. This work presents a method for future 3D reconstruction and visualization of endoscopically recorded vocal fold dynamics. The setup contains a high-speed camera (HSC) and a laser projection system (LPS). The LPS projects a regular grid on the vocal fold surfaces and in combination with the HSC allows a three-dimensional reconstruction of the vocal fold surface. Hence, quantitative information on displacements and velocities can be provided. The applicability of the method is presented for one ex-vivo human larynx, one ex-vivo porcine larynx and one synthetic silicone larynx. The setup introduced allows the reconstruction of the entire visible vocal fold surfaces for each oscillation status. This enables a detailed analysis of the three dimensional dynamics (i. e. displacements, velocities, accelerations) of the vocal folds. The next goal is the miniaturization of the LPS to allow clinical in-vivo analysis in humans. We anticipate new insight on dependencies between 3D dynamic behavior and the quality of the acoustic outcome for healthy and disordered phonation.
The vocal repertoire of the African Penguin (Spheniscus demersus): structure and function of calls.
Favaro, Livio; Ozella, Laura; Pessani, Daniela
2014-01-01
The African Penguin (Spheniscus demersus) is a highly social and vocal seabird. However, currently available descriptions of the vocal repertoire of African Penguin are mostly limited to basic descriptions of calls. Here we provide, for the first time, a detailed description of the vocal behaviour of this species by collecting audio and video recordings from a large captive colony. We combine visual examinations of spectrograms with spectral and temporal acoustic analyses to determine vocal categories. Moreover, we used a principal component analysis, followed by signal classification with a discriminant function analysis, for statistical validation of the vocalisation types. In addition, we identified the behavioural contexts in which calls were uttered. The results show that four basic vocalisations can be found in the vocal repertoire of adult African Penguin, namely a contact call emitted by isolated birds, an agonistic call used in aggressive interactions, an ecstatic display song uttered by single birds, and a mutual display song vocalised by pairs, at their nests. Moreover, we identified two distinct vocalisations interpreted as begging calls by nesting chicks (begging peep) and unweaned juveniles (begging moan). Finally, we discussed the importance of specific acoustic parameters in classifying calls and the possible use of the source-filter theory of vocal production to study penguin vocalisations.
Short bouts of vocalization induce long lasting fast gamma oscillations in a sensorimotor nucleus
Lewandowski, Brian; Schmidt, Marc
2011-01-01
Performance evaluation is a critical feature of motor learning. In the vocal system, it requires the integration of auditory feedback signals with vocal motor commands. The network activity that supports such integration is unknown, but it has been proposed that vocal performance evaluation occurs offline. Recording from NIf, a sensorimotor structure in the avian song system, we show that short bouts of singing in adult male zebra finches (Taeniopygia guttata) induce persistent increases in firing activity and coherent oscillations in the fast gamma range (90–150 Hz). Single units are strongly phase-locked to these oscillations, which can last up to 30 s, often outlasting vocal activity by an order of magnitude. In other systems, oscillations often are triggered by events or behavioral tasks but rarely outlast the event that triggered them by more than 1 second. The present observations are the longest reported gamma oscillations triggered by an isolated behavioral event. In mammals, gamma oscillations have been associated with memory consolidation and are hypothesized to facilitate communication between brain regions. We suggest that the timing and persistent nature of NIf’s fast gamma oscillations make them well suited to facilitate the integration of auditory and vocal motor traces associated with vocal performance evaluation. PMID:21957255
Arens, Christoph; Piazza, Cesare; Andrea, Mario; Dikkers, Frederik G; Tjon Pian Gi, Robin E A; Voigt-Zimmermann, Susanne; Peretti, Giorgio
2016-05-01
In the last decades new endoscopic tools have been developed to improve the diagnostic work-up of vocal fold lesions in addition to normal laryngoscopy, i.e., contact endoscopy, autofluorescence, narrow band imaging and others. Better contrasted and high definition images offer more details of the epithelial and superficial vascular structure of the vocal folds. Following these developments, particular vascular patterns come into focus during laryngoscopy. The present work aims at a systematic pathogenic description of superficial vascular changes of the vocal folds. Additionally, new nomenclature on vascular lesions of the vocal folds will be presented to harmonize the different terms in the literature. Superficial vascular changes can be divided into longitudinal and perpendicular. Unlike longitudinal vascular lesions, e.g., ectasia, meander and change of direction, perpendicular vascular lesions are characterized by different types of vascular loops. They are primarily observed in recurrent respiratory papillomatosis, and in pre-cancerous and cancerous lesions of the vocal folds. These vascular characteristics play a significant role in the differential diagnosis. Among different parameters, e.g., epithelial changes, increase of volume, stiffness of the vocal fold, vascular lesions play an increasing role in the diagnosis of pre- and cancerous lesions.
Manatee (Trichechus manatus) vocalization usage in relation to environmental noise levels.
Miksis-Olds, Jennifer L; Tyack, Peter L
2009-03-01
Noise can interfere with acoustic communication by masking signals that contain biologically important information. Communication theory recognizes several ways a sender can modify its acoustic signal to compensate for noise, including increasing the source level of a signal, its repetition, its duration, shifting frequency outside that of the noise band, or shifting the timing of signal emission outside of noise periods. The extent to which animals would be expected to use these compensation mechanisms depends on the benefit of successful communication, risk of failure, and the cost of compensation. Here we study whether a coastal marine mammal, the manatee, can modify vocalizations as a function of behavioral context and ambient noise level. To investigate whether and how manatees modify their vocalizations, natural vocalization usage and structure were examined in terms of vocalization rate, duration, frequency, and source level. Vocalizations were classified into two call types, chirps and squeaks, which were analyzed independently. In conditions of elevated noise levels, call rates decreased during feeding and social behaviors, and the duration of each call type was differently influenced by the presence of calves. These results suggest that ambient noise levels do have a detectable effect on manatee communication and that manatees modify their vocalizations as a function of noise in specific behavioral contexts.
Nanoscale Viscoelasticity of Extracellular Matrix Proteins in Soft Tissues: a Multiscale Approach
Miri, Amir K.; Heris, Hossein K.; Mongeau, Luc; Javid, Farhad
2013-01-01
We propose that the bulk viscoelasticity of soft tissues results from two length-scale-dependent mechanisms: the time-dependent response of extracellular matrix proteins (ECM) at the nanometer scale and the biophysical interactions between the ECM solid structure and interstitial fluid at the micrometer scale. The latter was modeled using the poroelasticity theory with an assumption of free motion of the interstitial fluid within the porous ECM structure. Following a recent study (Heris, H.K., Miri, A.K., Tripathy, U., Barthelat, F., Mongeau, L., 2013. Journal of the Mechanical Behavior of Biomedical Materials), atomic force microscopy was used to perform creep loading and 50-nm sinusoidal oscillations on porcine vocal folds. The proposed model was calibrated by a finite element model to accurately predict the nanoscale viscoelastic moduli of ECM. A linear correlation was observed between the in-depth distribution of the viscoelastic moduli and that of hyaluronic acids in the vocal fold tissue. We conclude that hyaluronic acids may regulate the vocal fold viscoelasticity at nanoscale. The proposed methodology offers a characterization tool for biomaterials used in vocal fold augmentations. PMID:24317493
[4D-MRI using the synchronized sampling method (SSM)].
Shimada, Yasuhiro; Fujimoto, Ichirou; Takemoto, Hironori; Takano, Sayoko; Masaki, Shinobu; Honda, Kiyoshi; Takeo, Kazuhiro
2002-12-01
A synchronized sampling method (SSM) was developed for the study of voluntary movements by combining the electrocardiographic (ECG) gating method with an external triggering device, and four-dimensional magnetic resonance imaging (4D-MRI) at a rate of 30 frames per second was accomplished by volumetric imaging with the SSM. This method was first applied to the motion imaging of articulatory organs during repetitions of a Japanese five-vowel sequence, and the dynamic change in vocal tract area function was demonstrated with sufficient temporal resolution. This paper describes the methodology, applicability, and limitations of 4D-MRI with the SSM.
Unusual case of choking due to assassin bug ( Cydnocoris gilvus).
Sonar, Vaibhav; Patil, Sachin
2018-01-01
Choking is a form of asphyxia which is caused by an obstruction within the air passages. Here, we report a case of obstruction of the upper respiratory tract due to assassin bug ( Cydnocoris gilvus) where allegations of medical negligence were made by relatives of the deceased. Autopsy findings demonstrated that an insect was present inside the larynx, lodged at the epiglottis. Multiple haemorrhagic patches were present at the base of the tongue, larynx, epiglottis, vocal cords and tracheal bifurcation. As Reduviidae can be successfully used as a biological pest-control agents, they should be used with due precaution.
Surveys of Puerto Rican screech-owl populations in large-tract and fragmented forest habitats
Pardieck, K.L.; Meyers, J.M.; Pagan, M.
1996-01-01
We conducted road surveys of Puerto Rican Screech-Owls (Otus nudipes) by playing conspecific vocalizations in secondary wet forest and fragmented secondary moist forest in rural areas of eastern Puerto Rico. Six paired surveys were conducted bi-weekly beginning in April. We recorded number of owl responses, cloud cover, wind speed, moon phase, and number of passing cars during 5-min stops at 60 locations. Owls responded in similar numbers (P > 0.05) in both habitat types. Also, we detected no association with cloud cover, wind speed, moon phase, or passing cars.
Catecholaminergic contributions to vocal communication signals.
Matheson, Laura E; Sakata, Jon T
2015-05-01
Social context affects behavioral displays across a variety of species. For example, social context acutely influences the acoustic and temporal structure of vocal communication signals such as speech and birdsong. Despite the prevalence and importance of such social influences, little is known about the neural mechanisms underlying the social modulation of communication. Catecholamines are implicated in the regulation of social behavior and motor control, but the degree to which catecholamines influence vocal communication signals remains largely unknown. Using a songbird, the Bengalese finch, we examined the extent to which the social context in which song is produced affected immediate early gene expression (EGR-1) in catecholamine-synthesising neurons in the midbrain. Further, we assessed the degree to which administration of amphetamine, which increases catecholamine concentrations in the brain, mimicked the effect of social context on vocal signals. We found that significantly more catecholaminergic neurons in the ventral tegmental area and substantia nigra (but not the central grey, locus coeruleus or subcoeruleus) expressed EGR-1 in birds that were exposed to females and produced courtship song than in birds that produced non-courtship song in isolation. Furthermore, we found that amphetamine administration mimicked the effects of social context and caused many aspects of non-courtship song to resemble courtship song. Specifically, amphetamine increased the stereotypy of syllable structure and sequencing, the repetition of vocal elements and the degree of sequence completions. Taken together, these data highlight the conserved role of catecholamines in vocal communication across species, including songbirds and humans. © 2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
A Formant Range Profile for Singers.
Titze, Ingo R; Maxfield, Lynn M; Walker, Megan C
2017-05-01
Vowel selection is important in differentiating between singing styles. The timbre of the vocal instrument, which is related to its frequency spectrum, is governed by both the glottal sound source and the vowel choices made by singers. Consequently, the ability to modify the vowel space is a measure of how successfully a singer can maintain a desired timbre across a range of pitches. Formant range profiles were produced as a means of quantifying this ability. Seventy-seven subjects (including trained and untrained vocalists) participated, producing vowels with three intended mouth shapes: (1) neutral or speech-like, (2) megaphone-shaped (wide open mouth), and (3) inverted-megaphone-shaped (widened oropharynx with moderate mouth opening). The first and second formant frequencies (F 1 and F 2 ) were estimated with fry phonation for each shape and values were plotted in F1-F2 space. By taking four vowels of a quadrangle /i, æ, a, u/, the resulting area was quantified in kHz 2 (kHz squared) as a measure of the subject's ability to modify their vocal tract for spectral differences. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Proctor, Michael; Bresch, Erik; Byrd, Dani; Nayak, Krishna; Narayanan, Shrikanth
2013-02-01
Real-time magnetic resonance imaging (rtMRI) was used to examine mechanisms of sound production by an American male beatbox artist. rtMRI was found to be a useful modality with which to study this form of sound production, providing a global dynamic view of the midsagittal vocal tract at frame rates sufficient to observe the movement and coordination of critical articulators. The subject's repertoire included percussion elements generated using a wide range of articulatory and airstream mechanisms. Many of the same mechanisms observed in human speech production were exploited for musical effect, including patterns of articulation that do not occur in the phonologies of the artist's native languages: ejectives and clicks. The data offer insights into the paralinguistic use of phonetic primitives and the ways in which they are coordinated in this style of musical performance. A unified formalism for describing both musical and phonetic dimensions of human vocal percussion performance is proposed. Audio and video data illustrating production and orchestration of beatboxing sound effects are provided in a companion annotated corpus.
A FORMANT RANGE PROFILE FOR SINGERS
Titze, Ingo R.; Maxfield, Lynn; Walker, Megan
2016-01-01
Vowel selection is important in differentiating between singing styles. The timbre of the vocal instrument, which is related to its frequency spectrum, is governed by both the glottal sound source and the vowel choices made by singers. Consequently, the ability to modify the vowel space is a measure of how successfully a singer can maintain a desired timbre across a range of pitches. Formant range profiles (FRPs) were produced as a means of quantifying this ability. 77 subjects (including trained and untrained vocalists) participated, producing vowels with three intended mouth shapes, (1) neutral or speech-like, (2) megaphone-shaped (wide open mouth), and (3) inverted-megaphone-shaped (widened oropharynx with moderate mouth opening). The first and second formant frequencies (F1 and F2) were estimated with fry phonation for each shape and values were plotted in F1–F2 space. By taking four vowels of a quadrangle /i, æ, a, u/, the resulting area was quantified in kHz2 (kHz squared) as a measure of the subject’s ability to modify their vocal tract for spectral differences. PMID:28029556
Shared processing of planning articulatory gestures and grasping.
Vainio, L; Tiainen, M; Tiippana, K; Vainio, M
2014-07-01
It has been proposed that articulatory gestures are shaped by tight integration in planning mouth and hand acts. This hypothesis is supported by recent behavioral evidence showing that response selection between the precision and power grip is systematically influenced by simultaneous articulation of a syllable. For example, precision grip responses are performed relatively fast when the syllable articulation employs the tongue tip (e.g., [te]), whereas power grip responses are performed relatively fast when the syllable articulation employs the tongue body (e.g., [ke]). However, this correspondence effect, and other similar effects that demonstrate the interplay between grasping and articulatory gestures, has been found when the grasping is performed during overt articulation. The present study demonstrates that merely reading the syllables silently (Experiment 1) or hearing them (Experiment 2) results in a similar correspondence effect. The results suggest that the correspondence effect is based on integration in planning articulatory gestures and grasping rather than requiring an overt articulation of the syllables. We propose that this effect reflects partially overlapped planning of goal shapes of the two distal effectors: a vocal tract shape for articulation and a hand shape for grasping. In addition, the paper shows a pitch-grip correspondence effect in which the precision grip is associated with a high-pitched vocalization of the auditory stimuli and the power grip is associated with a low-pitched vocalization. The underlying mechanisms of this phenomenon are discussed in relation to the articulation-grip correspondence.
Developmental weighting shifts for noise components of fricative-vowel syllables.
Nittrouer, S; Miller, M E
1997-07-01
Previous studies have convincingly shown that the weight assigned to vocalic formant transitions in decisions of fricative identity for fricative-vowel syllables decreases with development. Although these same studies suggested a developmental increase in the weight assigned to the noise spectrum, the role of the aperiodic-noise portions of the signals in these fricative decisions have not been as well-studied. The purpose of these experiments was to examine more closely developmental shifts in the weight assigned to the aperiodic-noise components of the signals in decisions of syllable-initial fricative identity. Two experiments used noises varying along continua from a clear /s/ percept to a clear /[symbol: see text]/ percept. In experiment 1, these noises were created by combining /s/ and /[symbol: see text]/ noises produced by a human vocal tract at different amplitude ratios, a process that resulted in stimuli differing primarily in the amplitude of a relatively low-frequency (roughly 2.2-kHz) peak. In experiment 2, noises that varied only in the amplitude of a similar low-frequency peak were created with a software synthesizer. Both experiments used synthetic /a/ and /u/ portions, and efforts were made to minimize possible contributions of vocalic formant transitions to fricative labeling. Children and adults labeled the resulting stimuli as /s/ vowel or /[symbol: see text]/ vowel. Combined results of the two experiments showed that children's responses were less influenced than those of adults by the amplitude of the low-frequency peak of fricative noises.
Le Roux, Aliza; Cherry, Michael I; Manser, Marta B
2009-05-01
We describe the vocal repertoire of a facultatively social carnivore, the yellow mongoose, Cynictis penicillata. Using a combination of close-range observations, recordings and experiments with simulated predators, we were able to obtain clear descriptions of call structure and function for a wide range of calls used by this herpestid. The vocal repertoire of the yellow mongooses comprised ten call types, half of which were used in appeasing or fearful contexts and half in aggressive interactions. Data from this study suggest that the yellow mongoose uses an urgency-based alarm calling system, indicating high and low urgency through two distinct call types. Compared to solitary mongooses, the yellow mongoose has a large proportion of 'friendly' vocalisations that enhance group cohesion, but its vocal repertoire is smaller and less context-specific than those of obligate social species. This study of the vocal repertoire of the yellow mongoose is, to our knowledge, the most complete to have been conducted on a facultatively social species in its natural habitat.
NASA Astrophysics Data System (ADS)
Le Roux, Aliza; Cherry, Michael I.; Manser, Marta B.
2009-05-01
We describe the vocal repertoire of a facultatively social carnivore, the yellow mongoose, Cynictis penicillata. Using a combination of close-range observations, recordings and experiments with simulated predators, we were able to obtain clear descriptions of call structure and function for a wide range of calls used by this herpestid. The vocal repertoire of the yellow mongooses comprised ten call types, half of which were used in appeasing or fearful contexts and half in aggressive interactions. Data from this study suggest that the yellow mongoose uses an urgency-based alarm calling system, indicating high and low urgency through two distinct call types. Compared to solitary mongooses, the yellow mongoose has a large proportion of ‘friendly’ vocalisations that enhance group cohesion, but its vocal repertoire is smaller and less context-specific than those of obligate social species. This study of the vocal repertoire of the yellow mongoose is, to our knowledge, the most complete to have been conducted on a facultatively social species in its natural habitat.
How to bootstrap a human communication system.
Fay, Nicolas; Arbib, Michael; Garrod, Simon
2013-01-01
How might a human communication system be bootstrapped in the absence of conventional language? We argue that motivated signs play an important role (i.e., signs that are linked to meaning by structural resemblance or by natural association). An experimental study is then reported in which participants try to communicate a range of pre-specified items to a partner using repeated non-linguistic vocalization, repeated gesture, or repeated non-linguistic vocalization plus gesture (but without using their existing language system). Gesture proved more effective (measured by communication success) and more efficient (measured by the time taken to communicate) than non-linguistic vocalization across a range of item categories (emotion, object, and action). Combining gesture and vocalization did not improve performance beyond gesture alone. We experimentally demonstrate that gesture is a more effective means of bootstrapping a human communication system. We argue that gesture outperforms non-linguistic vocalization because it lends itself more naturally to the production of motivated signs. © 2013 Cognitive Science Society, Inc.
Tsukahara, Naoki; Aoyama, Masato; Sugita, Shoei
2007-12-01
The vocal characteristics and the morph of the syrinx in Carrion crows (Corvus corone) and those in Jungle crows (C. macrorhynchos) were compared. The vocalizations of both species of crow were recorded into sonograms and analyzed. The appearance and inner configuration of the syrinx were observed using stereoscopic microscope. In addition, the inside diameter of the syrinx, the sizes of the labia and the attached position of the syringeal muscles were measured. The attached figures of syringeal muscles were different between the two species. The vocalizations of Carrion crows were noisier than possibly because their labias were noticeably smaller than those of Jungle crows. The attachment patterns of the syringeal muscles in Jungle crows suggested that they allow for more flexibility on the inside structure of the syrinx. The inner space of the syrinx in Jungle crows was also wider than those of Carrion crows. These results suggested that Jungle crows may be able to make various vocalizations because of these morphological characteristics.
Morphological properties of collagen fibers in porcine lamina propria
Johanes, Iecun; Mihelc, Elaine; Sivasankar, Mahalakshmi; Ivanisevic, Albena
2009-01-01
Objectives Collagen influences the biomechanical properties of vocal folds. Altered collagen morphology has been implicated in dysphonia associated with aging and scarring. Documenting the morphological properties of native collagen in healthy vocal folds is essential to understand the structural and functional alterations to collagen with aging and disease. Our primary objective was to quantify the morphological properties of collagen in the vocal fold lamina propria. Our secondary exploratory objective was to investigate the effects of pepsin exposure on the morphological properties of collagen in the lamina propria. Design Experimental, in vitro study with porcine model. Methods Lamina propria was dissected from 26 vocal folds and imaged with Atomic Force Microscopy (AFM). Morphological data on d-periodicity, diameter, and roughness of collagen fibers were obtained. To investigate the effects of pepsin exposure on collagen morphology, vocal fold surface was exposed to pepsin or sham challenge prior to lamina propria dissection and AFM imaging. Results The d-periodicity, diameter, and roughness values for native vocal fold collagen are consistent with literature reports for collagen fibers in other body tissue. Pepsin exposure on vocal fold surface did not appear to change the morphological properties of collagen fibers in the lamina propria. Conclusions Quantitative data on collagen morphology were obtained at nanoscale resolution. Documenting collagen morphology in healthy vocal folds is critical for understanding the physiological changes to collagen with aging and scarring, and for designing biomaterials that match the native topography of lamina propria. PMID:20171830
Echolocating bats rely on audiovocal feedback to adapt sonar signal design.
Luo, Jinhong; Moss, Cynthia F
2017-10-10
Many species of bat emit acoustic signals and use information carried by echoes reflecting from nearby objects to navigate and forage. It is widely documented that echolocating bats adjust the features of sonar calls in response to echo feedback; however, it remains unknown whether audiovocal feedback contributes to sonar call design. Audiovocal feedback refers to the monitoring of one's own vocalizations during call production and has been intensively studied in nonecholocating animals. Audiovocal feedback not only is a necessary component of vocal learning but also guides the control of the spectro-temporal structure of vocalizations. Here, we show that audiovocal feedback is directly involved in the echolocating bat's control of sonar call features. As big brown bats tracked targets from a stationary position, we played acoustic jamming signals, simulating calls of another bat, timed to selectively perturb audiovocal feedback or echo feedback. We found that the bats exhibited the largest call-frequency adjustments when the jamming signals occurred during vocal production. By contrast, bats did not show sonar call-frequency adjustments when the jamming signals coincided with the arrival of target echoes. Furthermore, bats rapidly adapted sonar call design in the first vocalization following the jamming signal, revealing a response latency in the range of 66 to 94 ms. Thus, bats, like songbirds and humans, rely on audiovocal feedback to structure sonar signal design.
Evaluation of a Multicomponent Intervention for Diurnal Bruxism in a Young Child with Autism
ERIC Educational Resources Information Center
Barnoy, Emily L.; Najdowski, Adel C.; Tarbox, Jonathan; Wilke, Arthur E.; Nollet, Megan D.
2009-01-01
Bruxism, forceful grinding of one's teeth together, can produce destructive outcomes such as wear on the teeth and damaged gums and bone structures. The current study implemented a multicomponent intervention that consisted of vocal and physical cues to decrease rates of bruxism. A partial component analysis suggested that the vocal cue was only…
Kello, Christopher T; Bella, Simone Dalla; Médé, Butovens; Balasubramaniam, Ramesh
2017-10-01
Humans talk, sing and play music. Some species of birds and whales sing long and complex songs. All these behaviours and sounds exhibit hierarchical structure-syllables and notes are positioned within words and musical phrases, words and motives in sentences and musical phrases, and so on. We developed a new method to measure and compare hierarchical temporal structures in speech, song and music. The method identifies temporal events as peaks in the sound amplitude envelope, and quantifies event clustering across a range of timescales using Allan factor (AF) variance. AF variances were analysed and compared for over 200 different recordings from more than 16 different categories of signals, including recordings of speech in different contexts and languages, musical compositions and performances from different genres. Non-human vocalizations from two bird species and two types of marine mammals were also analysed for comparison. The resulting patterns of AF variance across timescales were distinct to each of four natural categories of complex sound: speech, popular music, classical music and complex animal vocalizations. Comparisons within and across categories indicated that nested clustering in longer timescales was more prominent when prosodic variation was greater, and when sounds came from interactions among individuals, including interactions between speakers, musicians, and even killer whales. Nested clustering also was more prominent for music compared with speech, and reflected beat structure for popular music and self-similarity across timescales for classical music. In summary, hierarchical temporal structures reflect the behavioural and social processes underlying complex vocalizations and musical performances. © 2017 The Author(s).
Performance of a reduced-order FSI model for flow-induced vocal fold vibration
NASA Astrophysics Data System (ADS)
Chang, Siyuan; Luo, Haoxiang; Luo's lab Team
2016-11-01
Vocal fold vibration during speech production involves a three-dimensional unsteady glottal jet flow and three-dimensional nonlinear tissue mechanics. A full 3D fluid-structure interaction (FSI) model is computationally expensive even though it provides most accurate information about the system. On the other hand, an efficient reduced-order FSI model is useful for fast simulation and analysis of the vocal fold dynamics, which is often needed in procedures such as optimization and parameter estimation. In this work, we study the performance of a reduced-order model as compared with the corresponding full 3D model in terms of its accuracy in predicting the vibration frequency and deformation mode. In the reduced-order model, we use a 1D flow model coupled with a 3D tissue model. Two different hyperelastic tissue behaviors are assumed. In addition, the vocal fold thickness and subglottal pressure are varied for systematic comparison. The result shows that the reduced-order model provides consistent predictions as the full 3D model across different tissue material assumptions and subglottal pressures. However, the vocal fold thickness has most effect on the model accuracy, especially when the vocal fold is thin. Supported by the NSF.
Characteristics of phonation onset in a two-layer vocal fold model.
Zhang, Zhaoyan
2009-02-01
Characteristics of phonation onset were investigated in a two-layer body-cover continuum model of the vocal folds as a function of the biomechanical and geometric properties of the vocal folds. The analysis showed that an increase in either the body or cover stiffness generally increased the phonation threshold pressure and phonation onset frequency, although the effectiveness of varying body or cover stiffness as a pitch control mechanism varied depending on the body-cover stiffness ratio. Increasing body-cover stiffness ratio reduced the vibration amplitude of the body layer, and the vocal fold motion was gradually restricted to the medial surface, resulting in more effective flow modulation and higher sound production efficiency. The fluid-structure interaction induced synchronization of more than one group of eigenmodes so that two or more eigenmodes may be simultaneously destabilized toward phonation onset. At certain conditions, a slight change in vocal fold stiffness or geometry may cause phonation onset to occur as eigenmode synchronization due to a different pair of eigenmodes, leading to sudden changes in phonation onset frequency, vocal fold vibration pattern, and sound production efficiency. Although observed in a linear stability analysis, a similar mechanism may also play a role in register changes at finite-amplitude oscillations.
Factors associated with vocal fold pathologies in teachers.
Souza, Carla Lima de; Carvalho, Fernando Martins; Araújo, Tânia Maria de; Reis, Eduardo José Farias Borges Dos; Lima, Verônica Maria Cadena; Porto, Lauro Antonio
2011-10-01
To analyze factors associated with the prevalence of the medical diagnosis of vocal fold pathologies in teachers. A census-based epidemiological, cross-sectional study was conducted with 4,495 public primary and secondary school teachers in the city of Salvador, Northeastern Brazil, between March and April 2006. The dependent variable was the self-reported medical diagnosis of vocal fold pathologies and the independent variables were sociodemographic characteristics; professional activity; work organization/interpersonal relationships; physical work environment characteristics; frequency of common mental disorders, measured by the Self-Reporting Questionnaire-20 (SRQ-20 >7); and general health conditions. Descriptive statistical, bivariate and multiple logistic regression analysis techniques were used. The prevalence of self-reported medical diagnosis of vocal fold pathologies was 18.9%. In the logistic regression analysis, the variables that remained associated with this medical diagnosis were as follows: being female, having worked as a teacher for more than seven years, excessive voice use, reporting more than five unfavorable physical work environment characteristics and presence of common mental disorders. The presence of self-reported vocal fold pathologies was associated with factors that point out the need of actions that promote teachers' vocal health and changes in their work structure and organization.
The role of finite displacements in vocal fold modeling.
Chang, Siyuan; Tian, Fang-Bao; Luo, Haoxiang; Doyle, James F; Rousseau, Bernard
2013-11-01
Human vocal folds experience flow-induced vibrations during phonation. In previous computational models, the vocal fold dynamics has been treated with linear elasticity theory in which both the strain and the displacement of the tissue are assumed to be infinitesimal (referred to as model I). The effect of the nonlinear strain, or geometric nonlinearity, caused by finite displacements is yet not clear. In this work, a two-dimensional model is used to study the effect of geometric nonlinearity (referred to as model II) on the vocal fold and the airflow. The result shows that even though the deformation is under 1 mm, i.e., less than 10% of the size of the vocal fold, the geometric nonlinear effect is still significant. Specifically, model I underpredicts the gap width, the flow rate, and the impact stress on the medial surfaces as compared to model II. The study further shows that the differences are caused by the contact mechanics and, more importantly, the fluid-structure interaction that magnifies the error from the small-displacement assumption. The results suggest that using the large-displacement formulation in a computational model would be more appropriate for accurate simulations of the vocal fold dynamics.
Elie, Julie Estelle; Soula, Hédi Antoine; Trouvé, Colette; Mathevon, Nicolas; Vignal, Clémentine
2015-12-01
Individual cages represent a widely used housing condition in laboratories. This isolation represents an impoverished physical and social environment in gregarious animals. It prevents animals from socializing, even when auditory and visual contact is maintained. Zebra finches are colonial songbirds that are widely used as laboratory animals for the study of vocal communication from brain to behavior. In this study, we investigated the effect of single housing on the vocal behavior and the brain activity of male zebra finches (Taeniopygia guttata): male birds housed in individual cages were compared to freely interacting male birds housed as a social group in a communal cage. We focused on the activity of septo-hypothalamic regions of the "social behavior network" (SBN), a set of limbic regions involved in several social behaviors in vertebrates. The activity of four structures of the SBN (BSTm, medial bed nucleus of the stria terminalis; POM, medial preoptic area; lateral septum; ventromedial hypothalamus) and one associated region (paraventricular nucleus of the hypothalamus) was assessed using immunoreactive nuclei density of the immediate early gene Zenk (egr-1). We further assessed the identity of active cell populations by labeling vasotocin (VT). Brain activity was related to behavioral activities of birds like physical and vocal interactions. We showed that individual housing modifies vocal exchanges between birds compared to communal housing. This is of particular importance in the zebra finch, a model species for the study of vocal communication. In addition, a protocol that daily removes one or two birds from the group affects differently male zebra finches depending of their housing conditions: while communally-housed males changed their vocal output, brains of individually housed males show increased Zenk labeling in non-VT cells of the BSTm and enhanced correlation of Zenk-revealed activity between the studied structures. These results show that housing conditions must gain some attention in behavioral neuroscience protocols. Copyright © 2015. Published by Elsevier SAS.
Proton density-weighted laryngeal magnetic resonance imaging in systemically dehydrated rats.
Oleson, Steven; Lu, Kun-Han; Liu, Zhongming; Durkes, Abigail C; Sivasankar, M Preeti
2018-06-01
Dehydrated vocal folds are inefficient sound generators. Although systemic dehydration of the body is believed to induce vocal fold dehydration, this causative relationship has not been demonstrated in vivo. Here we investigate the feasibility of using in vivo proton density (PD)-weighted magnetic resonance imaging (MRI) to demonstrate hydration changes in vocal fold tissue following systemic dehydration in rats. Animal study. Sprague-Dawley rats (n = 10) were imaged at baseline and following a 10% reduction in body weight secondary to withholding water. In vivo, high-field (7 T), PD-weighted MRI was used to successfully resolve vocal fold and salivary gland tissue structures. Normalized signal intensities within the vocal fold decreased postdehydration by an average of 11.38% ± 3.95% (mean ± standard error of the mean [SEM], P = .0098) as compared to predehydration levels. The salivary glands experienced a similar decrease in normalized signal intensity by an average of 10.74% ± 4.14% (mean ± SEM, P = .0195) following dehydration. The correlation coefficient (percent change from dehydration) between vocal folds and salivary glands was 0.7145 (P = .0202). Ten percent systemic dehydration induced vocal fold dehydration as assessed by PD-weighted MRI. Changes in the hydration state of vocal fold tissue were highly correlated with that of the salivary glands in dehydrated rats in vivo. These preliminary findings demonstrate the feasibility of using PD-weighted MRI to quantify hydration states of the vocal folds and lay the foundation for further studies that explore more routine and realistic magnitudes of systemic dehydration and rehydration. NA. Laryngoscope, 128:E222-E227, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates
Petkov, Christopher I.; Jarvis, Erich D.
2012-01-01
Vocal learners such as humans and songbirds can learn to produce elaborate patterns of structurally organized vocalizations, whereas many other vertebrates such as non-human primates and most other bird groups either cannot or do so to a very limited degree. To explain the similarities among humans and vocal-learning birds and the differences with other species, various theories have been proposed. One set of theories are motor theories, which underscore the role of the motor system as an evolutionary substrate for vocal production learning. For instance, the motor theory of speech and song perception proposes enhanced auditory perceptual learning of speech in humans and song in birds, which suggests a considerable level of neurobiological specialization. Another, a motor theory of vocal learning origin, proposes that the brain pathways that control the learning and production of song and speech were derived from adjacent motor brain pathways. Another set of theories are cognitive theories, which address the interface between cognition and the auditory-vocal domains to support language learning in humans. Here we critically review the behavioral and neurobiological evidence for parallels and differences between the so-called vocal learners and vocal non-learners in the context of motor and cognitive theories. In doing so, we note that behaviorally vocal-production learning abilities are more distributed than categorical, as are the auditory-learning abilities of animals. We propose testable hypotheses on the extent of the specializations and cross-species correspondences suggested by motor and cognitive theories. We believe that determining how spoken language evolved is likely to become clearer with concerted efforts in testing comparative data from many non-human animal species. PMID:22912615
Marków, Magdalena; Janecki, Daniel; Orecka, Bogusława; Misiołek, Maciej; Warmuziński, Krzysztof
2017-09-01
Computational fluid dynamics (CFD), a rapidly developing instrument with a number of practical applications, allows calculation and visualization of the changing parameters of airflow in the upper respiratory tract. The aim of this study was to demonstrate the advantages of CFD as an instrument for noninvasive tests of the larynx in patients who had undergone surgical treatment due to bilateral vocal fold paralysis. Surface measurements of the glottic space were made during maximum adduction of the vocal folds. Additionally, the following spirometric parameters were determined: forced vital capacity (FVC), forced expiratory volume in the first second (FEV1), and peak expiratory flow (PEF) rate. Based on the measurements, commercial mesh generation software was used to develop a geometrical model of the glottic space. The computations were carried out using a general purpose CFD code. The analysis included patients who were surgically treated for BVFP in the authors' department between 1999 and 2012. The study group consisted of 22 women (91.67%) and 2 men (8.33%). It was observed that the pressure drop calculated for free breathing depends on the area of the glottis and is independent of its shape. Importantly, for areas below approx. 40 mm2, a sudden rise occurred in the resistance to flow; for the smallest glottic areas studied, the pressure drop was almost 6 times higher than for an area of 40 mm2. Consequently, in cases of areas below 40 mm2 even minor enlargement of the glottic opening can lead to a marked improvement in breathing comfort. Computational fluid dynamics is a useful method for calculating and visualizing the changing parameters of airflow in the upper respiratory tract.
Zajac, David J.; Weissler, Mark C.
2011-01-01
Two studies were conducted to evaluate short-latency vocal tract air pressure responses to sudden pressure bleeds during production of voiceless bilabial stop consonants. It was hypothesized that the occurrence of respiratory reflexes would be indicated by distinct patterns of responses as a function of bleed magnitude. In Study 1, 19 adults produced syllable trains of /pΛ/ using a mouthpiece coupled to a computer-controlled perturbator. The device randomly created bleed apertures that ranged from 0 to 40 mm2 during production of the 2nd or 4th syllable of an utterance. Although peak oral air pressure dropped in a linear manner across bleed apertures, it averaged 2 to 3 cm H2O at the largest bleed. While slope of oral pressure also decreased in a linear trend, duration of the oral pressure pulse remained relatively constant. The patterns suggest that respiratory reflexes, if present, have little effect on oral air pressure levels. In Study 2, both oral and subglottal air pressure responses were monitored in 2 adults while bleed apertures of 20 and 40 mm2 were randomly created. For 1 participant, peak oral air pressure dropped across bleed apertures, as in Study 1. Subglottal air pressure and slope, however, remained relatively stable. These patterns provide some support for the occurrence of respiratory reflexes to regulate subglottal air pressure. Overall, the studies indicate that the inherent physiologic processes of the respiratory system, which may involve reflexes, and passive aeromechanical resistance of the upper airway are capable of developing oral air pressure in the face of substantial pressure bleeds. Implications for understanding speech production and the characteristics of individuals with velopharyngeal dysfunction are discussed. PMID:15324286
Investigation of the impact of thyroid surgery on vocal tract steadiness.
Timon, Conrad I; Hirani, Shashi P; Epstein, Ruth; Rafferty, Mark A
2010-09-01
Subjective nonspecific upper aerodigestive symptoms are not uncommon after thyroid surgery. These are postulated to be related to injury of an extrinsic perithyroid nerve plexus that innervates the muscles of the supraglottic and glottic larynx. This plexus is thought to receive contributing branches from both the recurrent and superior laryngeal nerves. The technique of linear predictive coding was used to estimate the F(2) values from a sustained vowel /a/ in patients before and 48 hours after thyroid or parathyroid surgery. These patients were controlled against a matched pair undergoing surgery without any theoretical effect on the supraglottic musculature. In total, 12 patients were recruited into each group. Each patient had the formant frequency fluctuation (FFF) and the formant frequency fluctuation ratio (FFFR) calculated for F(1) and F(2). Mixed analysis of variance (ANOVA) for all acoustic parameters revealed that the chiF(2)FF showed a significant "time" main effect (F(1,22)=7.196, P=0.014, partial eta(2)=0.246) and a significant "time by group interaction" effect (F(1,22)=8.036, P=0.010, eta(p)(2)=0.268), with changes over time for the thyroid group but not for the controls. Similarly, mean chiF(2)FFR showed a similar significant "time" main effect (F(1,22)=6.488, P=0.018, eta(p)(2)=0.228) and a "time by group interaction" effect (F(1,22)=7.134, P=0.014, eta(p)(2)=0.245). This work suggests that thyroid surgery produces a significant reduction in vocal tract stability in contrast to the controls. This noninvasive measurement offers a potential instrument to investigate the functional implications of any disturbance that thyroid surgery may have on pharyngeal innervations. 2010 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Benichov, Jonathan I; Globerson, Eitan; Tchernichovski, Ofer
2016-01-01
Humans and oscine songbirds share the rare capacity for vocal learning. Songbirds have the ability to acquire songs and calls of various rhythms through imitation. In several species, birds can even coordinate the timing of their vocalizations with other individuals in duets that are synchronized with millisecond-accuracy. It is not known, however, if songbirds can perceive rhythms holistically nor if they are capable of spontaneous entrainment to complex rhythms, in a manner similar to humans. Here we review emerging evidence from studies of rhythm generation and vocal coordination across songbirds and humans. In particular, recently developed experimental methods have revealed neural mechanisms underlying the temporal structure of song and have allowed us to test birds' abilities to predict the timing of rhythmic social signals. Surprisingly, zebra finches can readily learn to anticipate the calls of a "vocal robot" partner and alter the timing of their answers to avoid jamming, even in reference to complex rhythmic patterns. This capacity resembles, to some extent, human predictive motor response to an external beat. In songbirds, this is driven, at least in part, by the forebrain song system, which controls song timing and is essential for vocal learning. Building upon previous evidence for spontaneous entrainment in human and non-human vocal learners, we propose a comparative framework for future studies aimed at identifying shared mechanism of rhythm production and perception across songbirds and humans.
Psychogenic dysphonia: diversity of clinical and vocal manifestations in a case series.
Martins, Regina Helena Garcia; Tavares, Elaine Lara Mendes; Ranalli, Paula Ferreira; Branco, Anete; Pessin, Adriana Bueno Benito
2014-01-01
Psychogenic dysphonia is a functional disorder with variable clinical manifestations. To assess the clinical and vocal characteristics of patients with psychogenic dysphonia in a case series. The study included 28 adult patients with psychogenic dysphonia, evaluated at a University hospital in the last ten years. Assessed variables included gender, age, occupation, vocal symptoms, vocal characteristics, and videolaryngostroboscopic findings. 28 patients (26 women and 2 men) were assessed. Their occupations included: housekeeper (n=17), teacher (n=4), salesclerk (n=4), nurse (n=1), retired (n=1), and psychologist (n=1). Sudden symptom onset was reported by 16 patients and progressive symptom onset was reported by 12; intermittent evolution was reported by 15; symptom duration longer than three months was reported by 21 patients. Videolaryngostroboscopy showed only functional disorders; no patient had structural lesions or changes in vocal fold mobility. Conversion aphonia, skeletal muscle tension, and intermittent voicing were the most frequent vocal emission manifestation forms. In this case series of patients with psychogenic dysphonia, the most frequent form of clinical presentation was conversion aphonia, followed by musculoskeletal tension and intermittent voicing. The clinical and vocal aspects of 28 patients with psychogenic dysphonia, as well as the particularities of each case, are discussed. Copyright © 2014 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Yen, Shih-Ching; Shieh, Bao-Sen; Wang, Yi-Ting; Wang, Ying
2013-12-01
In sika deer Cervus nippon, rutting vocalizations play an important role in breeding behavior. This study investigated two types of rutting vocalizations, the moan and the howl, of the Formosan sika deer C. n. taiouanus, including the acoustic characteristics of the vocalizations, the diurnal and seasonal variations of vocal activity, and individual acoustic variation and identification. The results showed that the sound levels were approximately 81-88 dB(A) for the moan and 92-96 dB(A) for the howl, at a distance of 7 m from the sources. From October 2006 to January 2007, eight days of continuous observations were conducted to record the type and amount of vocalizations. Both moan and howl began to occur in the middle of October and reached peaks in the middle and end of November. Thereafter, few vocalizations were recorded until mid-January 2007. Moreover, we found that 74.5% of the first portion of moan, 65.3% of the second portion of moan, and 64.2% of howl could be identified on an individual basis by using discriminant analysis with 200 iterations of cross-validation test. These results revealed that the sounds differed among individuals, and also that they could be correctly identified. Our findings add to the scientific knowledge of sika deer behavior and provide the basis for a novel method of monitoring sika deer populations.
Vibrational dynamics of vocal folds using nonlinear normal modes.
Pinheiro, Alan P; Kerschen, Gaëtan
2013-08-01
Many previous works involving physical models, excised and in vivo larynges have pointed out nonlinear vibration in vocal folds during voice production. Moreover, theoretical studies involving mechanical modeling of these folds have tried to gain a profound understanding of the observed nonlinear phenomena. In this context, the present work uses the nonlinear normal mode theory to investigate the nonlinear modal behavior of 16 subjects using a two-mass mechanical modeling of the vocal folds. The free response of the conservative system at different energy levels is considered to assess the impact of the structural nonlinearity of the vocal fold tissues. The results show very interesting and complex nonlinear phenomena including frequency-energy dependence, subharmonic regimes and, in some cases, modal interactions, entrainment and bifurcations. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
Fee, Michale S.
2011-01-01
Learned motor behaviors require descending forebrain control to be coordinated with midbrain and brainstem motor systems. In songbirds, such as the zebra finch, regular breathing is controlled by brainstem centers, but when the adult songbird begins to sing, its breathing becomes tightly coordinated with forebrain-controlled vocalizations. The periods of silence (gaps) between song syllables are typically filled with brief breaths, allowing the bird to sing uninterrupted for many seconds. While substantial progress has been made in identifying the brain areas and pathways involved in vocal and respiratory control, it is not understood how respiratory and vocal control is coordinated by forebrain motor circuits. Here we combine a recently developed technique for localized brain cooling, together with recordings of thoracic air sac pressure, to examine the role of cortical premotor nucleus HVC (proper name) in respiratory-vocal coordination. We found that HVC cooling, in addition to slowing all song timescales as previously reported, also increased the duration of expiratory pulses (EPs) and inspiratory pulses (IPs). Expiratory pulses, like song syllables, were stretched uniformly by HVC cooling, but most inspiratory pulses exhibited non-uniform stretch of pressure waveform such that the majority of stretch occurred late in the IP. Indeed, some IPs appeared to change duration by the earlier or later truncation of an underlying inspiratory event. These findings are consistent with the idea that during singing the temporal structure of EPs is under the direct control of forebrain circuits, whereas that of IPs can be strongly influenced by circuits downstream of HVC, likely in the brainstem. An analysis of the temporal jitter of respiratory and vocal structure suggests that IPs may be initiated by HVC at the end of each syllable and terminated by HVC immediately before the onset of the next syllable. PMID:21980466
Does hyaluronic acid distribution in the larynx relate to the newborn's capacity for crying?
Schweinfurth, John M; Thibeault, Susan L
2008-09-01
The newborn is heavily dependent on voice communication and therefore has relatively higher vocal demands and expenditures than the adult, the loudness output per mass performance exceeds that of the adult, and the newborn larynx exhibits significant histological and biochemical differences. The neonatal larynx is capable of sustaining relatively greater pitch and loudness than the adult over longer periods of time. This ability may be related to a more compact arrangement of collagen within the lamina propria, less interstitial space, and a uniform distribution of hyaluronic acid (HA). As HA is the primary determinant of vocal fold viscosity and water content, the distribution of HA in the superficial portion of the neonatal vocal fold is hypothesized to be related to newborn crying endurance. Our objective was to examine the histological structure and the quantity and arrangement of HA within the lamina propria of the pediatric larynx and review the relevant physiology of hyaluronic acid and its impact on voice production. Histological and digital subtraction analysis. Intact, neonatal larynges were sourced from fresh cadaveric specimens. Trichrome stain was used to assess the collagen content and location in the tissues. HA was stained using a colloidal iron staining technique with and without incubation with bovine testicular hyaluronidase. Average optical density was calculated in tissue before and after treatment with hyaluronidase, and the stain intensity ratio was calculated. A total of 14 larynges were suitable for examination, eight males and six females. Histological examination revealed a uniform appearance of the vocal fold without evidence of a distinct vocal ligament or layered structure. Colloidal iron staining revealed an even distribution of HA throughout the vocal fold with no significant difference between quadrants. Images of the colloidal iron-stained tissue had a mean pixel intensity of 82 of 255. Slides of vocal fold tissue treated with hyaluronidase revealed a pixel intensity of 106 of 255 for a 22% mean difference in stain intensity (P < .01). The identification of the layered structure of the adult lamina propria has raised a number of questions as to the development and purpose of the human larynx. Based on histological observations from the current study, possible explanations for the physiological differences include differences in the distribution and tissue concentration of HA and consequently dynamic viscosity, oncotic affinity for water, and less intercellular space in the superficial lamina propria.
ERIC Educational Resources Information Center
Lanovaz, Marc J.; Fletcher, Sarah E.; Rapp, John T.
2009-01-01
We used a three-component multiple-schedule with a brief reversal design to evaluate the effects of structurally unmatched and matched stimuli on immediate and subsequent vocal stereotypy that was displayed by three children with autism spectrum disorders. For 2 of the 3 participants, access to matched stimuli, unmatched stimuli, and music…
Nanoscale viscoelasticity of extracellular matrix proteins in soft tissues: A multiscale approach.
Miri, Amir K; Heris, Hossein K; Mongeau, Luc; Javid, Farhad
2014-02-01
It is hypothesized that the bulk viscoelasticity of soft tissues is determined by two length-scale-dependent mechanisms: the time-dependent response of the extracellular matrix (ECM) proteins at the nanometer scale and the biophysical interactions between the ECM solid structure and interstitial fluid at the micrometer scale. The latter is governed by poroelasticity theory assuming free motion of the interstitial fluid within the porous ECM structure. In a recent study (Heris, H.K., Miri, A.K., Tripathy, U., Barthelat, F., Mongeau, L., 2013. J. Mech. Behav. Biomed. Mater.), atomic force microscopy was used to measure the response of porcine vocal folds to a creep loading and a 50-nm sinusoidal oscillation. A constitutive model was calibrated and verified using a finite element model to accurately predict the nanoscale viscoelastic moduli of ECM. A generally good correlation was obtained between the predicted variation of the viscoelastic moduli with depth and that of hyaluronic acids in vocal fold tissue. We conclude that hyaluronic acids may regulate vocal fold viscoelasticity. The proposed methodology offers a characterization tool for biomaterials used in vocal fold augmentations. © 2013 Elsevier Ltd. All rights reserved.
Information content and acoustic structure of male African elephant social rumbles
Stoeger, Angela S.; Baotic, Anton
2016-01-01
Until recently, the prevailing theory about male African elephants (Loxodonta africana) was that, once adult and sexually mature, males are solitary and targeted only at finding estrous females. While this is true during the state of ‘musth’ (a condition characterized by aggressive behavior and elevated androgen levels), ‘non-musth’ males exhibit a social system seemingly based on companionship, dominance and established hierarchies. Research on elephant vocal communication has so far focused on females, and very little is known about the acoustic structure and the information content of male vocalizations. Using the source and filter theory approach, we analyzed social rumbles of 10 male African elephants. Our results reveal that male rumbles encode information about individuality and maturity (age and size), with formant frequencies and absolute fundamental frequency values having the most informative power. This first comprehensive study on male elephant vocalizations gives important indications on their potential functional relevance for male-male and male-female communication. Our results suggest that, similar to the highly social females, future research on male elephant vocal behavior will reveal a complex communication system in which social knowledge, companionship, hierarchy, reproductive competition and the need to communicate over long distances play key roles. PMID:27273586
Repairing the vibratory vocal fold.
Long, Jennifer L
2018-01-01
A vibratory vocal fold replacement would introduce a new treatment paradigm for structural vocal fold diseases such as scarring and lamina propria loss. This work implants a tissue-engineered replacement for vocal fold lamina propria and epithelium in rabbits and compares histology and function to injured controls and orthotopic transplants. Hypotheses were that the cell-based implant would engraft and control the wound response, reducing fibrosis and restoring vibration. Translational research. Rabbit adipose-derived mesenchymal stem cells (ASC) were embedded within a three-dimensional fibrin gel, forming the cell-based outer vocal fold replacement (COVR). Sixteen rabbits underwent unilateral resection of vocal fold epithelium and lamina propria, as well as reconstruction with one of three treatments: fibrin glue alone with healing by secondary intention, replantation of autologous resected vocal fold cover, or COVR implantation. After 4 weeks, larynges were examined histologically and with phonation. Fifteen rabbits survived. All tissues incorporated well after implantation. After 1 month, both graft types improved histology and vibration relative to injured controls. Extracellular matrix (ECM) of the replanted mucosa was disrupted, and ECM of the COVR implants remained immature. Immune reaction was evident when male cells were implanted into female rabbits. Best histologic and short-term vibratory outcomes were achieved with COVR implants containing male cells implanted into male rabbits. Vocal fold cover replacement with a stem cell-based tissue-engineered construct is feasible and beneficial in acute rabbit implantation. Wound-modifying behavior of the COVR implant is judged to be an important factor in preventing fibrosis. NA. Laryngoscope, 128:153-159, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Cook-Cunningham, Sheri L; Grady, Melissa L
2018-03-01
The purpose of this investigation was to assess the effects of three warm-up procedures (vocal-only, physical-only, physical/vocal combination) on acoustic and perceptual measures of choir sound. The researchers tested three videotaped, 5-minute, choral warm-up procedures on three university choirs. After participating in a warm-up procedure, each choir was recorded singing a folk song for long-term average spectra and pitch analysis. Singer participants responded to a questionnaire about preferences after each warm-up procedure. Warm-up procedures and recording sessions occurred during each choir's regular rehearsal time and in each choir's regular rehearsal space during three consecutive rehearsals. Long-term average spectra results demonstrated more resonant singing after the physical/vocal warm-up for two of the three choirs. Pitch analysis results indicate that all three choirs sang "in-tune" or with the least pitch deviation after participating in the physical/vocal warm-up. Singer questionnaire responses showed general preference for the physical/vocal combination warm-up, and singer ranking of the three procedures indicated the physical/vocal warm-up as the most favored for readiness to sing. In the context of this study with these three university choir participants, it seems that a combination choral warm-up that includes physical and vocal aspects is preferred by singers, enables more resonant singing, and more in-tune singing. Findings from this study could provide teachers and choral directors with important information as they structure and experiment with their choral warm-up procedures. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Musical melody and speech intonation: singing a different tune.
Zatorre, Robert J; Baum, Shari R
2012-01-01
Music and speech are often cited as characteristically human forms of communication. Both share the features of hierarchical structure, complex sound systems, and sensorimotor sequencing demands, and both are used to convey and influence emotions, among other functions [1]. Both music and speech also prominently use acoustical frequency modulations, perceived as variations in pitch, as part of their communicative repertoire. Given these similarities, and the fact that pitch perception and production involve the same peripheral transduction system (cochlea) and the same production mechanism (vocal tract), it might be natural to assume that pitch processing in speech and music would also depend on the same underlying cognitive and neural mechanisms. In this essay we argue that the processing of pitch information differs significantly for speech and music; specifically, we suggest that there are two pitch-related processing systems, one for more coarse-grained, approximate analysis and one for more fine-grained accurate representation, and that the latter is unique to music. More broadly, this dissociation offers clues about the interface between sensory and motor systems, and highlights the idea that multiple processing streams are a ubiquitous feature of neuro-cognitive architectures.
The role of the medial temporal limbic system in processing emotions in voice and music.
Frühholz, Sascha; Trost, Wiebke; Grandjean, Didier
2014-12-01
Subcortical brain structures of the limbic system, such as the amygdala, are thought to decode the emotional value of sensory information. Recent neuroimaging studies, as well as lesion studies in patients, have shown that the amygdala is sensitive to emotions in voice and music. Similarly, the hippocampus, another part of the temporal limbic system (TLS), is responsive to vocal and musical emotions, but its specific roles in emotional processing from music and especially from voices have been largely neglected. Here we review recent research on vocal and musical emotions, and outline commonalities and differences in the neural processing of emotions in the TLS in terms of emotional valence, emotional intensity and arousal, as well as in terms of acoustic and structural features of voices and music. We summarize the findings in a neural framework including several subcortical and cortical functional pathways between the auditory system and the TLS. This framework proposes that some vocal expressions might already receive a fast emotional evaluation via a subcortical pathway to the amygdala, whereas cortical pathways to the TLS are thought to be equally used for vocal and musical emotions. While the amygdala might be specifically involved in a coarse decoding of the emotional value of voices and music, the hippocampus might process more complex vocal and musical emotions, and might have an important role especially for the decoding of musical emotions by providing memory-based and contextual associations. Copyright © 2014 Elsevier Ltd. All rights reserved.
Kleber, Boris; Veit, Ralf; Moll, Christina Valérie; Gaser, Christian; Birbaumer, Niels; Lotze, Martin
2016-06-01
In contrast to instrumental musicians, professional singers do not train on a specific instrument but perfect a motor system that has already been extensively trained during speech motor development. Previous functional imaging studies suggest that experience with singing is associated with enhanced somatosensory-based vocal motor control. However, experience-dependent structural plasticity in vocal musicians has rarely been studied. We investigated voxel-based morphometry (VBM) in 27 professional classical singers and compared gray matter volume in regions of the "singing-network" to an age-matched group of 28 healthy volunteers with no special singing experience. We found right hemispheric volume increases in professional singers in ventral primary somatosensory cortex (larynx S1) and adjacent rostral supramarginal gyrus (BA40), as well as in secondary somatosensory (S2) and primary auditory cortices (A1). Moreover, we found that earlier commencement with vocal training correlated with increased gray-matter volume in S1. However, in contrast to studies with instrumental musicians, this correlation only emerged in singers who began their formal training after the age of 14years, when speech motor development has reached its first plateau. Structural data thus confirm and extend previous functional reports suggesting a pivotal role of somatosensation in vocal motor control with increased experience in singing. Results furthermore indicate a sensitive period for developing additional vocal skills after speech motor coordination has matured. Copyright © 2016 Elsevier Inc. All rights reserved.
Auditory responses in the amygdala to social vocalizations
NASA Astrophysics Data System (ADS)
Gadziola, Marie A.
The underlying goal of this dissertation is to understand how the amygdala, a brain region involved in establishing the emotional significance of sensory input, contributes to the processing of complex sounds. The general hypothesis is that communication calls of big brown bats (Eptesicus fuscus) transmit relevant information about social context that is reflected in the activity of amygdalar neurons. The first specific aim analyzed social vocalizations emitted under a variety of behavioral contexts, and related vocalizations to an objective measure of internal physiological state by monitoring the heart rate of vocalizing bats. These experiments revealed a complex acoustic communication system among big brown bats in which acoustic cues and call structure signal the emotional state of a sender. The second specific aim characterized the responsiveness of single neurons in the basolateral amygdala to a range of social syllables. Neurons typically respond to the majority of tested syllables, but effectively discriminate among vocalizations by varying the response duration. This novel coding strategy underscores the importance of persistent firing in the general functioning of the amygdala. The third specific aim examined the influence of acoustic context by characterizing both the behavioral and neurophysiological responses to natural vocal sequences. Vocal sequences differentially modify the internal affective state of a listening bat, with lower aggression vocalizations evoking the greatest change in heart rate. Amygdalar neurons employ two different coding strategies: low background neurons respond selectively to very few stimuli, whereas high background neurons respond broadly to stimuli but demonstrate variation in response magnitude and timing. Neurons appear to discriminate the valence of stimuli, with aggression sequences evoking robust population-level responses across all sound levels. Further, vocal sequences show improved discrimination among stimuli compared to isolated syllables, and this improved discrimination is expressed in part by the timing of action potentials. Taken together, these data support the hypothesis that big brown bat social vocalizations transmit relevant information about the social context that is encoded within the discharge pattern of amygdalar neurons ultimately responsible for coordinating appropriate social behaviors. I further propose that vocalization-evoked amygdalar activity will have significant impact on subsequent sensory processing and plasticity.
Glottal aerodynamics in compliant, life-sized vocal fold models
NASA Astrophysics Data System (ADS)
McPhail, Michael; Dowell, Grant; Krane, Michael
2013-11-01
This talk presents high-speed PIV measurements in compliant, life-sized models of the vocal folds. A clearer understanding of the fluid-structure interaction of voiced speech, how it produces sound, and how it varies with pathology is required to improve clinical diagnosis and treatment of vocal disorders. Physical models of the vocal folds can answer questions regarding the fundamental physics of speech, as well as the ability of clinical measures to detect the presence and extent of disorder. Flow fields were recorded in the supraglottal region of the models to estimate terms in the equations of fluid motion, and their relative importance. Experiments were conducted over a range of driving pressures with flow rates, given by a ball flowmeter, and subglottal pressures, given by a micro-manometer, reported for each case. Imaging of vocal fold motion, vector fields showing glottal jet behavior, and terms estimated by control volume analysis will be presented. The use of these results for a comparison with clinical measures, and for the estimation of aeroacoustic source strengths will be discussed. Acknowledge support from NIH R01 DC005642.
Oscillatory flow in the cochlea visualized by a magnetic resonance imaging technique.
Denk, W; Keolian, R M; Ogawa, S; Jelinski, L W
1993-02-15
We report a magnetic resonance imaging technique that directly measures motion of cochlear fluids. It uses oscillating magnetic field gradients phase-locked to an external stimulus to selectively visualize and quantify oscillatory fluid motion. It is not invasive, and it does not require optical line-of-sight access to the inner ear. It permits the detection of displacements far smaller than the spatial resolution. The method is demonstrated on a phantom and on living rats. It is projected to have applications for auditory research, for the visualization of vocal tract dynamics during speech and singing, and for determination of the spatial distribution of mechanical relaxations in materials.
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Frollo, Ivan
2017-12-01
The paper focuses on two methods of evaluation of successfulness of speech signal enhancement recorded in the open-air magnetic resonance imager during phonation for the 3D human vocal tract modeling. The first approach enables to obtain a comparison based on statistical analysis by ANOVA and hypothesis tests. The second method is based on classification by Gaussian mixture models (GMM). The performed experiments have confirmed that the proposed ANOVA and GMM classifiers for automatic evaluation of the speech quality are functional and produce fully comparable results with the standard evaluation based on the listening test method.
Phrase-level speech simulation with an airway modulation model of speech production
Story, Brad H.
2012-01-01
Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated. PMID:23503742
Mother goats do not forget their kids’ calls
Briefer, Elodie F.; Padilla de la Torre, Monica; McElligott, Alan G.
2012-01-01
Parent–offspring recognition is crucial for offspring survival. At long distances, this recognition is mainly based on vocalizations. Because of maturation-related changes to the structure of vocalizations, parents have to learn successive call versions produced by their offspring throughout ontogeny in order to maintain recognition. However, because of the difficulties involved in following the same individuals over years, it is not clear how long this vocal memory persists. Here, we investigated long-term vocal recognition in goats. We tested responses of mothers to their kids’ calls 7–13 months after weaning. We then compared mothers’ responses to calls of their previous kids with their responses to the same calls at five weeks postpartum. Subjects tended to respond more to their own kids at five weeks postpartum than 11–17 months later, but displayed stronger responses to their previous kids than to familiar kids from other females. Acoustic analyses showed that it is unlikely that mothers were responding to their previous kids simply because they confounded them with the new kids they were currently nursing. Therefore, our results provide evidence for strong, long-term vocal memory capacity in goats. The persistence of offspring vocal recognition beyond weaning could have important roles in kin social relationships and inbreeding avoidance. PMID:22719031
Smirnova, Darya S.; Demina, Tatyana S.; Volodina, Elena V.
2016-01-01
The vocal repertoire of captive cheetahs (Acinonyx jubatus) and the specific role of meow vocalizations in communication of this species attract research interest about two dozen years. Here, we expand this research focus for the contextual use of call types, sex differences and individual differences at short and long terms. During 457 trials of acoustic recordings, we collected calls (n = 8120) and data on their contextual use for 13 adult cheetahs (6 males and 7 females) in four Russian zoos. The cheetah vocal repertoire comprised 7 call types produced in 8 behavioural contexts. Context-specific call types (chirr, growl, howl and hiss) were related to courting behaviour (chirr) or to aggressive behaviour (growl, howl and hiss). Other call types (chirp, purr and meow) were not context-specific. The values of acoustic variables differed between call types. The meow was the most often call type. Discriminant function analysis revealed a high potential of meows to encode individual identity and sex at short terms, however, the vocal individuality was unstable over years. We discuss the contextual use and acoustic variables of call types, the ratios of individual and sex differences in calls and the pathways of vocal ontogeny in the cheetah with relevant data on vocalization of other animals. PMID:27362643
Smirnova, Darya S; Volodin, Ilya A; Demina, Tatyana S; Volodina, Elena V
2016-01-01
The vocal repertoire of captive cheetahs (Acinonyx jubatus) and the specific role of meow vocalizations in communication of this species attract research interest about two dozen years. Here, we expand this research focus for the contextual use of call types, sex differences and individual differences at short and long terms. During 457 trials of acoustic recordings, we collected calls (n = 8120) and data on their contextual use for 13 adult cheetahs (6 males and 7 females) in four Russian zoos. The cheetah vocal repertoire comprised 7 call types produced in 8 behavioural contexts. Context-specific call types (chirr, growl, howl and hiss) were related to courting behaviour (chirr) or to aggressive behaviour (growl, howl and hiss). Other call types (chirp, purr and meow) were not context-specific. The values of acoustic variables differed between call types. The meow was the most often call type. Discriminant function analysis revealed a high potential of meows to encode individual identity and sex at short terms, however, the vocal individuality was unstable over years. We discuss the contextual use and acoustic variables of call types, the ratios of individual and sex differences in calls and the pathways of vocal ontogeny in the cheetah with relevant data on vocalization of other animals.
Performance of a reduced-order FSI model for flow-induced vocal fold vibration
NASA Astrophysics Data System (ADS)
Luo, Haoxiang; Chang, Siyuan; Chen, Ye; Rousseau, Bernard; PhonoSim Team
2017-11-01
Vocal fold vibration during speech production involves a three-dimensional unsteady glottal jet flow and three-dimensional nonlinear tissue mechanics. A full 3D fluid-structure interaction (FSI) model is computationally expensive even though it provides most accurate information about the system. On the other hand, an efficient reduced-order FSI model is useful for fast simulation and analysis of the vocal fold dynamics, which can be applied in procedures such as optimization and parameter estimation. In this work, we study performance of a reduced-order model as compared with the corresponding full 3D model in terms of its accuracy in predicting the vibration frequency and deformation mode. In the reduced-order model, we use a 1D flow model coupled with a 3D tissue model that is the same as in the full 3D model. Two different hyperelastic tissue behaviors are assumed. In addition, the vocal fold thickness and subglottal pressure are varied for systematic comparison. The result shows that the reduced-order model provides consistent predictions as the full 3D model across different tissue material assumptions and subglottal pressures. However, the vocal fold thickness has most effect on the model accuracy, especially when the vocal fold is thin.
Characterization of vocal fold scar formation, prophylaxis, and treatment using animal models.
Bless, Diane M; Welham, Nathan V
2010-12-01
To review recent literature on animal models used to study the pathogenesis, detection, prevention, and treatment of vocal fold scarring. Animal work is critical to studying vocal fold scarring because it is the only way to conduct systematic research on the biomechanical properties of the layered structure of the vocal fold lamina propria, and therefore develop reliable prevention and treatment strategies for this complex clinical problem. During the period of review, critical anatomic, physiologic, and wound healing characteristics, which may serve as the bases for selection of a certain species to help answer a specific question, have been described in mouse, rat, rabbit, ferret, and canine models. A number of different strategies for prophylaxis and chronic scar treatment in animals show promise for clinical application. The pathways of scar formation and methods for quantifying treatment-induced change have become better defined. Recent animal vocal fold scarring studies have enriched and confirmed earlier work indicating that restoring pliability to the scarred vocal fold mucosa is challenging but achievable. Differences between animal models and differences in outcome measurements across studies necessitate considering each study individually to obtain guidance for future research. With increased standardization of measurement techniques it may be possible to make more inter-study comparisons.
NASA Astrophysics Data System (ADS)
Panova, E. M.; Belikov, R. A.; Agafonov, A. V.; Bel'Kovich, V. M.
2012-02-01
The underwater vocalizations of the beluga whale summering in Onega Bay (64°24'N, 35°49'E) were recorded in June-July of 2008. The vocalizations were classified into five major whistle types, four types of pulsed tones, click series, and noise vocalizations. To determine the relationship between the behavioral activity and the underwater vocalizations, a total of fifty-one 2 minute-long samples of the audio records were analyzed in the next six behavioral contexts: directional movements, quiet swimming, resting, social interactions, individual hunting behavior, and the exploration of hydrophones by beluga whales. The overall vocalization rate and the percentage of the main types of signals depend on the behavior of the belugas. We suggest that one of the whistle types (the "stereotype whistle") is used by belugas for long-distance communications, while other whistle types (with the exception of "squeaks") and three types of pulsed tones (with the exception of "vowels") are used for short distance communication. The percentage of "squeaks" and "vowels" was equally high in all the behavioral situations. Thus, we assume that "squeaks" are contact signals. "Vowels" have a specific physical structure and probably play a role in identification signals. A high rate of the click series was observed in the process of social interactions.
Mother goats do not forget their kids' calls.
Briefer, Elodie F; Padilla de la Torre, Monica; McElligott, Alan G
2012-09-22
Parent-offspring recognition is crucial for offspring survival. At long distances, this recognition is mainly based on vocalizations. Because of maturation-related changes to the structure of vocalizations, parents have to learn successive call versions produced by their offspring throughout ontogeny in order to maintain recognition. However, because of the difficulties involved in following the same individuals over years, it is not clear how long this vocal memory persists. Here, we investigated long-term vocal recognition in goats. We tested responses of mothers to their kids' calls 7-13 months after weaning. We then compared mothers' responses to calls of their previous kids with their responses to the same calls at five weeks postpartum. Subjects tended to respond more to their own kids at five weeks postpartum than 11-17 months later, but displayed stronger responses to their previous kids than to familiar kids from other females. Acoustic analyses showed that it is unlikely that mothers were responding to their previous kids simply because they confounded them with the new kids they were currently nursing. Therefore, our results provide evidence for strong, long-term vocal memory capacity in goats. The persistence of offspring vocal recognition beyond weaning could have important roles in kin social relationships and inbreeding avoidance.
Schlaug, Gottfried; Marchina, Sarah; Norton, Andrea
2009-01-01
Recovery from aphasia can be achieved through recruitment of either peri-lesional brain regions in the affected hemisphere or homologous language regions in the non-lesional hemisphere. For patients with large left-hemisphere lesions, recovery through the right hemisphere may be the only possible path. The right hemisphere regions most likely to play a role in this recovery process are the superior temporal lobe (important for auditory feedback control), premotor regions/posterior inferior frontal gyrus (important for planning and sequencing of motor actions and for auditory-motor mapping) and the primary motor cortex (important for execution of vocal motor actions). These regions are connected reciprocally via a major fiber tract called the arcuate fasciculus (AF), but this tract is usually not as well developed in the non-dominant right hemisphere. We tested whether an intonation-based speech therapy (i.e., Melodic Intonation Therapy) which is typically administered in an intense fashion with 75–80 daily therapy sessions, would lead to changes in white matter tracts, particularly the AF. Using diffusion tensor imaging (DTI), we found a significant increase in the number of AF fibers and AF volume comparing post with pre-treatment assessments in 6 patients that could not be attributed to scan-to-scan variability. This suggests that intense, long-term Melodic Intonation Therapy leads to remodeling of the right AF and may provide an explanation for the sustained therapy effects that were seen in these 6 patients. PMID:19673813
Affective responses in tamarins elicited by species-specific music
Snowdon, Charles T.; Teie, David
2010-01-01
Theories of music evolution agree that human music has an affective influence on listeners. Tests of non-humans provided little evidence of preferences for human music. However, prosodic features of speech (‘motherese’) influence affective behaviour of non-verbal infants as well as domestic animals, suggesting that features of music can influence the behaviour of non-human species. We incorporated acoustical characteristics of tamarin affiliation vocalizations and tamarin threat vocalizations into corresponding pieces of music. We compared music composed for tamarins with that composed for humans. Tamarins were generally indifferent to playbacks of human music, but responded with increased arousal to tamarin threat vocalization based music, and with decreased activity and increased calm behaviour to tamarin affective vocalization based music. Affective components in human music may have evolutionary origins in the structure of calls of non-human animals. In addition, animal signals may have evolved to manage the behaviour of listeners by influencing their affective state. PMID:19726444
Fillis, Michelle Moreira Abujamra; Andrade, Selma Maffei de; González, Alberto Durán; Melanda, Francine Nesello; Mesas, Arthur Eumann
2016-01-01
This study aimed to estimate the prevalence of self-reported vocal problems among primary schoolteachers and to identify associated occupational factors, using a cross-sectional design and face-to-face interviews with 967 teachers in 20 public schools in Londrina, Paraná State, Brazil. Prevalence of self-reported vocal problems was 25.7%. Adjusted analyses showed associations with characteristics of the employment relationship (workweek ≥ 40 hours and poor perception of salaries and health benefits), characteristics of the work environment (number of students per class and exposure to chalk dust and microorganisms), psychological factors (low job satisfaction, limited opportunities to express opinions, worse relationship with superiors, and poor balance between professional and personal life), and violence (insults and bullying). Vocal disorders affected one in four primary schoolteachers and were associated with various characteristics of the teaching profession (both structural and work-related).
The Sound Broadcasting System of the Bullfrog
NASA Astrophysics Data System (ADS)
Purgue, Alejandro P.
1995-01-01
This work presents a comparison across selected species of several aspects of the mechanism of sound broadcasting in anuran amphibians. These studies indicate that all anuran species studied to date broadcast their calls through structures that resonate at the dominant frequency in their calls. Measurements of the magnitude of the transfer function of the radiating structures show that the structures responsible for radiating the bulk of the energy present in the call vary depending on the species considered. Bullfrogs (Rana catesbeiana) radiate most of the energy (89% sound level) present in their calls through their eardrums. In this species the transfer function of the eardrum displays several peaks coincident in frequency and amplitude with the energy distribution observed in the mating and release call of the species. The vocal sac and gular area contribute energy only in the lower band (150 to 400 Hz) of the call. The ears are responsible for radiating additional frequency bands to the ones being radiated through the gular area and vocal sacs. This condition appears to be derived. In Rana pipiens the ears also broadcast a significant portion of the energy present in the call (63% sound level) but the frequencies of the aural emissions are a subset of those frequencies radiated through the vocal sac and gular area. Character optimization suggests that this is the primitive condition for ranid frogs. Finally, the barking treefrog (Hyla gratiosa) appears to use two different structures to radiate different portions of the call. The low frequency band appears to be preferentially radiated through the lungs while the high frequency components of the call are radiated through the vocal sac.
Azul, David
2016-11-01
Transmasculine people assigned female gender at birth but who do not identify with this classification have traditionally received little consideration in the voice literature. Existing analyses tend to be focused on evaluating speaker voice characteristics, whereas other factors that contribute to the production of vocal gender have remained underexplored. Most studies rely on researcher-centred perspectives, whereas very little is known about how transmasculine people themselves experience and make sense of their vocal situations. To explore how participants described their subjective gender positionings; which gender attributions they wished to receive from others; which gender they self-attributed to their voices; which gender attributions they had received from others; and how far participants were satisfied with the gender-related aspects of their vocal situations. Transcripts of semi-structured interviews with 14 German-speaking transmasculine people served as the original data corpus. Sections in which participants described the gender-related aspects of their vocal situations and that were relevant to the current research objectives were selected and explored using qualitative content analysis. The analysis revealed diverse accounts pertaining to the factors that contribute to the production of vocal gender for individual participants and variable levels of satisfaction with vocal gender presentation and attribution. Transmasculine people need to be regarded as a heterogeneous population and clinical practice needs to follow a client-centred, individualized approach. © 2016 Royal College of Speech and Language Therapists.
Core and Shell Song Systems Unique to the Parrot Brain
Chakraborty, Mukta; Walløe, Solveig; Nedergaard, Signe; Fridel, Emma E.; Dabelsteen, Torben; Pakkenberg, Bente; Bertelsen, Mads F.; Dorrestein, Gerry M.; Brauth, Steven E.; Durand, Sarah E.; Jarvis, Erich D.
2015-01-01
The ability to imitate complex sounds is rare, and among birds has been found only in parrots, songbirds, and hummingbirds. Parrots exhibit the most advanced vocal mimicry among non-human animals. A few studies have noted differences in connectivity, brain position and shape in the vocal learning systems of parrots relative to songbirds and hummingbirds. However, only one parrot species, the budgerigar, has been examined and no differences in the presence of song system structures were found with other avian vocal learners. Motivated by questions of whether there are important differences in the vocal systems of parrots relative to other vocal learners, we used specialized constitutive gene expression, singing-driven gene expression, and neural connectivity tracing experiments to further characterize the song system of budgerigars and/or other parrots. We found that the parrot brain uniquely contains a song system within a song system. The parrot “core” song system is similar to the song systems of songbirds and hummingbirds, whereas the “shell” song system is unique to parrots. The core with only rudimentary shell regions were found in the New Zealand kea, representing one of the only living species at a basal divergence with all other parrots, implying that parrots evolved vocal learning systems at least 29 million years ago. Relative size differences in the core and shell regions occur among species, which we suggest could be related to species differences in vocal and cognitive abilities. PMID:26107173
Microscopie non-lineaire pour l'imagerie des cordes vocales
NASA Astrophysics Data System (ADS)
Deterre, Romain
The vocal cords are two folds of epithelial tissues located in the larynx and are involved in production of the human voice. Despite their apparent simplicity, their internal structure is complex. Each fold can be divided into several layers with different mechanical properties. The gold standard for studying their structure - histology - has the inconvenience of being very invasive. Non-linear microscopy is an optical imaging technique which allows images to be taken in depth within samples in a non invasive manner. It also offers intrinsic contrasts, allowing the identification of certain fibrous proteins - elastin and collagen - which are responsible for the mechanical properties of epithelious tissues. The main goal of this research project was to assess nonlinear microscopy's performances for vocal fold imaging. The study has been broken down in two separate tasks. The first one was to evaluate the nonlinear modalities contrast against histology. For that purpose, we chose to first take images of thin samples and compare them to the corresponding histological slides. The second task was to make tests to transcribe the results obtained to in vivo imaging. A custom-built nonlinear imaging system was used for these experiments. It was developed to allow acquisition of wide-field images. A C++ based software was developped to control the microscope and allow treatment and visualization of the images. After being built, the system was further tested to check its performances in comparison with the theoretical limit as described in the literature. Thin slices of vocal folds were obtained from the team of Pr Christopher J. Hartnick from Massachusetts Eye and Ear Infirmary, Harvard Medical School. Specialists from his team analysed the histological samples to extract structural data from the vocal folds. A good correlation was measured between histological and nonlinear data. A first step in evaluating the possibility for translating these results towards in vivo imaging was performed during this project. A swine's larynx was obtained, and vocal folds were extracted for imaging purposes. This experiment showed that it is indeed possible to localize various macrostructures of the tissues with nonlinear microscopy.
Structural and functional connectivity of the subthalamic nucleus during vocal emotion decoding
Frühholz, Sascha; Ceravolo, Leonardo; Grandjean, Didier
2016-01-01
Our understanding of the role played by the subthalamic nucleus (STN) in human emotion has recently advanced with STN deep brain stimulation, a neurosurgical treatment for Parkinson’s disease and obsessive-compulsive disorder. However, the potential presence of several confounds related to pathological models raises the question of how much they affect the relevance of observations regarding the physiological function of the STN itself. This underscores the crucial importance of obtaining evidence from healthy participants. In this study, we tested the structural and functional connectivity between the STN and other brain regions related to vocal emotion in a healthy population by combining diffusion tensor imaging and psychophysiological interaction analysis from a high-resolution functional magnetic resonance imaging study. As expected, we showed that the STN is functionally connected to the structures involved in emotional prosody decoding, notably the orbitofrontal cortex, inferior frontal gyrus, auditory cortex, pallidum and amygdala. These functional results were corroborated by probabilistic fiber tracking, which revealed that the left STN is structurally connected to the amygdala and the orbitofrontal cortex. These results confirm, in healthy participants, the role played by the STN in human emotion and its structural and functional connectivity with the brain network involved in vocal emotions. PMID:26400857
Human listeners attend to size information in domestic dog growls.
Taylor, Anna M; Reby, David; McComb, Karen
2008-05-01
The acoustic features of vocalizations have the potential to transmit information about the size of callers. Most acoustic studies have focused on intraspecific perceptual abilities, but here, the ability of humans to use growls to assess the size of adult domestic dogs was tested. In a first experiment, the formants of growls were shifted to create playback stimuli with different formant dispersions (Deltaf), simulating different vocal tract lengths within the natural range of variation. Mean fundamental frequency (F0) was left unchanged and treated as a covariate. In a second experiment, F0 was resynthesized and Deltaf was left unchanged. In both experiments Deltaf and F0 influenced how participants rated the size of stimuli. Lower formant and fundamental frequencies were rated as belonging to larger dogs. Crucially, when F0 was manipulated and Deltaf was natural, ratings were strongly correlated with the actual weight of the dogs, while when Deltaf was varied and F0 was natural, ratings were not related to the actual weight. Taken together, this suggests that participants relied more heavily on Deltaf, in accordance with the fact that formants are better predictors of body size than F0.
Sigurdsson, Hilmar P; Pépés, Sophia E; Jackson, Georgina M; Draper, Amelia; Morgan, Paul S; Jackson, Stephen R
2018-04-12
Tourette syndrome (TS) is a neurodevelopmental disorder characterised by repetitive and intermittent motor and vocal tics. TS is thought to reflect fronto-striatal dysfunction and the aetiology of the disorder has been linked to widespread alterations in the functional and structural integrity of the brain. The aim of this study was to assess white matter (WM) abnormalities in a large sample of young patients with TS in comparison to a sample of matched typically developing control individuals (CS) using diffusion MRI. The study included 35 patients with TS (3 females; mean age: 14.0 ± 3.3) and 35 CS (3 females; mean age: 13.9 ± 3.3). Diffusion MRI data was analysed using tract-based spatial statistics (TBSS) and probabilistic tractography. Patients with TS demonstrated both marked and widespread decreases in axial diffusivity (AD) together with altered WM connectivity. Moreover, we showed that tic severity and the frequency of premonitory urges (PU) were associated with increased connectivity between primary motor cortex (M1) and the caudate nuclei, and increased information transfer between M1 and the insula, respectively. This is to our knowledge the first study to employ both TBSS and probabilistic tractography in a sample of young patients with TS. Our results contribute to the limited existing literature demonstrating altered connectivity in TS and confirm previous results suggesting in particular, that altered insular function contributes to increased frequency of PU. Copyright © 2018. Published by Elsevier Ltd.
Ackermann, Hermann; Riecker, Axel
2010-06-01
Skilled spoken language production requires fast and accurate coordination of up to 100 muscles. A long-standing concept--tracing ultimately back to Paul Broca--assumes posterior parts of the inferior frontal gyrus to support the orchestration of the respective movement sequences prior to innervation of the vocal tract. At variance with this tradition, the insula has more recently been declared the relevant "region for coordinating speech articulation", based upon clinico-neuroradiological correlation studies. However, these findings have been criticized on methodological grounds. A survey of the clinical literature (cerebrovascular disorders, brain tumours, stimulation mapping) yields a still inconclusive picture. By contrast, functional imaging studies report more consistently hemodynamic insular responses in association with motor aspects of spoken language. Most noteworthy, a relatively small area at the junction of insular and opercular cortex was found sensitive to the phonetic-linguistic structure of verbal utterances, a strong argument for its engagement in articulatory control processes. Nevertheless, intrasylvian hemodynamic activation does not appear restricted to articulatory processes and might also be engaged in the adjustment of the autonomic system to ventilatory needs during speech production: Whereas the posterior insula could be involved in the cortical representation of respiration-related metabolic (interoceptive) states, the more rostral components, acting upon autonomic functions, might serve as a corollary pathway to "voluntary control of breathing" bound to corticospinal and -bulbar fiber tracts. For example, the insula could participate in the implementation of task-specific autonomic settings such as the maintenance of a state of relative hyperventilation during speech production.
Nelson, Danielle V; Klinck, Holger; Carbaugh-Rutland, Alexander; Mathis, Codey L; Morzillo, Anita T; Garcia, Tiffany S
2017-01-01
Loss of acoustic habitat due to anthropogenic noise is a key environmental stressor for vocal amphibian species, a taxonomic group that is experiencing global population declines. The Pacific chorus frog ( Pseudacris regilla ) is the most common vocal species of the Pacific Northwest and can occupy human-dominated habitat types, including agricultural and urban wetlands. This species is exposed to anthropogenic noise, which can interfere with vocalizations during the breeding season. We hypothesized that Pacific chorus frogs would alter the spatial and temporal structure of their breeding vocalizations in response to road noise, a widespread anthropogenic stressor. We compared Pacific chorus frog call structure and ambient road noise levels along a gradient of road noise exposures in the Willamette Valley, Oregon, USA. We used both passive acoustic monitoring and directional recordings to determine source level (i.e., amplitude or volume), dominant frequency (i.e., pitch), call duration, and call rate of individual frogs and to quantify ambient road noise levels. Pacific chorus frogs were unable to change their vocalizations to compensate for road noise. A model of the active space and time ("spatiotemporal communication") over which a Pacific chorus frog vocalization could be heard revealed that in high-noise habitats, spatiotemporal communication was drastically reduced for an individual. This may have implications for the reproductive success of this species, which relies on specific call repertoires to portray relative fitness and attract mates. Using the acoustic call parameters defined by this study (frequency, source level, call rate, and call duration), we developed a simplified model of acoustic communication space-time for this species. This model can be used in combination with models that determine the insertion loss for various acoustic barriers to define the impact of anthropogenic noise on the radius of communication in threatened species. Additionally, this model can be applied to other vocal taxonomic groups provided the necessary acoustic parameters are determined, including the frequency parameters and perception thresholds. Reduction in acoustic habitat by anthropogenic noise may emerge as a compounding environmental stressor for an already sensitive taxonomic group.
Can Birds Perceive Rhythmic Patterns? A Review and Experiments on a Songbird and a Parrot Species
ten Cate, Carel; Spierings, Michelle; Hubert, Jeroen; Honing, Henkjan
2016-01-01
While humans can easily entrain their behavior with the beat in music, this ability is rare among animals. Yet, comparative studies in non-human species are needed if we want to understand how and why this ability evolved. Entrainment requires two abilities: (1) recognizing the regularity in the auditory stimulus and (2) the ability to adjust the own motor output to the perceived pattern. It has been suggested that beat perception and entrainment are linked to the ability for vocal learning. The presence of some bird species showing beat induction, and also the existence of vocal learning as well as vocal non-learning bird taxa, make them relevant models for comparative research on rhythm perception and its link to vocal learning. Also, some bird vocalizations show strong regularity in rhythmic structure, suggesting that birds might perceive rhythmic structures. In this paper we review the available experimental evidence for the perception of regularity and rhythms by birds, like the ability to distinguish regular from irregular stimuli over tempo transformations and report data from new experiments. While some species show a limited ability to detect regularity, most evidence suggests that birds attend primarily to absolute and not relative timing of patterns and to local features of stimuli. We conclude that, apart from some large parrot species, there is limited evidence for beat and regularity perception among birds and that the link to vocal learning is unclear. We next report the new experiments in which zebra finches and budgerigars (both vocal learners) were first trained to distinguish a regular from an irregular pattern of beats and then tested on various tempo transformations of these stimuli. The results showed that both species reduced the discrimination after tempo transformations. This suggests that, as was found in earlier studies, they attended mainly to local temporal features of the stimuli, and not to their overall regularity. However, some individuals of both species showed an additional sensitivity to the more global pattern if some local features were left unchanged. Altogether our study indicates both between and within species variation, in which birds attend to a mixture of local and to global rhythmic features. PMID:27242635
Kucinschi, Bogdan R; Scherer, Ronald C; DeWitt, Kenneth J; Ng, Terry T M
2006-06-01
Flow visualization with smoke particles illuminated by a laser sheet was used to obtain a qualitative description of the air flow structures through a dynamically similar 7.5x symmetric static scale model of the human larynx (divergence angle of 10 deg, minimal diameter of 0.04 cm real life). The acoustic level downstream of the vocal folds was measured by using a condenser microphone. False vocal folds (FVFs) were included. In general, the glottal flow was laminar and bistable. The glottal jet curvature increased with flow rate and decreased with the presence of the FVFs. The glottal exit flow for the lowest flow rate showed a curved jet which remained laminar for all geometries. For the higher flow rates, the jet flow patterns exiting the glottis showed a laminar jet core, transitioning to vortical structures, and leading spatially to turbulent dissipation. This structure was shortened and tightened with an increase in flow rate. The narrow FVF gap lengthened the flow structure and reduced jet curvature via acceleration of the flow. These results suggest that laryngeal flow resistance and the complex jet flow structure exiting the glottis are highly affected by flow rate and the presence of the false vocal folds. Acoustic consequences are discussed in terms of the quadrupole- and dipole-type sound sources due to ordered flow structures.
Sato, Kiminori; Umeno, Hirohito; Nakashima, Tadashi
2010-01-01
This study aims to clarify the role of the maculae flavae (MFe) during growth and development of the human vocal fold mucosa (VFM). Our current results concerning the MFe in the human newborn, infant, and child VFM are summarized. Newborns already had immature MFe at the same sites as adults. They were composed of dense masses of vocal fold stellate cells (VFSCs), whereas extracellular matrix components were sparse. VFSCs in the newborn MFe had already started synthesizing extracellular matrices (EM). During infancy, the EM synthesized in the MFe appeared in the VFM to initiate the formation of the three-dimensional extracellular matrix structure of the human VFM. During childhood, MFe including VFSCs continued to synthesize EM such as collagenous, reticular, and elastic fibers, and hyaluronic acid (glycosaminoglycan), which are essential for the human VFM as a vibrating tissue. The MFe in newborns, infants and children were related to the growth and development of the human VFM. Human MFe including VFSCs were inferred to be involved in the metabolism of EM, essential for the viscoelasticity of the human VFM, and are considered to be an important structure in the growth and development of the human VFM. Copyright © 2010 S. Karger AG, Basel.
Jansen, David A W A M; Cant, Michael A; Manser, Marta B
2012-12-03
All animals are anatomically constrained in the number of discrete call types they can produce. Recent studies suggest that by combining existing calls into meaningful sequences, animals can increase the information content of their vocal repertoire despite these constraints. Additionally, signalers can use vocal signatures or cues correlated to other individual traits or contexts to increase the information encoded in their vocalizations. However, encoding multiple vocal signatures or cues using the same components of vocalizations usually reduces the signals' reliability. Segregation of information could effectively circumvent this trade-off. In this study we investigate how banded mongooses (Mungos mungo) encode multiple vocal signatures or cues in their frequently emitted graded single syllable close calls. The data for this study were collected on a wild, but habituated, population of banded mongooses. Using behavioral observations and acoustical analysis we found that close calls contain two acoustically different segments. The first being stable and individually distinct, and the second being graded and correlating with the current behavior of the individual, whether it is digging, searching or moving. This provides evidence of Marler's hypothesis on temporal segregation of information within a single syllable call type. Additionally, our work represents an example of an identity cue integrated as a discrete segment within a single call that is independent from context. This likely functions to avoid ambiguity between individuals or receivers having to keep track of several context-specific identity cues. Our study provides the first evidence of segmental concatenation of information within a single syllable in non-human vocalizations. By reviewing descriptions of call structures in the literature, we suggest a general application of this mechanism. Our study indicates that temporal segregation and segmental concatenation of vocal signatures or cues is likely a common, but so far neglected, dimension of information coding in animal vocal communication. We argue that temporal segregation of vocal signatures and cues evolves in species where communication of multiple unambiguous signals is crucial, but is limited by the number of call types produced.
Sidtis, Diana; Kreiman, Jody
2011-01-01
The human voice is described in dialogic linguistics as an embodiment of self in a social context, contributing to expression, perception and mutual exchange of self, consciousness, inner life, and personhood. While these approaches are subjective and arise from phenomenological perspectives, scientific facts about personal vocal identity, and its role in biological development, support these views. It is our purpose to review studies of the biology of personal vocal identity -- the familiar voice pattern-- as providing an empirical foundation for the view that the human voice is an embodiment of self in the social context. Recent developments in the biology and evolution of communication are concordant with these notions, revealing that familiar voice recognition (also known as vocal identity recognition or individual vocal recognition) or contributed to survival in the earliest vocalizing species. Contemporary ethology documents the crucial role of familiar voices across animal species in signaling and perceiving internal states and personal identities. Neuropsychological studies of voice reveal multimodal cerebral associations arising across brain structures involved in memory, emotion, attention, and arousal in vocal perception and production, such that the voice represents the whole person. Although its roots are in evolutionary biology, human competence for processing layered social and personal meanings in the voice, as well as personal identity in a large repertory of familiar voice patterns, has achieved an immense sophistication. PMID:21710374
Karajanagi, Sandeep S; Lopez-Guerra, Gerardo; Park, Hyoungshin; Kobler, James B; Galindo, Marilyn; Aanestad, Jon; Mehta, Daryush D; Kumai, Yoshihiko; Giordano, Nicholas; d'Almeida, Anthony; Heaton, James T; Langer, Robert; Herrera, Victoria L M; Faquin, William; Hillman, Robert E; Zeitels, Steven M
2011-03-01
Most cases of irresolvable hoarseness are due to deficiencies in the pliability and volume of the superficial lamina propria of the phonatory mucosa. By using a US Food and Drug Administration-approved polymer, polyethylene glycol (PEG), we created a novel hydrogel (PEG30) and investigated its effects on multiple vocal fold structural and functional parameters. We injected PEG30 unilaterally into 16 normal canine vocal folds with survival times of 1 to 4 months. High-speed videos of vocal fold vibration, induced by intratracheal airflow, and phonation threshold pressures were recorded at 4 time points per subject. Three-dimensional reconstruction analysis of 11.7 T magnetic resonance images and histologic analysis identified 3 cases wherein PEG30 injections were the most superficial, so as to maximally impact vibratory function. These cases were subjected to in-depth analyses. High-speed video analysis of the 3 selected cases showed minimal to no reduction in the maximum vibratory amplitudes of vocal folds injected with PEG30 compared to the non-injected, contralateral vocal fold. All PEG30-injected vocal folds displayed mucosal wave activity with low average phonation threshold pressures. No significant inflammation was observed on microlaryngoscopic examination. Magnetic resonance imaging and histologic analyses revealed time-dependent resorption of the PEG30 hydrogel by phagocytosis with minimal tissue reaction or fibrosis. The PEG30 hydrogel is a promising biocompatible candidate biomaterial to restore form and function to deficient phonatory mucosa, while not mechanically impeding residual endogenous superficial lamina propria.
Teachers' voice use in teaching environments: a field study using ambulatory phonation monitor.
Lyberg Åhlander, Viveka; Pelegrín García, David; Whitling, Susanna; Rydell, Roland; Löfqvist, Anders
2014-11-01
This case-control designed field study examines the vocal behavior in teachers with self-estimated voice problems (VP) and their age- and school-matched voice healthy (VH) colleagues. It was hypothesized that teachers with and teachers without VP use their voices differently regarding fundamental frequency, sound pressure level (SPL), and in relation to the background noise. Teachers with self-estimated VP (n = 14; two males and 12 females) were age and gender matched to VH school colleagues (n = 14; two males and 12 females). The subjects, recruited from an earlier study, had been examined in laryngeal, vocal, hearing, and psychosocial aspects. The fundamental frequency, SPL, and phonation time were recorded with an Ambulatory Phonation Monitor during one representative workday. The teachers reported their activities in a structured diary. The SPL (including teachers' and students' activity and ambient noise) was recorded with a sound level meter; the room temperature and air quality were measured simultaneously. The acoustic properties of the empty classrooms were measured. Teachers with VP behaved vocally different from their VH peers, in particular during teaching sessions. The phonation time was significantly higher in the group with VP, and the number of vibratory cycles differed between the female teachers. The F0 pattern, related to the vocal SPL and room acoustics, differed between the groups. The results suggest a different vocal behavior in subjects with subjective VP and a higher vocal load with fewer possibilities for vocal recovery. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
An Immersed-Boundary Method for Fluid-Structure Interaction in the Human Larynx
NASA Astrophysics Data System (ADS)
Luo, Haoxiang; Zheng, Xudong; Mittal, Rajat; Bielamowicz, Steven
2006-11-01
We describe a novel and accurate computational methodology for modeling the airflow and vocal fold dynamics in human larynx. The model is useful in helping us gain deeper insight into the complicated bio-physics of phonation, and may have potential clinical application in design and placement of synthetic implant in vocal fold surgery. The numerical solution of the airflow employs a previously developed immersed-boundary solver. However, in order to incorporate the vocal fold into the model, we have developed a new immersed-boundary method that can simulate the dynamics of the multi-layered, viscoelastic solids. In this method, a finite-difference scheme is used to approximate the derivatives and ghost cells are defined near the boundary. To impose the traction boundary condition, a third-order polynomial is obtained using the weighted least squares fitting to approximate the function locally. Like its analogue for the flow solver, this immersed-boundary method for the solids has the advantage of simple grid generation, and may be easily implemented on parallel computers. In the talk, we will present the simulation results on both the specified vocal fold motion and the flow-induced vocal fold vibration. Supported by NIDCD Grant R01 DC007125-01A1.
Insights into the role of elastin in vocal fold health and disease
Moore, Jaime
2011-01-01
Elastic fibers are large, complex and surprisingly poorly understood extracellular matrix (ECM) macromolecules. The elastin fiber, generated from a single human gene - elastin (ELN), is a self assembling integral protein that endows critical mechanic proprieties to elastic tissues and organs such as the skin, lungs, and arteries. The biology of elastic fibers is complex because they have multiple components, a tightly regulated developmental deposition, a multi-step hierarchical assembly and unique biomechanical functions. Elastin is present in vocal folds, where it plays a pivotal role in the quality of phonation. This review article provides an overview of the genesis of elastin and its wide- ranging structure and function. Specific distribution within the vocal fold lamina propria across the lifespan in normal and pathological states and its contribution to vocal fold biomechanics will be examined. Elastin and elastin-derived molecules are increasingly investigated for their application in tissue engineering. The properties of various elastin– based materials will be discussed and their current and future applications evaluated. A new level of understanding of the biomechanical properties of vocal fold elastin composites and their molecular basis should lead to new strategies for elastic fiber repair and regeneration in aging and disease. PMID:21708449
Laughter as an approach to vocal evolution: The bipedal theory.
Provine, Robert R
2017-02-01
Laughter is a simple, stereotyped, innate, human play vocalization that is ideal for the study of vocal evolution. The basic approach of describing the act of laughter and when we do it has revealed a variety of phenomena of social, linguistic, and neurological significance. Findings include the acoustic structure of laughter, the minimal voluntary control of laughter, the punctuation effect (which describes the placement of laughter in conversation and indicates the dominance of speech over laughter), and the role of laughter in human matching and mating. Especially notable is the use of laughter to discover why humans can speak and other apes cannot. Quadrupeds, including our primate ancestors, have a 1:1 relation between breathing and stride because their thorax must absorb forelimb impacts during running. The direct link between breathing and locomotion limits vocalizations to short, simple utterances, such as the characteristic panting chimpanzee laugh (one sound per inward or outward breath). The evolution of bipedal locomotion freed the respiration system of its support function during running, permitting greater breath control and the selection for human-type laughter (a parsed exhalation), and subsequently the virtuosic, sustained, expiratory vocalization of speech. This is the basis of the bipedal theory of speech evolution.
Zhou, Xin; Fu, Xin; Lin, Chun; Zhou, Xiaojuan; Liu, Jin; Wang, Li; Zhang, Xinwen; Zuo, Mingxue; Fan, Xiaolong; Li, Dapeng; Sun, Yingyu
2017-05-01
Deafening elicits a deterioration of learned vocalization, in both humans and songbirds. In songbirds, learned vocal plasticity has been shown to depend on the basal ganglia-cortical circuit, but the underlying cellular basis remains to be clarified. Using confocal imaging and electron microscopy, we examined the effect of deafening on dendritic spines in avian vocal motor cortex, the robust nucleus of the arcopallium (RA), and investigated the role of the basal ganglia circuit in motor cortex plasticity. We found rapid structural changes to RA dendritic spines in response to hearing loss, accompanied by learned song degradation. In particular, the morphological characters of RA spine synaptic contacts between 2 major pathways were altered differently. However, experimental disruption of the basal ganglia circuit, through lesions in song-specialized basal ganglia nucleus Area X, largely prevented both the observed changes to RA dendritic spines and the song deterioration after hearing loss. Our results provide cellular evidence to highlight a key role of the basal ganglia circuit in the motor cortical plasticity that underlies learned vocal plasticity. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.