acoustic voice analysis: Topics by Science.gov

Sample records for acoustic voice analysis

[Applicability of voice acoustic analysis with vocal loading testto diagnostics of occupational voice diseases].

PubMed

Niebudek-Bogusz, Ewa; Sliwińska-Kowalska, Mariola

2006-01-01

An assessment of the vocal system, as a part of the medical certification of occupational diseases, should be objective and reliable. Therefore, interest in the method of acoustic voice analysis enabling objective assessment of voice parameters is still growing. The aim of the present study was to evaluate the applicability of acoustic analysis with vocal loading test to the diagnostics of occupational voice disorders. The results of acoustic voice analysis were compared using IRIS software for phoniatrics, before and after a 30-min vocal loading test in 35 female teachers with diagnosed occupational voice disorders (group I) and in 31 female teachers with functional dysphonia (group II). In group I, vocal effort produced significant abnormalities in voice acoustic parameters, compared to group II. These included significantly increased mean fundamental frequency (Fo) value (by 11 Hz) and worsened jitter, shimmer and NHR parameters. Also, the percentage of subjects showing abnormalities in voice acoustic analysis was higher in this group. Conducting voice acoustic analysis before and after the vocal loading test makes it possible to objectively confirm irreversible voice impairments in persons with work-related pathologies of the larynx, which is essential for medical certification of occupational voice diseases.
Changes After Voice Therapy in Acoustic Voice Analysis of Chinese Patients With Voice Disorders.

PubMed

Lu, Dan; Chen, Fei; Yang, Hui; Yu, Rong; Zhou, Qi; Zhang, Xinyuan; Ren, Jia; Zheng, Yitao; Zhang, Xiaoyan; Zou, Jian; Wang, Haiyang; Liu, Jun

2018-05-01

This study aimed to evaluate the effects of voice therapy on patients with voice disorders by comparing the acoustic parameter changes before and after treatment. This is a retrospective study. Forty-five female patients with early-stage vocal nodules or polyps, postoperative patients, and patients with chronic laryngitis were divided into three subgroups. Videostroboscopic, acoustic analysis (fundamental frequency, jitter, shimmer, mean harmonics-to-noise ratio), and maximum phonation time (MPT) were measured before and after treatment. Fifty healthy female volunteers were the control group. After treatment, 24.4% of nodules or polyps had decreased in size, 11.1% of patients with chronic laryngitis and postoperative patients had reduced edema, and the mucosal wave of vocal folds had different degrees of recovery in postoperative patients. All acoustic analysis values and MPT in the patient group were statistically worse than in the control group, except for fundamental frequency before treatment (P > 0.05). After treatment, the acoustic analysis and MPT values were improved. However, the jitter, mean harmonics-to-noise ratio, and MPT values in the patient group were still worse after voice therapy than in the control group (P < 0.05). Most of acoustic analysis values can be useful as a complementary tool in diagnosis and assessment of voice disorders; however, it is not recommended to use a single parameter to assess voice quality. Voice therapy can improve voice quality in patients with voice disorders, but a period longer than 8 weeks is recommended for these patients. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic Analysis of Voice in Singers: A Systematic Review

ERIC Educational Resources Information Center

Gunjawate, Dhanshree R.; Ravi, Rohit; Bellur, Rajashekhar

2018-01-01

Purpose: Singers are vocal athletes having specific demands from their voice and require special consideration during voice evaluation. Presently, there is a lack of standards for acoustic evaluation in them. The aim of the present study was to systematically review the available literature on the acoustic analysis of voice in singers. Method: A…
Acoustic Analysis of Voice and Electroglottography in Patients With Laryngopharyngeal Reflux.

PubMed

Ramírez, Daphne Anahit Morales; Jiménez, Víctor Manuel Valadez; López, Xochiquetzal Hernández; Ysunza, Pablo Antonio

2018-05-01

Laryngopharyngeal reflux (LPR) refers to the flow of gastric acid content into the laryngopharynx. It has been reported that 10% of the patients consulting an otolaryngologist present with this condition. Signs of LPR can be identified during flexible or rigid laryngoscopy. The Voice Handicap Index (VHI) is a reliable tool for detecting the impact of voice disorders, and acoustic assessment of voice including acoustic analysis of voice (AAV) and electroglottography (EGG) provide objective data of voice production and voice disorders. This study aimed to describe changes in AAV, EGG, and VHI in patients who present with LPR compared with a matched control group of healthy subjects. Seventeen patients with LPR were studied. A group of healthy subjects matched by age and gender without any history of voice disorder, LPR, or gastroesophageal reflux disease was assembled. Both groups of patients were studied by VHI, flexible laryngoscopy, AAV, and EGG. All patients with LPR demonstrated abnormal VHI values. Shimmer, jitter, open quotient, and irregularity were significantly increased in the patients with LPR. Nonsignificant correlations were found between VHI scores and abnormal acoustic parameters in patients with LPR. Although abnormal acoustic parameters of patients with LPR were not predictive of the overall VHI score, the abnormal acoustic parameters of patients with LPR suggest a decrease in adequate laryngeal control during phonation. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice Tremor in Parkinson's Disease: An Acoustic Study.

PubMed

Gillivan-Murphy, Patricia; Miller, Nick; Carding, Paul

2018-01-30

Voice tremor associated with Parkinson disease (PD) has not been characterized. Its relationship with voice disability and disease variables is unknown. This study aimed to evaluate voice tremor in people with PD (pwPD) and a matched control group using acoustic analysis, and to examine correlations with voice disability and disease variables. Acoustic voice tremor analysis was completed on 30 pwPD and 28 age-gender matched controls. Voice disability (Voice Handicap Index), and disease variables of disease duration, Activities of Daily Living (Unified Parkinson's Disease Rating Scale [UPDRS II]), and motor symptoms related to PD (UPDRS III) were examined for relationship with voice tremor measures. Voice tremor was detected acoustically in pwPD and controls with similar frequency. PwPD had a statistically significantly higher rate of amplitude tremor (Hz) than controls (P = 0.001). Rate of amplitude tremor was negatively and significantly correlated with UPDRS III total score (rho -0.509). For pwPD, the magnitude and periodicity of acoustic tremor was higher than for controls without statistical significance. The magnitude of frequency tremor (Mftr%) was positively and significantly correlated with disease duration (rho 0.463). PwPD had higher Voice Handicap Index total, functional, emotional, and physical subscale scores than matched controls (P < 0.001). Voice disability did not correlate significantly with acoustic voice tremor measures. Acoustic analysis enhances understanding of PD voice tremor characteristics, its pathophysiology, and its relationship with voice disability and disease symptomatology. Copyright © 2018 The Voice Foundation. All rights reserved.
Acoustic analysis of normal Saudi adult voices.

PubMed

Malki, Khalid H; Al-Habib, Salman F; Hagr, Abulrahman A; Farahat, Mohamed M

2009-08-01

To determine the acoustic differences between Saudi adult male and female voices, and to compare the acoustic variables of the Multidimensional Voice Program (MDVP) obtained from North American adults to a group of Saudi males and females. A cross-sectional survey of normal adult male and female voices was conducted at King Abdulaziz University Hospital, Riyadh, Kingdom of Saudi Arabia between March 2007 and December 2008. Ninety-five Saudi subjects sustained the vowel /a/ 6 times, and the steady state portion of 3 samples was analyzed and compared with the samples of the KayPentax normative voice database. Significant differences were found between Saudi and North American KayPentax database groups. In the male subjects, 15 of 33 MDVP variables, and 10 of 33 variables in the female subjects were found to be significantly different from the KayPentax database. We conclude that the acoustical differences may reflect laryngeal anatomical or tissue differences between the Saudi and the KayPentax database.
[Acoustic voice analysis using the Praat program: comparative study with the Dr. Speech program].

PubMed

Núñez Batalla, Faustino; González Márquez, Rocío; Peláez González, M Belén; González Laborda, Irene; Fernández Fernández, María; Morato Galán, Marta

2014-01-01

The European Laryngological Society (ELS) basic protocol for functional assessment of voice pathology includes 5 different approaches: perception, videostroboscopy, acoustics, aerodynamics and subjective rating by the patient. In this study we focused on acoustic voice analysis. The purpose of the present study was to correlate the results obtained by the commercial software Dr. Speech and the free software Praat in 2 fields: 1. Narrow-band spectrogram (the presence of noise according to Yanagihara, and the presence of subharmonics) (semi-quantitative). 2. Voice acoustic parameters (jitter, shimmer, harmonics-to-noise ratio, fundamental frequency) (quantitative). We studied a total of 99 voice samples from individuals with Reinke's oedema diagnosed using videostroboscopy. One independent observer used Dr. Speech 3.0 and a second one used the Praat program (Phonetic Sciences, University of Amsterdam). The spectrographic analysis consisted of obtaining a narrow-band spectrogram from the previous digitalised voice samples by the 2 independent observers. They then determined the presence of noise in the spectrogram, using the Yanagihara grades, as well as the presence of subharmonics. As a final result, the acoustic parameters of jitter, shimmer, harmonics-to-noise ratio and fundamental frequency were obtained from the 2 acoustic analysis programs. The results indicated that the sound spectrogram and the numerical values obtained for shimmer and jitter were similar for both computer programs, even though types 1, 2 and 3 voice samples were analysed. The Praat and Dr. Speech programs provide similar results in the acoustic analysis of pathological voices. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Standardization of pitch-range settings in voice acoustic analysis.

PubMed

Vogel, Adam P; Maruff, Paul; Snyder, Peter J; Mundt, James C

2009-05-01

Voice acoustic analysis is typically a labor-intensive, time-consuming process that requires the application of idiosyncratic parameters tailored to individual aspects of the speech signal. Such processes limit the efficiency and utility of voice analysis in clinical practice as well as in applied research and development. In the present study, we analyzed 1,120 voice files, using standard techniques (case-by-case hand analysis), taking roughly 10 work weeks of personnel time to complete. The results were compared with the analytic output of several automated analysis scripts that made use of preset pitch-range parameters. After pitch windows were selected to appropriately account for sex differences, the automated analysis scripts reduced processing time of the 1,120 speech samples to less than 2.5 h and produced results comparable to those obtained with hand analysis. However, caution should be exercised when applying the suggested preset values to pathological voice populations.
Acoustic analysis of voice in children with cleft palate and velopharyngeal insufficiency.

PubMed

Villafuerte-Gonzalez, Rocio; Valadez-Jimenez, Victor M; Hernandez-Lopez, Xochiquetzal; Ysunza, Pablo Antonio

2015-07-01

Acoustic analysis of voice can provide instrumental data concerning vocal abnormalities. These findings can be used for monitoring clinical course in cases of voice disorders. Cleft palate severely affects the structure of the vocal tract. Hence, voice quality can also be also affected. To study whether the main acoustic parameters of voice, including fundamental frequency, shimmer and jitter are significantly different in patients with a repaired cleft palate, as compared with normal children without speech, language and voice disorders. Fourteen patients with repaired unilateral cleft lip and palate and persistent or residual velopharyngeal insufficiency (VPI) were studied. A control group was assembled with healthy volunteer subjects matched by age and gender. Hypernasality and nasal emission were perceptually assessed in patients with VPI. Size of the gap as assessed by videonasopharyngoscopy was classified in patients with VPI. Acoustic analysis of voice including Fundamental frequency (F0), shimmer and jitter were compared between patients with VPI and control subjects. F0 was significantly higher in male patients as compared with male controls. Shimmer was significantly higher in patients with VPI regardless of gender. Moreover, patients with moderate VPI showed a significantly higher shimmer perturbation, regardless of gender. Although future research regarding voice disorders in patients with VPI is needed, at the present time it seems reasonable to include strategies for voice therapy in the speech and language pathology intervention plan for patients with VPI. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Feigned Depression and Feigned Sleepiness: A Voice Acoustical Analysis

ERIC Educational Resources Information Center

Reilly, Nicole; Cannizzaro, Michael S.; Harel, Brian T.; Snyder, Peter J.

2004-01-01

We sought to profile the voice acoustical correlates of simulated, or feigned depression by neurologically and psychiatrically healthy control subjects. We also sought to identify the voice acoustical correlates of feigned sleepiness for these same subjects. Twenty-two participants were asked to speak freely about a cartoon, to count from 1 to 10,…
Evaluating iPhone recordings for acoustic voice assessment.

PubMed

Lin, Emily; Hornibrook, Jeremy; Ormond, Tika

2012-01-01

This study examined the viability of using iPhone recordings for acoustic measurements of voice quality. Acoustic measures were compared between voice signals simultaneously recorded from 11 normal speakers (6 females and 5 males) through an iPhone (model A1303, Apple, USA) and a comparison recording system. Comparisons were also conducted between the pre- and post-operative voices recorded from 10 voice patients (4 females and 6 males) through the iPhone. Participants aged between 27 and 79 years. Measures from iPhone and comparison signals were found to be highly correlated. Findings of the effects of vowel type on the selected measures were consistent between the two recording systems and congruent with previous findings. Analysis of the patient data revealed that a selection of acoustic measures, such as vowel space area and voice perturbation measures, consistently demonstrated a positive change following phonosurgery. The present findings indicated that the iPhone device tested was useful for tracking voice changes for clinical management. Preliminary findings regarding factors such as gender and type of pathology suggest that intra-subject, instead of norm-referenced, comparisons of acoustic measures would be more useful in monitoring the progression of a voice disorder or tracking the treatment effect. Copyright © 2012 S. Karger AG, Basel.
Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening.

PubMed

Uloza, Virgilijus; Padervinskis, Evaldas; Vegiene, Aurelija; Pribuisiene, Ruta; Saferis, Viktoras; Vaiciukynas, Evaldas; Gelzinis, Adas; Verikas, Antanas

2015-11-01

The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73-1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7% and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5%. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60% and RFC revealed the EER of 7.9%, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.
Acoustic and perceptual characteristics of the voice in patients with vocal polyps after surgery and voice therapy.

PubMed

Petrovic-Lazic, Mirjana; Jovanovic, Nadica; Kulic, Milan; Babac, Snezana; Jurisic, Vladimir

2015-03-01

The aim of the study was to assess the effect of endolaryngeal phonomicrosurgery (EPM) and voice therapy in patients with vocal fold polyps using perceptual and acoustic analysis before and after both therapies. The acoustic tests and perceptual evaluation of voice were carried out on 41 female patients with vocal fold polyp before and after EPM and voice therapy. Both therapy strategies were performed. Used acoustic parameters were Jitter percent (Jitt), pitch perturbation quotient (PPQ), shimmer percent (Shim), amplitude perturbation quotient (APQ), fundamental frequency variation (vF0), noise-to-harmonic ratio (NHR), Voice Turbulence Index (VTI). For perceptual evaluation, GRB scale was used. Results indicated higher values of investigated parameters in patients' group than in the control group (P < 0.01). Good correlation between the perceptual hoarseness factors of GRB scale and objective acoustic voice parameters were observed. All analyzed acoustic parameters improved after the phonomicrosurgery and voice therapy and tend to approach to values of the control group. For Jitt percent, Shim percent, vF0, VTI, and NHR, there were statistically significant differences. Perceptual voice evaluation revealed statistically significantly (P < 0.01) decreased rating of G (grade), R (rough) and B (breathy) after surgery and voice therapy. Our data indicated that both acoustic and perceptual characteristic of voice in patients with vocal polyps significantly improved after phonomicrosurgical and voice treatment. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustical analysis of trained and untrained singers onsite before and after prolonged voice use

NASA Astrophysics Data System (ADS)

Jackson, Christophe E.

Controlled acoustic environments are important in voice research. Recording environment affects the quality of voice recordings. While sound booths and anechoic chambers are examples of controlled acoustic environments widely used in research, they are both costly and not portable. The long-term goal of this project is to compare the voice usage and efficiency of trained and untrained singers onsite immediately before and after vocal performance. The specific goal of this project is the further of development a Portable Sound Booth (PSB) and standardization of onsite voice recording procedures under controlled conditions. We hypothesized that the simple and controlled acoustic environment provided by the PSB would enable consistent reliable onsite voice recordings and the immediate differences as a consequence of voice usage were measurable. Research has suggested that it would be possible to conduct onsite voice recordings. Proof of concept research titled "Construction and Characterization of a Portable Sound Booth for Onsite Measurement" was conducted before initiating the full research effort. Preliminary findings revealed that: (1) it was possible to make high-quality voice recordings onsite, (2) the use of a Portable Sound Booth (PSB) required further acoustic characterization of its inherent acoustic properties, and (3) testable differences before and after performance were evident. The specific aims were to (1) develop and refine onsite objective voice measurements in the PSB and (2) evaluate use of the PSB to measure voice quality changes before and after voice usage.
Combined Use of Standard and Throat Microphones for Measurement of Acoustic Voice Parameters and Voice Categorization.

PubMed

Uloza, Virgilijus; Padervinskis, Evaldas; Uloziene, Ingrida; Saferis, Viktoras; Verikas, Antanas

2015-09-01

The aim of the present study was to evaluate the reliability of the measurements of acoustic voice parameters obtained simultaneously using oral and contact (throat) microphones and to investigate utility of combined use of these microphones for voice categorization. Voice samples of sustained vowel /a/ obtained from 157 subjects (105 healthy and 52 pathological voices) were recorded in a soundproof booth simultaneously through two microphones: oral AKG Perception 220 microphone (AKG Acoustics, Vienna, Austria) and contact (throat) Triumph PC microphone (Clearer Communications, Inc, Burnaby, Canada) placed on the lamina of thyroid cartilage. Acoustic voice signal data were measured for fundamental frequency, percent of jitter and shimmer, normalized noise energy, signal-to-noise ratio, and harmonic-to-noise ratio using Dr. Speech software (Tiger Electronics, Seattle, WA). The correlations of acoustic voice parameters in vocal performance were statistically significant and strong (r = 0.71-1.0) for the entire functional measurements obtained for the two microphones. When classifying into healthy-pathological voice classes, the oral-shimmer revealed the correct classification rate (CCR) of 75.2% and the throat-jitter revealed CCR of 70.7%. However, combination of both throat and oral microphones allowed identifying a set of three voice parameters: throat-signal-to-noise ratio, oral-shimmer, and oral-normalized noise energy, which provided the CCR of 80.3%. The measurements of acoustic voice parameters using a combination of oral and throat microphones showed to be reliable in clinical settings and demonstrated high CCRs when distinguishing the healthy and pathological voice patient groups. Our study validates the suitability of the throat microphone signal for the task of automatic voice analysis for the purpose of voice screening. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic Analysis of the Voiced-Voiceless Distinction in Dutch Tracheoesophageal Speech

ERIC Educational Resources Information Center

Jongmans, Petra; Wempe, Ton G.; van Tinteren, Harm; Hilgers, Frans J. M.; Pols, Louis C. W.; van As-Brooks, Corina J.

2010-01-01

Purpose: Confusions between voiced and voiceless plosives and voiced and voiceless fricatives are common in Dutch tracheoesophageal (TE) speech. This study investigates (a) which acoustic measures are found to convey a correct voicing contrast in TE speech and (b) whether different measures are found in TE speech than in normal laryngeal (NL)…
Integrating voice evaluation: correlation between acoustic and audio-perceptual measures.

PubMed

Vaz Freitas, Susana; Melo Pestana, Pedro; Almeida, Vítor; Ferreira, Aníbal

2015-05-01

This article aims to establish correlations between acoustic and audio-perceptual measures using the GRBAS scale with respect to four different voice analysis software programs. Exploratory, transversal. A total of 90 voice records were collected and analyzed with the Dr. Speech (Tiger Electronics, Seattle, WA), Multidimensional Voice Program (Kay Elemetrics, NJ, USA), PRAAT (University of Amsterdam, The Netherlands), and Voice Studio (Seegnal, Oporto, Portugal) software programs. The acoustic measures were correlated to the audio-perceptual parameters of the GRBAS and rated by 10 experts. The predictive value of the acoustic measurements related to the audio-perceptual parameters exhibited magnitudes ranging from weak (R(2)a=0.17) to moderate (R(2)a=0.71). The parameter exhibiting the highest correlation magnitude is B (Breathiness), whereas the weaker correlation magnitudes were found to be for A (Asthenia) and S (Strain). The acoustic measures with stronger predictive values were local Shimmer, harmonics-to-noise ratio, APQ5 shimmer, and PPQ5 jitter, with different magnitudes for each one of the studied software programs. Some acoustic measures are pointed as significant predictors of GRBAS parameters, but they differ among software programs. B (Breathiness) was the parameter exhibiting the highest correlation magnitude. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustics of the trained versus untrained singing voice.

PubMed

Howard, David M

2009-06-01

Acoustic voice analysis is now widely available on today's multimedia computers and knowledge of the acoustics of the trained and untrained singing voice has advanced dramatically in recent years. New techniques have emerged that are providing clearer representations of aspects of the physiology of voice function and a greater understanding of the differences between the voices of untrained and trained singers. Improvements in endoscope technology are changing understanding of vocal fold function and videokymography provides a new way of interpreting the output; some new and interesting possibilities are emerging. Larynx height variation is a feature of untrained singing and singing in different styles and its measurement has been inaccurate hitherto; perhaps the laryngoaltimeter will provide a solution. Magnetic resonance imaging is now a vital tool for vocal tract shape measurement but a new bio-inspired computing is offering a possible alternative. Differences between an untrained and trained singing voice lie in one or more of breathing technique, larynx settings or vocal tract settings. Measurement techniques in each of these areas are important to provide data on the singing voice, and accurate data are essential for natural personalized electronic voice synthesis in the future.
Acoustic analysis of the singing and speaking voice in singing students.

PubMed

Lundy, D S; Roy, S; Casiano, R R; Xue, J W; Evans, J

2000-12-01

The singing power ratio (SPR) is an objective means of quantifying the singer's formant. SPR has been shown to differentiate trained singers from nonsingers and sung from spoken tones. This study was designed to evaluate SPR and acoustic parameters in singing students to determine if the singer-in-training has an identifiable difference between sung and spoken voices. Digital audio recordings were made of both sung and spoken vowel sounds in 55 singing students for acoustic analysis. SPR values were not significantly different between the sung and spoken samples. Shimmer and noise-to-harmonic ratio were significantly higher in spoken samples. SPR analysis may provide an objective tool for monitoring the student's progress.
Acoustic Analysis of the Tremulous Voice: Assessing the Utility of the Correlation Dimension and Perturbation Parameters

ERIC Educational Resources Information Center

Shao, Jun; MacCallum, Julia K.; Zhang, Yu; Sprecher, Alicia; Jiang, Jack J.

2010-01-01

Acoustic analysis may provide a useful means to quantitatively characterize the tremulous voice. Signals were obtained from 25 subjects with diagnoses of either Parkinson's disease or vocal polyps exhibiting vocal tremor. These were compared to signals from 24 subjects with normal voices. Signals were analyzed via correlation dimension and several…

[Acoustic and aerodynamic characteristics of the oesophageal voice].

PubMed

Vázquez de la Iglesia, F; Fernández González, S

2005-12-01

The aim of the study is to determine the physiology and pathophisiology of esophageal voice according to objective aerodynamic and acoustic parameters (quantitative and qualitative parameters). Our subjects were comprised of 33 laryngectomized patients (all male) that underwent aerodynamic, acoustic and perceptual protocol. There is a statistical association between acoustic and aerodynamic qualitative parameters (phonation flow chart type, sound spectrum, perceptual analysis) among quantitative parameters (neoglotic pressure, phonation flow, phonation time, fundamental frequency, maximum intensity sound level, speech rate). Nevertheles, not always such observations bring practical resources to clinical practice. We consider that the facts studied may enable us to add, pragmatically, new resources to the more effective vocal rehabilitation to these patients. The physiology of esophageal voice is well understood by the method we have applied, also seeking for rehabilitation, improving oral communication skills in the laryngectomee population.
Application of the acoustic voice quality index for objective measurement of dysphonia severity.

PubMed

Núñez-Batalla, Faustino; Díaz-Fresno, Estefanía; Álvarez-Fernández, Andrea; Muñoz Cordero, Gabriela; Llorente Pendás, José Luis

Over the past several decades, many acoustic parameters have been studied as sensitive to and to measure dysphonia. However, current acoustic measures might not be sensitive measures of perceived voice quality. A meta-analysis which evaluated the relationship between perceived overall voice quality and several acoustic-phonetic correlates, identified measures that do not rely on the extraction of the fundamental period, such the measures derived from the cepstrum, and that can be used in sustained vowel as well as continuous speech samples. A specific and recently developed method to quantify the severity of overall dysphonia is the acoustic voice quality index (AVQI) that is a multivariate construct that combines multiple acoustic markers to yield a single number that correlates reasonably with overall vocal quality. This research is based on one pool of voice recordings collected in two sets of subjects: 60 vocally normal and 58 voice disordered participants. A sustained vowel and a sample of connected speech were recorded and analyzed to obtain the six parameters included in the AVQI using the program Praat. Statistical analysis was completed using SPSS for Windows, version 12.0. Correlation between perception of overall voice quality and AVQI: A significant difference exists (t(95) = 9.5; p<.000) between normal and dysphonic voices. The findings of this study demonstrate the clinical feasibility of the AVQI as a measure of dysphonia severity. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Otorrinolaringología y Cirugía de Cabeza y Cuello. All rights reserved.
Laryngoscopic, acoustic, perceptual, and functional assessment of voice in rock singers.

PubMed

Guzman, Marco; Barros, Macarena; Espinoza, Fernanda; Herrera, Alejandro; Parra, Daniela; Muñoz, Daniel; Lloyd, Adam

2013-01-01

The present study aimed to vocally assess a group of rock singers who use growl voice and reinforced falsetto. A group of 21 rock singers and a control group of 18 pop singers were included. Singing and speaking voice was assessed through acoustic, perceptual, functional and laryngoscopic analysis. No significant differences were observed between groups in most of the analyses. Acoustic and perceptual analysis of the experimental group demonstrated normality of speaking voice. Endoscopic evaluation showed that most rock singers presented during singing voice a high vertical laryngeal position, pharyngeal compression and laryngeal supraglottic compression. Supraglottic activity during speaking voice tasks was also observed. However, overall vocal fold integrity was demonstrated in most of the participants. Slightly abnormal observations were demonstrated in few of them. Singing voice handicap index revealed that the most affected variable was the physical sphere, followed by the social and emotional spheres. Although growl voice and reinforced falsetto represent laryngeal and pharyngeal hyperfunctional activity, they did not seem to contribute to the presence of any major vocal fold disorder in our subjects. Nevertheless, we cannot rule out the possibility that more evident vocal fold disorders could be found in singers who use these techniques more often and during a longer period of time.
Fluid-acoustic interactions and their impact on pathological voiced speech

NASA Astrophysics Data System (ADS)

Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.

2011-11-01

Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Acoustic markers to differentiate gender in prepubescent children's speaking and singing voice.

PubMed

Guzman, Marco; Muñoz, Daniel; Vivero, Martin; Marín, Natalia; Ramírez, Mirta; Rivera, María Trinidad; Vidal, Carla; Gerhard, Julia; González, Catalina

2014-10-01

Investigation sought to determine whether there is any acoustic variable to objectively differentiate gender in children with normal voices. A total of 30 children, 15 boys and 15 girls, with perceptually normal voices were examined. They were between 7 and 10 years old (mean: 8.1, SD: 0.7 years). Subjects were required to perform the following phonatory tasks: (1) to phonate sustained vowels [a:], [i:], [u:], (2) to read a phonetically balanced text, and (3) to sing a song. Acoustic analysis included long-term average spectrum (LTAS), fundamental frequency (F0), speaking fundamental frequency (SFF), equivalent continuous sound level (Leq), linear predictive code (LPC) to obtain formant frequencies, perturbation measures, harmonic to noise ratio (HNR), and Cepstral peak prominence (CPP). Auditory perceptual analysis was performed by four blinded judges to determine gender. No significant gender-related differences were found for most acoustic variables. Perceptual assessment showed good intra and inter rater reliability for gender. Cepstrum for [a:], alpha ratio in text, shimmer for [i:], F3 in [a:], and F3 in [i:], were the parameters that composed the multivariate logistic regression model to best differentiate male and female children's voices. Since perceptual assessment reliably detected gender, it is likely that other acoustic markers (not evaluated in the present study) are able to make clearer gender differences. For example, gender-specific patterns of intonation may be a more accurate feature for differentiating gender in children's voices. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Acoustic-Perceptual Correlates of Voice in Indian Hindu Purohits.

PubMed

Balasubramanium, Radish Kumar; Karuppali, Sudhin; Bajaj, Gagan; Shastry, Anuradha; Bhat, Jayashree

2018-05-16

Purohit, in the Indian religious context (Hindu), means priest. Purohits are professional voice users who use their voice while performing regular worships and rituals in temples and homes. Any deviations in their voice can have an impact on their profession. Hence, there is a need to investigate the voice characteristics of purohits using perceptual and acoustic analyses. A total of 44 men in the age range of 18-30 years were divided into two groups. Group 1 consisted of purohits who were trained since childhood (n = 22) in the traditional gurukul system. Group 2 (n = 22) consisted of normal controls. Phonation and spontaneous speech samples were obtained from all the participants at a comfortable pitch and loudness. The Praat software (Version 5.3.31) and the Speech tool were used to analyze the traditional acoustic and cepstral parameters, respectively, whereas GRBAS was used to perceptually evaluate the voice. Results of the independent t test revealed no significant differences across the groups for perceptual and traditional acoustic measures except for intensity, which was significantly higher in purohits' voices at P < 0.05. However, the cepstral values (cepstral peak prominence and smoothened cepstral peak prominence) were much higher in purohits than in controls at P < 0.05 CONCLUSIONS: Results revealed that purohits did not exhibit vocal deviations as analyzed through perceptual and acoustic parameters. In contrast, cepstral measures were higher in Indian Hindu purohits in comparison with normal controls, suggestive of a higher degree of harmonic organization in purohits. Further studies are required to analyze the physiological correlates of increased cepstral measures in purohits' voices. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Correlation Between Acoustic Measurements and Self-Reported Voice Disorders Among Female Teachers.

PubMed

Lin, Feng-Chuan; Chen, Sheng Hwa; Chen, Su-Chiu; Wang, Chi-Te; Kuo, Yu-Ching

2016-07-01

Many studies focused on teachers' voice problems and most of them were conducted using questionnaires, whereas little research has investigated the relationship between self-reported voice disorders and objective quantification of voice. This study intends to explore the relationship of acoustic measurements according to self-reported symptoms and its predictive value of future dysphonia. This is a case-control study. Voice samples of 80 female teachers were analyzed, including 40 self-reported voice disorders (VD) and 40 self-reported normal voice (NVD) subjects. The acoustic measurements included jitter, shimmer, and noise-to-harmonics ratio (NHR). Levene's t test and logistic regression were used to analyze the differences between VD and NVD and the relationship between self-reported voice conditions and the acoustic measurements. To examine whether acoustic measurements can be used to predict further voice disorders, we applied a receiver operating characteristic (ROC) curve to determine the cutoff values and the associated sensitivity and specificity. The results showed that jitter, shimmer, and the NHR of VD were significantly higher than those of NVD. Among the parameters, the NHR and shimmer demonstrated the highest correlation with self-reported voice disorders. By using the NHR ≥0.138 and shimmer ≥0.470 dB as the cutoff values, the ROC curve displayed 72.5% of sensitivity and 75% of specificity, and the overall positive predictive value for subsequent dysphonia achieved 60%. This study demonstrated a significant correlation between acoustic measurements and self-reported dysphonic symptoms. NHR and ShdB are two acoustic parameters that are more able to reflect vocal abnormalities and, probably, to predict subsequent subjective voice disorder. Future research recruiting more subjects in other occupations and genders shall validate the preliminary results revealed in this study. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All
Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures.

PubMed

Maryn, Youri; Ysenbaert, Femke; Zarowski, Andrzej; Vanspauwen, Robby

2017-03-01

The ability to move with mobile communication devices (MCDs; ie, smartphones and tablet computers) may induce differences in microphone-to-mouth positioning and use in noise-packed environments, and thus influence reliability of acoustic voice measurements. This study investigated differences in various acoustic voice measures between six recording equipments in backgrounds with low and increasing noise levels. One chain of continuous speech and sustained vowel from 50 subjects with voice disorders (all separated by silence intervals) was radiated and re-recorded in an anechoic chamber with five MCDs and one high-quality recording system. These recordings were acquired in one condition without ambient noise and in four conditions with increased ambient noise. A total of 10 acoustic voice markers were obtained in the program Praat. Differences between MCDs and noise condition were assessed with Friedman repeated-measures test and posthoc Wilcoxon signed-rank tests, both for related samples, after Bonferroni correction. (1) Except median fundamental frequency and seven nonsignificant differences, MCD samples have significantly higher acoustic markers than clinical reference samples in minimal environmental noise. (2) Except median fundamental frequency, jitter local, and jitter rap, all acoustic measures on samples recorded with the reference system experienced significant influence from room noise levels. Fundamental frequency is resistant to recording system, environmental noise, and their combination. All other measures, however, were impacted by both recording system and noise condition, and especially by their combination, often already in the reference/baseline condition without added ambient noise. Caution is therefore warranted regarding implementation of MCDs as clinical recording tools, particularly when applied for treatment outcomes assessments. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice Register in Mon: Acoustics and Electroglottography

PubMed Central

Abramson, Arthur S.; Tiede, Mark K.; Luangthongkum, Theraphan

2016-01-01

Mon is spoken in villages in Thailand and Myanmar. The dialect of Ban Nakhonchum, Thailand has two voice registers, modal and breathy; these phonation types, along with other phonetic properties, distinguish minimal pairs. Four native speakers of this dialect recorded repetitions of 14 randomized words (seven minimal pairs) for acoustic analysis. We used a subset of these pairs in a listening test to verify the perceptual robustness of the register distinction. Acoustic analysis found significant differences in noise component, spectral slope, and fundamental frequency. In a subsequent session four speakers were also recorded using electroglottography (EGG), which showed systematic differences in the contact quotient (CQ). The salience of these properties in maintaining the register distinction is discussed in the context of possible tonogenesis for this language. PMID:26636544
Measurements of the Acoustic Speaking Voice After Vocal Warm-up and Cooldown in Choir Singers.

PubMed

Onofre, Fernanda; Prado, Yuka de Almeida; Rojas, Gleidy Vannesa E; Garcia, Denny Marco; Aguiar-Ricz, Lílian

2017-01-01

The aim of this study was to evaluate the acoustic measurements of the vowel /a/ in modal recording before and after a singing voice resistance test and after 30 minutes of absolute rest in female choir singers. This is a prospective cohort study. A total of 13 soprano choir singers with experience in choir singing were evaluated through analysis of acoustic voice parameters at three points in time: before continuous use of the voice, after vocal warm-up and a singing test 60 minutes in duration respecting the pauses for breathing, and after vocal cooldown and an absolute voice rest for 30 minutes. The fundamental frequency increased after the voice resistance test (P = 0.012) and remained elevated after the 30 minutes of voice rest (P = 0.01). The jitter decreased after the voice resistance test (P = 0.02) and after the 30 minutes of voice rest. A significant difference was detected for the acoustic voice parameters relative average perturbation (RAP), (P = 0.05), and pitch perturbation quotient (PPQ), (P = 0.04), compared with the initial time point. The fundamental frequency increased after 60 minutes of singing and remained elevated after vocal cooldown and absolute rest for 30 minutes, proving an efficient parameter for identifying the changes inherent to voice demand during singing. Copyright © 2017. Published by Elsevier Inc.
Acoustic changes in voice after tonsillectomy.

PubMed

Saida, H; Hirose, H

1996-01-01

The vocal tract from the glottis to the lips is considered to he a resonator and the voice is changeable depending upon the shape of the vocal tract. In this report, we examined the change in pharyngeal size and acoustic feature of voice after tonsillectomy. Subjects were 20 patients. The distance between both anterior pillars (glossopalatine arches), and between both posterior pillars (pharyngopalatine arches) was measured weekly. For acoustic measurements, the five Japanese vowels and Japanese conversational sentences were recorded and analyzed. The distance between both anterior pillars became wider 2 weeks postoperatively, and tended to become narrower thereafter. The distance between both posterior pillars became wider even after 4 weeks postoperatively. No consistent changes in F0, F1 and F2 were found after surgery. Although there was a tendency for a decrease in F3, tonsillectomy did not appear to change the acoustical features of the Japanese vowels remarkably. It was assumed that the subject may adjust the shape of the vocal tract to produce consistent speech sounds after the surgery using auditory feedback.
Period for Normalization of Voice Acoustic Parameters in Indian Pediatric Cochlear Implantees.

PubMed

Joy, Jeena V; Deshpande, Shweta; Vaid, Dr Neelam

2017-05-01

The purpose of this study was to investigate the duration required by children with cochlear implants to approximate the norms of voice acoustic parameters. The study design is retrospective. Thirty children with cochlear implants (chronological ages ranging between 4.1 and 6.7 years) were divided into three groups, based on the postimplantation duration. Ten normal-hearing children (chronological ages ranging between 4 and 7 years) were selected as the control group. All implanted children underwent an objective voice analysis using Dr. Speech software (Tiger DRS, Inc., Seattle, WA, USA) at 6 months and at 1 and 2 years of implant use. Voice analysis was done for the children in the control group and means were derived for all the parameters analyzed to obtain the normal values. Habitual fundamental frequency (HFF), jitter (frequency variation), and shimmer (amplitude variation) were the voice acoustic parameters analyzed for the vowels |a|, |i|, and |u|. The obtained values of these parameters were then compared with the norms. HFF for the children with implant use for 6 months and 1 year did significantly differ from the control group. However, there was no significant difference (P > 0.5) observed in the children with implant use for 2 years, thus matching the norms. Jitter and shimmer showed a significant difference (P < 0.5) even at 2 years of implant use when compared with the control group. The findings of the study divulge that children with cochlear implants approximate age-matched normal-hearing kids with respect to the voice acoustic parameter of HFF by 2 years of implant use. However, jitter and shimmer were not found to stabilize for the duration studied. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Perceptual and Acoustic Analyses of Good Voice Quality in Male Radio Performers.

PubMed

Warhurst, Samantha; Madill, Catherine; McCabe, Patricia; Ternström, Sten; Yiu, Edwin; Heard, Robert

2017-03-01

Good voice quality is an asset to professional voice users, including radio performers. We examined whether (1) voices could be reliably categorized as good for the radio and (2) these categories could be predicted using acoustic measures. Male radio performers (n = 24) and age-matched male controls performed "The Rainbow Passage" as if presenting on the radio. Voice samples were rated using a three-stage paired-comparison paradigm by 51 naive listeners and perceptual categories were identified (Study 1), and then analyzed for fundamental frequency, long-term average spectrum, cepstral peak prominence, and pause or spoken-phrase duration (Study 2). Study 1: Good inter-judge reliability was found for perceptual judgments of the best 15 voices (good for radio category, 14/15 = radio performers), but agreement on the remaining 33 voices (unranked category) was poor. Study 2: Discriminant function analyses showed that the SD standard deviation of sounded portion duration, equivalent sound level, and smoothed cepstral peak prominence predicted membership of categories with moderate accuracy (R 2 = 0.328). Radio performers are heterogeneous for voice quality; good voice quality was judged reliably in only 14 out of 24 radio performers. Current acoustic analyses detected some of the relevant signal properties that were salient in these judgments. More refined perceptual analysis and the use of other perceptual methods might provide more information on the complex nature of judging good voices. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic and phonatory characterization of the Fado voice.

PubMed

Mendes, Ana P; Rodrigues, Aira F; Guerreiro, David Michael

2013-09-01

Fado is a Portuguese musical genre, instrumentally accompanied by a Portuguese and an acoustic guitar. Fado singers' voice is perceptually characterized by a low pitch, hoarse, and strained voice. The present research study sketches the acoustic and phonatory profile of the Fado singers' voice. Fifteen Fado singers produced spoken and sung phonatory tasks. For the spoken voice measures, the maximum phonation time and s/z ratio of Fado singers were near the inefficient physiological threshold. Fundamental frequency was higher than that found in nonsingers and lower than that found in Western Classical singers. Jitter and shimmer mean values were higher compared with nonsingers. Harmonic-to-noise ratio (HNR) was similar to the mean values for nonsingers. For the sung voice, jitter was higher compared with Country, Musical Theater, Soul, Jazz, and Western Classical singers and lower than Pop singers. Shimmer mean values were lower than Country, Musical Theater, Pop, Soul, and Jazz singers and higher than Western Classical singers. HNR was similar for Western Classical singers. Maximum phonational frequency range of Fado singers indicated that male and female subjects had a lower range compared with Western Classical singers. Additionally, Fado singers produced vibrato, but singer's formant was rarely produced. These sung voice characteristics could be related with life habits, less/lack of singing training, or could be just a Fado voice characteristic. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Voice Quality After a Semi-Occluded Vocal Tract Exercise With a Ventilation Mask in Contemporary Commercial Singers: Acoustic Analysis and Self-Assessments.

PubMed

Fantini, Marco; Succo, Giovanni; Crosetti, Erika; Borragán Torre, Alfonso; Demo, Roberto; Fussi, Franco

2017-05-01

The current study aimed at investigating the immediate effects of a semi-occluded vocal tract exercise with a ventilation mask in a group of contemporary commercial singers. A randomized controlled study was carried out. Thirty professional or semi-professional singers with no voice complaints were randomly divided into two groups on recruitment: an experimental group and a control group. The same warm-up exercise was performed by the experimental group with an occluded ventilation mask placed over the nose and the mouth and by the control group without the ventilation mask. Voice was recorded before and after the exercise. Acoustic and self-assessment analysis were accomplished. The acoustic parameters of the voice samples recorded before and after training were compared, as well as the parameters' variations between the experimental and the control group. Self-assessment results of the experimental and the control group were compared too. Significant changes after the warm-up exercise included jitter, shimmer, and singing power ratio (SPR) in the experimental group. No significant changes were recorded in the control group. Significant differences between the experimental and the control group were found for ΔShimmer and ΔSPR. Self-assessment analysis confirmed a significantly higher phonatory comfort and voice quality perception for the experimental group. The results of the present study support the immediate advantageous effects on singing voice of a semi-occluded vocal tract exercise with a ventilation mask in terms of acoustic quality, phonatory comfort, and voice quality perception in contemporary commercial singers. Long-term effects still remain to be studied. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Finding intonational boundaries using acoustic cues related to the voice source

NASA Astrophysics Data System (ADS)

Choi, Jeung-Yoon; Hasegawa-Johnson, Mark; Cole, Jennifer

2005-10-01

Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.
Acoustic and Perceived Measurements Certifying Tango as Voice Treatment Method.

PubMed

Tafiadis, Dionysios; Kosma, Evangelia I; Chronopoulos, Spyridon K; Papadopoulos, Aggelos; Toki, Eugenia I; Vassiliki, Siafaka; Ziavra, Nausica

2018-03-01

Voice disorders are affecting everyday life in many levels, and their prevalence has been studied extensively in certain and general populations. Notably, several factors have a cohesive influence on voice disorders and voice characteristics. Several studies report that health and environmental and psychological etiologies can serve as risk factors for voice disorders. Many diagnostic protocols, in the literature, evaluate voice and its parameters leading to direct or indirect treatment intervention. This study was designed to examine the effect of tango on adult acoustic voice parameters. Fifty-two adults (26 male and 26 female) were recruited and divided into four subgroups (male dancers, female dancers, male nondancers, and female nondancers). The participants were asked to answer two questionnaires (Voice Handicap Index and Voice Evaluation Form), and their voices were recorded before and after the tango dance session. Moreover, water consumption was investigated. The study's results indicated that the voices' acoustic characteristics were different between tango dancers and the control group. The beneficial results are far from prominent as they prove that tango dance can serve stand-alone as voice therapy without the need for hydration. Also, more research is imperative to be conducted on a longitudinal basis to obtain a more accurate result on the required time for the proposed therapy. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Inhaled beclomethasone dipropionate improves acoustic measures of voice in patients with asthma.

PubMed

Balter, M S; Adams, S G; Chapman, K R

2001-12-01

Inhaled corticosteroids have the potential to produce upper-airway side effects such as hoarseness. As new compounds and delivery devices are developed and compared, it is difficult to quantify their adverse upper-airway effects. We undertook the following study to test the ability of an acoustic analysis technique to quantify changes in vocal function in steroid-naive patients with asthma who receive inhaled beclomethasone dipropionate (BDP), 1,000 microg/d for 4 months. Patients self-administered one of four regimens of inhaled BDP. Group 1 patients received one 250-microg puff qid via metered-dose inhaler (MDI); group 2 patients received one 250-microg puff qid via MDI with a holding chamber; group 3 patients received two 250-microg puffs bid via MDI; and group 4 patients received two 250-microg puffs bid via MDI with a holding chamber. A smaller cohort of nonsmoking asthmatic patients was managed without steroid intervention for 4 months. At baseline and again at 8 weeks and 16 weeks after the initiation of BDP treatment, patients underwent spirometry and methacholine challenge. At baseline and again at 2, 4, 8, 12, and 16 weeks, patients underwent voice recording for analysis of voice parameters. The recorded vowels were low-pass filtered (10 KHz), digitized (22 KHz), and analyzed by software to obtain two acoustic measures: (1) jitter, the cycle-to-cycle variation in the time period of the voice signal; and (2) shimmer, the cycle-to-cycle variation in voice signal amplitude. We recruited 77 patients for randomization to inhaled steroid therapy and 10 patients who continued to receive only occasional inhaled bronchodilator therapy. In all active treatment groups, FEV(1), FVC, and provocative concentration of methacholine causing a 20% fall in FEV(1) improved significantly after BDP treatment. Mean jitter scores, a measurement of variation in voice pitch, were not significantly influenced by BDP treatment. However, mean shimmer scores, a reflection of
[The association between health-related quality of life and voice as evaluated by an acoustic analysis in elderly Japanese nursing home residents].

PubMed

Hara, Shuichi; Miura, Hiroko; Yamasaki, Kiyoko; Morisaki, Naoko; Sumi, Yasunori

2015-01-01

We carried out a cross-sectional study investigating the association between health-related quality of life (HRQOL) and voice, as evaluated by an acoustic analysis, in elderly residents of a nursing home. The HRQOL of 61 elderly nursing home residents (mean age: 82.1±8.3 years) was assessed via the SF-8 Health Survey questionnaire, Japanese version (SF-8). The subjects' voices were recorded and analyzed by a voice assessment software program, which calculated the pitch period perturbation quotient (PPQ), amplitude perturbation quotient (APQ), and noise-to-harmonic ratio (NHR). Subjects who scored under the 25th percentile on general health (GH), vitality (VT), or physical summary (PCS) in the SF-8 showed significantly higher PPQ, APQ, and NHR scores in comparison to their counterparts (p<0.05). After adjustment for age, lower GH scores were found to be associated with higher PPQ, APQ, and NHR scores; lower VT scores were associated with higher APQ and NHR scores; and lower PCS scores were associated with higher APQ and NHR scores (p<0.05). The results of the acoustic analysis indicated that voice was associated with HRQOL in the elderly nursing home residents of the present study. Among the acoustic parameters that were analyzed, PPQ, APQ, and NHR may be an influential factor that can be used to assess HRQOL, independently of the effects of age, in elderly individuals.
Acoustic changes of the voice as signs of vocal fatigue in radio broadcasters: preliminary findings.

PubMed

Guzmán, Marco; Malebrán, María Celina; Zavala, Paulina; Saldívar, Patricio; Muñoz, Daniel

2013-01-01

Vocal fatigue is one of the most common voice symptoms. It usually refers to the sensation of vocal tiredness after a long period of speaking or singing. The purpose of this study was to compare the acoustic characteristics of the voice before and after a long period of voice use in a group of radio broadcasters. Eight radio broadcasters with normal voices were assessed. We used cepstrum, energy ratio, noise to harmonic ratio and soft phonation index as acoustic variables to assess the possible pre-post vocal loading changes objectively. There were no statistically significant pre-post differences in any of the acoustic parameters. Although cepstrum at high pitch did not show a significant difference, it obtained the greatest difference among the acoustic variables. The acoustic measurements used in the present study might not be sensitive enough or appropriate for detecting vocal changes after a long period of voice use, whether in reading (as reported in previous research) or speaking tasks. Moreover, a longer period of vocal loading would eventually reveal more evident and consistent acoustic voice changes. Copyright © 2012 Elsevier España, S.L. All rights reserved.

[Voice acoustic study of plasma radiofrequency ablation for the treatment of laryngeal premalignant lesions].

PubMed

Zang, Y Z; Wan, B L; Jia, X D; Wang, G K

2016-11-01

Objective: To study the voice function effect of low temperature plasma radiofrequency ablation in the treatment of patients with laryngeal premalignant lesions. Method: Fifty cases of laryngeal premalignant lesions were treated with low temperature plasma radiofrequency ablation. All of the patients were examined by electronic laryngoscopy and acoustic analysis(F0，Jitter，Shimmer，NNE，HNR) in 2 weeks,1 month,3 months after surgery. Voice acoustic results were compared with a control group of 50 normal adults for the further analysis. Result: Fifty patients with laryngeal premalignant lesions were treated by low temperature plasma radiofrequency ablation.The result showed that 47 patients(94%)were successfully decannulated without serious complications, such as dyspnea, aphonia and anterior glottic stenosis. Acoustic analysis showed that F0,Jitter,Shimmer and NNE were significantly different from normal 2 weeks after surgery（ P <0.01）.Voice function recovered weakly 1 month after operation（ P <0.05）.There were no significant differences in the vocal parameters between plasma radiofrequency ablation group and control group 3 months after surgery( P >0.05). Conclusion: Radiofrequency coblation was a safe,minimally invasive and effective surgical method and can be widely used to treat laryngeal premalignant lesions．. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
Acoustic voice analysis of prelingually deaf adults before and after cochlear implantation.

PubMed

Evans, Maegan K; Deliyski, Dimitar D

2007-11-01

It is widely accepted that many severe to profoundly deaf adults have benefited from cochlear implants (CIs). However, limited research has been conducted to investigate changes in voice and speech of prelingually deaf adults who receive CIs, a population well known for presenting with a variety of voice and speech abnormalities. The purpose of this study was to use acoustic analysis to explore changes in voice and speech for three prelingually deaf males pre- and postimplantation over 6 months. The following measurements, some measured in varying contexts, were obtained: fundamental frequency (F0), jitter, shimmer, noise-to-harmonic ratio, voice turbulence index, soft phonation index, amplitude- and F0-variation, F0-range, speech rate, nasalance, and vowel production. Characteristics of vowel production were measured by determining the first formant (F1) and second formant (F2) of vowels in various contexts, magnitude of F2-variation, and rate of F2-variation. Perceptual measurements of pitch, pitch variability, loudness variability, speech rate, and intonation were obtained for comparison. Results are reported using descriptive statistics. The results showed patterns of change for some of the parameters while there was considerable variation across the subjects. All participants demonstrated a decrease in F0 in at least one context and demonstrated a change in nasalance toward the norm as compared to their normal hearing control. The two participants who were oral-language communicators were judged to produce vowels with an average of 97.2% accuracy and the sign-language user demonstrated low percent accuracy for vowel production.
Acoustic Analysis of Voice in Dysarthria following Stroke

ERIC Educational Resources Information Center

Wang, Yu-Tsai; Kent, Ray D.; Kent, Jane Finley; Duffy, Joseph R.; Thomas, Jack E.

2009-01-01

Although perceptual studies indicate the likelihood of voice disorders in persons with stroke, there have been few objective instrumental studies of voice dysfunction in dysarthria following stroke. This study reports automatic analysis of sustained vowel phonation for 61 speakers with stroke. The results show: (1) men with stroke and healthy…
Voice change in end-stage renal disease patients after hemodialysis: correlation of subjective hoarseness and objective acoustic parameters.

PubMed

Jung, Soo Yeon; Ryu, Jung-Hwa; Park, Hae Sang; Chung, Sung Min; Ryu, Dong-Ryeol; Kim, Han Su

2014-03-01

Patients with end-stage renal disease (ESRD) who are treated with hemodialysis (HD) frequently complain about hoarseness after completion of each HD session. The HD treatment affects laryngeal volume and muscle function. This study attempted to evaluate the vocal effect of HD by acoustic and aerodynamic analysis and to determine the difference between voice change group (VCG) and nonvoice change group (NVCG). A total of 55 patients (34 females and 21 males) diagnosed with ESRD and undergoing outpatient HD were enrolled. The subjects were divided into the VCG (n=13) and NVCG (n=42) by the change of the Korean Voice Handicap Index score. Patients underwent weighing and acoustic, aerodynamic analysis before and after the HD. Fundamental frequency (F0), jitter, shimmer, noise-to-harmonics ratio (NHR), pitch range, habitual pitch, voice energy, and maximal phonation time (MPT) were obtained. The pre- and post-HD data were compared using paired t test. The results were compared after dividing the total group into the VCG and NVCG categories. Correlation between the change of the weight and change of the voice analysis result was certified by Pearson correlation coefficient. The F0 and habitual pitch increased in all subjects. The NHR and MPT parameters significantly decreased (P<0.05). In the NVCG group, all the results were same as the total group. In the VCG group, the NHR result differed from the total group. All acoustic parameters showed no statistically significant differences between the two groups. There was no correlation between the weight change (%) and the change of acoustic parameter results. The NVCG group of patient displayed improvement in NHR, whereas the VCG group showed no change. Weight change did not significantly correlate with the voice analysis results. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Acoustic sensors in the helmet detect voice and physiology

NASA Astrophysics Data System (ADS)

Scanlon, Michael V.

2003-09-01

The Army Research Laboratory has developed body-contacting acoustic sensors that detect diverse physiological sounds such as heartbeats and breaths, high quality speech, and activity. These sensors use an acoustic impedance-matching gel contained in a soft, compliant pad to enhance the body borne sounds, yet significantly repel airborne noises due to an acoustic impedance mismatch. The signals from such a sensor can be used as a microphone with embedded physiology, or a dedicated digital signal processor can process packetized data to separate physiological parameters from voice, and log parameter trends for performance surveillance. Acoustic sensors were placed inside soldier helmets to monitor voice, physiology, activity, and situational awareness clues such as bullet shockwaves from sniper activity and explosions. The sensors were also incorporated into firefighter breathing masks, neck and wrist straps, and other protective equipment. Heart rate, breath rate, blood pressure, voice and activity can be derived from these sensors (reports at www.arl.army.mil/acoustics). Having numerous sensors at various locations provides a means for array processing to reduce motion artifacts, calculate pulse transit time for passive blood pressure measurement, and the origin of blunt/penetrating traumas such as ballistic wounding. These types of sensors give us the ability to monitor soldiers and civilian emergency first-responders in demanding environments, and provide vital signs information to assess their health status and how that person is interacting with the environment and mission at hand. The Objective Force Warrior, Scorpion, Land Warrior, Warrior Medic, and other military and civilian programs can potentially benefit from these sensors.
Perceptual, auditory and acoustic vocal analysis of speech and singing in choir conductors.

PubMed

Rehder, Maria Inês Beltrati Cornacchioni; Behlau, Mara

2008-01-01

the voice of choir conductors. to evaluate the vocal quality of choir conductors based on the production of a sustained vowel during singing and when speaking in order to observe auditory and acoustic differences. participants of this study were 100 choir conductors, with an equal distribution between genders. Participants were asked to produce the sustained vowel "é" using a singing and speaking voice. Speech samples were analyzed based on auditory-perceptive and acoustic parameters. The auditory-perceptive analysis was carried out by two speech-language pathologist, specialists in this field of knowledge. The acoustic analysis was carried out with the support of the computer software Doctor Speech (Tiger Electronics, SRD, USA, version 4.0), using the Real Analysis module. the auditory-perceptive analysis of the vocal quality indicated that most conductors have adapted voices, presenting more alterations in their speaking voice. The acoustic analysis indicated different values between genders and between the different production modalities. The fundamental frequency was higher in the singing voice, as well as the values for the first formant; the second formant presented lower values in the singing voice, with statistically significant results only for women. the voice of choir conductors is adapted, presenting fewer deviations in the singing voice when compared to the speaking voice. Productions differ based the voice modality, singing or speaking.
Voice disorders in children and its relationship with auditory, acoustic and vocal behavior parameters.

PubMed

Simões-Zenari, Marcia; Nemr, Katia; Behlau, Mara

2012-06-01

Parameters to distinguish normal from deviant voices in early childhood have not been established. The current study sought to auditorily and acoustically characterize voices of children, and to study the relationship between vocal behavior reported by teachers and the presence of vocal aberrations. One hundred children between four and 6 years and 11 months, who attended early childhood educational institutions, were included. The sample comprised 50 children with normal voices (NVG) and 50 with deviant voices (DVG) matched by gender and age. All participants were submitted to auditory and acoustic analysis of vocal quality and had their vocal behaviors assessed by teachers through a specific protocol. DVG had a higher incidence of breathiness (p<0.001) and roughness (p<0.001), but not vocal strain (p=0.546), which was similar in both groups. The average F(0) was lower in the DVG and a higher noise component was observed in this group as well. Regarding the protocol used "Aspects Related to Phonotrauma - Children's Protocol", higher means were observed for children from DVG in all analyzed aspects and also on the overall means (DVG=2.15; NVG=1.12, p<0.001). In NVG, a higher incidence of vocal behavior without alterations or with discrete alterations was observed, whereas a higher incidence of moderate, severe or extreme alterations of vocal behavior was observed in DVG. Perceptual assessment of voice, vocal acoustic parameters (F(0), noise and GNE), and aspects related to vocal trauma and vocal behavior differentiated the groups of children with normal voice and deviant voice. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Comparison of Acoustic and Stroboscopic Findings and Voice Handicap Index between Allergic Rhinitis Patients and Controls.

PubMed

Koç, Eltaf Ayça Özbal; Koç, Bülent; Erbek, Selim

2014-12-01

In our experience Allergic Rhinitis (AR) patients suffer from voice problems more than health subjects. To investigate the acoustic analysis of voice, stroscopic findings of larynx and Voice Handicap Index scores in allergic rhinitis patients compared with healthy controls. Case-control study. Thirty adult patients diagnosed with perennial allergic rhinitis were compared with 30 age- and sex-matched healthy controls without allergy. All assessments were performed in the speech physiology laboratory and the testing sequence was as follows: 1. Voice Handicap Index (VHI) questionnaire, 2. Laryngovideostroboscopy, 3. Acoustic analyses. No difference was observed between the allergic rhinitis and control groups regarding mean Maximum Phonation Time (MPT) values, Fo values, and stroboscopic assessment (p>0.05). On the other hand, mean VHI score (p=0.001) and s/z ratio (p=0.011) were significantly higher in the allergic rhinitis group than in controls. Our findings suggest that the presence of allergies could have effects on laryngeal dysfunction and voice-related quality of life.
The Belt voice: Acoustical measurements and esthetic correlates

NASA Astrophysics Data System (ADS)

Bounous, Barry Urban

This dissertation explores the esthetic attributes of the Belt voice through spectral acoustical analysis. The process of understanding the nature and safe practice of Belt is just beginning, whereas the understanding of classical singing is well established. The unique nature of the Belt sound provides difficulties for voice teachers attempting to evaluate the quality and appropriateness of a particular sound or performance. This study attempts to provide answers to the question "does Belt conform to a set of measurable esthetic standards?" In answering this question, this paper expands on a previous study of the esthetic attributes of the classical baritone voice (see "Vocal Beauty", NATS Journal 51,1) which also drew some tentative conclusions about the Belt voice but which had an inadequate sample pool of subjects from which to draw. Further, this study demonstrates that it is possible to scientifically investigate the realm of musical esthetics in the singing voice. It is possible to go beyond the "a trained voice compared to an untrained voice" paradigm when evaluating quantitative vocal parameters and actually investigate what truly beautiful voices do. There are functions of sound energy (measured in dB) transference which may affect the nervous system in predictable ways and which can be measured and associated with esthetics. This study does not show consistency in measurements for absolute beauty (taste) even among belt teachers and researchers but does show some markers with varying degrees of importance which may point to a difference between our cognitive learned response to singing and our emotional, more visceral response to sounds. The markers which are significant in determining vocal beauty are: (1) Vibrancy-Characteristics of vibrato including speed, width, and consistency (low variability). (2) Spectral makeup-Ratio of partial strength above the fundamental to the fundamental. (3) Activity of the voice-The quantity of energy being produced. (4
Validation of the Acoustic Voice Quality Index in the Japanese Language.

PubMed

Hosokawa, Kiyohito; Barsties, Ben; Iwahashi, Toshihiko; Iwahashi, Mio; Kato, Chieri; Iwaki, Shinobu; Sasai, Hisanori; Miyauchi, Akira; Matsushiro, Naoki; Inohara, Hidenori; Ogawa, Makoto; Maryn, Youri

2017-03-01

The Acoustic Voice Quality Index (AVQI) is a multivariate construct for quantification of overall voice quality based on the analysis of continuous speech and sustained vowel. The stability and validity of the AVQI is well established in several language families. However, the Japanese language has distinct characteristics with respect to several parameters of articulatory and phonatory physiology. The aim of the study was to confirm the criterion-related concurrent validity of AVQI, as well as its responsiveness to change and diagnostic accuracy for voice assessment in the Japanese-speaking population. This is a retrospective study. A total of 336 voice recordings, which included 69 pairs of voice recordings (before and after therapeutic interventions), were eligible for the study. The auditory-perceptual judgment of overall voice quality was evaluated by five experienced raters. The concurrent validity, responsiveness to change, and diagnostic accuracy of the AVQI were estimated. The concurrent validity and responsiveness to change based on the overall voice quality was indicated by high correlation coefficients 0.828 and 0.767, respectively. Receiver operating characteristic analysis revealed an excellent diagnostic accuracy for discrimination between dysphonic and normophonic voices (area under the curve: 0.905). The best threshold level for the AVQI of 3.15 corresponded with a sensitivity of 72.5% and specificity of 95.2%, with the positive and negative likelihood ratios of 15.1 and 0.29, respectively. We demonstrated the validity of the AVQI as a tool for assessment of overall voice quality and that of voice therapy outcomes in the Japanese-speaking population. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Changes after voice therapy in objective and subjective voice measurements of pediatric patients with vocal nodules.

PubMed

Tezcaner, Ciler Zahide; Karatayli Ozgursoy, Selmin; Ozgursoy, Selmin Karatayli; Sati, Isil; Dursun, Gursel

2009-12-01

The aim of this study was to analyze the efficiency of the voice therapy in children with vocal nodules by using the acoustic analysis and subjective assessment. Thirty-nine patients with vocal fold nodules, aged between 7 and 14, were included in the study. Each subject had voice therapy led by an experienced voice therapist once a week. All diagnostic and follow-up workouts were performed before the voice therapy and after the third or the sixth month. Transoral and/or transnasal videostroboscopic examination and acoustic analysis were achieved using multi-dimensional voice program (MDVP) and subjective analysis with GRBAS scale. As for the perceptual assessment, the difference was significant for four parameters out of five. A significant improvement was found in the acoustic analysis parameters of jitter, shimmer, and noise-to-harmonic ratio. The voice therapy which was planned according to patients' needs, age, compliance and response to therapy had positive effects on pediatric patients with vocal nodules. Acoustic analysis and GRBAS may be used successfully in the follow-up of pediatric vocal nodule treatment.
A study of VHI scores and acoustic features in street vendors as occupational voice users.

PubMed

Natour, Yaser S; Darawsheh, Wesam B; Bashiti, Sara; Wari, Majd; Taha, Juhayna; Odeh, Thair

to investigate acoustic features of phonation and perception of voice handicap in street vendors. Eighty-eight participants (44 street vendors, 44 controls) were recruited. The mean age of the group was 38.9±16.0 years (range: 20-78 years). Scores of the Arabic version of the Voice Handicap Index (VHI-Arab) were used for analysis. Acoustic measures of fundamental frequency (F 0 ), jitter, shimmer, and signal-to-noise ratio (SNR) were also analyzed. Analysis showed a significant difference between street vendors and controls in the total score of the VHI-Arab (p<0.001) as well as scores of all three VHI-Arab subsections: functional (p<0.001), physical (p<0.001), and emotional (p=0.025). Weak correlations were found among all of the VHI scores and acoustic measures (-0.219≤ r≤0.355), except for SNR where a moderate negative correlations were found (r=-0.555; -0.4) between the VHI (physical and total) scores and SNR values. Significant differences also were found in F 0 , jitter, and SNR among specific subgroups of street vendors when stratified by weekly hours worked (p<0.05), and in jitter (p=0.39) when stratified by educational level. Perception of voice handicap and a possible effect on vocal quality in street vendors were noted. The effect of factors, namely work hours and educational level, on voice quality should be further studied. Copyright © 2017. Published by Elsevier Inc.
Quantitative Analysis of Voice in Parkinson Disease Compared to Motor Performance: A Pilot Study.

PubMed

Silbergleit, Alice K; LeWitt, Peter A; Peterson, Edward L; Gardner, Glendon M

2015-01-01

Characteristic features of hypokinetic dysarthria develop in Parkinson disease (PD). We hypothesized that quantified acoustic changes of voice might provide a correlate of disease severity. To determine if there are significant differences in acoustic measures of voice between mild and moderate PD; 2) To evaluate correlations between acoustic parameters of voice and subtests of the UPDRS in mild and moderate PD. Twenty six participants with PD underwent vocal acoustic testing while off PD medication, for comparison to 22 healthy controls. Participants with PD were divided into two groups based upon UPDRS activities of daily living (ADL) ratings: summed scores were used to define mild and moderate PD. Participants voiced /i/ ("ee") at comfort, high, and low pitch (3 trials/pitch). The CSpeech Waveform Analysis Program was used to analyze cycle-to-cycle frequency ("jitter") and amplitude ("shimmer") irregularities of the vocal signal, signal-to-noise ratio, and maximum phonation frequency range converted to semitones. Sections of UPDRS scores were correlated to acoustic variables of voice. Key findings included a significant difference between the semitone range of the control subjects and the moderate PD group (p = 0.036). Further analyses revealed significant differences in semitone range for males between the controls vs. mild PD (p = 0.014), and controls vs. moderate PD (p = 0.005). Significant correlations were also found between acoustic findings and both the ADL and motor portions of the UPDRS. Acoustic analysis of voice, particularly frequency range, may provide a quantifiable correlate of disease progression in PD.
Validation of the Acoustic Voice Quality Index Version 03.01 and the Acoustic Breathiness Index in the Spanish language.

PubMed

Delgado Hernández, Jonathan; León Gómez, Nieves M; Jiménez, Alejandra; Izquierdo, Laura M; Barsties V Latoszek, Ben

2018-05-01

The aim of this study was to validate the Acoustic Voice Quality Index 03.01 (AVQIv3) and the Acoustic Breathiness Index (ABI) in the Spanish language. Concatenated voice samples of continuous speech (cs) and sustained vowel (sv) from 136 subjects with dysphonia and 47 vocally healthy subjects were perceptually judged for overall voice quality and breathiness severity. First, to reach a higher level of ecological validity, the proportions of cs and sv were equalized regarding the time length of 3 seconds sv part and voiced cs part, respectively. Second, concurrent validity and diagnostic accuracy were verified. A moderate reliability of overall voice quality and breathiness severity from 5 experts was used. It was found that 33 syllables as standardization of the cs part, which represents 3 seconds of voiced cs, allows the equalization of both speech tasks. A strong correlation was revealed between AVQIv3 and overall voice quality and ABI and perceived breathiness severity. Additionally, the best diagnostic outcome was identified at a threshold of 2.28 and 3.40 for AVQIv3 and ABI, respectively. The AVQIv3 and ABI showed in the Spanish language valid and robust results to quantify abnormal voice qualities regarding overall voice quality and breathiness severity.
Acoustical analysis of the underlying voice differences between two groups of professional singers: opera and country and western.

PubMed

Burns, P

1986-05-01

An acoustical analysis of the speaking and singing voices of two types of professional singers was conducted. The vowels /i/, /a/, and /o/ were spoken and sung ten times each by seven opera and seven country and western singers. Vowel spectra were derived by computer software techniques allowing quantitative assessment of formant structure (F1-F4), relative amplitude of resonance peaks (F1-F4), fundamental frequency, and harmonic high frequency energy. Formant analysis was the most effective parameter differentiating the two groups. Only opera singers lowered their fourth formant creating a wide-band resonance area (approximately 2,800 Hz) corresponding to the well-known "singing formant." Country and western singers revealed similar resonatory voice characteristics for both spoken and sung output. These results implicate faulty vocal technique in country and western singers as a contributory reason for vocal abuse/fatigue.
Acoustic and Auditory Perception Effects of the Voice Therapy Technique Finger Kazoo in Adult Women.

PubMed

Christmann, Mara Keli; Cielo, Carla Aparecida

2017-05-01

This study aimed to verify and to correlate acoustic and auditory-perceptual measures of glottic source after the performance of finger kazoo (FK) technique. This is an experimental, cross-sectional, and qualitative study. We made an analysis of the vowel [a:] in 46 adult women with neither vocal complaints nor laryngeal alterations, through the Multi-Dimensional Voice Program Advanced and RASATI scale, before and immediately after performing three series of FK and 5 minutes after a period of silence. Kappa, Friedman, Wilcoxon, and Spearman tests were used. We found significant increase in fundamental frequency, reduction of amplitude variation, and degree of sub-harmonics immediately after performing FK. Positive correlations were measures of frequency and its perturbation, measures of amplitude, of soft phonation index, of degree and number of unvoiced segments with aspects of RASATI. Negative correlations were voice turbulence index, measures of frequency and its perturbation, and measures of soft phonation index with aspects of RASATI. There was fundamental frequency increase, within normal limits, and reduction of acoustic measures related to presence of noise and instability. In general, acoustic measures, suggestive of noise and instability, were reduced according to the decrease of perceptive-auditory aspects of vocal alteration. It shows that both instruments are complementary and that the acoustic vocal effect was positive. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
System And Method For Characterizing Voiced Excitations Of Speech And Acoustic Signals, Removing Acoustic Noise From Speech, And Synthesizi

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-04-25

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
What makes a voice masculine: physiological and acoustical correlates of women's ratings of men's vocal masculinity.

PubMed

Cartei, Valentina; Bond, Rod; Reby, David

2014-09-01

Men's voices contain acoustic cues to body size and hormonal status, which have been found to affect women's ratings of speaker size, masculinity and attractiveness. However, the extent to which these voice parameters mediate the relationship between speakers' fitness-related features and listener's judgments of their masculinity has not yet been investigated. We audio-recorded 37 adult heterosexual males performing a range of speech tasks and asked 20 adult heterosexual female listeners to rate speakers' masculinity on the basis of their voices only. We then used a two-level (speaker within listener) path analysis to examine the relationships between the physiological (testosterone, height), acoustic (fundamental frequency or F0, and resonances or ΔF) and perceptual dimensions (listeners' ratings) of speakers' masculinity. Overall, results revealed that male speakers who were taller and had higher salivary testosterone levels also had lower F0 and ΔF, and were in turn rated as more masculine. The relationship between testosterone and perceived masculinity was essentially mediated by F0, while that of height and perceived masculinity was partially mediated by both F0 and ΔF. These observations confirm that women listeners attend to sexually dimorphic voice cues to assess the masculinity of unseen male speakers. In turn, variation in these voice features correlate with speakers' variation in stature and hormonal status, highlighting the interdependence of these physiological, acoustic and perceptual dimensions. Copyright © 2014. Published by Elsevier Inc.
The Relationship Between Acoustic Signal Typing and Perceptual Evaluation of Tracheoesophageal Voice Quality for Sustained Vowels.

PubMed

Clapham, Renee P; van As-Brooks, Corina J; van Son, Rob J J H; Hilgers, Frans J M; van den Brekel, Michiel W M

2015-07-01

To investigate the relationship between acoustic signal typing and perceptual evaluation of sustained vowels produced by tracheoesophageal (TE) speakers and the use of signal typing in the clinical setting. Two evaluators independently categorized 1.75-second segments of narrow-band spectrograms according to acoustic signal typing and independently evaluated the recording of the same segments on a visual analog scale according to overall perceptual acoustic voice quality. The relationship between acoustic signal typing and overall voice quality (as a continuous scale and as a four-point ordinal scale) was investigated and the proportion of inter-rater agreement as well as the reliability between the two measures is reported. The agreement between signal type (I-IV) and ordinal voice quality (four-point scale) was low but significant, and there was a significant linear relationship between the variables. Signal type correctly predicted less than half of the voice quality data. There was a significant main effect of signal type on continuous voice quality scores with significant differences in median quality scores between signal types I-IV, I-III, and I-II. Signal typing can be used as an adjunct to perceptual and acoustic evaluation of the same stimuli for TE speech as part of a multidimensional evaluation protocol. Signal typing in its current form provides limited predictive information on voice quality, and there is significant overlap between signal types II and III and perceptual categories. Future work should consider whether the current four signal types could be refined. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Effects of voice style, noise level, and acoustic feedback on objective and subjective voice evaluations

PubMed Central

Bottalico, Pasquale; Graetzer, Simone; Hunter, Eric J.

2015-01-01

Speakers adjust their vocal effort when communicating in different room acoustic and noise conditions and when instructed to speak at different volumes. The present paper reports on the effects of voice style, noise level, and acoustic feedback on vocal effort, evaluated as sound pressure level, and self-reported vocal fatigue, comfort, and control. Speakers increased their level in the presence of babble and when instructed to talk in a loud style, and lowered it when acoustic feedback was increased and when talking in a soft style. Self-reported responses indicated a preference for the normal style without babble noise. PMID:26723357

Robotic vehicle uses acoustic sensors for voice detection and diagnostics

NASA Astrophysics Data System (ADS)

Young, Stuart H.; Scanlon, Michael V.

2000-07-01

An acoustic sensor array that cues an imaging system on a small tele- operated robotic vehicle was used to detect human voice and activity inside a building. The advantage of acoustic sensors is that it is a non-line of sight (NLOS) sensing technology that can augment traditional LOS sensors such as visible and IR cameras. Acoustic energy emitted from a target, such as from a person, weapon, or radio, will travel through walls and smoke, around corners, and down corridors, whereas these obstructions would cripple an imaging detection system. The hardware developed and tested used an array of eight microphones to detect the loudest direction and automatically setter a camera's pan/tilt toward the noise centroid. This type of system has applicability for counter sniper applications, building clearing, and search/rescue. Data presented will be time-frequency representations showing voice detected within rooms and down hallways at various ranges. Another benefit of acoustics is that it provides the tele-operator some situational awareness clues via low-bandwidth transmission of raw audio data for the operator to interpret with either headphones or through time-frequency analysis. This data can be useful to recognize familiar sounds that might indicate the presence of personnel, such as talking, equipment, movement noise, etc. The same array also detects the sounds of the robot it is mounted on, and can be useful for engine diagnostics and trouble shooting, or for self-noise emanations for stealthy travel. Data presented will characterize vehicle self noise over various surfaces such as tiles, carpets, pavement, sidewalk, and grass. Vehicle diagnostic sounds will indicate a slipping clutch and repeated unexpected application of emergency braking mechanism.
The Effects of Size and Type of Vocal Fold Polyp on Some Acoustic Voice Parameters.

PubMed

Akbari, Elaheh; Seifpanahi, Sadegh; Ghorbani, Ali; Izadi, Farzad; Torabinezhad, Farhad

2018-03-01

Vocal abuse and misuse would result in vocal fold polyp. Certain features define the extent of vocal folds polyp effects on voice acoustic parameters. The present study aimed to define the effects of polyp size on acoustic voice parameters, and compare these parameters in hemorrhagic and non-hemorrhagic polyps. In the present retrospective study, 28 individuals with hemorrhagic or non-hemorrhagic polyps of the true vocal folds were recruited to investigate acoustic voice parameters of vowel/ æ/ computed by the Praat software. The data were analyzed using the SPSS software, version 17.0. According to the type and size of polyps, mean acoustic differences and correlations were analyzed by the statistical t test and Pearson correlation test, respectively; with significance level below 0.05. The results indicated that jitter and the harmonics-to-noise ratio had a significant positive and negative correlation with the polyp size (P=0.01), respectively. In addition, both mentioned parameters were significantly different between the two types of the investigated polyps. Both the type and size of polyps have effects on acoustic voice characteristics. In the present study, a novel method to measure polyp size was introduced. Further confirmation of this method as a tool to compare polyp sizes requires additional investigations.
The Effects of Size and Type of Vocal Fold Polyp on Some Acoustic Voice Parameters

PubMed Central

Akbari, Elaheh; Seifpanahi, Sadegh; Ghorbani, Ali; Izadi, Farzad; Torabinezhad, Farhad

2018-01-01

Background Vocal abuse and misuse would result in vocal fold polyp. Certain features define the extent of vocal folds polyp effects on voice acoustic parameters. The present study aimed to define the effects of polyp size on acoustic voice parameters, and compare these parameters in hemorrhagic and non-hemorrhagic polyps. Methods In the present retrospective study, 28 individuals with hemorrhagic or non-hemorrhagic polyps of the true vocal folds were recruited to investigate acoustic voice parameters of vowel/ æ/ computed by the Praat software. The data were analyzed using the SPSS software, version 17.0. According to the type and size of polyps, mean acoustic differences and correlations were analyzed by the statistical t test and Pearson correlation test, respectively; with significance level below 0.05. Results The results indicated that jitter and the harmonics-to-noise ratio had a significant positive and negative correlation with the polyp size (P=0.01), respectively. In addition, both mentioned parameters were significantly different between the two types of the investigated polyps. Conclusion Both the type and size of polyps have effects on acoustic voice characteristics. In the present study, a novel method to measure polyp size was introduced. Further confirmation of this method as a tool to compare polyp sizes requires additional investigations. PMID:29749984
Evaluating voice characteristics of first-year acting students in Israel: factor analysis.

PubMed

Amir, Ofer; Primov-Fever, Adi; Kushnir, Tami; Kandelshine-Waldman, Osnat; Wolf, Michael

2013-01-01

Acting students require diverse, high-quality, and high-intensity vocal performance from early stages of their training. Demanding vocal activities, before developing the appropriate vocal skills, put them in high risk for developing vocal problems. A retrospective analysis of voice characteristics of first-year acting students using several voice evaluation tools. A total of 79 first-year acting students (55 women and 24 men) were assigned into two study groups: laryngeal findings (LFs) and no laryngeal findings, based on stroboscopic findings. Their voice characteristics were evaluated using acoustic analysis, aerodynamic examination, perceptual scales, and self-report questionnaires. Results obtained from each set of measures were examined using a factor analysis approach. Significant differences between the two groups were found for a single fundamental frequency (F(0))-Regularity factor; a single Grade, Roughness, Breathiness, Asthenia, Strain perceptual factor; and the three self-evaluation factors. Gender differences were found for two acoustic analysis factors, which were based on F(0) and its derivatives, namely an aerodynamic factor that represents expiratory volume measurements and a single self-evaluation factor that represents the tendency to seek therapy. Approximately 50% of the first-year acting students had LFs. These students differed from their peers in the control group in a single acoustic analysis factor, as well as perceptual and self-report factors. No group differences, however, were found for the aerodynamic factors. Early laryngeal examination and voice evaluation of future professional voice users could provide a valuable individual baseline, to which later examinations could be compared, and assist in providing personally tailored treatment. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Validation of the Acoustic Voice Quality Index in the Lithuanian Language.

PubMed

Uloza, Virgilijus; Petrauskas, Tadas; Padervinskis, Evaldas; Ulozaitė, Nora; Barsties, Ben; Maryn, Youri

2017-03-01

The aim of the present study was to validate the Acoustic Voice Quality Index in Lithuanian language (AVQI-LT) and investigate the feasibility and robustness of its diagnostic accuracy, differentiating normal and dysphonic voice. A total of 184 native Lithuanian subjects with normal voices (n = 46) and with various voice disorders (n = 138) were asked to read aloud the Lithuanian text and to sustain the vowel /a/. A sentence with 13 syllables and a 3-second midvowel portion of the sustained vowel were edited. Both speech tasks were concatenated, and perceptually rated for dysphonia severity by five voice clinicians. They rated the Grade (G) from the Grade Roughness Breathiness Asthenia Strain (GRBAS) protocol and the overall severity from the Consensus Auditory-perceptual Evaluation of Voice protocol with a visual analog scale (VAS). The average scores (G mean and VAS mean ) were taken as the perceptual dysphonia severity level for every voice sample. All concatenated voice samples were acoustically analyzed to receive an AVQI-LT score. Both auditory-perceptual judgment procedures showed sufficient strength of agreement between five raters. The results achieved significant and marked concurrent validity between both auditory-perceptual judgment procedures and AVQI-LT. The diagnostic accuracy of AVQI-LT showed for both auditory-perceptual judgment procedures comparable results with two different AVQI-LT thresholds. The AVQI-LT threshold of 2.97 for the G mean rating obtained reasonable sensitivity = 0.838 and excellent specificity = 0.937. For the VAS rating, an AVQI-LT threshold of 3.48 was determined with sensitivity = 0.840 and specificity = 0.922. The AVQI-LT is considered a valid and reliable tool for assessing the dysphonia severity level in Lithuanian-speaking population. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C [Livermore, CA; Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA

2006-08-08

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2004-03-23

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-02-14

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
Acoustic passaggio pedagogy for the male voice.

PubMed

Bozeman, Kenneth Wood

2013-07-01

Awareness of interactions between the lower harmonics of the voice source and the first formant of the vocal tract, and of the passive vowel modifications that accompany them, can assist in working out a smooth transition through the passaggio of the male voice. A stable vocal tract length establishes the general location of all formants, including the higher formants that form the singer's formant cluster. Untrained males instinctively shorten the tube to preserve the strong F1/H2 acoustic coupling of voce aperta, resulting in 'yell' timbre. If tube length and shape are kept stable during pitch ascent, the yell can be avoided by allowing the second harmonic to rise above the first formant, creating the balanced timbre of voce chiusa.
Remote Capture of Human Voice Acoustical Data by Telephone: A Methods Study

ERIC Educational Resources Information Center

Cannizzaro, Michael S.; Reilly, Nicole; Mundt, James C.; Snyder, Peter J.

2005-01-01

In this pilot study we sought to determine the reliability and validity of collecting speech and voice acoustical data via telephone transmission for possible future use in large clinical trials. Simultaneous recordings of each participant's speech and voice were made at the point of participation, the local recording (LR), and over a telephone…
Correlation of VHI-30 to Acoustic Measurements Across Three Common Voice Disorders.

PubMed

Dehqan, Ali; Yadegari, Fariba; Scherer, Ronald C; Dabirmoghadam, Peyman

2017-01-01

Voice disorders that affect the quality of voice also result in varying degrees of psychological and social problems. The research question here is whether the correlations between Voice Handicap Index (VHI)-30 scores and objective acoustic measures differ in patients with different types of voice disorders. The subjects were divided into three groups: muscle tension dysphonia (MTD), benign mid-membranous vocal fold lesions, and unilateral vocal fold paralysis (UVFP). All participants were male. The mean age for the groups were 32.85 ± 8.6 years in the MTD group, 33.24 ± 7.32 years in the benign lesions group, and 34.24 ± 7.51 years in the UVFP group. The participants completed the Persian VHI-30 questionnaire. PRAAT software was used to obtain acoustic analyses. There was a significant correlation between the physical subscale of the VHI-30 and the total score of the VHI-30 and maximum phonation time (MPT) in the MTD group. Also, there was a significant correlation between the total VHI-30 score and the MPT value. There were relatively strong and significant correlations between the physical subscale of the VHI-30 with jitter and shimmer, harmonics-to-noise ratio (HNR) for the group with benign lesions such as nodules and polyps. Also, in this group, there was a significant correlation between the total VHI-30 score and the jitter value. The physical scale had strong and significant correlations between jitter, shimmer, and HNR in the unilateral paralysis group. Findings suggest that although the VHI-30 and the acoustic measurements of voice provide independent information, they are associated to some extent. Copyright © 2017 The Voice Foundation. All rights reserved.
Voice Acoustical Measurement of the Severity of Major Depression

ERIC Educational Resources Information Center

Cannizzaro, Michael; Harel, Brian; Reilly, Nicole; Chappell, Phillip; Snyder, Peter J.

2004-01-01

A number of empirical studies have documented the relationship between quantifiable and objective acoustical measures of voice and speech, and clinical subjective ratings of severity of Major Depression. To further explore this relationship, speech samples were extracted from videotape recordings of structured interviews made during the…
Acoustics Characteristics of Voice and Vocal Care in Acting and Other Students

ERIC Educational Resources Information Center

Varosanec-Skaric, Gordana

2008-01-01

Based on voice-history data, a X[superscript 2] test was used to investigate the difference between students of acting (n = 45) and other students (n = 45). A t-test was used to calculate the differences in acoustic parameters between the two groups. It was expected that students of acting spent significantly more time practicing voice exercises,…
Computational Modeling of Fluid–Structure–Acoustics Interaction during Voice Production

PubMed Central

Jiang, Weili; Zheng, Xudong; Xue, Qian

2017-01-01

The paper presented a three-dimensional, first-principle based fluid–structure–acoustics interaction computer model of voice production, which employed a more realistic human laryngeal and vocal tract geometries. Self-sustained vibrations, important convergent–divergent vibration pattern of the vocal folds, and entrainment of the two dominant vibratory modes were captured. Voice quality-associated parameters including the frequency, open quotient, skewness quotient, and flow rate of the glottal flow waveform were found to be well within the normal physiological ranges. The analogy between the vocal tract and a quarter-wave resonator was demonstrated. The acoustic perturbed flux and pressure inside the glottis were found to be at the same order with their incompressible counterparts, suggesting strong source–filter interactions during voice production. Such high fidelity computational model will be useful for investigating a variety of pathological conditions that involve complex vibrations, such as vocal fold paralysis, vocal nodules, and vocal polyps. The model is also an important step toward a patient-specific surgical planning tool that can serve as a no-risk trial and error platform for different procedures, such as injection of biomaterials and thyroplastic medialization. PMID:28243588
Acoustic characteristics of voice after severe traumatic brain injury.

PubMed

McHenry, M

2000-07-01

To describe the acoustic characteristics of voice in individuals with motor speech disorders after traumatic brain injury (TBI). Prospective study of 100 individuals with TBI based on consecutive referrals for motor speech evaluations. Subjects were audio tape-recorded while producing sustained vowels and single word and sentence intelligibility tests. Laryngeal airway resistance was estimated, and voice quality was rated perceptually. None of the subjects evidenced vocal parameters within normal limits. The most frequently occurring abnormal parameter across subjects was amplitude perturbation, followed by voice turbulence index. Twenty-three percent of subjects evidenced deviation in all five parameters measured. The perceptual ratings of breathiness were significantly correlated with both the amplitude perturbation quotient and the noise-to-harmonics ratio. Vocal quality deviation is common in motor speech disorders after TBI and may impact intelligibility.
Predicting mutational change in the speaking voice of boys.

PubMed

Fuchs, Michael; Fröehlich, Matthias; Hentschel, Bettina; Stuermer, Ingo W; Kruse, Eberhard; Knauft, Daniel

2007-03-01

The authors investigated whether acoustic speaking voice analyses can be used to predict the beginning of mutation in 21 male members of a professional boys' choir. Over a period of 3 years before mutation, children were examined every 3 months by ear, nose, and throat (ENT) and phoniatric specialists. At the same time, the voice was evaluated acoustically using analysis features of the Goettingen Hoarseness Diagram (GHD). Irregularity component and noise component, jitter, shimmer, mean waveform correlation coefficient, and fundamental frequency were determined from recordings of the speaking voice. Significant changes of acoustic features appeared 7 and 5 months before mutation onset, which indicates that vocal function is already restricted 6 months before mutation onset. This acoustic voice analysis is therefore suitable to support the care of the professional singing voice.
Analysis of the Auditory Feedback and Phonation in Normal Voices.

PubMed

Arbeiter, Mareike; Petermann, Simon; Hoppe, Ulrich; Bohr, Christopher; Doellinger, Michael; Ziethe, Anke

2018-02-01

The aim of this study was to investigate the auditory feedback mechanisms and voice quality during phonation in response to a spontaneous pitch change in the auditory feedback. Does the pitch shift reflex (PSR) change voice pitch and voice quality? Quantitative and qualitative voice characteristics were analyzed during the PSR. Twenty-eight healthy subjects underwent transnasal high-speed video endoscopy (HSV) at 8000 fps during sustained phonation [a]. While phonating, the subjects heard their sound pitched up for 700 cents (interval of a fifth), lasting 300 milliseconds in their auditory feedback. The electroencephalography (EEG), acoustic voice signal, electroglottography (EGG), and high-speed-videoendoscopy (HSV) were analyzed to compare feedback mechanisms for the pitched and unpitched condition of the phonation paradigm statistically. Furthermore, quantitative and qualitative voice characteristics were analyzed. The PSR was successfully detected within all signals of the experimental tools (EEG, EGG, acoustic voice signal, HSV). A significant increase of the perturbation measures and an increase of the values of the acoustic parameters during the PSR were observed, especially for the audio signal. The auditory feedback mechanism seems not only to control for voice pitch but also for voice quality aspects.
Evaluation of Voice Acoustics as Predictors of Clinical Depression Scores.

PubMed

Hashim, Nik Wahidah; Wilkes, Mitch; Salomon, Ronald; Meggs, Jared; France, Daniel J

2017-03-01

The aim of the present study was to determine if acoustic measures of voice, characterizing specific spectral and timing properties, predict clinical ratings of depression severity measured in a sample of patients using the Hamilton Depression Rating Scale (HAMD) and Beck Depression Inventory (BDI-II). This is a prospective study. Voice samples and clinical depression scores were collected prospectively from consenting adult patients who were referred to psychiatry from the adult emergency department or primary care clinics. The patients were audio-recorded as they read a standardized passage in a nearly closed-room environment. Mean Absolute Error (MAE) between actual and predicted depression scores was used as the primary outcome measure. The average MAE between predicted and actual HAMD scores was approximately two scores for both men and women, and the MAE for the BDI-II scores was approximately one score for men and eight scores for women. Timing features were predictive of HAMD scores in female patients while a combination of timing features and spectral features was predictive of scores in male patients. Timing features were predictive of BDI-II scores in male patients. Voice acoustic features extracted from read speech demonstrated variable effectiveness in predicting clinical depression scores in men and women. Voice features were highly predictive of HAMD scores in men and women, and BDI-II scores in men, respectively. The methodology is feasible for diagnostic applications in diverse clinical settings as it can be implemented during a standard clinical interview in a normal closed room and without strict control on the recording environment. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice parameters and videonasolaryngoscopy in children with vocal nodules: a longitudinal study, before and after voice therapy.

PubMed

Valadez, Victor; Ysunza, Antonio; Ocharan-Hernandez, Esther; Garrido-Bustamante, Norma; Sanchez-Valerio, Araceli; Pamplona, Ma C

2012-09-01

Vocal Nodules (VN) are a functional voice disorder associated with voice misuse and abuse in children. There are few reports addressing vocal parameters in children with VN, especially after a period of vocal rehabilitation. The purpose of this study is to describe measurements of vocal parameters including Fundamental Frequency (FF), Shimmer (S), and Jitter (J), videonasolaryngoscopy examination and clinical perceptual assessment, before and after voice therapy in children with VN. Voice therapy was provided using visual support through Speech-Viewer software. Twenty patients with VN were studied. An acoustical analysis of voice was performed and compared with data from subjects from a control group matched by age and gender. Also, clinical perceptual assessment of voice and videonasolaryngoscopy were performed to all patients with VN. After a period of voice therapy, provided with visual support using Speech Viewer-III (SV-III-IBM) software, new acoustical analyses, perceptual assessments and videonasolaryngoscopies were performed. Before the onset of voice therapy, there was a significant difference (p<0.05) in mean FF, S and J, between the patients with VN and subjects from the control group. After the voice therapy period, a significant improvement (p<0.05) was found in all acoustic voice parameters. Moreover, perceptual voice analysis demonstrated improvement in all cases. Finally, videonasolaryngoscopy demonstrated that vocal nodules were no longer discernible on the vocal folds in any of the cases. SV-III software seems to be a safe and reliable method for providing voice therapy in children with VN. Acoustic voice parameters, perceptual data and videonasolaryngoscopy were significantly improved after the speech therapy period was completed. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Multidimensional assessment of strongly irregular voices such as in substitution voicing and spasmodic dysphonia: a compilation of own research.

PubMed

Moerman, Mieke; Martens, Jean-Pierre; Dejonckere, Philippe

2015-04-01

This article is a compilation of own research performed during the European COoperation in Science and Technology (COST) action 2103: 'Advance Voice Function Assessment', an initiative of voice and speech processing teams consisting of physicists, engineers, and clinicians. This manuscript concerns analyzing largely irregular voicing types, namely substitution voicing (SV) and adductor spasmodic dysphonia (AdSD). A specific perceptual rating scale (IINFVo) was developed, and the Auditory Model Based Pitch Extractor (AMPEX), a piece of software that automatically analyses running speech and generates pitch values in background noise, was applied. The IINFVo perceptual rating scale has been shown to be useful in evaluating SV. The analysis of strongly irregular voices stimulated a modification of the European Laryngological Society's assessment protocol which was originally designed for the common types of (less severe) dysphonia. Acoustic analysis with AMPEX demonstrates that the most informative features are, for SV, the voicing-related acoustic features and, for AdSD, the perturbation measures. Poor correlations between self-assessment and acoustic and perceptual dimensions in the assessment of highly irregular voices argue for a multidimensional approach.

Do Standard Instrumental Acoustic, Perceptual, and Subjective Voice Outcomes Indicate Therapy Success in Patients With Functional Dysphonia?

PubMed

Reetz, Stephanie; Bohlender, Joerg E; Brockmann-Bauser, Meike

2018-01-29

The validity and sensitivity to change of instrumental acoustic measurements in patients with functional dysphonia have been controversially discussed. This work examines combined voice therapy effects on standard acoustic measurements, and if these agree with perceptual and subjective voice outcomes. Retrospective study. Thirty-nine patients (26 women, 13 men) aged 20-70 years (mean: 46.3, standard deviation 12.8) with functional dysphonia were investigated before and after combined voice therapy. Instrumental parameters included mean and range of speaking fundamental frequency (f o ) and intensity (SPL (dBA)); maximum SPL and mean f o of calling voice; minimum, maximum, range of singing voice f o and SPL, jitter (%), and the Dysphonia Severity Index. Voice Handicap Index-9 international was used for subjective and Grading-Roughness-Breathiness-Asthenia-Strain scale for perceptual assessment. Differences were investigated by Wilcoxon signed ranks test and coherences by Spearman rank correlation coefficient. After treatment, the speaking voice f o range (7-8.13 semitones) and SPL range (12.9-14.85 dB(A)) were significantly larger (P < 0.05). Both parameters were highly correlated (P < 0.001). Subjective symptoms were significantly reduced from a mean Voice Handicap Index-9 international of 15.6-8.6, and all perceptual Grading-Roughness-Breathiness-Asthenia-Strain scale parameters were significantly improved (G: 1.05-0.51) after therapy (P < 0.05). These findings were not associated with any acoustic parameter (P > 0.05). Significantly improved subjective and perceptual findings verify positive combined voice therapy effects in patients with functional dysphonia. The larger f o and SPL speaking voice range after treatment indicate an altered voice technique. These instrumental measures may be clinical indicators of therapy success and transfer effects. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Outcomes Measurement in Voice Disorders: Application of an Acoustic Index of Dysphonia Severity

ERIC Educational Resources Information Center

Awan, Shaheen N.; Roy, Nelson

2009-01-01

Purpose: The purpose of this experiment was to assess the ability of an acoustic model composed of both time-based and spectral-based measures to track change following voice disorder treatment and to serve as a possible treatment outcomes measure. Method: A weighted, four-factor acoustic algorithm consisting of shimmer, pitch sigma, the ratio of…
The acoustic and perceptual differences to the non-singer's singing voice before and after a singing vocal warm-up

NASA Astrophysics Data System (ADS)

DeRosa, Angela

The present study analyzed the acoustic and perceptual differences in non-singer's singing voice before and after a vocal warm-up. Experiments were conducted with 12 females who had no singing experience and considered themselves to be non-singers. Participants were recorded performing 3 tasks: a musical scale stretching to their most comfortable high and low pitches, sustained productions of the vowels /a/ and /i/, and singing performance of the "Star Spangled Banner." Participants were recorded performing these three tasks before a vocal warm-up, after a vocal warm-up, and then again 2-3 weeks later after 2-3 weeks of practice. Acoustical analysis consisted of formant frequency analysis, singer's formant/singing power ratio analysis, maximum phonation frequency range analysis, and an analysis of jitter, noise to harmonic ratio (NHR), relative average perturbation (RAP), and voice turbulence index (VTI). A perceptual analysis was also conducted with 12 listeners rating comparison performances of before vs. after the vocal warm-up, before vs. after the second vocal warm-up, and after both vocal warm-ups. There were no significant findings for the formant frequency analysis of the vowel /a/, but there was significance for the 1st formant frequency analysis of the vowel /i/. Singer's formant analyzed via Singing Power Ratio analysis showed significance only for the vowel /i/. Maximum phonation frequency range analysis showed a significant increase after the vocal warm-ups. There were no significant findings for the acoustic measures of jitter, NHR, RAP, and VTI. Perceptual analysis showed a significant difference after a vocal warm-up. The results indicate that a singing vocal warm-up can have a significant positive influence on the singing voice of non-singers.
Vocal fold vibrations: high-speed imaging, kymography, and acoustic analysis: a preliminary report.

PubMed

Larsson, H; Hertegård, S; Lindestad, P A; Hammarberg, B

2000-12-01

To evaluate a new analysis system, High-Speed Tool Box (H. Larsson, custom-made program for image analysis, version 1.1, Department of Logopedics and Phoniatrics, Huddinge University Hospital, Huddinge, Sweden, 1998) for studying vocal fold vibrations using a high-speed camera and to relate findings from these analyses to sound characteristics. A Weinberger Speedcam + 500 system (Weinberger AG, Dietikon, Switzerland) was used with a frame rate of 1,904 frames per second. Images were stored and analyzed digitally. Analysis included automatic glottal edge detection and calculation of glottal area variations, as well as kymography. These signals were compared with acoustic waveforms using the Soundswell program (Hitech Development AB, Stockholm, Sweden). The High-Speed Tool Box was applied on two types of high-speed recordings: a diplophonic phonation and a tremor voice. Relations between glottal vibratory patterns and the sound waveform were analyzed. In the diplophonic phonation, the glottal area waveform, as well as the kymogram, showed a specific pattern of repetitive glottal closures, which was also seen in the acoustic waveform. In the tremor voice, fundamental frequency (F0) fluctuations in the acoustic waveform were reflected in slow variations in amplitude in the glottal area waveform. For studying details of mucosal movements during these kinds of abnormal vibrations, the glottal area waveform was particularly useful. Our results suggest that this combined high-speed acoustic-kymographic analysis package is a promising aid for separating and specifying different voice qualities such as diplophonia and voice tremor. Apart from clinical use, this finding should be of help for specification of the terminology of different voice qualities.
Effects of singing training on the speaking voice of voice majors.

PubMed

Mendes, Ana P; Brown, W S; Rothman, Howard B; Sapienza, Christine

2004-03-01

This longitudinal study gathered data with regard to the question: Does singing training have an effect on the speaking voice? Fourteen voice majors (12 females and two males; age range 17 to 20 years) were recorded once a semester for four consecutive semesters, while sustaining vowels and reading the "Rainbow Passage." Acoustic measures included speaking fundamental frequency (SFF) and sound pressure level (SLP). Perturbation measures included jitter, shimmer, and harmonic-to-noise ratio. Temporal measures included sentence, consonant, and diphthong durations. Results revealed that, as the number of semesters increased, the SFF increased while jitter and shimmer slightly decreased. Repeated measure analysis, however, indicated that none of the acoustic, temporal, or perturbation differences were statistically significant. These results confirm earlier cross-sectional studies that compared singers with nonsingers, in that singing training mostly affects the singing voice and rarely the speaking voice.
Voice Dysfunction in Dysarthria: Application of the Multi-Dimensional Voice Program.

ERIC Educational Resources Information Center

Kent, R. D.; Vorperian, H. K.; Kent, J. F.; Duffy, J. R.

2003-01-01

Part 1 of this paper recommends procedures and standards for the acoustic analysis of voice in individuals with dysarthria. In Part 2, acoustic data are reviewed for dysarthria associated with Parkinson disease (PD), cerebellar disease, amytrophic lateral sclerosis, traumatic brain injury, unilateral hemispheric stroke, and essential tremor.…
Perception of recorded singing voice quality and expertise: cognitive linguistics and acoustic approaches.

PubMed

Morange, Séverine; Dubois, Danièle; Fontaine, Jean-Marc

2010-07-01

The objective of the present pluridisciplinary study was to contribute to determine how a diversity of audience differently appreciates several versions resulting from different "restoration" treatments of one single original lyrical recording. We present here a joint analysis coupling psychological and linguistic analyses with acoustic descriptions on a unique research object: a Caruso's piece of song diversely remastered on commercial CDs. Thirty-two subjects were selected contrasted on age ("younger than 30 years" and "older than 60 years") related with their different experience of earlier technical recording devices (rendering through devices such as radio, 78rpm records, CD...) and on expertise concerning musical acoustics (acousticians and/or musicians vs ordinary music lovers). Eleven excerpts of reediting of an opera record interpreted by Caruso were selected from what could found on the market. The listening protocol involved a free categorization task and the selection of excerpts on preference judgments. Each task involved subjects' free commentaries about their choices as a joint output from psychological processing. A cluster analysis scaffold by a psycholinguistic processing of the verbal comments of the categories allowed to identify both commonalities and differences in groupings excerpts by the different groups of the subjects, along a diversity of criteria, varying according to age and expertise. Each excerpt can therefore be characterized both according to psychological and to acoustic criteria. This study has enabled us to develop the idea that a lyric voice is a multifaced object (cultural, esthetic, technical, physical), acoustic parameters being linked to the various sensory experiences and expertises of appraisers. Such pluridisciplinary research and the coupling of the correlated multiplicity of methodologies we developed acknowledge for a better understanding of listening practices and music-lover assessments here concerned with a
Differences in acoustic and perceptual parameters of the voice between elderly and young women at habitual and high intensity.

PubMed

Mazzetto de Menezes, Keyla S; Master, Suely; Guzman, Marco; Bortnem, Cori; Ramos, Luiz Roberto

2014-01-01

The present study aimed to compare elderly and young female voices in habitual and high intensity. The effect of increased intensity on the acoustic and perceptual parameters was assessed. Sound pressure level, fundamental frequency, jitter, shimmer, and harmonic to noise ratio were obtained at habitual and high intensity voice in a group of 30 elderly women and 30 young women. Perceptual assessment was also performed. Both groups demonstrated an increase in sound pressure level and fundamental frequency from habitual voice to high intensity voice. No differences were found between groups in any acoustic variables on samples recorded with habitual intensity level. No significant differences between groups were found in habitual intensity level for pitch, hoarseness, roughness, and breathiness. Asthenia and instability obtained significant higher values in elderly than young participants, whereas, the elderly demonstrated lower values for perceived tension and loudness than young subjects. Acoustic and perceptual measures do not demonstrate evident differences between elderly and young speakers in habitual intensity level. The parameters analyzed may lack the sensitivity necessary to detect differences in subjects with normal voices. Phonation with high intensity highlights differences between groups, especially in perceptual parameters. Therefore, high intensity should be included to compare elderly and young voice. Copyright © 2013 Elsevier España, S.L. All rights reserved.
A Flexible Analysis Tool for the Quantitative Acoustic Assessment of Infant Cry

PubMed Central

Reggiannini, Brian; Sheinkopf, Stephen J.; Silverman, Harvey F.; Li, Xiaoxue; Lester, Barry M.

2015-01-01

Purpose In this article, the authors describe and validate the performance of a modern acoustic analyzer specifically designed for infant cry analysis. Method Utilizing known algorithms, the authors developed a method to extract acoustic parameters describing infant cries from standard digital audio files. They used a frame rate of 25 ms with a frame advance of 12.5 ms. Cepstral-based acoustic analysis proceeded in 2 phases, computing frame-level data and then organizing and summarizing this information within cry utterances. Using signal detection methods, the authors evaluated the accuracy of the automated system to determine voicing and to detect fundamental frequency (F0) as compared to voiced segments and pitch periods manually coded from spectrogram displays. Results The system detected F0 with 88% to 95% accuracy, depending on tolerances set at 10 to 20 Hz. Receiver operating characteristic analyses demonstrated very high accuracy at detecting voicing characteristics in the cry samples. Conclusions This article describes an automated infant cry analyzer with high accuracy to detect important acoustic features of cry. A unique and important aspect of this work is the rigorous testing of the system’s accuracy as compared to ground-truth manual coding. The resulting system has implications for basic and applied research on infant cry development. PMID:23785178
Changes in Acoustic Characteristics of the Voice across the Life Span: Measures from Individuals 4-93 Years of Age

ERIC Educational Resources Information Center

Stathopoulos, Elaine T.; Huber, Jessica E.; Sussman, Joan E.

2011-01-01

Purpose: The purpose of the present investigation was to examine acoustic voice changes across the life span. Previous voice production investigations used small numbers of participants, had limited age ranges, and produced contradictory results. Method: Voice recordings were made from 192 male and female participants 4-93 years of age. Acoustic…
Effects of Bel Canto Training on Acoustic and Aerodynamic Characteristics of the Singing Voice.

PubMed

McHenry, Monica A; Evans, Joseph; Powitzky, Eric

2016-03-01

This study was designed to assess the impact of 2 years of operatic training on acoustic and aerodynamic characteristics of the singing voice. This is a longitudinal study. Participants were 21 graduate students and 16 undergraduate students. They completed a variety of tasks, including laryngeal videostroboscopy, audio recording of pitch range, and singing of syllable trains at full voice in chest, passaggio, and head registers. Inspiration, intraoral pressure, airflow, and sound pressure level (SPL) were captured during the syllable productions. Both graduate and undergraduate students significantly increased semitone range and SPL. The contributions to increased SPL were typically increased inspiration, increased airflow, and reduced laryngeal resistance, although there were individual differences. Two graduate students increased SPL without increased airflow and likely used supraglottal strategies to do so. Students demonstrated improvements in both acoustic and aerodynamic components of singing. Increasing SPL primarily through respiratory drive is a healthy strategy and results from intensive training. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
A comparison of recordings of sentences and spontaneous speech: perceptual and acoustic measures in preschool children's voices.

PubMed

McAllister, Anita; Brandt, Signe Kofoed

2012-09-01

A well-controlled recording in a studio is fundamental in most voice rehabilitation. However, this laboratory like recording method has been questioned because voice use in a natural environment may be quite different. In children's natural environment, high background noise levels are common and are an important factor contributing to voice problems. The primary noise source in day-care centers is the children themselves. The aim of the present study was to compare perceptual evaluations of voice quality and acoustic measures from a controlled recording with recordings of spontaneous speech in children's natural environment in a day-care setting. Eleven 5-year-old children were recorded three times during a day at the day care. The controlled speech material consisted of repeated sentences. Matching sentences were selected from the spontaneous speech. All sentences were repeated three times. Recordings were randomized and analyzed acoustically and perceptually. Statistic analyses showed that fundamental frequency was significantly higher in spontaneous speech (P<0.01) as was hyperfunction (P<0.001). The only characteristic the controlled sentences shared with spontaneous speech was degree of hoarseness (Spearman's rho=0.564). When data for boys and girls were analyzed separately, a correlation was found for the parameter breathiness (rho=0.551) for boys, and for girls the correlation for hoarseness remained (rho=0.752). Regarding acoustic data, none of the measures correlated across recording conditions for the whole group. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Effects of the Voice over Internet Protocol on Perturbation Analysis of Normal and Pathological Phonation

PubMed Central

Zhu, Yanmei; Witt, Rachel E.; MacCallum, Julia K.; Jiang, Jack J.

2010-01-01

Objective In this study, a Voice over Internet Protocol (VoIP) communication based on G.729 protocol was simulated to determine the effects of this system on acoustic perturbation parameters of normal and pathological voice signals. Patients and Methods: Fifty recordings of normal voice and 48 recordings of pathological voice affected by laryngeal paralysis were transmitted through a VoIP communication system. The acoustic analysis programs of CSpeech and MDVP were used to determine the percent jitter and percent shimmer from the voice samples before and after VoIP transmission. The effects of three frequently used audio compression protocols (MP3, WMA, and FLAC) on the perturbation measures were also studied. Results It was found that VoIP transmission disrupts the waveform and increases the percent jitter and percent shimmer of voice samples. However, after VoIP transmission, significant discrimination between normal and pathological voices affected by laryngeal paralysis was still possible. It was found that the lossless compression method FLAC does not exert any influence on the perturbation measures. The lossy compression methods MP3 and WMA increase percent jitter and percent shimmer values. Conclusion This study validates the feasibility of these transmission and compression protocols in developing remote voice signal data collection and assessment systems. PMID:20588051
Acoustic Correlates of Fatigue in Laryngeal Muscles: Findings for a Criterion-Based Prevention of Acquired Voice Pathologies

ERIC Educational Resources Information Center

Boucher, Victor J.

2008-01-01

Purpose: The objective was to identify acoustic correlates of laryngeal muscle fatigue in conditions of vocal effort. Method: In a previous study, a technique of electromyography (EMG) served to define physiological signs of "voice fatigue" in laryngeal muscles involved in voicing. These signs correspond to spectral changes in contraction…
Empirical test of the performance of an acoustic-phonetic approach to forensic voice comparison under conditions similar to those of a real case.

PubMed

Enzinger, Ewald; Morrison, Geoffrey Stewart

2017-08-01

In a 2012 case in New South Wales, Australia, the identity of a speaker on several audio recordings was in question. Forensic voice comparison testimony was presented based on an auditory-acoustic-phonetic-spectrographic analysis. No empirical demonstration of the validity and reliability of the analytical methodology was presented. Unlike the admissibility standards in some other jurisdictions (e.g., US Federal Rule of Evidence 702 and the Daubert criteria, or England & Wales Criminal Practice Directions 19A), Australia's Unified Evidence Acts do not require demonstration of the validity and reliability of analytical methods and their implementation before testimony based upon them is presented in court. The present paper reports on empirical tests of the performance of an acoustic-phonetic-statistical forensic voice comparison system which exploited the same features as were the focus of the auditory-acoustic-phonetic-spectrographic analysis in the case, i.e., second-formant (F2) trajectories in /o/ tokens and mean fundamental frequency (f0). The tests were conducted under conditions similar to those in the case. The performance of the acoustic-phonetic-statistical system was very poor compared to that of an automatic system. Copyright © 2017 Elsevier B.V. All rights reserved.
Measures of voiced frication for automatic classification

NASA Astrophysics Data System (ADS)

Jackson, Philip J. B.; Jesus, Luis M. T.; Shadle, Christine H.; Pincas, Jonathan

2004-05-01

As an approach to understanding the characteristics of the acoustic sources in voiced fricatives, it seems apt to draw on knowledge of vowels and voiceless fricatives, which have been relatively well studied. However, the presence of both phonation and frication in these mixed-source sounds offers the possibility of mutual interaction effects, with variations across place of articulation. This paper examines the acoustic and articulatory consequences of these interactions and explores automatic techniques for finding parametric and statistical descriptions of these phenomena. A reliable and consistent set of such acoustic cues could be used for phonetic classification or speech recognition. Following work on devoicing of European Portuguese voiced fricatives [Jesus and Shadle, in Mamede et al. (eds.) (Springer-Verlag, Berlin, 2003), pp. 1-8]. and the modulating effect of voicing on frication [Jackson and Shadle, J. Acoust. Soc. Am. 108, 1421-1434 (2000)], the present study focuses on three types of information: (i) sequences and durations of acoustic events in VC transitions, (ii) temporal, spectral and modulation measures from the periodic and aperiodic components of the acoustic signal, and (iii) voicing activity derived from simultaneous EGG data. Analysis of interactions observed in British/American English and European Portuguese speech corpora will be compared, and the principal findings discussed.
Voice amplification versus vocal hygiene instruction for teachers with voice disorders: a treatment outcomes study.

PubMed

Roy, Nelson; Weinrich, Barbara; Gray, Steven D; Tanner, Kristine; Toledo, Sue Walker; Dove, Heather; Corbin-Lewis, Kim; Stemple, Joseph C

2002-08-01

Voice problems are common among schoolteachers. This prospective, randomized clinical trial used patient-based treatment outcomes measures combined with acoustic analysis to evaluate the effectiveness of two treatment programs. Forty-four voice-disordered teachers were randomly assigned to one of three groups: voice amplification using the ChatterVox portable amplifier (VA, n = 15), vocal hygiene (VH, n = 15), and a nontreatment control group (n = 14). Before and after a 6-week treatment phase, all teachers completed: (a) the Voice Handicap Index (VHI), an instrument designed to appraise the self-perceived psychosocial consequences of voice disorders; (b) a voice severity self-rating scale; and (c) an audiorecording for later acoustic analysis. Based on pre- and posttreatment comparisons, only the amplification group experienced significant reductions on mean VHI scores (p = .045), voice severity self-ratings (p = .012), and the acoustic measures of percent jitter (p = .031) and shimmer (p = .008). The nontreatment control group reported a significant increase in level of vocal handicap as assessed by the VHI (p = .012). Although most pre- to posttreatment changes were in the desired direction, no significant improvements were observed within the VH group on any of the dependent measures. Between-group comparisons involving the three possible pairings of the groups revealed a pattern of results to suggest that: (a) compared to the control group, both treatment groups (i.e., VA and VH) experienced significantly more improvement on specific outcomes measures and (b) there were no significant differences between the VA and VH groups to indicate superiority of one treatment over another. Results, however, from a posttreatment questionnaire regarding the perceived benefits of treatment revealed that, compared to the VH group, the VA group reported more clarity of their speaking and singing voice (p = .061), greater ease of voice production (p = .001), and greater
Voice measures of workload in the advanced flight deck

NASA Technical Reports Server (NTRS)

Schneider, Sid J.; Alpert, Murray; Odonnell, Richard

1989-01-01

Voice samples were obtained from 14 male subjects under high and low workload conditions. Acoustical analysis of the voice suggested that high workload conditions can be revealed by their effects on the voice over time. Aircrews in the advanced flight deck will be voicing short, imperative sentences repeatedly. A drop in the energy of the voice, as reflected by reductions in amplitude and frequency over time, and the failure to achieve old amplitude and frequency levels after rest periods, can signal that the workload demands of the situation are straining the speaker. This kind of measurement would be relatively unaffected by individual differences in acoustical measures.
Perceptual and acoustic outcomes of voice therapy for male-to-female transgender individuals immediately after therapy and 15 months later.

PubMed

Gelfer, Marylou Pausewang; Tice, Ruthanne M

2013-05-01

The present study examined how effectively listeners' perceptions of gender could be changed from male to female for male-to-female (MTF) transgender (TG) clients based on the voice signal alone, immediately after voice therapy and at long-term follow-up. Short- and long-term changes in masculinity and femininity ratings and acoustic measures of speaking fundamental frequency (SFF) and vowel formant frequencies were also investigated. Prospective treatment study. Five MTF TG clients, five control female speakers, and five control male speakers provided a variety of speech samples for later analysis. The TG clients then underwent 8 weeks of voice therapy. Voice samples were collected immediately at the termination of therapy and again 15 months later. Two groups of listeners were recruited to evaluate gender and provide masculinity and femininity ratings. Perceptual results revealed that TG subjects were perceived as female 1.9% of the time in the pretest, 50.8% of the time in the immediate posttest, and 33.1% of the time in the long-term posttest. The TG speakers were also perceived as significantly less masculine and more feminine in the immediate posttest and the long-term posttest compared with the pre-test. Some acoustic measures showed significant differences between the pretest and the immediate posttest and long-term posttest. It appeared that 8 weeks of voice therapy could result in vocal changes in MTF TG individuals that persist at least partially for up to 15 months. However, some TG subjects were more successful with voice feminization than others. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Predicting Voice Disorder Status From Smoothed Measures of Cepstral Peak Prominence Using Praat and Analysis of Dysphonia in Speech and Voice (ADSV).

PubMed

Sauder, Cara; Bretl, Michelle; Eadie, Tanya

2017-09-01

The purposes of this study were to (1) determine and compare the diagnostic accuracy of a single acoustic measure, smoothed cepstral peak prominence (CPPS), to predict voice disorder status from connected speech samples using two software systems: Analysis of Dysphonia in Speech and Voice (ADSV) and Praat; and (2) to determine the relationship between measures of CPPS generated from these programs. This is a retrospective cross-sectional study. Measures of CPPS were obtained from connected speech recordings of 100 subjects with voice disorders and 70 nondysphonic subjects without vocal complaints using commercially available ADSV and freely downloadable Praat software programs. Logistic regression and receiver operating characteristic (ROC) analyses were used to evaluate and compare the diagnostic accuracy of CPPS measures. Relationships between CPPS measures from the programs were determined. Results showed acceptable overall accuracy rates (75% accuracy, ADSV; 82% accuracy, Praat) and area under the ROC curves (area under the curve [AUC] = 0.81, ADSV; AUC = 0.91, Praat) for predicting voice disorder status, with slight differences in sensitivity and specificity. CPPS measures derived from Praat were uniquely predictive of disorder status above and beyond CPPS measures from ADSV (χ 2 (1) = 40.71, P < 0.001). CPPS measures from both programs were significantly and highly correlated (r = 0.88, P < 0.001). A single acoustic measure of CPPS was highly predictive of voice disorder status using either program. Clinicians may consider using CPPS to complement clinical voice evaluation and screening protocols. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Learning [Voice

ERIC Educational Resources Information Center

Tauberer, Joshua Ian

2010-01-01

The [voice] distinction between homorganic stops and fricatives is made by a number of acoustic correlates including voicing, segment duration, and preceding vowel duration. The present work looks at [voice] from a number of multidimensional perspectives. This dissertation's focus is a corpus study of the phonetic realization of [voice] in two…
Acoustic Analysis and Electroglottography in Elite Vocal Performers.

PubMed

Villafuerte-Gonzalez, Rocio; Valadez-Jimenez, Victor M; Sierra-Ramirez, Jose A; Ysunza, Pablo Antonio; Chavarria-Villafuerte, Karen; Hernandez-Lopez, Xochiquetzal

2017-05-01

Acoustic analysis of voice (AAV) and electroglottography (EGG) have been used for assessing vocal quality in patients with voice disorders. The effectiveness of these procedures for detecting mild disturbances in vocal quality in elite vocal performers has been controversial. To compare acoustic parameters obtained by AAV and EGG before and after vocal training to determine the effectiveness of these procedures for detecting vocal improvements in elite vocal performers. Thirty-three elite vocal performers were studied. The study group included 14 males and 19 females, ages 18-40 years, without a history of voice disorders. Acoustic parameters were obtained through AAV and EGG before and after vocal training using the Linklater method. Nonsignificant differences (P > 0.05) were found between values of fundamental frequency (F 0 ), shimmer, and jitter obtained by both procedures before vocal training. Mean F 0 was similar after vocal training. Jitter percentage as measured by AAV showed nonsignificant differences (P > 0.05) before and after vocal training. Shimmer percentage as measured by AAV demonstrated a significant reduction (P < 0.05) after vocal training. As measured by EGG after vocal training, shimmer and jitter were significantly reduced (P < 0.05); open quotient was significantly increased (P < 0.05); and irregularity was significantly reduced (P < 0.05). AAV and EGG were effective for detecting improvements in vocal function after vocal training in male and female elite vocal performers undergoing vocal training. EGG demonstrated better efficacy for detecting improvements and provided additional parameters as compared to AAV. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The impact of extended voice use on the acoustic characteristics of phonation after training and performance of actors from the La MaMa Experimental Theater club.

PubMed

Ferrone, Carol; Galgano, Jessica; Ramig, Lorraine Olson

2011-05-01

To test the hypothesis that extensive use of La MaMa vocal technique may result in symptoms of vocal abuse, an evaluation of the acoustic and perceptual characteristics of voice for eight performers from the Great Jones Repertory Company of the La MaMa Experimental Theater was conducted. This vocal technique includes wide ranges of frequency from 46 to 2003 Hz and vocal intensity that is sustained at 90-108 dB sound pressure level with a mouth-to-microphone distance of 30 cm for 3-4 hours per performance. The actors rehearsed for 4 hours per day, 5 days per week for 14 weeks before the series of performances. Thirty-nine performances were presented in 6 weeks. Three pretraining, three posttraining, and two postperformance series data collection sessions were carried out for each performer. Speech samples were gathered using the CSL 4500 and analyzed using Real-Time Pitch program and Multidimensional Voice Program. Acoustic analysis was performed on 48 tokens of sustained vowel phonation for each subject. Statistical analysis was performed using the Friedman test of related samples. Perceptual analysis included professional listeners rating voice quality in pretraining, posttraining, and postperformance samples of the Rainbow Passage and sample lines from the plays. The majority of professional listeners (11/12) judged that this technique would result in symptoms of vocal abuse; however, acoustic data revealed statistically stable or improved measurements for all subjects in most dependent acoustic variables when compared with both posttraining and postperformance trials. These findings add support to the notion that a technique that may be perceived as vocally abusive, generating 90-100 dB sound pressure level and sustained over 6 weeks of performances, actually resulted in improved vocal strength and flexibility. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Changes in objective acoustic measurements and subjective voice complaints in call center customer-service advisors during one working day.

PubMed

Lehto, Laura; Laaksonen, Laura; Vilkman, Erkki; Alku, Paavo

2008-03-01

The aim of this study was to investigate how different acoustic parameters, extracted both from speech pressure waveforms and glottal flows, can be used in measuring vocal loading in modern working environments and how these parameters reflect the possible changes in the vocal function during a working day. In addition, correlations between objective acoustic parameters and subjective voice symptoms were addressed. The subjects were 24 female and 8 male customer-service advisors, who mainly use telephone during their working hours. Speech samples were recorded from continuous speech four times during a working day and voice symptom questionnaires were completed simultaneously. Among the various objective parameters, only F0 resulted in a statistically significant increase for both genders. No correlations between the changes in objective and subjective parameters appeared. However, the results encourage researchers within the field of occupational voice use to apply versatile measurement techniques in studying occupational voice loading.
Acoustic correlates of Japanese expressions associated with voice quality of male adults

NASA Astrophysics Data System (ADS)

Kido, Hiroshi; Kasuya, Hideki

2004-05-01

Japanese expressions associated with the voice quality of male adults were extracted by a series of questionnaire surveys and statistical multivariate analysis. One hundred and thirty-seven Japanese expressions were collected through the first questionnaire and careful investigations of well-established Japanese dictionaries and articles. From the second questionnaire about familiarity with each of the expressions and synonymity that were addressed to 249 subjects, 25 expressions were extracted. The third questionnaire was about an evaluation of their own voice quality. By applying a statistical clustering method and a correlation analysis to the results of the questionnaires, eight bipolar expressions and one unipolar expression were obtained. They constituted high-pitched/low-pitched, masculine/feminine, hoarse/clear, calm/excited, powerful/weak, youthful/elderly, thick/thin, tense/lax, and nasal, respectively. Acoustic correlates of each of the eight bipolar expressions were extracted by means of perceptual evaluation experiments that were made with sentence utterances of 36 males and by a statistical decision tree method. They included an average of the fundamental frequency (F0) of the utterance, speaking rate, spectral tilt, formant frequency parameter, standard deviation of F0 values, and glottal noise, when SPL of each of the stimuli was maintained identical in the perceptual experiments.
Speech waveform perturbation analysis: a perceptual-acoustical comparison of seven measures.

PubMed

Askenfelt, A G; Hammarberg, B

1986-03-01

The performance of seven acoustic measures of cycle-to-cycle variations (perturbations) in the speech waveform was compared. All measures were calculated automatically and applied on running speech. Three of the measures refer to the frequency of occurrence and severity of waveform perturbations in special selected parts of the speech, identified by means of the rate of change in the fundamental frequency. Three other measures refer to statistical properties of the distribution of the relative frequency differences between adjacent pitch periods. One perturbation measure refers to the percentage of consecutive pitch period differences with alternating signs. The acoustic measures were tested on tape recorded speech samples from 41 voice patients, before and after successful therapy. Scattergrams of acoustic waveform perturbation data versus an average of perceived deviant voice qualities, as rated by voice clinicians, are presented. The perturbation measures were compared with regard to the acoustic-perceptual correlation and their ability to discriminate between normal and pathological voice status. The standard deviation of the distribution of the relative frequency differences was suggested as the most useful acoustic measure of waveform perturbations for clinical applications.
Chaos tool implementation for non-singer and singer voice comparison (preliminary study)

NASA Astrophysics Data System (ADS)

Dajer, Me; Pereira, Jc; Maciel, Cd

2007-11-01

Voice waveform is linked to the stretch, shorten, widen or constrict vocal tract. The articulation effects of the singer's vocal tract modify the voice acoustical characteristics and differ from the non-singer voices. In the last decades, Chaos Theory has shown the possibility to explore the dynamic nature of voice signals from a different point of view. The purpose of this paper is to apply the chaos technique of phase space reconstruction to analyze non- singers and singer voices in order to explore the signal nonlinear dynamic, and correlate them with traditional acoustic parameters. Eight voice samples of sustained vowel /i/ from non-singers and eight from singers were analyzed with "ANL" software. The samples were also acoustically analyzed with "Analise de Voz 5.0" in order to extract acoustic perturbation measures jitter and shimmer, and the coefficient of excess - (EX). The results showed different visual patterns for the two groups correlated with different jitter, shimmer, and coefficient of excess values. We conclude that these results clearly indicate the potential of phase space reconstruction technique for analysis and comparison of non-singers and singer voices. They also show a promising tool for training voices application.
Acoustic Measures of Voice and Physiologic Measures of Autonomic Arousal during Speech as a Function of Cognitive Load.

PubMed

MacPherson, Megan K; Abur, Defne; Stepp, Cara E

2017-07-01

This study aimed to determine the relationship among cognitive load condition and measures of autonomic arousal and voice production in healthy adults. A prospective study design was conducted. Sixteen healthy young adults (eight men, eight women) produced a sentence containing an embedded Stroop task in each of two cognitive load conditions: congruent and incongruent. In both conditions, participants said the font color of the color words instead of the word text. In the incongruent condition, font color differed from the word text, creating an increase in cognitive load relative to the congruent condition in which font color and word text matched. Three physiologic measures of autonomic arousal (pulse volume amplitude, pulse period, and skin conductance response amplitude) and four acoustic measures of voice (sound pressure level, fundamental frequency, cepstral peak prominence, and low-to-high spectral energy ratio) were analyzed for eight sentence productions in each cognitive load condition per participant. A logistic regression model was constructed to predict the cognitive load condition (congruent or incongruent) using subject as a categorical predictor and the three autonomic measures and four acoustic measures as continuous predictors. It revealed that skin conductance response amplitude, cepstral peak prominence, and low-to-high spectral energy ratio were significantly associated with cognitive load condition. During speech produced under increased cognitive load, healthy young adults show changes in physiologic markers of heightened autonomic arousal and acoustic measures of voice quality. Future work is necessary to examine these measures in older adults and individuals with voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Fundamental frequency and voice perturbation measures in smokers and non-smokers: An acoustic and perceptual study

NASA Astrophysics Data System (ADS)

Freeman, Allison

This research examined the fundamental frequency and perturbation (jitter % and shimmer %) measures in young adult (20-30 year-old) and middle-aged adult (40-55 year-old) smokers and non-smokers; there were 36 smokers and 36 non-smokers. Acoustic analysis was carried out utilizing one task: production of sustained /a/. These voice samples were analyzed utilizing Multi-Dimensional Voice Program (MDVP) software, which provided values for fundamental frequency, jitter %, and shimmer %.These values were analyzed for trends regarding smoking status, age, and gender. Statistical significance was found regarding the fundamental frequency, jitter %, and shimmer % for smokers as compared to non-smokers; smokers were found to have significantly lower fundamental frequency values, and significantly higher jitter % and shimmer % values. Statistical significance was not found regarding fundamental frequency, jitter %, and shimmer % for age group comparisons. With regard to gender, statistical significance was found regarding fundamental frequency; females were found to have statistically higher fundamental frequencies as compared to males. However, the relationships between gender and jitter % and shimmer % lacked statistical significance. These results indicate that smoking negatively affects voice quality. This study also examined the ability of untrained listeners to identify smokers and non-smokers based on their voices. Results of this voice perception task suggest that listeners are not accurately able to identify smokers and non-smokers, as statistical significance was not reached. However, despite a lack of significance, trends in data suggest that listeners are able to utilize voice quality to identify smokers and non-smokers.
Singer's preferred acoustic condition in performance in an opera house and self-perception of the singer's voice

NASA Astrophysics Data System (ADS)

Noson, Dennis; Kato, Kosuke; Ando, Yoichi

2004-05-01

Solo singers have been shown to over estimate the relative sound pressure level of a delayed, external reproduction of their own voice, singing single syllables, which, in turn, appears to influence the preferred delay of simulated stage reflections [Noson, Ph.D. thesis, Kobe University, 2003]. Bone conduction is thought to be one factor separating singer versus instrumental performer judgments of stage acoustics. Using a parameter derived from the vocal signal autocorrelation function (ACF envelope), the changes in singer preference for delayed reflections is primarily explained by the ACF parameter, rather than internal bone conduction. An auditory model of a singer's preferred reflection delay is proposed, combining the effects of acoustical environment (reflection amplitude), bone conduction, and performer vocal overestimate, which may be applied to the acoustic design of reflecting elements in both upstage and forestage environments of opera stages. For example, soloists who characteristically underestimate external voice levels (or overestimate their own voice) should be provided shorter distances to reflective panels-irrespective of their singing style. Adjustable elements can be deployed to adapt opera houses intended for bel canto style performances to other styles. Additional examples will also be discussed. a)Now at Kumamoto Univ., Kumamoto, Japan. b)Now at: 1-10-27 Yamano Kami, Kumamoto, Japan.
Acoustic analysis of speech variables during depression and after improvement.

PubMed

Nilsonne, A

1987-09-01

Speech recordings were made of 16 depressed patients during depression and after clinical improvement. The recordings were analyzed using a computer program which extracts acoustic parameters from the fundamental frequency contour of the voice. The percent pause time, the standard deviation of the voice fundamental frequency distribution, the standard deviation of the rate of change of the voice fundamental frequency and the average speed of voice change were found to correlate to the clinical state of the patient. The mean fundamental frequency, the total reading time and the average rate of change of the voice fundamental frequency did not differ between the depressed and the improved group. The acoustic measures were more strongly correlated to the clinical state of the patient as measured by global depression scores than to single depressive symptoms such as retardation or agitation.
Influence of Smartphones and Software on Acoustic Voice Measures

PubMed Central

GRILLO, ELIZABETH U.; BROSIOUS, JENNA N.; SORRELL, STACI L.; ANAND, SUPRAJA

2016-01-01

This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone) and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV), Multi-dimensional Voice Program (MDVP), and Praat). Correlations between the software programs that calculated the voice measures were also analyzed. Results demonstrated no significant within-subject variability across devices and software and that some of the measures were highly correlated across software programs. The study suggests that certain smartphones may be appropriate to record daily voice measures representing the effects of vocal loading within individuals. In addition, even though different algorithms are used to compute voice measures across software programs, some of the programs and measures share a similar relationship. PMID:28775797
Automatic Assessment of Acoustic Parameters of the Singing Voice: Application to Professional Western Operatic and Jazz Singers.

PubMed

Manfredi, Claudia; Barbagallo, Davide; Baracca, Giovanna; Orlandi, Silvia; Bandini, Andrea; Dejonckere, Philippe H

2015-07-01

The obvious perceptual differences between various singing styles like Western operatic and jazz rely on specific dissimilarities in vocal technique. The present study focuses on differences in vibrato acoustics and in singer's formant as analyzed by a novel software tool, named BioVoice, based on robust high-resolution and adaptive techniques that have proven its validity on synthetic voice signals. A total of 48 professional singers were investigated (29 females; 19 males; 29 Western operatic; and 19 jazz). They were asked to sing "a cappella," but with artistic expression, a well-known musical phrase from Gershwin's Porgy and Bess, in their own style: either operatic or jazz. A specific sustained note was extracted for detailed vibrato analysis. Beside rate (s(-1)) and extent (cents), duration (seconds) and regularity were computed. Two new concepts are introduced: vibrato jitter and vibrato shimmer, by analogy with the traditional jitter and shimmer of voice signals. For the singer's formant, on the same sustained tone, the ratio of the acoustic energy in formants 1-2 to the energy in formants 3, 4, and 5 was automatically computed, providing a quality ratio (QR). Vibrato rates did not differ among groups. Extent was significantly larger in operatic singers, particularly females. Vibrato jitter and vibrato shimmer were significantly smaller in operatic singers. Duration of vibrato was also significantly longer in operatic singers. QR was significantly lower in male operatic singers. Some vibrato characteristics (extent, regularity, and duration) very clearly differentiate the Western operatic singing style from the jazz singing style. The singer's formant is typical of male operatic singers. The new software tool is well suited to provide useful feedback in a pedagogical context. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Materials of acoustic analysis: sustained vowel versus sentence.

PubMed

Moon, Kyung Ray; Chung, Sung Min; Park, Hae Sang; Kim, Han Su

2012-09-01

Sustained vowel is a widely used material of acoustic analysis. However, vowel phonation does not sufficiently demonstrate sentence-based real-life phonation, and biases may occur depending on the test subjects intent during pronunciation. The purpose of this study was to investigate the differences between the results of acoustic analysis using each material. An individual prospective study. Two hundred two individuals (87 men and 115 women) with normal findings in videostroboscopy were enrolled. Acoustic analysis was done using the speech pattern element acquisition and display program. Fundamental frequency (Fx), amplitude (Ax), contact quotient (Qx), jitter, and shimmer were measured with sustained vowel-based acoustic analysis. Average fundamental frequency (FxM), average amplitude (AxM), average contact quotient (QxM), Fx perturbation (CFx), and amplitude perturbation (CAx) were measured with sentence-based acoustic analysis. Corresponding data of the two methods were compared with each other. SPSS (Statistical Package for the Social Sciences, Version 12.0; SPSS, Inc., Chicago, IL) software was used for statistical analysis. FxM was higher than Fx in men (Fx, 124.45 Hz; FxM, 133.09 Hz; P=0.000). In women, FxM seemed to be lower than Fx, but the results were not statistically significant (Fx, 210.58 Hz; FxM, 208.34 Hz; P=0.065). There was no statistical significance between Ax and AxM in both the groups. QxM was higher than Qx in men and women. Jitter was lower in men, but CFx was lower in women. Both Shimmer and CAx were higher in men. Sustained vowel phonation could not be a complete substitute for real-time phonation in acoustic analysis. Characteristics of acoustic materials should be considered when choosing the material for acoustic analysis and interpreting the results. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Effects on vocal range and voice quality of singing voice training: the classically trained female voice.

PubMed

Pabon, Peter; Stallinga, Rob; Södersten, Maria; Ternström, Sten

2014-01-01

A longitudinal study was performed on the acoustical effects of singing voice training under a given study program, using the voice range profile (VRP). Pretraining and posttraining recordings were made of students who participated in a 3-year bachelor singing study program. A questionnaire that included questions on optimal range, register use, classification, vocal health and hygiene, mixing technique, and training goals was used to rate and categorize self-assessed voice changes. Based on the responses, a subgroup of 10 classically trained female voices was selected, which was homogeneous enough for effects of training to be identified. The VRP perimeter contour was analyzed for effects of voice training. Also, a mapping within the VRP of voice quality, as expressed by the crest factor, was used to indicate the register boundaries and to monitor the acoustical consequences of the newly learned vocal technique of "mixed voice." VRPs were averaged across subjects. Findings were compared with the self-assessed vocal changes. Pre/post comparison of the average VRPs showed, in the midrange, (1) a decrease in the VRP area that was associated with the loud chest voice, (2) a reduction of the crest factor values, and (3) a reduction of maximum sound pressure level values. The students' self-evaluations of the voice changes appeared in some cases to contradict the VRP findings. VRPs of individual voices were seen to change over the course of a singing education. These changes were manifest also in the average group. High-resolution computerized recording, complemented with an acoustic register marker, allows a meaningful assessment of some effects of training, on an individual basis and for groups that comprise singers of a specific genre. It is argued that this kind of investigation is possible only within a focused training program, given by a faculty who has agreed on the goals. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Is There an Ironic Tone of Voice?

ERIC Educational Resources Information Center

Bryant, Gregory A.; Fox Tree, Jean E.

2005-01-01

Research on nonverbal vocal cues and verbal irony has often relied on the concept of an "ironic tone of voice". Here we provide acoustic analysis and experimental evidence that this notion is oversimplified and misguided. Acoustic analyses of spontaneous ironic speech extracted from talk radio shows, both ambiguous and unambiguous in…
Birth Control Pills and Nonprofessional Voice: Acoustic Analyses

ERIC Educational Resources Information Center

Amir, Ofer; Biron-Shental, Tal; Shabtai, Esther

2006-01-01

Purpose: Two studies are presented here. Study 1 was aimed at evaluating whether the voice characteristics of women who use birth control pills that contain different progestins differ from the voice characteristics of a control group. Study 2 presents a meta-analysis that combined the results of Study 1 with those from 3 recent studies that…
Correlation of VHI-10 to voice laboratory measurements across five common voice disorders.

PubMed

Gillespie, Amanda I; Gooding, William; Rosen, Clark; Gartner-Schmidt, Jackie

2014-07-01

To correlate change in Voice Handicap Index (VHI)-10 scores with corresponding voice laboratory measures across five voice disorders. Retrospective study. One hundred fifty patients aged >18 years with primary diagnosis of vocal fold lesions, primary muscle tension dysphonia-1, atrophy, unilateral vocal fold paralysis (UVFP), and scar. For each group, participants with the largest change in VHI-10 between two periods (TA and TB) were selected. The dates of the VHI-10 values were linked to corresponding acoustic/aerodynamic and audio-perceptual measures. Change in voice laboratory values were analyzed for correlation with each other and with VHI-10. VHI-10 scores were greater for patients with UVFP than other disorders. The only disorder-specific correlation between voice laboratory measure and VHI-10 was average phonatory airflow in speech for patients with UVFP. Average airflow in repeated phonemes was strongly correlated with average airflow in speech (r=0.75). Acoustic measures did not significantly change between time points. The lack of correlations between the VHI-10 change scores and voice laboratory measures may be due to differing constructs of each measure; namely, handicap versus physiological function. Presuming corroboration between these measures may be faulty. Average airflow in speech may be the most ecologically valid measure for patients with UVFP. Although aerodynamic measures changed between the time points, acoustic measures did not. Correlations to VHI-10 and change between time points may be found with other acoustic measures. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Measuring voice outcomes: state of the science review.

PubMed

Carding, Pau N; Wilson, J A; MacKenzie, K; Deary, I J

2009-08-01

Researchers evaluating voice disorder interventions currently have a plethora of voice outcome measurement tools from which to choose. Faced with such a wide choice, it would be beneficial to establish a clear rationale to guide selection. This article reviews the published literature on the three main areas of voice outcome assessment: (1) perceptual rating of voice quality, (2) acoustic measurement of the speech signal and (3) patient self-reporting of voice problems. We analysed the published reliability, validity, sensitivity to change and utility of the common outcome measurement tools in each area. From the data, we suggest that routine voice outcome measurement should include (1) an expert rating of voice quality (using the Grade-Roughness-Breathiness-Asthenia-Strain rating scale) and (2) a short self-reporting tool (either the Vocal Performance Questionnaire or the Vocal Handicap Index 10). These measures have high validity, the best reported reliability to date, good sensitivity to change data and excellent utility ratings. However, their application and administration require attention to detail. Acoustic measurement has arguable validity and poor reliability data at the present time. Other areas of voice outcome measurement (e.g. stroboscopy and aerodynamic phonatory measurements) require similarly detailed research and analysis.
Comparison of Pitch Strength With Perceptual and Other Acoustic Metric Outcome Measures Following Medialization Laryngoplasty.

PubMed

Rubin, Adam D; Jackson-Menaldi, Cristina; Kopf, Lisa M; Marks, Katherine; Skeffington, Jean; Skowronski, Mark D; Shrivastav, Rahul; Hunter, Eric J

2018-05-14

The diagnoses of voice disorders, as well as treatment outcomes, are often tracked using visual (eg, stroboscopic images), auditory (eg, perceptual ratings), objective (eg, from acoustic or aerodynamic signals), and patient report (eg, Voice Handicap Index and Voice-Related Quality of Life) measures. However, many of these measures are known to have low to moderate sensitivity and specificity for detecting changes in vocal characteristics, including vocal quality. The objective of this study was to compare changes in estimated pitch strength (PS) with other conventionally used acoustic measures based on the cepstral peak prominence (smoothed cepstral peak prominence, cepstral spectral index of dysphonia, and acoustic voice quality index), and clinical judgments of voice quality (GRBAS [grade, roughness, breathiness, asthenia, strain] scale) following laryngeal framework surgery. This study involved post hoc analysis of recordings from 22 patients pretreatment and post treatment (thyroplasty and behavioral therapy). Sustained vowels and connected speech were analyzed using objective measures (PS, smoothed cepstral peak prominence, cepstral spectral index of dysphonia, and acoustic voice quality index), and these results were compared with mean auditory-perceptual ratings by expert clinicians using the GRBAS scale. All four acoustic measures changed significantly in the direction that usually indicates improved voice quality following treatment (P < 0.005). Grade and breathiness correlated the strongest with the acoustic measures (|r| ~0.7) with strain being the least correlated. Acoustic analysis on running speech highly correlates with judged ratings. PS is a robust, easily obtained acoustic measure of voice quality that could be useful in the clinical environment to follow treatment of voice disorders. Copyright © 2018. Published by Elsevier Inc.

Updating signal typing in voice: addition of type 4 signals.

PubMed

Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu

2010-06-01

The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.
Effects of melody and technique on acoustical and musical features of western operatic singing voices.

PubMed

Larrouy-Maestri, Pauline; Magis, David; Morsomme, Dominique

2014-05-01

The operatic singing technique is frequently used in classical music. Several acoustical parameters of this specific technique have been studied but how these parameters combine remains unclear. This study aims to further characterize the Western operatic singing technique by observing the effects of melody and technique on acoustical and musical parameters of the singing voice. Fifty professional singers performed two contrasting melodies (popular song and romantic melody) with two vocal techniques (with and without operatic singing technique). The common quality parameters (energy distribution, vibrato rate, and extent), perturbation parameters (standard deviation of the fundamental frequency, signal-to-noise ratio, jitter, and shimmer), and musical features (fundamental frequency of the starting note, average tempo, and sound pressure level) of the 200 sung performances were analyzed. The results regarding the effect of melody and technique on the acoustical and musical parameters show that the choice of melody had a limited impact on the parameters observed, whereas a particular vocal profile appeared depending on the vocal technique used. This study confirms that vocal technique affects most of the parameters examined. In addition, the observation of quality, perturbation, and musical parameters contributes to a better understanding of the Western operatic singing technique. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Freddie Mercury-acoustic analysis of speaking fundamental frequency, vibrato, and subharmonics.

PubMed

Herbst, Christian T; Hertegard, Stellan; Zangger-Borch, Daniel; Lindestad, Per-Åke

2017-04-01

Freddie Mercury was one of the twentieth century's best-known singers of commercial contemporary music. This study presents an acoustical analysis of his voice production and singing style, based on perceptual and quantitative analysis of publicly available sound recordings. Analysis of six interviews revealed a median speaking fundamental frequency of 117.3 Hz, which is typically found for a baritone voice. Analysis of voice tracks isolated from full band recordings suggested that the singing voice range was 37 semitones within the pitch range of F#2 (about 92.2 Hz) to G5 (about 784 Hz). Evidence for higher phonations up to a fundamental frequency of 1,347 Hz was not deemed reliable. Analysis of 240 sustained notes from 21 a-cappella recordings revealed a surprisingly high mean fundamental frequency modulation rate (vibrato) of 7.0 Hz, reaching the range of vocal tremor. Quantitative analysis utilizing a newly introduced parameter to assess the regularity of vocal vibrato corroborated its perceptually irregular nature, suggesting that vibrato (ir)regularity is a distinctive feature of the singing voice. Imitation of subharmonic phonation samples by a professional rock singer, documented by endoscopic high-speed video at 4,132 frames per second, revealed a 3:1 frequency locked vibratory pattern of vocal folds and ventricular folds.
Voice analysis before and after vocal rehabilitation in patients following open surgery on vocal cords.

PubMed

Bunijevac, Mila; Petrović-Lazić, Mirjana; Jovanović-Simić, Nadica; Vuković, Mile

2016-02-01

The major role of larynx in speech, respiration and swallowing makes carcinomas of this region and their treatment very influential for patients' life quality. The aim of this study was to assess the importance of voice therapy in patients after open surgery on vocal cords. This study included 21 male patients and the control group of 19 subjects. The vowel (A) was recorded and analyzed for each examinee. All the patients were recorded twice: firstly, when they contacted the clinic and secondly, after a three-month vocal therapy, which was held twiceper week on an outpatient basis. The voice analysis was carried out in the Ear, Nose and Throat (ENT) Clinic, Clinical Hospital Center "Zvezdara" in Belgrade. The values of the acoustic parameters in the patients submitted to open surgery on the vocal cords before vocal rehabilitation and the control group subjects were significantly different in all specified parameters. These results suggest that the voice of the patients was damaged before vocal rehabilitation. The results of the acoustic parameters of the vowel (A) before and after vocal rehabilitation of the patients with open surgery on vocal cords were statistically significantly different. Among the parameters--Jitter (%), Shimmer (%)--the observed difference was highly statistically significant (p < 0.01). The voice turbulence index and the noise/harmonic ratio were also notably improved, and the observed difference was statistically significant (p < 0.05). The analysis of the tremor intensity index showed no significant improvement and the observed difference was not statistically significant (p > 0.05 ). CONCLUSION. There was a significant improvement of the acoustic parameters of the vowel (A) in the study subjects three months following vocal therapy. Only one out of five representative parameters showed no significant improvement.
Acoustic changes in student actors' voices after 12 months of training.

PubMed

Walzak, Peta; McCabe, Patricia; Madill, Cate; Sheard, Christine

2008-05-01

This study was to evaluate acoustic changes in student actors' voices after 12 months of actor training. The design used was a longitudinal study. Eighteen students enrolled in an Australian tertiary 3-year acting program (nine male and nine female) were assessed at the beginning of their acting course and again 12 months later using a questionnaire, interview, maximum phonation time (MPT), reading, spontaneous speaking, sustained phonation tasks, and a pitch range task. Samples were analyzed for MPT, fundamental frequency across tasks, pitch range for speaking and reading, singing pitch range, noise-to-harmonic ratio, shimmer, and jitter. After training, measures of shimmer significantly increased for both male and female participants. Female participants' pitch range significantly increased after training, with a significantly lower mean frequency for their lowest pitch. The finding of limited or negative changes for some measures indicate that further investigation is required into the long-term effects of actor voice training and which parameters of voicing are most targeted and valued in training. Particular investigation into the relationship between training targets and outcomes could more reliably inform acting programs about changes in teaching methodologies. Further research into the relationship between specific training techniques, physiological changes, and vocal changes may also provide information on implementing more evidence-based training methods.
External Validation of the Acoustic Voice Quality Index Version 03.01 With Extended Representativity.

PubMed

Barsties, Ben; Maryn, Youri

2016-07-01

The Acoustic Voice Quality Index (AVQI) is an objective method to quantify the severity of overall voice quality in concatenated continuous speech and sustained phonation segments. Recently, AVQI was successfully modified to be more representative and ecologically valid because the internal consistency of AVQI was balanced out through equal proportion of the 2 speech types. The present investigation aims to explore its external validation in a large data set. An expert panel of 12 speech-language therapists rated the voice quality of 1058 concatenated voice samples varying from normophonia to severe dysphonia. The Spearman rank-order correlation coefficients (r) were used to measure concurrent validity. The AVQI's diagnostic accuracy was evaluated with several estimates of its receiver operating characteristics (ROC). Finally, 8 of the 12 experts were chosen because of reliability criteria. A strong correlation was identified between AVQI and auditoryperceptual rating (r = 0.815, P = .000). It indicated that 66.4% of the auditory-perceptual rating's variation was explained by AVQI. Additionally, the ROC results showed again the best diagnostic outcome at a threshold of AVQI = 2.43. This study highlights external validation and diagnostic precision of the AVQI version 03.01 as a robust and ecologically valid measurement to objectify voice quality. © The Author(s) 2016.
Relationship between Activity Noise, Voice Parameters, and Voice Symptoms among Female Teachers.

PubMed

Pirilä, Sirpa; Pirilä, Paula; Ansamaa, Terhi; Yliherva, Anneli; Sonning, Samuel; Rantala, Leena

2017-01-01

Our interest was in how teachers' voices behave during the delivery of lessons in core subjects (e.g., mathematics, science, etc.). We sought to evaluate the relationship between voice sound pressure level (SPL), vocal fundamental frequency (F0), voice symptoms, activity noise, and differences therein during the first and the last lessons in core subjects of the day. The participants were 24 female elementary school teachers. Voice symptoms were evaluated by questionnaire. The data were recorded on 2 portable voice accumulators (VoxLog) from the first and last lessons of the day. The versions of accumulators differed by frequency weighting; therefore, the analysis and the results of noise and voice SPL were treated separately: unweighted (group 1) and A-weighted (group 2). Difference in voice SPL followed difference in activity noise. F0 increased between the first and last lessons. Correlations were found between differences in the noise and the voice symptoms of tiredness and dryness. Irritating mucus was associated with high F0 during the first lesson. An apparent increase in voice loading due to the activity noise was observed during lessons in core subjects. Collaboration between specialists in voice and acoustics and teachers and pupils is needed to reduce this voice loading. © 2017 S. Karger AG, Basel.
Objective and Subjective Aspects of Voice in Pregnancy.

PubMed

Saltürk, Ziya; Kumral, Tolgar Lütfi; Bekiten, Güler; Atar, Yavuz; Ataç, Enes; Aydoğdu, İmran; Yıldırım, Güven; Kılıç, Aydın; Uyar, Yavuz

2016-01-01

This study aimed to evaluate vocal changes in pregnancy according to trimesters both objectively and subjectively. Fifty pregnant women and 15 nonpregnant women were included in the study. Eighteen of the 50 pregnant women were in the first trimester, 17 in the second trimester, and 15 in the third trimester of their pregnancies. The fundamental frequency (F0), jitter, shimmer, noise-to-harmonics ratio (NHR), and minimum and maximum pitch were determined during acoustic voice analysis. Laryngologic examination was evaluated via reflux finding score (RFS). Voice Handicap Index 10 (VHI-10) was used for subjective analysis. Maximum phonation time (MPT), VHI-10, and RFS were the parameters that differed significantly. MPT was significantly shorter in the third trimester. Acoustic analysis revealed that F0, jitter, shimmer, NHR, and minimum and maximum pitch values were not significantly different in any groups. RFS was higher in the first and third trimesters than the second trimester and control groups. VHI-10 scores were significantly higher in the third trimester. Our results showed that MPT is decreased during the third trimester, although acoustic parameters did not differ. VHI-10 results deteriorated in the third trimester significantly. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Body mass index and acoustic voice parameters: is there a relationship.

PubMed

Souza, Lourdes Bernadete Rocha de; Santos, Marquiony Marques Dos

2017-05-06

Specific elements such as weight and body volume can interfere in voice production and consequently in its acoustic parameters, which is why it is important for the clinician to be aware of these relationships. To investigate the relationship between body mass index and the average acoustic voice parameters. Observational, cross-sectional descriptive study. The sample consisted of 84 women, aged between 18 and 40years, an average of 26.83 (±6.88). The subjects were grouped according to body mass index: 19 underweight; 23 normal ranges, 20 overweight and 22 obese and evaluated the fundamental frequency of the sustained vowel [a] and the maximum phonation time of the vowels [a], [i], [u], using PRAAT software. The data were submitted to the Kruskal-Wallis test to verify if there were differences between the groups regarding the study variables. All variables showed statistically significant results and were subjected to non-parametric test Mann-Whitney. Regarding to the average of the fundamental frequency, there was statistically significant difference between groups with underweight and overweight and obese; normal range and overweight and obese. The average maximum phonation time revealed statistically significant difference between underweight and obese individuals; normal range and obese; overweight and obese. Body mass index influenced the average fundamental frequency of overweight and obese individuals evaluated in this study. Obesity influenced in reducing maximum phonation time average. Copyright © 2017 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Phonation Types in Marathi: An Acoustic Investigation

ERIC Educational Resources Information Center

Berkson, Kelly Harper

2013-01-01

This dissertation presents a comprehensive instrumental acoustic analysis of phonation type distinctions in Marathi, an Indic language with numerous breathy voiced sonorants and obstruents. Important new facts about breathy voiced sonorants, which are crosslinguistically rare, are established: male and female speakers cue breathy phonation in…
Lax Vox as a Voice Training Program for Teachers: A Pilot Study.

PubMed

Mailänder, Eva; Mühre, Lea; Barsties, Ben

2017-03-01

The objective of this study was to explore the effectiveness of a 3-week training program with the voice therapy "Lax Vox" for teachers. Four healthy female teachers participated as volunteers for the study. Several voice measurements of perception, acoustics, aerodynamics, and self-evaluation were investigated. Furthermore, a survey to rate the applicability of Lax Vox was also part of the study. To assess the treatment effects of the Lax Vox training, an effect size analysis (d unb ) was conducted. After 3 weeks of training, medium and large improvements were found in some parameters of perceptual and acoustic voice quality assessments (d unb >0.50 and d unb >0.80, respectively). Furthermore, medium improvements were revealed in some parameters of self-evaluation (ie, physical and total scale of the Voice Handicap Index) and aerodynamic (ie, maximum phonation time) assessments (all d unb >0.50). Additionally, acoustic measures of vocal function showed an expansion in the upper contour of voice range profiles after training. Particularly, the main improvements in the voice range profile was found in the modal and the beginning of the falsetto voice registers. There was an increase of the intensity levels of about 4.6 dB. No changes were revealed in some acoustic measures of the voice range profile, self-evaluation measurements, and the perception of breathy voice quality (all d unb <0.20). Finally, the applicability of Lax Vox perceptually showed clear support in training success, learning process, and transfer to the daily routine. Lax Vox training for teachers appears to improve select measures of voice quality, maximum phonation time, vocal function, self-evaluation, and perceived applicability. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic analysis of speech under stress.

PubMed

Sondhi, Savita; Khan, Munna; Vijay, Ritu; Salhan, Ashok K; Chouhan, Satish

2015-01-01

When a person is emotionally charged, stress could be discerned in his voice. This paper presents a simplified and a non-invasive approach to detect psycho-physiological stress by monitoring the acoustic modifications during a stressful conversation. Voice database consists of audio clips from eight different popular FM broadcasts wherein the host of the show vexes the subjects who are otherwise unaware of the charade. The audio clips are obtained from real-life stressful conversations (no simulated emotions). Analysis is done using PRAAT software to evaluate mean fundamental frequency (F0) and formant frequencies (F1, F2, F3, F4) both in neutral and stressed state. Results suggest that F0 increases with stress; however, formant frequency decreases with stress. Comparison of Fourier and chirp spectra of short vowel segment shows that for relaxed speech, the two spectra are similar; however, for stressed speech, they differ in the high frequency range due to increased pitch modulation.
Multi-Dimensional Voice Program (MDVP) vs Praat for Assessing Euphonic Subjects: A Preliminary Study on the Gender-discriminating Power of Acoustic Analysis Software.

PubMed

Lovato, Andrea; De Colle, Wladimiro; Giacomelli, Luciano; Piacente, Alessandro; Righetto, Lara; Marioni, Gino; de Filippis, Cosimo

2016-11-01

The aim of this study was to compare the discriminatory power of the Multi-Dimensional Voice Program (MDVP) and Praat in distinguishing the gender of euphonic adults. This is a cross-sectional study. The recordings of 100 euphonic volunteers (50 males and 50 females) producing a sustained vowel /a/ were analyzed with MDVP and Praat software. Both computer programs identified significant differences between male and female volunteers in absolute jitter (MDVP P < 0.00001 and Praat P < 0.00001) and in shimmer in decibel (dB) (MDVP P = 0.006 and Praat P = 0.001). Using the scale proposed by Hosmer and Lemeshow, we found no gender discrimination for shimmer in dB with either the MDVP (area under the receiver operating characteristics curve [AUC] = 0.658) or Praat (AUC = 0.682). In our series, on the other hand, MDVP absolute jitter achieved an acceptable discrimination between males and females (AUC = 0.752), and Praat absolute jitter achieved an outstanding discrimination (AUC = 0.901). The discriminatory power of Praat absolute jitter was significantly higher than that of the MDVP (P = 0.003). Absolute jitter sensitivity and specificity were also higher for Praat (83% and 80%) than for the MDVP (74% and 49%). Differences attributable to a subject's gender and to the software used to measure acoustic parameters should be carefully considered in both research and clinical settings. Further studies are needed to test the discriminatory power of different voice analysis programs when differentiating between normal and dysphonic voices. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.

PubMed

Rothenberg, Martin; Schutte, Harm K

2016-11-01

In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Computerized Analysis of Acoustic Characteristics of Patients with Internal Nasal Valve Collapse Before and After Functional Rhinoplasty

PubMed Central

Rezaei, Fariba; Omrani, Mohammad Reza; Abnavi, Fateme; Mojiri, Fariba; Golabbakhsh, Marzieh; Barati, Sohrab; Mahaki, Behzad

2015-01-01

Acoustic analysis of sounds produced during speech provides significant information about the physiology of larynx and vocal tract. The analysis of voice power spectrum is a fundamental sensitive method of acoustic assessment that provides valuable information about the voice source and characteristics of vocal tract resonance cavities. The changes in long-term average spectrum (LTAS) spectral tilt and harmony to noise ratio (HNR) were analyzed to assess the voice quality before and after functional rhinoplasty in patients with internal nasal valve collapse. Before and 3 months after functional rhinoplasty, 12 participants were evaluated and HNR and LTAS spectral tilt in /a/ and /i/ vowels were estimated. It was seen that an increase in HNR and a decrease in LTAS spectral tilt existed after surgery. Mean LTAS spectral tilt in vowel /a/ decreased from 2.37 ± 1.04 to 2.28 ± 1.17 (P = 0.388), and it was decreased from 4.16 ± 1.65 to 2.73 ± 0.69 in vowel /i/ (P = 0.008). Mean HNR in the vowel /a/ increased from 20.71 ± 3.93 to 25.06 ± 2.67 (P = 0.002), and it was increased from 21.28 ± 4.11 to 25.26 ± 3.94 in vowel /i/ (P = 0.002). Modification of the vocal tract caused the vocal cords to close sufficiently, and this showed that although rhinoplasty did not affect the larynx directly, it changes the structure of the vocal tract and consequently the resonance of voice production. The aim of this study was to investigate the changes in voice parameters after functional rhinoplasty in patients with internal nasal valve collapse by computerized analysis of acoustic characteristics. PMID:26955564
The professional voice.

PubMed

Benninger, M S

2011-02-01

The human voice is not only the key to human communication but also serves as the primary musical instrument. Many professions rely on the voice, but the most noticeable and visible are singers. Care of the performing voice requires a thorough understanding of the interaction between the anatomy and physiology of voice production, along with an awareness of the interrelationships between vocalisation, acoustic science and non-vocal components of performance. This review gives an overview of the care and prevention of professional voice disorders by describing the unique and integrated anatomy and physiology of singing, the roles of development and training, and the importance of the voice care team.
Subjective and objective voice evaluation in Sjögren's syndrome.

PubMed

Saltürk, Ziya; Özdemir, Erdi; Kumral, Tolgar Lütfi; Karabacakoğlu, Zeynep; Kumral, Esra; Yildiz, Hatice Elvin; Mersinlioğlu, Gökhan; Atar, Yavuz; Berkiten, Güler; Yildirim, Güven; Uyar, Yavuz

2017-04-01

Objective The aim of this study is to assess the subjective and objective aspects of voice in Sjögren's syndrome. Methods The study enrolled 10 women with Sjögren's syndrome and 12 healthy women. Maximum phonation time, fundamental frequency, jitter, shimmer, and noise-to-harmonics ratio were determined during acoustic voice analysis. The Stroboscopy Evaluation Rating Form was used for the laryngostroboscopic evaluation. A subjective evaluation was performed using the Turkish version of Voice Handicap Index-10. Results The mean age of the Sjögren's syndrome and control groups was 46 ± 13.89 and 41.27 ± 6.99 years, respectively, and did not differ (P = 0.131). In the laryngostroboscopic evaluation, the smoothness and straightness of vocal folds, regularity, and glottal closure differed significantly. In the acoustic and aerodynamic analyses, none of the parameters differed statistically, while the Sjögren's syndrome group had significantly higher Voice Handicap Index-10 scores than the controls. Conclusion Sjögren's syndrome affects the voice and voice quality.
Effects of steam inhalation on voice quality-related acoustic measures.

PubMed

Mahalingam, Shenbagavalli; Boominathan, Prakash

2016-10-01

To investigate the effects of steam inhalation using a facial steamer on voice quality-related acoustic measures. Prospective outcome research: single-blinded experimental study. Forty-five vocally healthy female subjects ranging in age from 18 to 30 years (Mean age: 22.41 years; standard deviation [SD]: 8.91) participated in the study. Phonation samples were recorded under three different conditions in triplicate: baseline recording, immediately after mouth breathing (dehydration), and immediately after 3 minutes of steam inhalation via the mouth (rehydration). In the initial voice recording (prior to dehydration), mean jitter (0.42 %; SD: 0.07), shimmer (2.20 dB; SD: 0.45), and harmonics-to-noise ratio (HNR) (21.60; SD: 2.41) values were within normal limits. After short-term mouth breathing (dehydration, approximately 10 minutes), the mean jitter (1.57 %; SD: 1.82) and shimmer (4.73 dB; SD: 1.83) were significantly increased (P < 0.05), and HNR (18.64; SD: 3.16) was reduced (P < 0.05). After steam inhalation (rehydration) for 3 minutes, mean jitter (0.48 %; SD: 0.12) and shimmer (2.70 dB; SD: 0.71) showed significant decrease (P < 0.05), and HNR (20.10; SD: 3.69) showed significant increase (P < 0.05). All parameters statistically significantly improved from post dehydration values. The simple procedure of steam inhalation using a facial steamer displayed positive effects on parameters proposed to reflect voice quality. 4. Laryngoscope, 126:2305-2309, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Analysis of Measured and Simulated Supraglottal Acoustic Waves.

PubMed

Fraile, Rubén; Evdokimova, Vera V; Evgrafova, Karina V; Godino-Llorente, Juan I; Skrelin, Pavel A

2016-09-01

To date, although much attention has been paid to the estimation and modeling of the voice source (ie, the glottal airflow volume velocity), the measurement and characterization of the supraglottal pressure wave have been much less studied. Some previous results have unveiled that the supraglottal pressure wave has some spectral resonances similar to those of the voice pressure wave. This makes the supraglottal wave partially intelligible. Although the explanation for such effect seems to be clearly related to the reflected pressure wave traveling upstream along the vocal tract, the influence that nonlinear source-filter interaction has on it is not as clear. This article provides an insight into this issue by comparing the acoustic analyses of measured and simulated supraglottal and voice waves. Simulations have been performed using a high-dimensional discrete vocal fold model. Results of such comparative analysis indicate that spectral resonances in the supraglottal wave are mainly caused by the regressive pressure wave that travels upstream along the vocal tract and not by source-tract interaction. On the contrary and according to simulation results, source-tract interaction has a role in the loss of intelligibility that happens in the supraglottal wave with respect to the voice wave. This loss of intelligibility mainly corresponds to spectral differences for frequencies above 1500 Hz. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range

NASA Astrophysics Data System (ADS)

Ternström, Sten; Johansson, Dennis; Selamtzis, Andreas

2018-01-01

From soft to loud and low to high, the mechanisms of human voice have many degrees of freedom, making it difficult to assess phonation from the acoustic signal alone. FonaDyn is a research tool that combines acoustics with electroglottography (EGG). It characterizes and visualizes in real time the dynamics of EGG waveforms, using statistical clustering of the cycle-synchronous EGG Fourier components, and their sample entropy. The prevalence and stability of different EGG waveshapes are mapped as colored regions into a so-called voice range profile, without needing pre-defined thresholds or categories. With appropriately 'trained' clusters, FonaDyn can classify and map voice regimes. This is of potential scientific, clinical and pedagogical interest.

Tracking voice change after thyroidectomy: application of spectral/cepstral analyses.

PubMed

Awan, Shaheen N; Helou, Leah B; Stojadinovic, Alexander; Solomon, Nancy Pearl

2011-04-01

This study evaluates the utility of perioperative spectral and cepstral acoustic analyses to monitor voice change after thyroidectomy. Perceptual and acoustic analyses were conducted on speech samples (sustained vowel /α/ and CAPE-V sentences) provided by 70 participants (36 women and 34 men) at four study time points: prior to thyroid surgery and 2 weeks, 3 months and 6 months after thyroidectomy. Repeated measures analyses of variance focused on the relative amplitude of the dominant harmonic in the voice signal (cepstral peak prominence, CPP), the ratio of low-to-high spectral energy, and their respective standard deviations (SD). Data were also examined for relationships between acoustic measures and perceptual ratings of overall severity of voice quality. Results showed that perceived overall severity and the acoustic measures of the CPP and its SD (CPPsd) computed from sentence productions were significantly reduced at 2-week post-thyroidectomy for 20 patients (29% of the sample) who had self-reported post-operative voice change. For this same group of patients, the CPP and CPPsd computed from sentence productions improved significantly from 2-weeks post-thyroidectomy to 6-months post-surgery. CPP and CPPsd also correlated well with perceived overall severity (r = -0.68 and -0.79, respectively). Measures of CPP from sustained vowel productions were not as effective as those from sentence productions in reflecting voice deterioration in the post-thyroidectomy patients at the 2-week post-surgery time period, were weaker correlates with perceived overall severity, and were not as effective in discriminating negative voice outcome (NegVO) from normal voice outcome (NormVO) patients as compared to the results from the sentence-level stimuli. Results indicate that spectral/cepstral analysis methods can be used with continuous speech samples to provide important objective data to document the effects of dysphonia in a post-thyroidectomy patient sample. When used in
Assessments of Voice Use and Voice Quality Among College/University Singing Students Ages 18-24 Through Ambulatory Monitoring With a Full Accelerometer Signal.

PubMed

Schloneger, Matthew J; Hunter, Eric J

2017-01-01

The multiple social and performance demands placed on college/university singers could put their still-developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hertz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and the estimated voice quality of college/university singing students (18-24 years old, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal, traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose were analyzed. Additional acoustic measures included perceived pitch, pitch strength, long-term average spectrum slope, alpha ratio, dB sound pressure level 1-3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between nonsinging, solo singing, and choral singing. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice similarity in identical twins.

PubMed

Van Gysel, W D; Vercammen, J; Debruyne, F

2001-01-01

If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.
Electroglottographic analysis of actresses and nonactresses' voices in different levels of intensity.

PubMed

Master, Suely; Guzman, Marco; Carlos de Miranda, Helder; Lloyd, Adam

2013-03-01

Previous studies with long-term average spectrum (LTAS) showed the importance of the glottal source for understanding the projected voices of actresses. In this study, electroglottographic (EGG) analysis was used to investigate the contribution of the glottal source to the projected voice, comparing actresses and nonactresses' voices, in different levels of intensity. Thirty actresses and 30 nonactresses sustained vowels in habitual, moderate, and loud intensity levels. The EGG variables were contact quotient (CQ), closing quotient (QCQ), and opening quotient (QOQ). Other variables were sound pressure level (SPL) and fundamental frequency (F0). A KayPENTAX EGG was used. Variables were inputted in a general linear model. Actresses showed significantly higher values for SPL, in all levels, and both groups increased SPL significantly while changing from habitual to moderate and further to loud. There were no significant differences between groups for EGG quotients. There were significant differences between the levels only for F0 and CQ for both groups. SPL was significantly higher among actresses in all intensity levels, but in the EGG analysis, no differences were found. This apparently weak contribution of the glottal source in the supposedly projected voices of actresses, contrary to previous LTAS studies, might be because of a higher subglottal pressure or perhaps greater vocal tract contribution in SPL. Results from the present study suggest that trained subjects did not produce a significant higher SPL than untrained individuals by increasing the cost in terms of higher vocal fold collision and hence more impact stress. Future researches should explore the difference between trained and nontrained voices by aerodynamic measurements to evaluate the relationship between physiologic findings and the acoustic and EGG data. Moreover, further studies should consider both types of vocal tasks, sustained vowel and running speech, for both EGG and LTAS analysis
Assessments of Voice Use and Voice Quality among College/University Singing Students Ages 18–24 through Ambulatory Monitoring with a Full Accelerometer Signal

PubMed Central

Schloneger, Matthew; Hunter, Eric

2016-01-01

The multiple social and performance demands placed on college/university singers could put their still developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and estimated voice quality of college/university singing students (18–24 y/o, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal were analyzed traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose. Additional acoustic measures included perceived pitch, pitch strength, LTAS slope, alpha ratio, dB SPL 1–3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between non-singing, solo singing and choral singing. PMID:26897545
Long-term voice handicap index after type II thyroplasty using titanium bridges for adductor spasmodic dysphonia.

PubMed

Sanuki, Tetsuji; Yumoto, Eiji; Kodama, Narihiro; Minoda, Ryosei; Kumai, Yoshihiko

2014-06-01

To determine the long-term functional outcomes of type II thyroplasty using titanium bridges for adductor spasmodic dysphonia (AdSD) by perceptual analysis using the Voice Handicap Index-10 (VHI-10) and by acoustic analysis. Fifteen patients with AdSD underwent type II thyroplasty using titanium brides between August 2006 and February 2011. VHI-10 scores, a patient-based survey that quantifies a patient's perception of his or her vocal handicap, were determined before and at least 2 years after surgery. Concurrent with the VHI-10 evaluation, acoustic parameters were assessed, including jitter, shimmer, harmonic-to-noise ratio (HNR), standard deviation of F0 (SDF0), and degree of voice breaks (DVB). The average follow-up interval was 30.1 months. No patient had strangulation of the voice, and all were satisfied with the voice postoperatively. In the perceptual analysis, the mean VHI-10 score improved significantly, from 26.7 to 4.1 two years after surgery. All patients had significantly improved each score of three different aspects of VHI-10, representing improved functional, physical, and emotional well-being. All acoustic parameters improved significantly 2 years after surgery. The treatment of AdSD with type II thyroplasty significantly improved the voice-related quality of life and acoustic parameters 2 years after surgery. The results of the study suggest that type II thyroplasty using titanium bridges provides long-term relief of vocal symptoms in patients with AdSD. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Vocal effectiveness of speech-language pathology students: Before and after voice use during service delivery.

PubMed

Couch, Stephanie; Zieba, Dominique; Van der Linde, Jeannie; Van der Merwe, Anita

2015-03-26

As a professional voice user, it is imperative that a speech-language pathologist's(SLP) vocal effectiveness remain consistent throughout the day. Many factors may contribute to reduced vocal effectiveness, including prolonged voice use, vocally abusive behaviours,poor vocal hygiene and environmental factors. To determine the effect of service delivery on the perceptual and acoustic features of voice. A quasi-experimental., pre-test-post-test research design was used. Participants included third- and final-year speech-language pathology students at the University of Pretoria(South Africa). Voice parameters were evaluated in a pre-test measurement, after which the participants provided two consecutive hours of therapy. A post-test measurement was then completed. Data analysis consisted of an instrumental analysis in which the multidimensional voice programme (MDVP) and the voice range profile (VRP) were used to measure vocal parameters and then calculate the dysphonia severity index (DSI). The GRBASI scale was used to conduct a perceptual analysis of voice quality. Data were processed using descriptive statistics to determine change in each measured parameter after service delivery. A change of clinical significance was observed in the acoustic and perceptual parameters of voice. Guidelines for SLPs in order to maintain optimal vocal effectiveness were suggested.
Vocal effectiveness of speech-language pathology students: Before and after voice use during service delivery

PubMed Central

Couch, Stephanie; Zieba, Dominique; van der Merwe, Anita

2015-01-01

Background As a professional voice user, it is imperative that a speech-language pathologist's (SLP) vocal effectiveness remain consistent throughout the day. Many factors may contribute to reduced vocal effectiveness, including prolonged voice use, vocally abusive behaviours, poor vocal hygiene and environmental factors. Objectives To determine the effect of service delivery on the perceptual and acoustic features of voice. Method A quasi-experimental., pre-test–post-test research design was used. Participants included third- and final-year speech-language pathology students at the University of Pretoria (South Africa). Voice parameters were evaluated in a pre-test measurement, after which the participants provided two consecutive hours of therapy. A post-test measurement was then completed. Data analysis consisted of an instrumental analysis in which the multidimensional voice programme (MDVP) and the voice range profile (VRP) were used to measure vocal parameters and then calculate the dysphonia severity index (DSI). The GRBASI scale was used to conduct a perceptual analysis of voice quality. Data were processed using descriptive statistics to determine change in each measured parameter after service delivery. Results A change of clinical significance was observed in the acoustic and perceptual parameters of voice. Conclusion Guidelines for SLPs in order to maintain optimal vocal effectiveness were suggested. PMID:26304213
Adductor spasmodic dysphonia: Relationships between acoustic indices and perceptual judgments

NASA Astrophysics Data System (ADS)

Cannito, Michael P.; Sapienza, Christine M.; Woodson, Gayle; Murry, Thomas

2003-04-01

This study investigated relationships between acoustical indices of spasmodic dysphonia and perceptual scaling judgments of voice attributes made by expert listeners. Audio-recordings of The Rainbow Passage were obtained from thirty one speakers with spasmodic dysphonia before and after a BOTOX injection of the vocal folds. Six temporal acoustic measures were obtained across 15 words excerpted from each reading sample, including both frequency of occurrence and percent time for (1) aperiodic phonation, (2) phonation breaks, and (3) fundamental frequency shifts. Visual analog scaling judgments were also obtained from six voice experts using an interactive computer interface to quantify four voice attributes (i.e., overall quality, roughness, brokenness, breathiness) in a carefully psychoacoustically controlled environment, using the same reading passages as stimuli. Number and percent aperiodicity and phonation breaks correlated significanly with perceived overall voice quality, roughness, and brokenness before and after the BOTOX injection. Breathiness was correlated with aperidocity only prior to injection, while roughness also correlated with frequency shifts following injection. Factor analysis reduced perceived attributes to two principal components: glottal squeezing and breathiness. The acoustic measures demonstrated a strong regression relationship with perceived glottal squeezing, but no regression relationship with breathiness was observed. Implications for an analysis of pathologic voices will be discussed.
Aeroacoustic analysis of the human phonation process based on a hybrid acoustic PIV approach

NASA Astrophysics Data System (ADS)

Lodermeyer, Alexander; Tautz, Matthias; Becker, Stefan; Döllinger, Michael; Birk, Veronika; Kniesburges, Stefan

2018-01-01

The detailed analysis of sound generation in human phonation is severely limited as the accessibility to the laryngeal flow region is highly restricted. Consequently, the physical basis of the underlying fluid-structure-acoustic interaction that describes the primary mechanism of sound production is not yet fully understood. Therefore, we propose the implementation of a hybrid acoustic PIV procedure to evaluate aeroacoustic sound generation during voice production within a synthetic larynx model. Focusing on the flow field downstream of synthetic, aerodynamically driven vocal folds, we calculated acoustic source terms based on the velocity fields obtained by time-resolved high-speed PIV applied to the mid-coronal plane. The radiation of these sources into the acoustic far field was numerically simulated and the resulting acoustic pressure was finally compared with experimental microphone measurements. We identified the tonal sound to be generated downstream in a small region close to the vocal folds. The simulation of the sound propagation underestimated the tonal components, whereas the broadband sound was well reproduced. Our results demonstrate the feasibility to locate aeroacoustic sound sources inside a synthetic larynx using a hybrid acoustic PIV approach. Although the technique employs a 2D-limited flow field, it accurately reproduces the basic characteristics of the aeroacoustic field in our larynx model. In future studies, not only the aeroacoustic mechanisms of normal phonation will be assessable, but also the sound generation of voice disorders can be investigated more profoundly.
Emirati Teachers' Perceptions of Voice Handicap.

PubMed

Natour, Yaser S; Sartawi, Abdealaziz M; Al Muhairy, Ousha; Efthymiou, Effie; Marie, Basem S

2016-05-01

The purpose of the study was to explore Emirati teachers' perceptions of voice handicap and to analyze their acoustic characteristics to determine whether acoustic measures of teachers' voice would verify their perceptions of voice handicap. Sixty-six Emirati school teachers (33 men and 33 women), with different years of teaching experience and age, and 100 control participants (50 men and 50 women) underwent vocal assessment that included the Voice Handicap Index (VHI-Arab) and acoustic measures (F0, jitter%, shimmer%, signal to noise ratio [SNR]). Significant differences between the teachers' group scores and the control group scores on the following subscales of VHI-Arab: physical (P = 0.006), emotional (P = 0.004), and total score of the test (P = 0.002). No significant differences were found among teachers in the three VHI subscales, and the total score regarding gender (functional P = 0.307; physical P = 0.341; emotional P = 0.126; and total P = 0.184), age (functional P = 0.972; physical P = 0.525; emotional P = 0.772; and total P = 0.848), and years of teaching experience (functional P = 0.319; physical P = 0.619; emotional P = 0.926; and total P = 0.638). The significant differences between the teacher's group and the control group in three acoustic measures: F0 (P = 0.000), shimmer% (P = 0.000), and SNR (P = 0.000) were further investigated. Significant differences were found among female and male teachers in F0 (P = 0.00) and SNR (P = 0.007). As for teachers' age, significant differences were found in SNR (P = 0.028). Teachers' years of experience did not show significant differences in any of the acoustic measures. Teachers have a higher perception of voice handicap. However, they were able to produce better voice quality than control participants were, as expressed in better SNRs. This might have been caused either by manipulation of vocal properties or abusive overloading the vocal system to produce a
Experiences of hearing voices: analysis of a novel phenomenological survey

PubMed Central

Woods, Angela; Jones, Nev; Alderson-Day, Ben; Callard, Felicity; Fernyhough, Charles

2015-01-01

Summary Background Auditory hallucinations—or voices—are a common feature of many psychiatric disorders and are also experienced by individuals with no psychiatric history. Understanding of the variation in subjective experiences of hallucination is central to psychiatry, yet systematic empirical research on the phenomenology of auditory hallucinations remains scarce. We aimed to record a detailed and diverse collection of experiences, in the words of the people who hear voices themselves. Methods We made a 13 item questionnaire available online for 3 months. To elicit phenomenologically rich data, we designed a combination of open-ended and closed-ended questions, which drew on service-user perspectives and approaches from phenomenological psychiatry, psychology, and medical humanities. We invited people aged 16–84 years with experience of voice-hearing to take part via an advertisement circulated through clinical networks, hearing voices groups, and other mental health forums. We combined qualitative and quantitative methods, and used inductive thematic analysis to code the data and χ2 tests to test additional associations of selected codes. Findings Between Sept 9 and Nov 29, 2013, 153 participants completed the study. Most participants described hearing multiple voices (124 [81%] of 153 individuals) with characterful qualities (106 [69%] individuals). Less than half of the participants reported hearing literally auditory voices—70 (46%) individuals reported either thought-like or mixed experiences. 101 (66%) participants reported bodily sensations while they heard voices, and these sensations were significantly associated with experiences of abusive or violent voices (p=0·024). Although fear, anxiety, depression, and stress were often associated with voices, 48 (31%) participants reported positive emotions and 49 (32%) reported neutral emotions. Our statistical analysis showed that mixed voices were more likely to have changed over time (p=0·030), be
Combined Functional Voice Therapy in Singers With Muscle Tension Dysphonia in Singing.

PubMed

Sielska-Badurek, Ewelina; Osuch-Wójcikiewicz, Ewa; Sobol, Maria; Kazanecka, Ewa; Rzepakowska, Anna; Niemczyk, Kazimierz

2017-07-01

The purpose of this study was to evaluate vocal tract function and the voice quality in singers with muscle tension dysphonia (MTD) after undergoing combined functional voice therapy of the singing voice. This is a prospective, randomized study. Forty singers (29 females and 11 males, mean age: 24.6 ± 8.8 years) with MTD were enrolled in the study. The study group consisted of 20 singers who underwent combined functional voice therapy (10-15 individual sessions, 30-40 minutes each). Singers who did not opt for vocal rehabilitation consisted of the control group. Effects of rehabilitation were assessed with videolaryngostroboscopy, palpation of the vocal tract structures, flexible fiberoptic evaluation of the pharynx and the larynx, perceptual speaking and singing voice assessment, acoustic analysis, maximal phonation time, and the Voice Handicap Index. After combined functional voice therapy in the study group, great improvement was noticed in palpation of the vocal tract structures (P < 0.001), perceptual voice assessment (P < 0.001), phonetograms (P = 0.002), and singing range obtained from acoustic analysis of glissando (P < 0.001). In the control group, no statistically significant differences were found between the first and the second assessments. Combined functional voice therapy proved to be an efficacious treatment method in singers with MTD in singing. Development of palpation and perceptual singing voice examination protocols enables one to compare results before and after rehabilitation in clinics. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Dependencies and Ill-designed Parameters Within High-speed Videoendoscopy and Acoustic Signal Analysis.

PubMed

Schlegel, Patrick; Stingl, Michael; Kunduk, Melda; Kniesburges, Stefan; Bohr, Christopher; Döllinger, Michael

2018-05-31

The phonatory process is often judged during sustained phonation by analyzing the acoustic voice signal and the vocal fold vibrations. Many formulas and parameters have been suggested for qualifying the characteristics of the acoustic signal and the vocal fold vibrations during sustained phonation. These parameters are directly computed from the acoustic signal and the endoscopic glottal area waveform (GAW). The GAW is calculated from laryngeal high-speed videoendoscopy (HSV) recordings and describes the increase and decrease of the glottal area during the phonation process, that is, the opening and closing of the two oscillating vocal folds over time. However, some of the parameters have strong mathematical dependencies with one another and some are ill-defined. The purpose of this study is to identify mathematical dependencies between parameters with the aim of reducing their numbers and suggesting which parameters may best describe the properties of the GAW and the acoustical signal. In this preliminary investigation, 20 frequently used parameters are examined: 10 GAW only and 10 both GAW and acoustic parameters. In total 13 parameters can be neglected because of mathematical dependencies. In addition, nine of these parameters show problematic features that range from unexpected behavior to ill definition. Reducing the number of parameters appears to be necessary to standardize vocal fold function analysis. This may lead to better comparability of research results from different studies. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
[Acoustic characteristics of adductor spasmodic dysphonia].

PubMed

Yang, Yang; Wang, Li-Ping

2008-06-01

To explore the acoustic characteristics of adductor spasmodic dysphonia. The acoustic characteristics, including acoustic signal of recorded voice, three-dimensional sonogram patterns and subjective assessment of voice, between 10 patients (7 women, 3 men) with adductor spasmodic dysphonia and 10 healthy volunteers (5 women, 5 men), were compared. The main clinical manifestation of adductor spasmodic dysphonia included the disorders of sound quality, rhyme and fluency. It demonstrated the tension dysphonia when reading, acoustic jitter, momentary fluctuation of frequency and volume, voice squeezing, interruption, voice prolongation, and losing normal chime. Among 10 patients, there were 1 mild dysphonia (abnormal syllable number < 25%), 6 moderate dysphonia (abnormal syllable number 25%-49%), 1 severe dysphonia (abnormal syllable number 50%-74%) and 2 extremely severe dysphonia (abnormal syllable number > or = 75%). The average reading time in 10 patients was 49 s, with reading time extension and aphasia area interruption in acoustic signals, whereas the average reading time in health control group was 30 s, without voice interruption. The aphasia ratio averaged 42%. The respective symptom syllable in different patients demonstrated in the three-dimensional sonogram. There were voice onset time prolongation, irregular, interrupted and even absent vowel formants. The consonant of symptom syllables displayed absence or prolongation of friction murmur in the block-friction murmur occasionally. The acoustic characteristics of adductor spasmodic dysphonia is the disorders of sound quality, rhyme and fluency. The three-dimensional sonogram of the symptom syllables show distinctive changes of proportional vowels or consonant phonemes.
Human voice quality measurement in noisy environments.

PubMed

Ueng, Shyh-Kuang; Luo, Cheng-Ming; Tsai, Tsung-Yu; Yeh, Hsuan-Chen

2015-01-01

Computerized acoustic voice measurement is essential for the diagnosis of vocal pathologies. Previous studies showed that ambient noises have significant influences on the accuracy of voice quality assessment. This paper presents a voice quality assessment system that can accurately measure qualities of voice signals, even though the input voice data are contaminated by low-frequency noises. The ambient noises in our living rooms and laboratories are collected and the frequencies of these noises are analyzed. Based on the analysis, a filter is designed to reduce noise level of the input voice signal. Then, improved numerical algorithms are employed to extract voice parameters from the voice signal to reveal the health of the voice signal. Compared with MDVP and Praat, the proposed method outperforms these two widely used programs in measuring fundamental frequency and harmonic-to-noise ratio, and its performance is comparable to these two famous programs in computing jitter and shimmer. The proposed voice quality assessment method is resistant to low-frequency noises and it can measure human voice quality in environments filled with noises from air-conditioners, ceiling fans and cooling fans of computers.
Effects of Radioactive Iodine Ablation Therapy on Voice Quality.

PubMed

Aydoğdu, İmran; Atar, Yavuz; Saltürk, Ziya; Sarı, Hüseyin; Ataç, Enes; Aydoğdu, Zeynep; İnan, Muzaffer; Mersinlioğlu, Gökhan; Uyar, Yavuz

2017-01-01

The goal of this study was to evaluate the effects of radioactive iodine ablation therapy on voice quality of patients diagnosed with well-differentiated thyroid carcinoma. We enrolled 36 patients who underwent total or subtotal thyroidectomy due to well-differentiated thyroid carcinoma. Voice recordings from patients were analyzed for acoustic and aerodynamic voice. The Voice Handicap Index-10 was used for subjective analysis. The control group consisted of 36 healthy participants. Results taken before and after therapy were compared statistically. There were no differences in the results taken before and after therapy for the radioactive iodine ablation group. The Voice Handicap Index-10 results did not differ between groups before and after therapy. Radioactive iodine ablation therapy has no effect on voice quality objectively or subjectively. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice stress analysis and evaluation

NASA Astrophysics Data System (ADS)

Haddad, Darren M.; Ratley, Roy J.

2001-02-01

Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.
Voice characteristics in the progression of Parkinson's disease.

PubMed

Holmes, R J; Oates, J M; Phyland, D J; Hughes, A J

2000-01-01

This study examined the acoustic and perceptual voice characteristics of patients with Parkinson's disease according to disease severity. The perceptual and acoustic voice characteristics of 30 patients with early stage PD and 30 patients with later stage PD were compared with data from 30 normal control subjects. Voice recordings consisted of prolongation of the vowel /a/, scale singing, and a 1-min monologue. In comparison with controls and previously published normative data, both early and later stage PD patients' voices were characterized perceptually by limited pitch and loudness variability, breathiness, harshness and reduced loudness. High modal pitch levels also characterized the voices of males in both early and later stages of PD. Acoustically, the voices of both groups of PD patients demonstrated lower mean intensity levels and reduced maximum phonational frequency ranges in comparison with normative data. Although less clear, the present data also suggested that the PD patients' voices were characterized by excess jitter, a high-speaking fundamental frequency for males and a reduced fundamental frequency variability for females. While several of these voice features did not appear to deteriorate with disease progression (i.e. harshness, high modal pitch and speaking fundamental frequency in males, fundamental frequency variability in females, low intensity and jitter), breathiness, monopitch and monoloudness, low loudness and reduced maximum phonational frequency range were all worse in the later stages of PD. Tremor was the sole voice feature which was associated only with later stage PD.
Cepstral analysis of normal and pathological voice in Spanish adults. Smoothed cepstral peak prominence in sustained vowels versus connected speech.

PubMed

Delgado-Hernández, Jonathan; León-Gómez, Nieves M; Izquierdo-Arteaga, Laura M; Llanos-Fumero, Yanira

In recent years, the use of cepstral measures for acoustic evaluation of voice has increased. One of the most investigated parameters is smoothed cepstral peak prominence (CPPs). The objectives of this paper are to establish the usefulness of this acoustic measure in the objective evaluation of alterations of the voice in Spanish and to determine what type of voice sample (sustained vowels or connected speech) is the most sensitive in evaluating the severity of dysphonia. Forty subjects participated in this study 40, 20 controls and 20 with dysphonia. Two voice samples were recorded for each subject (one sustained vowel/a/and four phonetically balanced sentences) and the CPPs was calculated using the Praat programme. Three raters perceptually evaluated the voice sample with the Grade parameter of GRABS scale. Significantly lower values were found in the dysphonic voices, both for/a/(t [38] = 4.85, P<.000) and for phrases (t [38] = 5,75, P<.000). In relation to the type of voice sample most suitable for evaluating the severity of voice alterations, a strong correlation was found with the acoustic-perceptual scale of CPPs calculated from connected speech (r s = -0.73) and moderate correlation with that calculated from the sustained vowel (r s = -0,56). The results of this preliminary study suggest that CPPs is a good measure to detect dysphonia and to objectively assess the severity of alterations in the voice. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Otorrinolaringología y Cirugía de Cabeza y Cuello. All rights reserved.

Multidimensional assessment of vocal changes in benign vocal fold lesions after voice therapy.

PubMed

Schindler, Antonio; Mozzanica, Francesco; Maruzzi, Patrizia; Atac, Murat; De Cristofaro, Valeria; Ottaviani, Francesco

2013-06-01

To evaluate through a multidimensional protocol voice changes after voice therapy in patients with benign vocal fold lesions. 65 consecutive patients affected by benign vocal fold lesions were enrolled. Depending on videolaryngostroboscopy the patients were divided into 3 groups: 23 patients with Reinke's oedema, 22 patients with vocal fold cysts and 20 patients with gelatinous polyp. Each subject received 10 voice therapy sessions and was evaluated, before and after voice therapy, through a multidimensional protocol including videolaryngostroboscopy, perception, acoustics, aerodynamics and self-rating by the patient. Data were compared using Wilcoxon signed-rank test. Kruskal-Wallis test was used to analyse the mean variation difference between the three groups of patients. Mann-Whitney test was used for post hoc analysis. Only in 11 cases videolaryngostroboscopy revealed an improvement of the initial pathology. However a significant improvement was observed in perceptual, acoustic and self-assessment ratings in the 3 groups of patients. In particular the parameters of G, R and A of the GIRBAS scale, and the noise to harmonic ratio, Jitter and shimmer scores improved after rehabilitation. A significant improvement of all the parameters of Voice Handicap Index after rehabilitation treatment was found. No significant difference among the three groups of patients was visible, except for self-assessment ratings. Voice therapy may provide a significant improvement in perceptual, acoustic and self-assessed voice quality in patients with benign glottal lesions. Utilization of voice therapy may allow some patients to avoid surgical intervention. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Effects of noise and acoustics in schools on vocal health in teachers.

PubMed

Cutiva, Lady Catherine Cantor; Burdorf, Alex

2015-01-01

Previous studies on the influence of noise and acoustics in the classroom on voice symptoms among teachers have exclusively relied on self-reports. Since self-reported physical conditions may be biased, it is important to determine the role of objective measurements of noise and acoustics in the presence of voice symptoms. To assess the association between objectively measured and self-reported physical conditions at school with the presence of voice symptoms among teachers. In 12 public schools in Bogotα, we conducted a cross-sectional study among 682 Colombian school workers at 377 workplaces. After signed the informed consent, participants filled out a questionnaire on individual and work-related conditions and the nature and severity of voice symptoms in the past month. Short-term environmental measurements of sound levels, temperature, humidity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Logistic regression analysis was used to determine associations between work-related factors and voice symptoms. High noise levels outside schools (odds ratio [OR] = 1.83; 95% confidence interval [CI]: 1.12-2.99) and self-reported poor acoustics at the workplace (OR = 2.44; 95% CI: 1.88-3.53) were associated with voice symptoms. We found poor agreement between the objective measurements and self-reports of physical conditions at the workplace. This study indicates that noise and acoustics may play a role in the occurrence of voice symptoms among teachers. The poor agreement between objective measurements and self-reports of physical conditions indicate that these are different entities, which argue for inclusion of physical measurements of the working environment in studies on the influence of noise and acoustics on vocal health.
Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease.

PubMed

Rusz, J; Cmejla, R; Ruzickova, H; Ruzicka, E

2011-01-01

An assessment of vocal impairment is presented for separating healthy people from persons with early untreated Parkinson's disease (PD). This study's main purpose was to (a) determine whether voice and speech disorder are present from early stages of PD before starting dopaminergic pharmacotherapy, (b) ascertain the specific characteristics of the PD-related vocal impairment, (c) identify PD-related acoustic signatures for the major part of traditional clinically used measurement methods with respect to their automatic assessment, and (d) design new automatic measurement methods of articulation. The varied speech data were collected from 46 Czech native speakers, 23 with PD. Subsequently, 19 representative measurements were pre-selected, and Wald sequential analysis was then applied to assess the efficiency of each measure and the extent of vocal impairment of each subject. It was found that measurement of the fundamental frequency variations applied to two selected tasks was the best method for separating healthy from PD subjects. On the basis of objective acoustic measures, statistical decision-making theory, and validation from practicing speech therapists, it has been demonstrated that 78% of early untreated PD subjects indicate some form of vocal impairment. The speech defects thus uncovered differ individually in various characteristics including phonation, articulation, and prosody.
Cue-specific effects of categorization training on the relative weighting of acoustic cues to consonant voicing in English

PubMed Central

Francis, Alexander L.; Kaganovich, Natalya; Driscoll-Huber, Courtney

2008-01-01

In English, voiced and voiceless syllable-initial stop consonants differ in both fundamental frequency at the onset of voicing (onset F0) and voice onset time (VOT). Although both correlates, alone, can cue the voicing contrast, listeners weight VOT more heavily when both are available. Such differential weighting may arise from differences in the perceptual distance between voicing categories along the VOT versus onset F0 dimensions, or it may arise from a bias to pay more attention to VOT than to onset F0. The present experiment examines listeners’ use of these two cues when classifying stimuli in which perceptual distance was artificially equated along the two dimensions. Listeners were also trained to categorize stimuli based on one cue at the expense of another. Equating perceptual distance eliminated the expected bias toward VOT before training, but successfully learning to base decisions more on VOT and less on onset F0 was easier than vice versa. Perceptual distance along both dimensions increased for both groups after training, but only VOT-trained listeners showed a decrease in Garner interference. Results lend qualified support to an attentional model of phonetic learning in which learning involves strategic redeployment of selective attention across integral acoustic cues. PMID:18681610
Major depressive disorder discrimination using vocal acoustic features.

PubMed

Taguchi, Takaya; Tachikawa, Hirokazu; Nemoto, Kiyotaka; Suzuki, Masayuki; Nagano, Toru; Tachibana, Ryuki; Nishimura, Masafumi; Arai, Tetsuaki

2018-01-01

The voice carries various information produced by vibrations of the vocal cords and the vocal tract. Though many studies have reported a relationship between vocal acoustic features and depression, including mel-frequency cepstrum coefficients (MFCCs) which applied to speech recognition, there have been few studies in which acoustic features allowed discrimination of patients with depressive disorder. Vocal acoustic features as biomarker of depression could make differential diagnosis of patients with depressive state. In order to achieve differential diagnosis of depression, in this preliminary study, we examined whether vocal acoustic features could allow discrimination between depressive patients and healthy controls. Subjects were 36 patients who met the criteria for major depressive disorder and 36 healthy controls with no current or past psychiatric disorders. Voices of reading out digits before and after verbal fluency task were recorded. Voices were analyzed using OpenSMILE. The extracted acoustic features, including MFCCs, were used for group comparison and discriminant analysis between patients and controls. The second dimension of MFCC (MFCC 2) was significantly different between groups and allowed the discrimination between patients and controls with a sensitivity of 77.8% and a specificity of 86.1%. The difference in MFCC 2 between the two groups reflected an energy difference of frequency around 2000-3000Hz. The MFCC 2 was significantly different between depressive patients and controls. This feature could be a useful biomarker to detect major depressive disorder. Sample size was relatively small. Psychotropics could have a confounding effect on voice. Copyright © 2017 Elsevier B.V. All rights reserved.
The relation of vocal fold lesions and voice quality to voice handicap and psychosomatic well-being.

PubMed

Smits, R; Marres, H; de Jong, Felix

2012-07-01

Voice disorders have a multifactorial genesis and may be present in various ways. They can cause a significant communication handicap and impaired quality of life. To assess the effect of vocal fold lesions and voice quality on voice handicap and psychosomatic well-being. Female patients, aged 18-65 years, who were referred to the outpatient clinic with voice problems were subsequently assessed. Laryngostroboscopic examination and acoustic voice analysis were carried out, and the patients were asked to fill in the Voice Handicap Index (VHI) and Symptom Check List-90 questionnaires. Eighty-two patients were included. In 43 patients (52.4%), a vocal fold lesion was observed. The VHI and psychosomatic well-being did not differ significantly between patients with and without a vocal fold lesion. The patients with a vocal fold lesion showed lower scores on the Dysphonia Severity Index (DSI) compared with those without a vocal fold lesion. However, the DSI was not correlated with voice handicap and psychosomatic well-being, except for the VHI physical subscale. Objective measurement does not necessarily correlate with the subjective appraisal of the patient's voice handicap and psychosomatic well-being. Furthermore, the criterion of the presence of a vocal fold lesion as the base of indemnity that is applied by health insurance institutions should be questioned. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Perceptual evaluation and acoustic analysis of pneumatic artificial larynx.

PubMed

Xu, Jie Jie; Chen, Xi; Lu, Mei Ping; Qiao, Ming Zhe

2009-12-01

To investigate the perceptual and acoustic characteristics of the pneumatic artificial larynx (PAL) and evaluate its speech ability and clinical value. Prospective study. The study was conducted in the Voice Lab, Department of Otorhinolaryngology, The First Affiliated Hospital of Nanjing Medical University. Forty-six laryngectomy patients using the PAL were rated for intelligibility and fluency of speech. The voice signals of sustained vowel /a/ for 40 healthy controls and 42 successful patients using the PAL were measured by a computer system. The acoustic parameters and sound spectrographs were analyzed and compared between the two groups. Forty-two of 46 patients using the PAL (91.3%) acquired successful speech capability. The intelligibility scores of 42 successful PAL speakers ranged from 71 to 95 percent, and the intelligibility range of four unsuccessful speakers was 30 to 50 percent. The fluency was judged as good or excellent in 42 successful patients, and poor or fair in four unsuccessful patients. There was no significant difference in average fundamental frequency, maximum intensity, jitter, shimmer, and normalized noise energy (NNE) between 42 successful PAL speakers and 40 healthy controls, while the maximum phonation time (MPT) of PAL speakers was slightly lower than that of the controls. The sound spectrographs of the patients using the PAL approximated those of the healthy controls. The PAL has the advantage of a high percentage of successful vocal rehabilitation. PAL speech is fluent and intelligible. The acoustic characteristics of the PAL are similar to those of a normal voice.
[Acoustic analysis and characteristics of vocal range in Beijing Opera actors].

PubMed

Qu, C; Liu, Y

2000-02-01

To get the objective acoustic parameters of the voice of Beijing Opera actors and set a foundation for the training and protection of the special professional voice. Seventy-three (age 16-57 years) professional actors and students were asked to produce sustained comfortable vowels /a/ and /i/, and to sing two pieces of songs which were in the category of Xipi and Erhuang respectively. Dr. Speech for windows version 3.0 was used to get the acoustic parameters of the vowels and the songs. F0 of the vowels /a/ and /i/ of different Hangdangs were Chou (272.6 +/- 42.0) Hz (mean +/- s), (304.2 +/- 22.1) Hz; Xiaosheng (499.3 +/- 34.0) Hz, (485.4 +/- 18.7) Hz; Laosheng (335.6 +/- 60.0) Hz, (317.9 +/- 45.1) Hz; Hualian (319.0 +/- 61.3) Hz, (340.1 +/- 68.8) Hz; Laodan (427.6 +/- 47.2) Hz, (437.7 +/- 45.8) Hz; Huadan (535.8 +/- 48.8) Hz, (561.6 +/- 29.2) Hz; Qingyi (548.0 +/- 69.5) Hz, (543.5 +/- 79.3) Hz; these and other acoustic parameters of vowels such as Jitter, Shimmer and NNE were all within the normal range given by the software. The vocal range of Beijing Opera actors was from 1.7 to 2.8 oct, and most of the highest and the lowest pitches were higher than that of tenor or soprano. These findings may help to provide insight regarding the acoustic characteristics of the voice of Beijing Opera actors.
Real time analysis of voiced sounds

NASA Technical Reports Server (NTRS)

Hong, J. P. (Inventor)

1976-01-01

A power spectrum analysis of the harmonic content of a voiced sound signal is conducted in real time by phase-lock-loop tracking of the fundamental frequency, (f sub 0) of the signal and successive harmonics (h sub 1 through h sub n) of the fundamental frequency. The analysis also includes measuring the quadrature power and phase of each frequency tracked, differentiating the power measurements of the harmonics in adjacent pairs, and analyzing successive differentials to determine peak power points in the power spectrum for display or use in analysis of voiced sound, such as for voice recognition.
Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors.

PubMed

Godino-Llorente, J I; Gómez-Vilda, P

2004-02-01

It is well known that vocal and voice diseases do not necessarily cause perceptible changes in the acoustic voice signal. Acoustic analysis is a useful tool to diagnose voice diseases being a complementary technique to other methods based on direct observation of the vocal folds by laryngoscopy. Through the present paper two neural-network based classification approaches applied to the automatic detection of voice disorders will be studied. Structures studied are multilayer perceptron and learning vector quantization fed using short-term vectors calculated accordingly to the well-known Mel Frequency Coefficient cepstral parameterization. The paper shows that these architectures allow the detection of voice disorders--including glottic cancer--under highly reliable conditions. Within this context, the Learning Vector quantization methodology demonstrated to be more reliable than the multilayer perceptron architecture yielding 96% frame accuracy under similar working conditions.
Effects of Voice Therapy on Laryngeal Motor Units During Phonation in Chronic Superior Laryngeal Nerve Paresis Dysphonia.

PubMed

Kaneko, Mami; Hitomi, Takefumi; Takekawa, Takashi; Tsuji, Takuya; Kishimoto, Yo; Hirano, Shigeru

2017-09-26

Injury to the superior laryngeal nerve can result in dysphonia, and in particular, loss of vocal range. It can be an especially difficult problem to address with either voice therapy or surgical intervention. Some clinicians and scientists suggest that combining vocal exercises with adjunctive neuromuscular electrical stimulation may enhance the positive effects of voice therapy for superior laryngeal nerve paresis (SLNP). However, the effects of voice therapy without neuromuscular electrical stimulation are unknown. The purpose of this retrospective study was to demonstrate the clinical effectiveness of voice therapy for rehabilitating chronic SLNP dysphonia in two subjects, using interspike interval (ISI) variability of laryngeal motor units by laryngeal electromyography (LEMG). Both patients underwent LEMG and were diagnosed with having 70% recruitment of the cricothyroid muscle, and 70% recruitment of the cricothyroid and thyroarytenoid muscles, respectively. Both patients received voice therapy for 3 months. Grade, roughness, breathiness, asthenia, and strain (GRBAS) scale, stroboscopic examination, aerodynamic assessment, acoustic analysis, and Voice Handicap Index-10 were performed before and after voice therapy. Mean ISI variability during steady phonation was also assessed. After voice therapy, both patients showed improvement in vocal assessments by acoustic, aerodynamic, GRBAS, and Voice Handicap Index-10 analysis. LEMG indicated shortened ISIs in both cases. This study suggests that voice therapy for chronic SLNP dysphonia can be useful for improving SLNP and voice quality. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Perceptual structure of adductor spasmodic dysphonia and its acoustic correlates.

PubMed

Cannito, Michael P; Doiuchi, Maki; Murry, Thomas; Woodson, Gayle E

2012-11-01

To examine the perceptual structure of voice attributes in adductor spasmodic dysphonia (ADSD) before and after botulinum toxin treatment and identify acoustic correlates of underlying perceptual factors. Reliability of perceptual judgments is considered in detail. Pre- and posttreatment trial with comparison to healthy controls, using single-blind randomized listener judgments of voice qualities, as well as retrospective comparison with acoustic measurements. Oral readings were recorded from 42 ADSD speakers before and after treatment as well as from their age- and sex-matched controls. Experienced judges listened to speech samples and rated attributes of overall voice quality, breathiness, roughness, and brokenness, using computer-implemented visual analog scaling. Data were adjusted for regression to the mean and submitted to principal components factor analysis. Acoustic waveforms, extracted from the reading samples, were analyzed and measurements correlated with perceptual factor scores. Four reliable perceptual variables of ADSD voice were effectively reduced to two underlying factors that corresponded to hyperadduction, most strongly associated with roughness, and hypoadduction, most strongly associated with breathiness. After treatment, the hyperadduction factor improved, whereas the hypoadduction factor worsened. Statistically significant (P<0.01) correlations were observed between perceived roughness and four acoustic measures, whereas breathiness correlated with aperiodicity and cepstral peak prominence (CPPs). This study supported a two-factor model of ADSD, suggesting perceptual characterization by both hyperadduction and hypoadduction before and after treatment. Responses of the factors to treatment were consistent with previous research. Correlations among perceptual and acoustic variables suggested that multiple acoustic features contributed to the overall impression of roughness. Although CPPs appears to be a partial correlate of perceived
Speech masking and cancelling and voice obscuration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, John F.

A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.
Effects of noise and acoustics in schools on vocal health in teachers

PubMed Central

Cutiva, Lady Catherine Cantor; Burdorf, Alex

2015-01-01

Previous studies on the influence of noise and acoustics in the classroom on voice symptoms among teachers have exclusively relied on self-reports. Since self-reported physical conditions may be biased, it is important to determine the role of objective measurements of noise and acoustics in the presence of voice symptoms. To assess the association between objectively measured and self-reported physical conditions at school with the presence of voice symptoms among teachers. In 12 public schools in Bogotá, we conducted a cross-sectional study among 682 Colombian school workers at 377 workplaces. After signed the informed consent, participants filled out a questionnaire on individual and work-related conditions and the nature and severity of voice symptoms in the past month. Short-term environmental measurements of sound levels, temperature, humidity, and reverberation time were conducted during visits at the workplaces, such as classrooms and offices. Logistic regression analysis was used to determine associations between work-related factors and voice symptoms. High noise levels outside schools (odds ratio [OR] = 1.83; 95% confidence interval [CI]: 1.12–2.99) and self-reported poor acoustics at the workplace (OR = 2.44; 95% CI: 1.88–3.53) were associated with voice symptoms. We found poor agreement between the objective measurements and self-reports of physical conditions at the workplace. This study indicates that noise and acoustics may play a role in the occurrence of voice symptoms among teachers. The poor agreement between objective measurements and self-reports of physical conditions indicate that these are different entities, which argue for inclusion of physical measurements of the working environment in studies on the influence of noise and acoustics on vocal health. PMID:25599754
Connections between voice ergonomic risk factors in classrooms and teachers' voice production.

PubMed

Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva

2012-01-01

The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.
Comparisons of voice onset time for trained male singers and male nonsingers during speaking and singing.

PubMed

McCrea, Christopher R; Morris, Richard J

2005-09-01

This study was designed to examine the temporal acoustic differences between male trained singers and nonsingers during speaking and singing across voiced and voiceless English stop consonants. Recordings were made of 5 trained singers and 5 nonsingers, and acoustically analyzed for voice onset time (VOT). A mixed analysis of variance showed that the male trained singers had significantly longer mean VOT than did the nonsingers during voiceless stop production. Sung productions of voiceless stops had significantly longer mean VOTs than did the spoken productions. No significant differences were observed for the voiced stops, nor were any interactions observed. These results indicated that vocal training and phonatory task have a significant influence on VOT.
Short term effect of hubble-bubble smoking on voice.

PubMed

Hamdan, A-L; Sibai, A; Mahfoud, L; Oubari, D; Ashkar, J; Fuleihan, N

2011-05-01

To investigate the short term effect of hubble-bubble smoking on voice. Prospective study. Eighteen non-dysphonic subjects (seven men and 11 women) with a history of hubble-bubble smoking and no history of cigarette smoking underwent acoustic analysis and laryngeal video-stroboscopic examination before and 30 minutes after hubble-bubble smoking. On laryngeal video-stroboscopy, none of the subjects had vocal fold erythema either before or after smoking. Five patients had mild vocal fold oedema both before and after smoking. After smoking, there was a slight increase in the number of subjects with thick mucus between the vocal folds (six, vs four before smoking) and with vocal fold vessel dilation (two, vs one before smoking). Acoustic analysis indicated a drop in habitual pitch, fundamental frequency and voice turbulence index after smoking, and an increase in noise-to-harmonics ratio. Even 30 minutes of hubble-bubble smoking can cause a drop in vocal pitch and an increase in laryngeal secretions and vocal fold vasodilation.
Voice hearing: a secondary analysis of talk by people who hear voices.

PubMed

Jones, Malcolm; Coffey, Michael

2012-02-01

Unitary explanations of mental illness symptoms appear to be inadequate when faced with everyday experiences of living with these conditions. In particular, the experience of voice hearing is not sufficiently accounted for by biomedical explanations. This paper revisits data collected from a sample of people who hear voices to perform a secondary analysis with the aim of examining the explanatory devices deployed by individuals in their accounts of voice hearing. Secondary analysis is the use of existing data, collected for a previous study, in order to explore a research question distinct from the original inquiry. In this study, we subjected these data to a thematic analysis. People who hear voices make use of standard psychiatric explanations about the experience in their accounts. However, the accounts paint a more complex picture and show that people also impute personal meaning to the experience. This in turn implicates both personal and social identity; that is, how the person is known to themselves and to others. We suggest that this knowledge can inform a more thoughtful engagement with the experiences of voice hearing by mental health nurses. © 2011 The Authors; International Journal of Mental Health Nursing © 2011 Australian College of Mental Health Nurses Inc.
Sex and the singer: Gender categorization aspects of singing voice

NASA Astrophysics Data System (ADS)

Ternström, Sten

2003-04-01

The singing voice exhibits many systematic differences by gender and age. The physiological differences between the voice organs of males, females, and children are well known and give rise to several acoustic differences, including acoustic power, pitch range, and spectral distribution. Vocal artists often strive to widen their range of expression, and it is not uncommon for males to sing in a femalelike register, as in counter tenors and in some pop/rock genres. The opposite, however, is quite rare. While ambiguous or contradictory gender in speech is usually a social disadvantage, in singing it can be a desired effect. The physical differences in singing voice production between males and females are reviewed in detail. Some interesting borderline cases are examined from an acoustic standpoint.
Elephants can determine ethnicity, gender, and age from acoustic cues in human voices

PubMed Central

McComb, Karen; Shannon, Graeme; Sayialel, Katito N.; Moss, Cynthia

2014-01-01

Animals can accrue direct fitness benefits by accurately classifying predatory threat according to the species of predator and the magnitude of risk associated with an encounter. Human predators present a particularly interesting cognitive challenge, as it is typically the case that different human subgroups pose radically different levels of danger to animals living around them. Although a number of prey species have proved able to discriminate between certain human categories on the basis of visual and olfactory cues, vocalizations potentially provide a much richer source of information. We now use controlled playback experiments to investigate whether family groups of free-ranging African elephants (Loxodonta africana) in Amboseli National Park, Kenya can use acoustic characteristics of speech to make functionally relevant distinctions between human subcategories differing not only in ethnicity but also in sex and age. Our results demonstrate that elephants can reliably discriminate between two different ethnic groups that differ in the level of threat they represent, significantly increasing their probability of defensive bunching and investigative smelling following playbacks of Maasai voices. Moreover, these responses were specific to the sex and age of Maasai presented, with the voices of Maasai women and boys, subcategories that would generally pose little threat, significantly less likely to produce these behavioral responses. Considering the long history and often pervasive predatory threat associated with humans across the globe, it is likely that abilities to precisely identify dangerous subcategories of humans on the basis of subtle voice characteristics could have been selected for in other cognitively advanced animal species. PMID:24616492

Vocal parameters and voice-related quality of life in adult women with and without ovarian function.

PubMed

Ferraz, Pablo Rodrigo Rocha; Bertoldo, Simão Veras; Costa, Luanne Gabrielle Morais; Serra, Emmeliny Cristini Nogueira; Silva, Eduardo Magalhães; Brito, Luciane Maria Oliveira; Chein, Maria Bethânia da Costa

2013-05-01

To identify the perceptual and acoustic parameters of voice in adult women with and without ovarian function and its impact on quality of life related to voice. Cross-sectional and analytical study with 106 women divided into, two groups: G1, with ovarian function (n=43) and G2, without physiological ovarian function (n=63). The women were instructed to sustain the vowel "a" and the sounds of /s/ and /z/ in habitual pitch and loudness. They were also asked to classify their voices and answer the voice-related quality of life (V-RQOL) questionnaire. The perceptual analysis of the vocal samples was performed by three speech-language pathologists using the GRBASI (G: grade; R: roughness; B: breathness; A: asthenia; S: strain; I: instability) scale. The acoustic analysis was carried out with the software VoxMetria 2.7h (CTS Informatica). The data were analyzed using descriptive statistics. In the perceptual analysis, both groups showed a mild deviation for the parameters roughness, strain, and instability, but only G2 showed a mild impact for the overall degree of dysphonia. The mean of fundamental frequency was significantly lower for the G2, with a difference of 17.41Hz between the two groups. There was no impact on V-RQOL in any of the V-RQOL domains for this group. With the menopause, there is a change in women's voices, impacting on some voice parameters. However, there is no direct impact on their quality of life related to voice. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The interaction of tone with voicing and foot structure: evidence from Kera phonetics and phonology

NASA Astrophysics Data System (ADS)

Pearce, Mary Dorothy

This thesis uses acoustic measurements as a basis for the phonological analysis of the interaction of tone with voicing and foot structure in Kera (a Chadic language). In both tone spreading and vowel harmony, the iambic foot acts as a domain for spreading. Further evidence for the foot comes from measurements of duration, intensity and vowel quality. Kera is unusual in combining a tone system with a partially independent metrical system based on iambs. In words containing more than one foot, the foot is the tone bearing unit (TBU), but in shorter words, the TBU is the syllable. In perception and production experiments, results show that Kera speakers, unlike English and French, use the fundamental frequency as the principle cue to 'Voicing" contrast. Voice onset time (VOT) has only a minor role. Historically, tones probably developed from voicing through a process of tonogenesis, but synchronically, the feature voice is no longer contrastive and VOT is used in an enhancing role. Some linguists have claimed that Kera is a key example for their controversial theory of long-distance voicing spread. But as voice is not part of Kera phonology, this thesis gives counter-evidence to the voice spreading claim. An important finding from the experiments is that the phonological grammars are different between village women, men moving to town and town men. These differences are attributed to French contact. The interaction between Kera tone and voicing and contact with French have produced changes from a 2-way voicing contrast, through a 3-way tonal contrast, to a 2-way voicing contrast plus another contrast with short VOT. These diachronic and synchronic tone/voicing facts are analysed using laryngeal features and Optimality Theory. This thesis provides a body of new data, detailed acoustic measurements, and an analysis incorporating current theoretical issues in phonology, which make it of interest to Africanists and theoreticians alike.
Event identification by acoustic signature recognition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dress, W.B.; Kercel, S.W.

1995-07-01

Many events of interest to the security commnnity produce acoustic emissions that are, in principle, identifiable as to cause. Some obvious examples are gunshots, breaking glass, takeoffs and landings of small aircraft, vehicular engine noises, footsteps (high frequencies when on gravel, very low frequencies. when on soil), and voices (whispers to shouts). We are investigating wavelet-based methods to extract unique features of such events for classification and identification. We also discuss methods of classification and pattern recognition specifically tailored for acoustic signatures obtained by wavelet analysis. The paper is divided into three parts: completed work, work in progress, and futuremore » applications. The completed phase has led to the successful recognition of aircraft types on landing and takeoff. Both small aircraft (twin-engine turboprop) and large (commercial airliners) were included in the study. The project considered the design of a small, field-deployable, inexpensive device. The techniques developed during the aircraft identification phase were then adapted to a multispectral electromagnetic interference monitoring device now deployed in a nuclear power plant. This is a general-purpose wavelet analysis engine, spanning 14 octaves, and can be adapted for other specific tasks. Work in progress is focused on applying the methods previously developed to speaker identification. Some of the problems to be overcome include recognition of sounds as voice patterns and as distinct from possible background noises (e.g., music), as well as identification of the speaker from a short-duration voice sample. A generalization of the completed work and the work in progress is a device capable of classifying any number of acoustic events-particularly quasi-stationary events such as engine noises and voices and singular events such as gunshots and breaking glass. We will show examples of both kinds of events and discuss their recognition likelihood.« less
Voice measures of workload in the advanced flight deck: Additional studies

NASA Technical Reports Server (NTRS)

Schneider, Sid J.; Alpert, Murray

1989-01-01

These studies investigated acoustical analysis of the voice as a measure of workload in individual operators. In the first study, voice samples were recorded from a single operator during high, medium, and low workload conditions. Mean amplitude, frequency, syllable duration, and emphasis all tended to increase as workload increased. In the second study, NASA test pilots performed a laboratory task, and used a flight simulator under differing work conditions. For two of the pilots, high workload in the simulator brought about greater amplitude, peak duration, and stress. In both the laboratory and simulator tasks, high workload tended to be associated with more statistically significant drop-offs in the acoustical measures than were lower workload levels. There was a great deal of intra-subject variability in the acoustical measures. The results suggested that in individual operators, increased workload might be revealed by high initial amplitude and frequency, followed by rapid drop-offs over time.
Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech.

PubMed

Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc

2016-10-01

The question of what type of utterance-a sustained vowel or continuous speech-is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.
Identification and human condition analysis based on the human voice analysis

NASA Astrophysics Data System (ADS)

Mieshkov, Oleksandr Yu.; Novikov, Oleksandr O.; Novikov, Vsevolod O.; Fainzilberg, Leonid S.; Kotyra, Andrzej; Smailova, Saule; Kozbekova, Ainur; Imanbek, Baglan

2017-08-01

The paper presents a two-stage biotechnical system for human condition analysis that is based on analysis of human voice signal. At the initial stage, the voice signal is pre-processed and its characteristics in time domain are determined. At the first stage, the developed system is capable of identifying the person in the database on the basis of the extracted characteristics. At the second stage, the model of a human voice is built on the basis of the real voice signals after clustering the whole database.
Voice Quality and Gender Stereotypes: A Study of Lebanese Women With Reinke's Edema.

PubMed

Matar, Nayla; Portes, Cristel; Lancia, Leonardo; Legou, Thierry; Baider, Fabienne

2016-12-01

Women with Reinke's edema (RW) report being mistaken for men during telephone conversations. For this reason, their masculine-sounding voices are interesting for the study of gender stereotypes. The study's objective is to verify their complaint and to understand the cues used in gender identification. Using a self-evaluation study, we verified RW's perception of their own voices. We compared the acoustic parameters of vowels produced by 10 RW to those produced by 10 men and 10 women with healthy voices (hereafter referred to as NW) in Lebanese Arabic. We conducted a perception study for the evaluation of RW, healthy men's, and NW voices by naïve listeners. RW self-evaluated their voices as masculine and their gender identities as feminine. The acoustic parameters that distinguish RW from NW voices concern fundamental frequency, spectral slope, harmonicity of the voicing signal, and complexity of the spectral envelope. Naïve listeners very often rate RW as surely masculine. Listeners may rate RW's gender incorrectly. These incorrect gender ratings are correlated with acoustic measures of fundamental frequency and voice quality. Further investigations will reveal the contribution of each of these parameters to gender perception and guide the treatment plan of patients complaining of a gender ambiguous voice.
Mobile Digital Recording: Adequacy of the iRig and iOS Device for Acoustic and Perceptual Analysis of Normal Voice.

PubMed

Oliveira, Gisele; Fava, Gaetano; Baglione, Melody; Pimpinella, Michael

2017-03-01

To determine whether the iRig and iOS device recording system is comparable with a standard computer recording system for digital voice recording. Thirty-seven vocally healthy adults, between ages 20 and 62, with a mean age of 33.9 years, 13 males and 24 females, were recruited. Recordings were simultaneously digitalized in an iPad and iPhone using a unidirectional condenser microphone for smartphones/tablets (iRig Mic, IK Multimedia) and in a computer laptop (Dell-Inspiron) using a unidirectional condenser microphone (Samson-CL5) connected to a preamplifier with phantom power. Both microphones were lined up at an equal fixed distance from the subject's mouth. Speech tasks consisted of a sustained vowel "ah" at comfortable pitch/loudness, counting from 1 to 10, and a glissando "ah" from a low to a high note. The samples captured on the iOS devices were transferred via SoundCloud in WAV format, and analyzed using the Praat software. The acoustic parameters measured were mean, min, and max F0, SD F0, jitter local, jitter rap, jitter ppq5, jitter ddp, shimmer local, shimmer local-dB, shimmer apq3, shimmer apq5, shimmer apq11, shimmer dda, NHR, and HNR. There were no statistically significant differences for any parameter and speech task analyzed for both iOS devices as compared with the gold standard computer/preamp system (all P values > 0.050). In addition, there were no statistical differences in the perceptual identification of the recordings among devices (P < 0.001). In the present study, the iRig and iOS device may provide reliable digital recording of normal voices. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Effect of Hydration on the Voice Quality of Future Professional Vocal Performers.

PubMed

van Wyk, Liezl; Cloete, Mariaan; Hattingh, Danel; van der Linde, Jeannie; Geertsema, Salome

2017-01-01

The application of systemic hydration as an instrument for optimal voice quality has been a common practice by several professional voice users over the years. Although the physiological action has been determined, the benefits on acoustic and perceptual characteristics are relatively unknown. The present study aimed to determine whether systemic hydration has beneficial outcomes on the voice quality of future professional voice users. A within-subject, pretest posttest design is applied to determine quantitative research results of female singing students between 18 and 32 years of age without a history of voice pathology. Acoustic and perceptual data were collected before and after a 2-hour singing rehearsal. The difference between the hypohydrated condition (controlled) and the hydrated condition (experimental) and the relationship between adequate hydration and acoustic and perceptual parameters of voice was then investigated. A statistical significant (P = 0.041) increase in jitter values were obtained for the hypohydrated condition. Increased maximum phonation time (MPT/z/) and higher maximum frequency for hydration indicated further statistical significant changes in voice quality (P = 0.028 and P = 0.015, respectively). Systemic hydration has positive outcomes on perceptual and acoustic parameters of voice quality for future professional singers. The singer's ability to sustain notes for longer and reach higher frequencies may reflect well in performances. Any positive change in voice quality may benefit the singer's occupational success and subsequently their social, emotional, and vocational well-being. More research evidence is needed to determine the parameters for implementing adequate hydration in vocal hygiene programs. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Short-Term Effect of Two Semi-Occluded Vocal Tract Training Programs on the Vocal Quality of Future Occupational Voice Users: "Resonant Voice Training Using Nasal Consonants" Versus "Straw Phonation".

PubMed

Meerschman, Iris; Van Lierde, Kristiane; Peeters, Karen; Meersman, Eline; Claeys, Sofie; D'haeseleer, Evelien

2017-09-18

The purpose of this study was to determine the short-term effect of 2 semi-occluded vocal tract training programs, "resonant voice training using nasal consonants" versus "straw phonation," on the vocal quality of vocally healthy future occupational voice users. A multigroup pretest-posttest randomized control group design was used. Thirty healthy speech-language pathology students with a mean age of 19 years (range: 17-22 years) were randomly assigned into a resonant voice training group (practicing resonant exercises across 6 weeks, n = 10), a straw phonation group (practicing straw phonation across 6 weeks, n = 10), or a control group (receiving no voice training, n = 10). A voice assessment protocol consisting of both subjective (questionnaire, participant's self-report, auditory-perceptual evaluation) and objective (maximum performance task, aerodynamic assessment, voice range profile, acoustic analysis, acoustic voice quality index, dysphonia severity index) measurements and determinations was used to evaluate the participants' voice pre- and posttraining. Groups were compared over time using linear mixed models and generalized linear mixed models. Within-group effects of time were determined using post hoc pairwise comparisons. No significant time × group interactions were found for any of the outcome measures, indicating no differences in evolution over time among the 3 groups. Within-group effects of time showed a significant improvement in dysphonia severity index in the resonant voice training group, and a significant improvement in the intensity range in the straw phonation group. Results suggest that the semi-occluded vocal tract training programs using resonant voice training and straw phonation may have a positive impact on the vocal quality and vocal capacities of future occupational voice users. The resonant voice training caused an improved dysphonia severity index, and the straw phonation training caused an expansion of the intensity range in
Acoustic Analysis of Speech of Cochlear Implantees and Its Implications

PubMed Central

Patadia, Rajesh; Govale, Prajakta; Rangasayee, R.; Kirtane, Milind

2012-01-01

Objectives Cochlear implantees have improved speech production skills compared with those using hearing aids, as reflected in their acoustic measures. When compared to normal hearing controls, implanted children had fronted vowel space and their /s/ and /∫/ noise frequencies overlapped. Acoustic analysis of speech provides an objective index of perceived differences in speech production which can be precursory in planning therapy. The objective of this study was to compare acoustic characteristics of speech in cochlear implantees with those of normal hearing age matched peers to understand implications. Methods Group 1 consisted of 15 children with prelingual bilateral severe-profound hearing loss (age, 5-11 years; implanted between 4-10 years). Prior to an implant behind the ear, hearing aids were used; prior & post implantation subjects received at least 1 year of aural intervention. Group 2 consisted of 15 normal hearing age matched peers. Sustained productions of vowels and words with selected consonants were recorded. Using Praat software for acoustic analysis, digitized speech tokens were measured for F1, F2, and F3 of vowels; centre frequency (Hz) and energy concentration (dB) in burst; voice onset time (VOT in ms) for stops; centre frequency (Hz) of noise in /s/; rise time (ms) for affricates. A t-test was used to find significant differences between groups. Results Significant differences were found in VOT for /b/, F1 and F2 of /e/, and F3 of /u/. No significant differences were found for centre frequency of burst, energy concentration for stops, centre frequency of noise in /s/, or rise time for affricates. These findings suggest that auditory feedback provided by cochlear implants enable subjects to monitor production of speech sounds. Conclusion Acoustic analysis of speech is an essential method for discerning characteristics which have or have not been improved by cochlear implantation and thus for planning intervention. PMID:22701768
Relation of structural and vibratory kinematics of the vocal folds to two acoustic measures of breathy voice based on computational modeling.

PubMed

Samlan, Robin A; Story, Brad H

2011-10-01

To relate vocal fold structure and kinematics to 2 acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). The authors used a computational, kinematic model of the medial surfaces of the vocal folds to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow, and acoustic waveforms and were related to the underlying vocal fold kinematics. CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1.0 mm to 1.5 mm and decreased with separation > 1.5 mm. CPP is generally a function of vocal process separation. H1*-H2* (see paragraph 6 of article text for an explanation of the asterisks) will increase or decrease with vocal process separation on the basis of vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures.
Relation of structural and vibratory kinematics of the vocal folds to two acoustic measures of breathy voice based on computational modeling

PubMed Central

Samlan, Robin A.; Story, Brad H.

2011-01-01

Purpose To relate vocal fold structure and kinematics to two acoustic measures: cepstral peak prominence (CPP) and the amplitude of the first harmonic relative to the second (H1-H2). Method A computational, kinematic model of the medial surfaces of the vocal folds was used to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: degree of vocal fold adduction, surface bulging, vibratory nodal point, and supraglottal constriction. CPP and H1-H2 were measured from simulated glottal area, glottal flow and acoustic waveforms and related to the underlying vocal fold kinematics. Results CPP decreased with increased separation of the vocal processes, whereas the nodal point location had little effect. H1-H2 increased as a function of separation of the vocal processes in the range of 1–1.5 mm and decreased with separation > 1.5 mm. Conclusions CPP is generally a function of vocal process separation. H1*-H2* will increase or decrease with vocal process separation based on vocal fold shape, pivot point for the rotational mode, and supraglottal vocal tract shape, limiting its utility as an indicator of breathy voice. Future work will relate the perception of breathiness to vocal fold kinematics and acoustic measures. PMID:21498582
Relationship Between Acoustic Voice Onset and Offset and Selected Instances of Oscillatory Onset and Offset in Young Healthy Men and Women.

PubMed

Patel, Rita R; Forrest, Karen; Hedges, Drew

2017-05-01

This study aimed to investigate the relationship between (1) onset of the acoustic signal (X 1 a ) and prephonatory phases associated with oscillatory onset and (2) offset of the acoustic signal (X 2 a ) with the postphonatory events associated with oscillatory offset across vocally healthy adults. High-speed videoendoscopy was captured simultaneously with the acoustic signal during repeated production of /hi.hi.hi/ at typical pitch and loudness from 56 vocally healthy adults (aged 20-42 years; 21 men, 35 women). The relationships between the acoustic sound pressure signal and oscillatory onset and offset events from the glottal area waveforms (GAWs) were statistically investigated using a multivariate linear regression analysis. The X 1 a is a significant predictor of the onset of first oscillatory motion (X 1 g ) and onset of sustained oscillations (X 2 g ). X 1 a as well as gender are significant predictors of the first medial contact of the vocal folds (X 1.5 g ). The X 2 a is a significant predictor of the first instance of oscillatory offset (X 3 g ), first instance of incomplete glottal closure (X 3.5 g ), and complete cessation of (vocal fold) oscillatory motion (X 4 g ). The acoustic signal onset is closely related to the X 1.5 g , but the latency between these events is longer for women compared to men. The X 2 a occurs immediately after incomplete glottal adduction. The emerging normative group latencies between the onset and offset of the acoustic and the GAW from this study appear promising for future investigations. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods.

PubMed

Liu, Boquan; Polce, Evan; Sprott, Julien C; Jiang, Jack J

2018-05-17

The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. A diffusive behavior detection-based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.
An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification.

PubMed

Al-Nasheri, Ahmed; Muhammad, Ghulam; Alsulaiman, Mansour; Ali, Zulfiqar; Mesallam, Tamer A; Farahat, Mohamed; Malki, Khalid H; Bencherif, Mohamed A

2017-01-01

Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Investigating the Effects of Glottal Stop Productions on Voice in Children With Cleft Palate Using Multidimensional Voice Assessment Methods.

PubMed

Aydınlı, Fatma Esen; Özcebe, Esra; Kulak Kayıkçı, Maviş E; Yılmaz, Taner; Özgür, Fatma F

2016-11-01

The aim was to investigate the effects of glottal stop productions (GS) on voice in children with cleft palate using multidimensional voice assessment methods. This is a prospective case-control study. Children with repaired cleft palate (n = 34) who did not have any vocal fold lesions were separated into two groups based on the results of the articulation test. The glottal stop group (GSG) consisted of 17 children who had GS. The control group (CG) consisted of an equal number of age- and gender-matched children who did not have GS. The voice evaluation protocol included acoustic analysis, Pediatric Voice Handicap Index (pVHI), and perceptual analysis (Grade, Roughness, Breathiness, Asthenia, Strain method). The velopharyngeal statuses of the groups were compared using the nasopharyngoscopy and the nasometer. The total pVHI score and the subscales of the pVHI were found to be significantly higher in the GSG. The F0, jitter, and shimmer were found to be numerically higher in the GSG with the difference being statistically significant in jitter (P < 0.05). Audioperceptual analysis revealed a difference in overall voice quality and roughness between the groups. Greater incidence of significant velopharyngeal insufficiency and higher nasalance scores were found in the GSG (P < 0.05). These results may indicate that the vocal quality characteristics of children with GS differ from children who do not have this type of production. It is suggested that children with cleft palate who have GS should receive a comprehensive speech and language pathology intervention including voice therapy techniques. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Role of the Internal Superior Laryngeal Nerve in the Motor Responses of Vocal Cords and the Related Voice Acoustic Changes

PubMed Central

Seifpanahi, Sadegh; Izadi, Farzad; Jamshidi, Ali-Ashraf; Torabinezhad, Farhad; Sarrafzadeh, Javad; Mohammadi, Siavash

2016-01-01

Background: Repeated efforts by researchers to impose voice changes by laryngeal surface electrical stimulation (SES) have come to no avail. This present pre-experimental study employed a novel method for SES application so as to evoke the motor potential of the internal superior laryngeal nerve (ISLN) and create voice changes. Methods: Thirty-two normal individuals (22 females and 10 males) participated in this study. The subjects were selected from the students of Iran University of Medical Sciences in 2014. Two monopolar active electrodes were placed on the thyrohyoid space at the location of the ISLN entrance to the larynx and 1 dispersive electrode was positioned on the back of the neck. A current with special programmed parameters was applied to stimulate the ISLN via the active electrodes and simultaneously the resultant acoustic changes were evaluated. All the means of the acoustic parameters during SES and rest periods were compared using the paired t-test. Results: The findings indicated significant changes (P=0.00) in most of the acoustic parameters during SES presentation compared to them at rest. The mean of fundamental frequency standard deviation (SD F0) at rest was 1.54 (SD=0.55) versus 4.15 (SD=3.00) for the SES period. The other investigated parameters comprised fundamental frequency (F0), minimum F0, jitter, shimmer, harmonic-to-noise ratio (HNR), mean intensity, and minimum intensity. Conclusion: These findings demonstrated significant changes in most of the important acoustic features, suggesting that the stimulation of the ISLN via SES could induce motor changes in the vocal folds. The clinical applicability of the method utilized in the current study in patients with vocal fold paralysis requires further research. PMID:27582586
Swinging at a cocktail party: voice familiarity aids speech perception in the presence of a competing voice.

PubMed

Johnsrude, Ingrid S; Mackey, Allison; Hakyemez, Hélène; Alexander, Elizabeth; Trang, Heather P; Carlyon, Robert P

2013-10-01

People often have to listen to someone speak in the presence of competing voices. Much is known about the acoustic cues used to overcome this challenge, but almost nothing is known about the utility of cues derived from experience with particular voices--cues that may be particularly important for older people and others with impaired hearing. Here, we use a version of the coordinate-response-measure procedure to show that people can exploit knowledge of a highly familiar voice (their spouse's) not only to track it better in the presence of an interfering stranger's voice, but also, crucially, to ignore it so as to comprehend a stranger's voice more effectively. Although performance declines with increasing age when the target voice is novel, there is no decline when the target voice belongs to the listener's spouse. This finding indicates that older listeners can exploit their familiarity with a speaker's voice to mitigate the effects of sensory and cognitive decline.
Voice characteristics of children aged between 6 and 13 years: impact of age, gender, and vocal training.

PubMed

Pribuisiene, Ruta; Uloza, Virgilijus; Kardisiene, Vilija

2011-12-01

To determine impact of age, gender, and vocal training on voice characteristics of children aged 6-13 years. Voice acoustic and phonetogram parameters were determined for the group of 44 singing and 31 non-singing children. No impact of gender and/or age on phonetogram, acoustic voice parameters, and maximum phonation time was detected. Voice ranges of all children represented a pre-pubertal soprano type with a voice range of 22 semitones for non-singing and of 26 semitones for singing individuals. The mean maximum voice intensity was 81 dB. Vocal training had a positive impact on voice intensity parameters in girls. The presented data on average voice characteristics may be applicable in the clinical practice and provide relevant support for voice assessment.

Voice activity and participation profile: assessing the impact of voice disorders on daily activities.

PubMed

Ma, E P; Yiu, E M

2001-06-01

Traditional clinical voice evaluation focuses primarily on the severity of voice impairment, with little emphasis on the impact of voice disorders on the individual's quality of life. This study reports the development of a 28-item assessment tool that evaluates the perception of voice problem, activity limitation, and participation restriction using the International Classification of Impairments, Disabilities and Handicaps-2 Beta-1 concept (World Health Organization, 1997). The questionnaire was administered to 40 subjects with dysphonia and 40 control subjects with normal voices. Results showed that the dysphonic group reported significantly more severe voice problems, limitation in daily voice activities, and restricted participation in these activities than the control group. The study also showed that the perception of a voice problem by the dysphonic subjects correlated positively with the perception of limitation in voice activities and restricted participation. However, the self-perceived voice problem had little correlation with the degree of voice-quality impairment measured acoustically and perceptually by speech pathologists. The data also showed that the aggregate scores of activity limitation and participation restriction were positively correlated, and the extent of activity limitation and participation restriction was similar in all except the job area. These findings highlight the importance of identifying and quantifying the impact of dysphonia on the individual's quality of life in the clinical management of voice disorders.
Nonlinear dynamic analysis of voices before and after surgical excision of vocal polyps

NASA Astrophysics Data System (ADS)

Zhang, Yu; McGilligan, Clancy; Zhou, Liang; Vig, Mark; Jiang, Jack J.

2004-05-01

Phase space reconstruction, correlation dimension, and second-order entropy, methods from nonlinear dynamics, are used to analyze sustained vowels generated by patients before and after surgical excision of vocal polyps. Two conventional acoustic perturbation parameters, jitter and shimmer, are also employed to analyze voices before and after surgery. Presurgical and postsurgical analyses of jitter, shimmer, correlation dimension, and second-order entropy are statistically compared. Correlation dimension and second-order entropy show a statistically significant decrease after surgery, indicating reduced complexity and higher predictability of postsurgical voice dynamics. There is not a significant postsurgical difference in shimmer, although jitter shows a significant postsurgical decrease. The results suggest that jitter and shimmer should be applied to analyze disordered voices with caution; however, nonlinear dynamic methods may be useful for analyzing abnormal vocal function and quantitatively evaluating the effects of surgical excision of vocal polyps.
Effects of Early Smoking Habits on Young Adult Female Voices in Greece.

PubMed

Tafiadis, Dionysios; Toki, Eugenia I; Miller, Kevin J; Ziavra, Nausica

2017-11-01

Cigarette use is a preventable cause of mortality and diseases. The World Health Organization states that Europe and especially Greece has the highest occurrence of smoking among adults. The prevalence of smoking among women in Greece was estimated to be over 30% in 2012. Smoking is a risk factor for many diseases. Studies have demonstrated the association between smoking and laryngeal pathologies as well as changes in voice characteristics. The purpose of this study was to estimate the effect of early smoking habit on young adult female voices and if they perceive any vocal changes using two assessment methods. The Voice Handicap Index and the acoustic analyses of voice measurements were used, with both serving as mini-assessment protocols. Two hundred and ten young females (110 smokers and 100 nonsmokers) attending the Technological Educational Institute of Epirus in the School of Health and Welfare were included. Statistically significant increases for physical and total scores of the Voice Handicap Index were found in the smokers group (P < 0.05). Significant changes were observed for the acoustic parameters between smoker and nonsmoker groups. The results of this study indicated observable signs of change in the voice acoustic characteristics of young adults with early smoking habits. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Comparison of hearing and voicing ranges in singing

NASA Astrophysics Data System (ADS)

Hunter, Eric J.; Titze, Ingo R.

2003-04-01

The spectral and dynamic ranges of the human voice of professional and nonprofessional vocalists were compared to the auditory hearing and feeling thresholds at a distance of one meter. In order to compare these, an analysis was done in true dB SPL, not just relative dB as is usually done in speech analysis. The methodology of converting the recorded acoustic signal to absolute pressure units was described. The human voice range of a professional vocalist appeared to match the dynamic range of the auditory system at some frequencies. In particular, it was demonstrated that professional vocalists were able to make use of the most sensitive part of the hearing thresholds (around 4 kHz) through the use of a learned vocal ring or singer's formant. [Work sponsored by NIDCD.
Acoustic characteristics of different target vowels during the laryngeal telescopy.

PubMed

Shu, Min-Tsan; Lee, Kuo-Shen; Chang, Chin-Wen; Hsieh, Li-Chun; Yang, Cheng-Chien

2014-10-01

The aim of this study was to investigate the acoustic characteristics of target vowels phonated in normal voice persons while performing laryngeal telescopy. The acoustic characteristics are compared to show the extent of possible difference to speculate their impact on phonation function. Thirty-four male subjects aged 20-39 years with normal voice were included in this study. The target vowels were /i/ and /ɛ/. Recording of voice samples was done under natural phonation and during laryngeal telescopy. The acoustic analysis included the parameters of fundamental frequency, jitter, shimmer and noise-to-harmonic ratio. The sound of a target vowel /ɛ/ was perceived identical in more than 90% of the subjects by the examiner and speech language pathologist during the telescopy. Both /i/ and /ɛ/ sounds showed significant difference when compared with the results under natural phonation. There was no significant difference between /i/ and /ɛ/ during the telescopy. The present study showed that change in target vowels during laryngeal telescopy makes no significant difference in the acoustic characteristics. The results may lead to the speculation that the phonation mechanism was not affected significantly by different vowels during the telescopy. This study may suggest that in the principle of comfortable phonation, introduction of the target vowels /i/ and /ɛ/ is practical. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Voice-onset time and buzz-onset time identification: A ROC analysis

NASA Astrophysics Data System (ADS)

Lopez-Bascuas, Luis E.; Rosner, Burton S.; Garcia-Albea, Jose E.

2004-05-01

Previous studies have employed signal detection theory to analyze data from speech and nonspeech experiments. Typically, signal distributions were assumed to be Gaussian. Schouten and van Hessen [J. Acoust. Soc. Am. 104, 2980-2990 (1998)] explicitly tested this assumption for an intensity continuum and a speech continuum. They measured response distributions directly and, assuming an interval scale, concluded that the Gaussian assumption held for both continua. However, Pastore and Macmillan [J. Acoust. Soc. Am. 111, 2432 (2002)] applied ROC analysis to Schouten and van Hessen's data, assuming only an ordinal scale. Their ROC curves suppported the Gaussian assumption for the nonspeech signals only. Previously, Lopez-Bascuas [Proc. Audit. Bas. Speech Percept., 158-161 (1997)] found evidence with a rating scale procedure that the Gaussian model was inadequate for a voice-onset time continuum but not for a noise-buzz continuum. Both continua contained ten stimuli with asynchronies ranging from -35 ms to +55 ms. ROC curves (double-probability plots) are now reported for each pair of adjacent stimuli on the two continua. Both speech and nonspeech ROCs often appeared nonlinear, indicating non-Gaussian signal distributions under the usual zero-variance assumption for response criteria.
[The comparative assessment of the vocal function in the professional voice users and non-occupational voice users in the late adulthood].

PubMed

Pavlikhin, O G; Romanenko, S G; Krasnikova, D I; Lesogorova, E V; Yakovlev, V S

The objective of the present study was to evaluate the clinical and functional condition of the voice apparatus in the elderly patients and to elaborate recommendations for the prevention of disturbances of the vocal function in the professional voice users. This comprehensive study involved 95 patients including the active professional voice users (n=48) and 45 non-occupational voice users at the age from 61 to 82 years with the employment history varying from 32 to 51 years. The study was designed to obtain the voice characteristics by means of the subjective auditory assessment, microlaryngoscopy, video laryngostroboscopy, determination of maximum phonation time (MPT), and computer-assisted acoustic analysis of the voice with the use of the MDVP Kay Pentaxy system. The level of anxiety of the patients was estimated based on the results of the HADS questionnaire study. It is concluded that the majority of the disturbances of the vocal function in the professional voice users have the functional nature. It is concluded that the method of neuro-muscular electrophonopedic stimulation (NMEPS) of laryngeal muscles is the method of choice for the diagnostics of the vocal function of the voice users in the late adulthood. It is recommended that the professional vocal load for such subjects should not exceed 12-14 hours per week. Rational psychotherapy must constitute an important component of the system of measures intended to support the working capacity of the voice users belonging to this age group.
Scientific bases of human-machine communication by voice.

PubMed Central

Schafer, R W

1995-01-01

The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines. PMID:7479802
Modulation of voice related to tremor and vibrato

NASA Astrophysics Data System (ADS)

Lester, Rosemary Anne

Modulation of voice is a result of physiologic oscillation within one or more components of the vocal system including the breathing apparatus (i.e., pressure supply), the larynx (i.e. sound source), and the vocal tract (i.e., sound filter). These oscillations may be caused by pathological tremor associated with neurological disorders like essential tremor or by volitional production of vibrato in singers. Because the acoustical characteristics of voice modulation specific to each component of the vocal system and the effect of these characteristics on perception are not well-understood, it is difficult to assess individuals with vocal tremor and to determine the most effective interventions for reducing the perceptual severity of the disorder. The purpose of the present studies was to determine how the acoustical characteristics associated with laryngeal-based vocal tremor affect the perception of the magnitude of voice modulation, and to determine if adjustments could be made to the voice source and vocal tract filter to alter the acoustic output and reduce the perception of modulation. This research was carried out using both a computational model of speech production and trained singers producing vibrato to simulate laryngeal-based vocal tremor with different voice source characteristics (i.e., vocal fold length and degree of vocal fold adduction) and different vocal tract filter characteristics (i.e., vowel shapes). It was expected that, by making adjustments to the voice source and vocal tract filter that reduce the amplitude of the higher harmonics, the perception of magnitude of voice modulation would be reduced. The results of this study revealed that listeners' perception of the magnitude of modulation of voice was affected by the degree of vocal fold adduction and the vocal tract shape with the computational model, but only by the vocal quality (corresponding to the degree of vocal fold adduction) with the female singer. Based on regression analyses
Relationship Between Voice and Motor Disabilities of Parkinson's Disease.

PubMed

Majdinasab, Fatemeh; Karkheiran, Siamak; Soltani, Majid; Moradi, Negin; Shahidi, Gholamali

2016-11-01

To evaluate voice of Iranian patients with Parkinson's disease (PD) and find any relationship between motor disabilities and acoustic voice parameters as speech motor components. We evaluated 27 Farsi-speaking PD patients and 21 age- and sex-matched healthy persons as control. Motor performance was assessed by the Unified Parkinson's Disease Rating Scale part III and Hoehn and Yahr rating scale in the "on" state. Acoustic voice evaluation, including fundamental frequency (f0), standard deviation of f0, minimum of f0, maximum of f0, shimmer, jitter, and harmonic to noise ratio, was done using the Praat software via /a/ prolongation. No difference was seen between the voice of the patients and the voice of the controls. f0 and its variation had a significant correlation with the duration of the disease, but did not have any relationships with the Unified Parkinson's Disease Rating Scale part III. Only limited relationship was observed between voice and motor disabilities. Tremor is an important main feature of PD that affects motor and phonation systems. Females had an older age at onset, more prolonged disease, and more severe motor disabilities (not statistically significant), but phonation disorders were more frequent in males and showed more relationship with severity of motor disabilities. Voice is affected by PD earlier than many other motor components and is more sensitive to disease progression. Tremor is the most effective part of PD that impacts voice. PD has more effect on voice of male versus female patients. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The maximum intelligible range of the human voice

NASA Astrophysics Data System (ADS)

Boren, Braxton

This dissertation examines the acoustics of the spoken voice at high levels and the maximum number of people that could hear such a voice unamplified in the open air. In particular, it examines an early auditory experiment by Benjamin Franklin which sought to determine the maximum intelligible crowd for the Anglican preacher George Whitefield in the eighteenth century. Using Franklin's description of the experiment and a noise source on Front Street, the geometry and diffraction effects of such a noise source are examined to more precisely pinpoint Franklin's position when Whitefield's voice ceased to be intelligible. Based on historical maps, drawings, and prints, the geometry and material of Market Street is constructed as a computer model which is then used to construct an acoustic cone tracing model. Based on minimal values of the Speech Transmission Index (STI) at Franklin's position, Whitefield's on-axis Sound Pressure Level (SPL) at 1 m is determined, leading to estimates centering around 90 dBA. Recordings are carried out on trained actors and singers to determine their maximum time-averaged SPL at 1 m. This suggests that the greatest average SPL achievable by the human voice is 90-91 dBA, similar to the median estimates for Whitefield's voice. The sites of Whitefield's largest crowds are acoustically modeled based on historical evidence and maps. Based on Whitefield's SPL, the minimal STI value, and the crowd's background noise, this allows a prediction of the minimally intelligible area for each site. These yield maximum crowd estimates of 50,000 under ideal conditions, while crowds of 20,000 to 30,000 seem more reasonable when the crowd was reasonably quiet and Whitefield's voice was near 90 dBA.
Quantitative analysis of professionally trained versus untrained voices.

PubMed

Siupsinskiene, Nora

2003-01-01

The aim of this study was to compare healthy trained and untrained voices as well as healthy and dysphonic trained voices in adults using combined voice range profile and aerodynamic tests, to define the normal range limiting values of quantitative voice parameters and to select the most informative quantitative voice parameters for separation between healthy and dysphonic trained voices. Three groups of persons were evaluated. One hundred eighty six healthy volunteers were divided into two groups according to voice training: non-professional speakers group consisted of 106 untrained voices persons (36 males and 70 females) and professional speakers group--of 80 trained voices persons (21 males and 59 females). Clinical group consisted of 103 dysphonic professional speakers (23 males and 80 females) with various voice disorders. Eighteen quantitative voice parameters from combined voice range profile (VRP) test were analyzed: 8 of voice range profile, 8 of speaking voice, overall vocal dysfunction degree and coefficient of sound, and aerodynamic maximum phonation time. Analysis showed that healthy professional speakers demonstrated expanded vocal abilities in comparison to healthy non-professional speakers. Quantitative voice range profile parameters- pitch range, high frequency limit, area of high frequencies and coefficient of sound differed significantly between healthy professional and non-professional voices, and were more informative than speaking voice or aerodynamic parameters in showing the voice training. Logistic stepwise regression revealed that VRP area in high frequencies was sufficient to discriminate between healthy and dysphonic professional speakers for male subjects (overall discrimination accuracy--81.8%) and combination of three quantitative parameters (VRP high frequency limit, maximum voice intensity and slope of speaking curve) for female subjects (overall model discrimination accuracy--75.4%). We concluded that quantitative voice assessment
Reproducibility of Automated Voice Range Profiles, a Systematic Literature Review.

PubMed

Printz, Trine; Rosenberg, Tine; Godballe, Christian; Dyrvig, Anne-Kirstine; Grøntved, Ågot Møller

2018-05-01

Reliable voice range profiles are of great importance when measuring effects and side effects from surgery affecting voice capacity. Automated recording systems are increasingly used, but the reproducibility of results is uncertain. Our objective was to identify and review the existing literature on test-retest accuracy of the automated voice range profile assessment. Systematic review. PubMed, Scopus, Cochrane Library, ComDisDome, Embase, and CINAHL (EBSCO). We conducted a systematic literature search of six databases from 1983 to 2016. The following keywords were used: phonetogram, voice range profile, and acoustic voice analysis. Inclusion criteria were automated recording procedure, healthy voices, and no intervention between test and retest. Test-retest values concerning fundamental frequency and voice intensity were reviewed. Of 483 abstracts, 231 full-text articles were read, resulting in six articles included in the final results. The studies found high reliability, but data are few and heterogeneous. The reviewed articles generally reported high reliability of the voice range profile, and thus clinical usefulness, but uncertainty remains because of low sample sizes and different procedures for selecting, collecting, and analyzing data. More data are needed, and clinical conclusions must be drawn with caution. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Voice changes after thyroidectomy without recurrent laryngeal nerve injury.

PubMed

Sinagra, Diego L; Montesinos, Manuel R; Tacchi, Verónica A; Moreno, Julio C; Falco, Jorge E; Mezzadri, Norberto A; Debonis, Daniel L; Curutchet, H Pablo

2004-10-01

Injury of the inferior laryngeal nerve is not the only cause of voice alteration after thyroidectomy; many patients notice minimal changes immediately after operation, without evidence of inferior laryngeal nerve damage. We hypothesized that there may be other causes for voice modification, such as injuries of the superior laryngeal nerve, prethyroid strap muscles, and cricothyroid muscles. We describe voice changes after total thyroidectomy, without inferior laryngeal nerve injury, using a computer program to objectively compare different patterns of voice. Forty-six consecutive patients who underwent total thyroidectomy were studied between March 1997 and December 1999. Acoustic voice analysis was performed preoperatively and at the second, fourth, and sixth postoperative months using a microphone adapted to a personal computer. Parameters measured were intensity of the voice (Shimmer) and fundamental frequency (Fo). No complications occurred during operation or in the postoperative period. Voice fatigue during phonation was the most common symptom after thyroidectomy. Forty patients (87%) stated that their voices had changed since the operation, and common complaints were voice alteration while speaking loudly, changes in voice pitch, and voice disorder while singing. Changes in the Fo and Shimmer values in smokers versus nonsmokers were similar (Fo overall, p = 0.56; Shimmer overall, p = 0.66), as were the same parameters in benign and malignant pathologies (Fo overall, p = 0.66; Shimmer overall, p = 0.67). Voice changes after uncomplicated thyroidectomy occur and can be objectively measured. This is important in the preoperative counseling of patients before thyroidectomy, for ethical and legal purposes.
Preliminary study of acoustic analysis for evaluating speech-aid oral prostheses: Characteristic dips in octave spectrum for comparison of nasality.

PubMed

Chang, Yen-Liang; Hung, Chao-Ho; Chen, Po-Yueh; Chen, Wei-Chang; Hung, Shih-Han

2015-10-01

Acoustic analysis is often used in speech evaluation but seldom for the evaluation of oral prostheses designed for reconstruction of surgical defect. This study aimed to introduce the application of acoustic analysis for patients with velopharyngeal insufficiency (VPI) due to oral surgery and rehabilitated with oral speech-aid prostheses. The pre- and postprosthetic rehabilitation acoustic features of sustained vowel sounds from two patients with VPI were analyzed and compared with the acoustic analysis software Praat. There were significant differences in the octave spectrum of sustained vowel speech sound between the pre- and postprosthetic rehabilitation. Acoustic measurements of sustained vowels for patients before and after prosthetic treatment showed no significant differences for all parameters of fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, formant frequency, F1 bandwidth, and band energy difference. The decrease in objective nasality perceptions correlated very well with the decrease in dips of the spectra for the male patient with a higher speech bulb height. Acoustic analysis may be a potential technique for evaluating the functions of oral speech-aid prostheses, which eliminates dysfunctions due to the surgical defect and contributes to a high percentage of intelligible speech. Octave spectrum analysis may also be a valuable tool for detecting changes in nasality characteristics of the voice during prosthetic treatment of VPI. Copyright © 2014. Published by Elsevier B.V.
Acoustic and perceptual aspects of vocal function in children with adenotonsillar hypertrophy--effects of surgery.

PubMed

Lundeborg, Inger; Hultcrantz, Elisabeth; Ericsson, Elisabeth; McAllister, Anita

2012-07-01

To evaluate outcome of two types of tonsil surgery (tonsillectomy [TE]+adenoidectomy or tonsillotomy [TT]+adenoidectomy) on vocal function perceptually and acoustically. Sixty-seven children, aged 50-65 months, on waiting list for tonsil surgery were randomized to TE (n=33) or TT (n=34). Fifty-seven age- and gender-matched healthy preschool children were controls. Twenty-eight of them, aged 48-59 months, served as control group before surgery, and 29, aged 60-71 months, served as control group after surgery. Before surgery and 6 months postoperatively, the children were recorded producing three sustained vowels (/ɑ/, /u/, and /i/) and 14 words. The control groups were recorded only once. Three trained speech and language pathologists performed the perceptual analysis using visual analog scale for eight voice quality parameters. Acoustic analysis from sustained vowels included average fundamental frequency, jitter percent, shimmer percent, noise-to-harmonic ratio, and the center frequencies of formants 1-3. Before surgery, the children were rated to have more hyponasality and compressed/throaty voice (P<0.05) and lower mean pitch (P<0.01) in comparison to the control group. They also had higher perturbation measures and lower frequencies of the second and third formants. After surgery, there were no differences perceptually. Perturbation measures decreased but were still higher compared with those of control group (P<0.05). Differences in formant frequencies for /i/ and /u/ remained. No differences were found between the two surgical methods. Voice quality is affected perceptually and acoustically by adenotonsillar hypertrophy. After surgery, the voice is perceptually normalized but acoustic differences remain. Outcome was equal for both surgical methods. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
[A comparative acoustic study of the speaking and singing voice during the adolescent's break of the voice].

PubMed

Amy de la Bretèque, B; Sanchez, S

2000-01-01

The observation of the vocal evolution of adolescent singers has shown it takes place in two stages, the singing voice changing after the speaking voice. The same pattern has been encountered and made more explicit with a study of 50 non-singer adolescents. It thus appears that the average pitch of the speaking voice deepening by one octave is not by itself the sign that the break of the voice has ended. This study also shows the individual nature of adolescent vocal evolution and its length (up to two years in one out of four cases).
Comparative analysis of perceptual evaluation, acoustic analysis and indirect laryngoscopy for vocal assessment of a population with vocal complaint.

PubMed

Nemr, Kátia; Amar, Ali; Abrahão, Marcio; Leite, Grazielle Capatto de Almeida; Köhle, Juliana; Santos, Alexandra de O; Correa, Luiz Artur Costa

2005-01-01

As a result of technology evolution and development, methods of voice evaluation have changed both in medical and speech and language pathology practice. To relate the results of perceptual evaluation, acoustic analysis and medical evaluation in the diagnosis of vocal and/or laryngeal affections of the population with vocal complaint. Clinical prospective. 29 people that attended vocal health protection campaign were evaluated. They were submitted to perceptual evaluation (AFPA), acoustic analysis (AA), indirect laryngoscopy (LI) and telelaryngoscopy (TL). Correlations between medical and speech language pathology evaluation methods were established, verifying possible statistical signification with the application of Fischer Exact Test. There were statistically significant results in the correlation between AFPA and LI, AFPA and TL, LI and TL. This research study conducted in a vocal health protection campaign presented correlations between speech language pathology evaluation and perceptual evaluation and clinical evaluation, as well as between vocal affection and/or laryngeal medical exams.
International Space Station Acoustics - A Status Report

NASA Technical Reports Server (NTRS)

Allen, Christopher S.; Denham, Samuel A.

2011-01-01

It is important to control acoustic noise aboard the International Space Station (ISS) to provide a satisfactory environment for voice communications, crew productivity, and restful sleep, and to minimize the risk for temporary and permanent hearing loss. Acoustic monitoring is an important part of the noise control process on ISS, providing critical data for trend analysis, noise exposure analysis, validation of acoustic analysis and predictions, and to provide strong evidence for ensuring crew health and safety, thus allowing Flight Certification. To this purpose, sound level meter (SLM) measurements and acoustic noise dosimetry are routinely performed. And since the primary noise sources on ISS include the environmental control and life support system (fans and airflow) and active thermal control system (pumps and water flow), acoustic monitoring will indicate changes in hardware noise emissions that may indicate system degradation or performance issues. This paper provides the current acoustic levels in the ISS modules and sleep stations, and is an update to the status presented in 20031. Many new modules, and sleep stations have been added to the ISS since that time. In addition, noise mitigation efforts have reduced noise levels in some areas. As a result, the acoustic levels on the ISS have improved.
Physiological characteristics of the supported singing voice. A preliminary study.

PubMed

Griffin, B; Woo, P; Colton, R; Casper, J; Brewer, D

1995-03-01

The purpose of this study was to develop a definition of the supported singing voice based on physiological characteristics by comparing the subjects' concepts of a supported voice with objective measurements of their supported and unsupported voice. This preliminary report presents findings based on data from eight classically trained singers. Subjects answered questions about their concepts of the characteristics of the supported singing voice and how it is produced. Samples of the supported and unsupported singing voice produced at low, medium, and high pitches at a comfortable loudness level were collected for acoustic, spectral, airflow, electroglottographic, air volume, and stroboscopic analyses. Significant differences between the supported and unsupported voice were found for sound pressure level (SPL), peak airflow, subglottal pressure (Ps), glottal open time, and frequency of the fourth formant (F4). Mean flow and F2 frequency differences were sex and pitch related. Males adjusted laryngeal configuration to produce supported voice, whereas glottal configuration differences were greater in females. Breathing patterns were variable and not significantly different between supported and unsupported voice. Subjects in this study believe that the supported singing voice is resonant, clear, and easy to manage and is produced by correct breath management. Results of data analysis show that the supported singing voice has different spectral characteristics from and higher SPL, peak airflow, and Ps than the unsupported voice. Singers adjust laryngeal and/or glottal configuration to account for these changes, but no significant differences in breathing activity were found.

The Moderating Effect of Frequent Singing on Voice Aging.

PubMed

Lortie, Catherine L; Rivard, Julie; Thibeault, Mélanie; Tremblay, Pascale

2017-01-01

The effects of aging on voice production are well documented, including changes in loudness, pitch, and voice quality. However, one important and clinically relevant question that remains concerns the possibility that the aging of voice can be prevented or at least delayed through noninvasive methods. Indeed, discovering natural means to preserve the integrity of the human voice throughout aging could have a major impact on the quality of life of elderly adults. The objective of this study was therefore to examine the potentially positive effect of singing on voice production. To this aim, a group of 72 healthy nonsmoking adults (20-93 years old) was recruited and separated into three groups based on their singing habits. Several voice parameters were assessed (fundamental frequency [f0] mean, f0 standard deviation [SD], f0 minimum and f0 maximum, mean amplitude and amplitude SD, jitter, shimmer, and harmonic-to-noise ratio) during the sustained production of vowel /a/. Other parameters were assessed during standardized reading passage (speaking f0, speaking f0 SD). As was expected, age effects were found on most acoustic parameters with significant sex differences. Importantly, moderation analyses revealed that frequent singing moderates the effect of aging on most acoustic parameters. Specifically, in frequent singers, there was no decrease in the stability of pitch and amplitude with age, suggesting that the voice of frequent singers remains more stable in aging than the voice of non-singers, and more generally, providing empirical evidence for a positive effect of singing on voice in aging. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Speech Motor Development during Acquisition of the Voicing Contrast

ERIC Educational Resources Information Center

Grigos, Maria I.; Saxman, John H.; Gordon, Andrew M.

2005-01-01

Lip and jaw movements were studied longitudinally in 19-month-old children as they acquired the voicing contrast for /p/ and /b/. A movement tracking system obtained lip and jaw kinematics as participants produced the target utterances /papa/ and /baba/. Laryngeal adjustments were also tracked through acoustically recorded voice onset time (VOT)…
VOT and the perception of voicing

NASA Astrophysics Data System (ADS)

Remez, Robert E.

2004-05-01

In explaining the ability to distinguish phonemes, linguists have described the dimension of voicing. Acoustic analyses have identified many correlates of the voicing contrast in initial, medial, and final consonants within syllables, and these in turn have motivated studies of the perceptual resolution of voicing. The framing conceptualization articulated by Lisker and Abramson 40 years ago in physiological, phonetic, and perceptual studies has been widely influential, and research on voicing now adopts their perspective without reservation. Their original survey included languages with two voicing categories (Dutch, Puerto Rican Spanish, Hungarian, Tamil, Cantonese, English), three voicing categories (Eastern Armenian, Thai, Korean), and four voicing categories (Hindi, Marathi). Perceptual studies inspired by this work have also ranged widely, including tests with different languages and with listeners of several species. The profound value of the analyses of Lisker and Abramson is evident in the empirical traction provided by the concept of VOT in research on the every important perceptual question about speech and language in our era. Some of these classic perceptual investigations will be reviewed. [Research supported by NIH (DC00308).
Associations between the Transsexual Voice Questionnaire (TVQ[superscript MtF) and Self-Report of Voice Femininity and Acoustic Voice Measures

ERIC Educational Resources Information Center

Dacakis, Georgia; Oates, Jennifer; Douglas, Jacinta

2017-01-01

Background: The Transsexual Voice Questionnaire (TVQ[Superscript MtF]) was designed to capture the voice-related perceptions of individuals whose gender identity as female is the opposite of their birth-assigned gender (MtF women). Evaluation of the psychometric properties of the TVQ[Superscript MtF]is ongoing. Aims: To investigate associations…
Objective and subjective assessment of tracheoesophageal prosthesis voice outcome.

PubMed

D'Alatri, Lucia; Bussu, Francesco; Scarano, Emanuele; Paludetti, Gaetano; Marchese, Maria Raffaella

2012-09-01

To investigate the relationships between objective measures and the results of subjective assessment of voice quality and speech intelligibility in patients submitted to total laryngectomy and tracheoesophageal (TE) puncture. Retrospective. Twenty patients implanted with voice prosthesis were studied. After surgery, the entire sample performed speech rehabilitation. The assessment protocol included maximum phonation time (MPT), number of syllables per deep breath, acoustic analysis of the sustained vowel /a/ and of a bisyllabic word, perceptual evaluation (pleasantness and intelligibility%), and self-assessment. The correlation between pleasantness and intelligibility% was statistically significant. Both the latter were significantly correlated with the acoustic signal type, the number of formant peaks, and the F2-F1 difference. The intelligibility% and number of formant peaks were significantly correlated with the MPT and number of syllables per deep breath. Moreover, significant correlations were found between the number of formant peaks and both intelligibility% and pleasantness. The higher the number of syllables per deep breath and the longer the MPT, significantly higher was the number of formant peaks and the intelligibility%. The study failed to show significant correlation between patient's self-assessment of voice quality and both pleasantness and communication effectiveness. The multidimensional assessment seems to be a reliable tool to evaluate the TE functional outcome. Particularly, the results showed that both pleasantness and intelligibility of TE speech are correlated to the availability of expired air and the function of the vocal tract. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Perceptual Adaptation of Voice Gender Discrimination with Spectrally Shifted Vowels

ERIC Educational Resources Information Center

Li, Tianhao; Fu, Qian-Jie

2011-01-01

Purpose: To determine whether perceptual adaptation improves voice gender discrimination of spectrally shifted vowels and, if so, which acoustic cues contribute to the improvement. Method: Voice gender discrimination was measured for 10 normal-hearing subjects, during 5 days of adaptation to spectrally shifted vowels, produced by processing the…
Panel acoustic contribution analysis.

PubMed

Wu, Sean F; Natarajan, Logesh Kumar

2013-02-01

Formulations are derived to analyze the relative panel acoustic contributions of a vibrating structure. The essence of this analysis is to correlate the acoustic power flow from each panel to the radiated acoustic pressure at any field point. The acoustic power is obtained by integrating the normal component of the surface acoustic intensity, which is the product of the surface acoustic pressure and normal surface velocity reconstructed by using the Helmholtz equation least squares based nearfield acoustical holography, over each panel. The significance of this methodology is that it enables one to analyze and rank relative acoustic contributions of individual panels of a complex vibrating structure to acoustic radiation anywhere in the field based on a single set of the acoustic pressures measured in the near field. Moreover, this approach is valid for both interior and exterior regions. Examples of using this method to analyze and rank the relative acoustic contributions of a scaled vehicle cabin are demonstrated.
Start/End Delays of Voiced and Unvoiced Speech Signals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herrnstein, A

Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measuredmore » acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.« less
Bioengineered vocal fold mucosa for voice restoration*

PubMed Central

Ling, Changying; Li, Qiyao; Brown, Matthew E.; Kishimoto, Yo; Toya, Yutaka; Devine, Erin E.; Choi, Kyeong-Ok; Nishimoto, Kohei; Norman, Ian G.; Tsegyal, Tenzin; Jiang, Jack J.; Burlingham, William J.; Gunasekaran, Sundaram; Smith, Lloyd M.; Frey, Brian L.; Welham, Nathan V.

2015-01-01

Patients with voice impairment caused by advanced vocal fold (VF) fibrosis or tissue loss have few treatment options. A transplantable, bioengineered VF mucosa would address the individual and societal costs of voice-related communication loss. Such a tissue must be biomechanically capable of aerodynamic-to-acoustic energy transfer and high-frequency vibration, and physiologically capable of maintaining a barrier against the airway lumen. Here, we isolated primary human VF fibroblasts and epithelial cells and cocultured them under organotypic conditions. The resulting engineered mucosae showed morphologic features of native tissue, proteome-level evidence of mucosal morphogenesis and emerging extracellular matrix complexity, and rudimentary barrier function in vitro. When grafted into canine larynges ex vivo, the mucosae generated vibratory behavior and acoustic output that were indistinguishable from those of native VF tissue. When grafted into humanized mice in vivo, the mucosae survived and were well tolerated by the human adaptive immune system. This tissue engineering approach has the potential to restore voice function in patients with otherwise untreatable VF mucosal disease. PMID:26582902
Perceptual and acoustic study of professionally trained versus untrained voices.

PubMed

Brown, W S; Rothman, H B; Sapienza, C M

2000-09-01

Acoustic and perceptual analyses were completed to determine the effect of vocal training on professional singers when speaking and singing. Twenty professional singers and 20 nonsingers, acting as the control, were recorded while sustaining a vowel, reading a modified Rainbow Passage, and singing "America the Beautiful." Acoustic measures included fundamental frequency, duration, percent jitter, percent shimmer, noise-to-harmonic ratio, and determination of the presence or absence of both vibrato and the singer's formant. Results indicated that, whereas certain acoustic parameters differentiated singers from nonsingers within sex, no consistently significant trends were found across males and females for either speaking or singing. The most consistent differences were the presence or absence of the singer's vibrato and formant in the singers versus the nonsingers, respectively. Perceptual analysis indicated that singers could be correctly identified with greater frequency than by chance alone from their singing, but not their speaking utterances.
Accuracy of Acoustic Analysis Measurements in the Evaluation of Patients With Different Laryngeal Diagnoses.

PubMed

Lopes, Leonardo Wanderley; Batista Simões, Layssa; Delfino da Silva, Jocélio; da Silva Evangelista, Deyverson; da Nóbrega E Ugulino, Ana Celiane; Oliveira Costa Silva, Priscila; Jefferson Dias Vieira, Vinícius

2017-05-01

This study aims to investigate the accuracy of acoustic measures in discriminating between patients with different laryngeal diagnoses. The study design is descriptive, cross-sectional, and retrospective. A total of 279 female patients participated in the research. Acoustic measures of the mean and standard deviation (SD) values of the fundamental frequency (F 0 ), jitter, shimmer, and glottal to noise excitation (GNE) were extracted from the emission of the vowel /ε/. Isolated acoustic measures do not demonstrate adequate performance in discriminating patients with and without laryngeal alteration. The combination of GNE, SD of the F 0 , jitter, and shimmer improved the ability to classify patients with and without laryngeal alteration. In isolation, the SD of the F 0 , shimmer, and GNE presented acceptable performance in discriminating individuals with different laryngeal diagnoses. The combination of acoustic measurements caused discrete improvement in performance of the classifier to discriminate healthy larynx vs vocal polyp (SD of the F 0 , shimmer, and GNE), healthy larynx vs unilateral vocal fold paralysis (SD of the F 0 and jitter), healthy larynx vs vocal nodules (SD of the F 0 and jitter), healthy larynx vs sulcus vocalis (SD of the F 0 and shimmer), and healthy larynx vs voice disorder due to gastroesophageal reflux (F 0 mean, jitter, and shimmer). Isolated acoustic measures do not demonstrate adequate performance in discriminating patients with and without laryngeal alteration, although they present acceptable performance in classifying different laryngeal diagnoses. Combined acoustic measures present an acceptable capacity to discriminate between the presence and the absence of laryngeal alteration and to differentiate several laryngeal diagnoses. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
World Voice Day in news: analysis of reports on the Voice Campaign in Brazil.

PubMed

Dornelas, Rodrigo; Giannini, Susana Pimentel Pinto; Ferreira, Léslie Piccolotto

2015-01-01

To analyze the television reports on the World Voice Day transmitted by Globo(r) TV. We researched television reports broadcasted by Globo(r) Network in regional television news programs from March 15 to April 20, 2013. For the data analysis, the Document Analysis technique was used. The analyzed variables were the following: location, broadcasting period, duration, interviewed professional, mention of multiprofessional work, orientation to the population, and the interview approach (health promotion or disease prevention). Through statistical analysis, the interview approach was considered the outcome and associated with the other variables. On the regions where there are news programs for the researched TV station, the majority made reports about the Voice Campaign. Among these, we discovered that the five regions of Brazil were contemplated, in the morning/afternoon periods, with medium duration of 5.3 minutes. The presence of the speech-language pathologist was observed in greater numbers of the interviews, as also the emphasis on the importance of a multiprofessional work. Regarding the content presented, the interviewees focused on diseases caused by habits that impair the voice, with orientation to the public about what negatively interferes in the vocal well-being. The approach of the interviews was not, in the majority of times, of the same nature (promoting the vocal well-being or preventing voice disorder), and the interprofessional practice is still seen less frequently as a possible work strategy.
How do you say 'hello'? Personality impressions from brief novel voices.

PubMed

McAleer, Phil; Todorov, Alexander; Belin, Pascal

2014-01-01

On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word 'hello' on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional 'social voice space' with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices.
Voice disorders in actors.

PubMed

Lerner, Michael Z; Paskhover, Boris; Acton, Lynn; Young, Nwanmegha

2013-11-01

The purpose of this study was to investigate the prevalence of vocal pathology among first-year acting students. A retrospective review of 30 first-year graduate-level drama students between 2009 and 2011 was performed. Stroboscopy, Voice Handicap Index-10 questionnaires, and acoustic measures were analyzed. The prevalence of incomplete glottal closure, laryngeal hyperfunction, and decreased mucosal wave was 62%, 59%, and 55%, respectively. Laryngoscopic findings consistent with laryngopharyngeal reflux (LPR) were demonstrated in 48% of subjects. Subgroup analysis of laryngeal hyperfunctioning (HF) and nonhyperfunctioning drama students revealed an increased prevalence of all videostroboscopic abnormalities in the HF group. The increased prevalence of LPR stigmata in HF actors reached statistical significance (P = 0.04). The vocal demands of actors are unique, requiring the effective use of volume, pitch control, and endurance. This is the first study that systematically analyzes the prevalence of vocal pathology in actors. This study will continue throughout their education, anticipating that our feedback along with their vocal training will improve outcomes. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Analysis of laryngeal amyloidosis using high speed digital phonoscopy and acoustics (Conference Presentation)

NASA Astrophysics Data System (ADS)

Blanco, Matthew; Cruz, Raul M.; Izdebski, Krzysztof; Yan, Yuling

2017-02-01

Amyloidosis is an unknown pathogenic process in which abnormally folded proteins are deposited in the extracellular space as macroscopic aggregates. Laryngeal deposits of these proteins are extremely rare, but primarily cause dysphonia in patients. High Speed Digital Phonoscopy (HSDP) was used to capture the kinematics of vocal folds in a patient with laryngeal amyloidosis. Acoustic data was also recorded and both HSDP and acoustics were processed using custom Vocalizer® software to help elucidate the physiological impact of amyloids in the larynx, especially in regards to effects on the voice.
Effect of Spinal Manipulative Therapy on the Singing Voice.

PubMed

Fachinatto, Ana Paula A; Duprat, André de Campos; Silva, Marta Andrada E; Bracher, Eduardo Sawaya Botelho; Benedicto, Camila de Carvalho; Luz, Victor Botta Colangelo; Nogueira, Maruan Nogueira; Fonseca, Beatriz Suster Gomes

2015-09-01

This study investigated the effect of spinal manipulative therapy (SMT) on the singing voice of male individuals. Randomized, controlled, case-crossover trial. Twenty-nine subjects were selected among male members of the Heralds of the Gospel. This association was chosen because it is a group of persons with similar singing activities. Participants were randomly assigned to two groups: (A) chiropractic SMT procedure and (B) nontherapeutic transcutaneous electrical nerve stimulation (TENS) procedure. Recordings of the singing voice of each participant were taken immediately before and after the procedures. After a 14-day period, procedures were switched between groups: participants who underwent SMT on the first day were subjected to TENS and vice versa. Recordings were subjected to perceptual audio and acoustic evaluations. The same recording segment of each participant was selected. Perceptual audio evaluation was performed by a specialist panel (SP). Recordings of each participant were randomly presented thus making the SP blind to intervention type and recording session (before/after intervention). Recordings compiled in a randomized order were also subjected to acoustic evaluation. No differences in the quality of the singing on perceptual audio evaluation were observed between TENS and SMT. No differences in the quality of the singing voice of asymptomatic male singers were observed on perceptual audio evaluation or acoustic evaluation after a single spinal manipulative intervention of the thoracic and cervical spine. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
[An across-scales analysis of the voice self-concept questionnaire (FESS)].

PubMed

Nusseck, Manfred; Richter, Bernhard; Echternach, Matthias; Spahn, Claudia

2018-04-01

The questionnaire for the assessment of the voice selfconcept (FESS) contains three sub-scales indicating the personal relation with the own voice. The scales address the relationship with one's own voice, the awareness of the use of one's own voice, and the perception of the connection between voice and emotional changes. A comprehensive approach across the three scales supporting a simplified interpretation of the results was still missing. The FESS questionnaire was used in a sample of 536 German teachers. With a discrimination analysis, commonalities in the scale characteristics were investigated. For a comparative validation with voice health and psychological and physiological wellbeing, the Voice Handicap Index (VHI), the questionnaire for Work-related Behavior and Experience Patterns (AVEM), and the questionnaire for Health-related Quality of Life (SF-12) were additionally collected. The analysis provided four different groups of voice self-concept: group 1 with healthy values in the voice self-concept and wellbeing scales, group 2 with a low voice self-concept and mean wellbeing values, group 3 with a high awareness of the voice use and mean wellbeing values and group 4 with low values in all scales. The results show that a combined approach across all scales of the questionnaire for the assessment of the voice self-concept enables a more detailed interpretation of the characteristics in the voice self-concept. The presented groups provide an applicable use supporting medical diagnoses. © Georg Thieme Verlag KG Stuttgart · New York.
‘Inner voices’: the cerebral representation of emotional voice cues described in literary texts

PubMed Central

Kreifelts, Benjamin; Gößling-Arnold, Christina; Wertheimer, Jürgen; Wildgruber, Dirk

2014-01-01

While non-verbal affective voice cues are generally recognized as a crucial behavioral guide in any day-to-day conversation their role as a powerful source of information may extend well beyond close-up personal interactions and include other modes of communication such as written discourse or literature as well. Building on the assumption that similarities between the different ‘modes’ of voice cues may not only be limited to their functional role but may also include cerebral mechanisms engaged in the decoding process, the present functional magnetic resonance imaging study aimed at exploring brain responses associated with processing emotional voice signals described in literary texts. Emphasis was placed on evaluating ‘voice’ sensitive as well as task- and emotion-related modulations of brain activation frequently associated with the decoding of acoustic vocal cues. Obtained findings suggest that several similarities emerge with respect to the perception of acoustic voice signals: results identify the superior temporal, lateral and medial frontal cortex as well as the posterior cingulate cortex and cerebellum to contribute to the decoding process, with similarities to acoustic voice perception reflected in a ‘voice’-cue preference of temporal voice areas as well as an emotion-related modulation of the medial frontal cortex and a task-modulated response of the lateral frontal cortex. PMID:24396008
Mares Prefer the Voices of Highly Fertile Stallions

PubMed Central

Lemasson, Alban; Remeuf, Kévin; Trabalon, Marie; Cuir, Frédérique; Hausberger, Martine

2015-01-01

We investigated the possibility that stallion whinnies, known to encode caller size, also encoded information about caller arousal and fertility, and the reactions of mares in relation to type of voice. Voice acoustic features are correlated with arousal and reproduction success, the lower-pitched the stallion’s voice, the slower his heart beat and the higher his fertility. Females from three study groups preferred playbacks of low-pitched voices. Hence, females are attracted by frequencies encoding for large male size, calmness and high fertility. More work is needed to explore the relative importance of morpho-physiological features. Assortative mating may be involved as large females preferred voices of larger stallions. Our study contributes to basic and applied ongoing research on mammal reproduction, and questions the mechanisms used by females to detect males’ fertility. PMID:25714814
Effect of the menstrual cycle on voice quality.

PubMed

Silverman, E M; Zimmer, C H

1978-01-01

The question addressed was whether most young women with no vocal training exhibit premenstrual hoarseness. Spectral (acoustical) analyses of the sustained productions of three vowels produced by 20 undergraduates at and at premenstruation were rated for degree of hoarseness. Statistical analysis of the data indicated that the typical subject was no more hoarse of premenstruation than at ovulation. To determine whether this finding represented a genuine characteristic of women's voices or a type II statistical error, a systematic replication was undertaken with another sample of 27 undergraduates. The finding replicated that of the original investigation, suggesting that premenstrual hoarseness is a rarely occurring condition among young women with no vocal training. The apparent differential effect of the menstrual cycle on trained as opposed to untrained voices deserves systematic investigation.

The source-filter theory of whistle-like calls in marmosets: Acoustic analysis and simulation of helium-modulated voices.

PubMed

Koda, Hiroki; Tokuda, Isao T; Wakita, Masumi; Ito, Tsuyoshi; Nishimura, Takeshi

2015-06-01

Whistle-like high-pitched "phee" calls are often used as long-distance vocal advertisements by small-bodied marmosets and tamarins in the dense forests of South America. While the source-filter theory proposes that vibration of the vocal fold is modified independently from the resonance of the supralaryngeal vocal tract (SVT) in human speech, a source-filter coupling that constrains the vibration frequency to SVT resonance effectively produces loud tonal sounds in some musical instruments. Here, a combined approach of acoustic analyses and simulation with helium-modulated voices was used to show that phee calls are produced principally with the same mechanism as in human speech. The animal keeps the fundamental frequency (f0) close to the first formant (F1) of the SVT, to amplify f0. Although f0 and F1 are primarily independent, the degree of their tuning can be strengthened further by a flexible source-filter interaction, the variable strength of which depends upon the cross-sectional area of the laryngeal cavity. The results highlight the evolutionary antiquity and universality of the source-filter model in primates, but the study can also explore the diversification of vocal physiology, including source-filter interaction and its anatomical basis in non-human primates.
Does CPAP treatment affect the voice?

PubMed

Saylam, Güleser; Şahin, Mustafa; Demiral, Dilek; Bayır, Ömer; Yüceege, Melike Bağnu; Çadallı Tatar, Emel; Korkmaz, Mehmet Hakan

2016-12-20

The aim of this study was to investigate alterations in voice parameters among patients using continuous positive airway pressure (CPAP) for the treatment of obstructive sleep apnea syndrome. Patients with an indication for CPAP treatment without any voice problems and with normal laryngeal findings were included and voice parameters were evaluated before and 1 and 6 months after CPAP. Videolaryngostroboscopic findings, a self-rated scale (Voice Handicap Index-10, VHI-10), perceptual voice quality assessment (GRBAS: grade, roughness, breathiness, asthenia, strain), and acoustic parameters were compared. Data from 70 subjects (48 men and 22 women) with a mean age of 44.2 ± 6.0 years were evaluated. When compared with the pre-CPAP treatment period, there was a significant increase in the VHI-10 score after 1 month of treatment and in VHI- 10 and total GRBAS scores, jitter percent (P = 0.01), shimmer percent, noise-to-harmonic ratio, and voice turbulence index after 6 months of treatment. Vague negative effects on voice parameters after the first month of CPAP treatment became more evident after 6 months. We demonstrated nonsevere alterations in the voice quality of patients under CPAP treatment. Given that CPAP is a long-term treatment it is important to keep these alterations in mind.
Acoustic analyses of thyroidectomy-related changes in vowel phonation.

PubMed

Solomon, Nancy Pearl; Awan, Shaheen N; Helou, Leah B; Stojadinovic, Alexander

2012-11-01

Changes in vocal function that can occur after thyroidectomy were tracked with acoustic analyses of sustained vowel productions. The purpose was to determine which time-based or spectral/cepstral-based measures of two vowels were able to detect voice changes over time in patients undergoing thyroidectomy. Prospective, longitudinal, and observational clinical trial. Voice samples of sustained /ɑ/ and /i/ recorded from 70 adults before and approximately 2 weeks, 3 months, and 6 months after thyroid surgery were analyzed for jitter, shimmer, harmonic-to-noise ratio (HNR), cepstral peak prominence (CPP), low-to-high ratio of spectral energy (L/H ratio), and the standard deviations of CPP and L/H ratio. Three trained listeners rated vowel and sentence productions for the four data collection sessions for each participant. For analysis purposes, participants were categorized post hoc according to voice outcome (VO) at their first postthyroidectomy assessment session. Shimmer, HNR, and CPP differed significantly across sessions; follow-up analyses revealed the strongest effect for CPP. CPP for /ɑ/ and /i/ differed significantly between groups of participants with normal versus negative (adverse) VO and between the pre- and 2-week postthyroidectomy sessions for the negative VO group. HNR, CPP, and L/H ratio differed across vowels, but both /ɑ/ and /i/ were similarly effective in tracking voice changes over time and differentiating VO groups. This study indicated that shimmer, HNR, and CPP determined from vowel productions can be used to track changes in voice over time as patients undergo and subsequently recover from thyroid surgery, with CPP being the strongest variable for this purpose. Evidence did not clearly reveal whether acoustic voice evaluations should include both /ɑ/ and /i/ vowels, but they should specify which vowel is used to allow for comparisons across studies and multiple clinical assessments. Copyright © 2012 The Voice Foundation. All rights
Change of signs, symptoms and voice quality evaluations throughout a 3- to 6-month empirical treatment for laryngopharyngeal reflux disease.

PubMed

Lechien, J R; Finck, C; Khalife, M; Huet, K; Delvaux, V; Picalugga, M; Harmegnies, B; Saussez, S

2018-05-16

To assess the usefulness of voice quality measurements as a treatment outcome in patients with laryngopharyngeal reflux (LPR)-related symptoms. Prospective uncontrolled multi-centre study. A total of 80 clinically diagnosed LPR patients with a reflux finding score (RFS)>7 and a reflux symptom index (RSI)>13 were treated with pantoprazole and diet recommendations during 3 or 6 months, according to their evolution. RSI; RFS; blinded Grade, Roughness, Breathiness, Asthenia, Strain and Instability (GRBASI) and aerodynamic and acoustic measurements were evaluated at baseline, 3 months (n = 80), and 6 months (n = 41) post-treatment. We conducted a correlation analysis between the adherence to the diet, and the evolution of both signs and symptoms and between videolaryngostroboscopic signs and acoustic measurements. Reflux symptom index, RFS, perceptual voice quality evaluations (dysphonia, roughness, strain and instability), and aerodynamic and acoustic measurements (ie, percent jitter and percent shimmer) were significantly improved at 3 months post-treatment but not at 6 months. Percent jitter was the most useful outcome for evaluating the clinical evolution of patients throughout the treatment course. A significant relationship between globus sensation and posterior commissure hypertrophy was documented; both seemed to significantly improve from 3 to 6 months. The correlation analysis revealed correlations between adherence to diet recommendations and the improvement of symptoms and between posterior commissure granulation severity and acoustic measurement impairments. Voice quality improved in a manner similar to both signs and symptoms throughout a 6-month empirical treatment with better improvement the 3 first months. Voice quality assessments can be used as indicators of treatment effectiveness in patients with LPR-related symptoms. © 2018 John Wiley & Sons Ltd.
Tracking Voice Change after Thyroidectomy: Application of Spectral/Cepstral Analyses

ERIC Educational Resources Information Center

Awan, Shaheen N.; Helou, Leah B.; Stojadinovic, Alexander; Solomon, Nancy Pearl

2011-01-01

This study evaluates the utility of perioperative spectral and cepstral acoustic analyses to monitor voice change after thyroidectomy. Perceptual and acoustic analyses were conducted on speech samples (sustained vowel /[alpha]/ and CAPE-V sentences) provided by 70 participants (36 women and 34 men) at four study time points: prior to thyroid…
Double Fourier analysis for Emotion Identification in Voiced Speech

NASA Astrophysics Data System (ADS)

Sierra-Sosa, D.; Bastidas, M.; Ortiz P., D.; Quintero, O. L.

2016-04-01

We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented.
On the role of glottis-interior sources in the production of voiced sound.

PubMed

Howe, M S; McGowan, R S

2012-02-01

The voice source is dominated by aeroacoustic sources downstream of the glottis. In this paper an investigation is made of the contribution to voiced speech of secondary sources within the glottis. The acoustic waveform is ultimately determined by the volume velocity of air at the glottis, which is controlled by vocal fold vibration, pressure forcing from the lungs, and unsteady backreactions from the sound and from the supraglottal air jet. The theory of aerodynamic sound is applied to study the influence on the fine details of the acoustic waveform of "potential flow" added-mass-type glottal sources, glottis friction, and vorticity either in the glottis-wall boundary layer or in the portion of the free jet shear layer within the glottis. These sources govern predominantly the high frequency content of the sound when the glottis is near closure. A detailed analysis performed for a canonical, cylindrical glottis of rectangular cross section indicates that glottis-interior boundary/shear layer vortex sources and the surface frictional source are of comparable importance; the influence of the potential flow source is about an order of magnitude smaller. © 2012 Acoustical Society of America
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOEpatents

Holzrichter, John F.; Ng, Lawrence C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOEpatents

Holzrichter, J.F.; Ng, L.C.

1998-03-17

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.
Factors associated with voice disorders among teachers: a case-control study.

PubMed

Giannini, Susana Pimentel Pinto; Latorre, Maria do Rosário Dias de Oliveira; Ferreira, Léslie Piccolotto

2013-01-01

We aimed at verifying an association between voice disorders/stress and loss of work ability among female teachers who work in São Paulo's public school system. This is a paired case- control study. The case group was composed offiteachers with alterations in speech and larynges assessments, and the control group was formed by teachers without alterations in these evaluations who work in the same schools. Both groups answered the following questionnaires: Conditions of Vocal Production-Teachers, Job Stress Scale, and Work Ability Index. The analysis was performed using the chi-square association test and logistic regression models with the purpose of estimating the association between independent variables and voice disorders. We found differences between the groups in relation to stress in the workplace under high demand, a situation that poses greater risks of adverse reactions to the workers' physical and mental health. Regarding the ability to work, the categories poor and moderate ability for work are associated with voice disorders, regardless of job stress factors, age, and the unsatisfactory acoustic properties of the classrooms. This study confirmed the association between voice disorders and job stress, as well as between voice disorders and loss of work ability.
Improving Speaker Recognition by Biometric Voice Deconstruction

PubMed Central

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions. PMID:26442245
Improving Speaker Recognition by Biometric Voice Deconstruction.

PubMed

Mazaira-Fernandez, Luis Miguel; Álvarez-Marquina, Agustín; Gómez-Vilda, Pedro

2015-01-01

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.
The Sound of Voice: Voice-Based Categorization of Speakers’ Sexual Orientation within and across Languages

PubMed Central

Maass, Anne; Paladino, Maria Paola; Vespignani, Francesco; Eyssel, Friederike; Bentler, Dominik

2015-01-01

Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency) and to non-native speakers (language-specificity), has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity. PMID:26132820
The Sound of Voice: Voice-Based Categorization of Speakers' Sexual Orientation within and across Languages.

PubMed

Sulpizio, Simone; Fasoli, Fabio; Maass, Anne; Paladino, Maria Paola; Vespignani, Francesco; Eyssel, Friederike; Bentler, Dominik

2015-01-01

Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency) and to non-native speakers (language-specificity), has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity.
Measurement of voice onset time in maxillectomy patients.

PubMed

Hattori, Mariko; Sumita, Yuka I; Taniguchi, Hisashi

2014-01-01

Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients for objective speech evaluation. We examined voice onset time for /ka/ and /ta/ in 13 maxillectomy patients by calculating the number of valid measurements of voice onset time out of three trials for each syllable. Wilcoxon's signed rank test showed that voice onset time measurements were more successful for /ka/ and /ta/ when a prosthesis was used (Z = -2.232, P = 0.026 and Z = -2.401, P = 0.016, resp.) than when a prosthesis was not used. These results indicate a prosthesis affected voice onset measurement in these patients. Although more research in this area is needed, measurement of voice onset time has the potential to be used to evaluate consonant production in maxillectomy patients wearing a prosthesis.
Measurement of Voice Onset Time in Maxillectomy Patients

PubMed Central

Hattori, Mariko; Sumita, Yuka I.; Taniguchi, Hisashi

2014-01-01

Objective speech evaluation using acoustic measurement is needed for the proper rehabilitation of maxillectomy patients. For digital evaluation of consonants, measurement of voice onset time is one option. However, voice onset time has not been measured in maxillectomy patients as their consonant sound spectra exhibit unique characteristics that make the measurement of voice onset time challenging. In this study, we established criteria for measuring voice onset time in maxillectomy patients for objective speech evaluation. We examined voice onset time for /ka/ and /ta/ in 13 maxillectomy patients by calculating the number of valid measurements of voice onset time out of three trials for each syllable. Wilcoxon's signed rank test showed that voice onset time measurements were more successful for /ka/ and /ta/ when a prosthesis was used (Z = −2.232, P = 0.026 and Z = −2.401, P = 0.016, resp.) than when a prosthesis was not used. These results indicate a prosthesis affected voice onset measurement in these patients. Although more research in this area is needed, measurement of voice onset time has the potential to be used to evaluate consonant production in maxillectomy patients wearing a prosthesis. PMID:24574934
Amygdala and auditory cortex exhibit distinct sensitivity to relevant acoustic features of auditory emotions.

PubMed

Pannese, Alessia; Grandjean, Didier; Frühholz, Sascha

2016-12-01

Discriminating between auditory signals of different affective value is critical to successful social interaction. It is commonly held that acoustic decoding of such signals occurs in the auditory system, whereas affective decoding occurs in the amygdala. However, given that the amygdala receives direct subcortical projections that bypass the auditory cortex, it is possible that some acoustic decoding occurs in the amygdala as well, when the acoustic features are relevant for affective discrimination. We tested this hypothesis by combining functional neuroimaging with the neurophysiological phenomena of repetition suppression (RS) and repetition enhancement (RE) in human listeners. Our results show that both amygdala and auditory cortex responded differentially to physical voice features, suggesting that the amygdala and auditory cortex decode the affective quality of the voice not only by processing the emotional content from previously processed acoustic features, but also by processing the acoustic features themselves, when these are relevant to the identification of the voice's affective value. Specifically, we found that the auditory cortex is sensitive to spectral high-frequency voice cues when discriminating vocal anger from vocal fear and joy, whereas the amygdala is sensitive to vocal pitch when discriminating between negative vocal emotions (i.e., anger and fear). Vocal pitch is an instantaneously recognized voice feature, which is potentially transferred to the amygdala by direct subcortical projections. These results together provide evidence that, besides the auditory cortex, the amygdala too processes acoustic information, when this is relevant to the discrimination of auditory emotions. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Autophonic Loudness of Singers in Simulated Room Acoustic Environments.

PubMed

Yadav, Manuj; Cabrera, Densil

2017-05-01

This paper aims to study the effect of room acoustics and phonemes on the perception of loudness of one's own voice (autophonic loudness) for a group of trained singers. For a set of five phonemes, 20 singers vocalized over several autophonic loudness ratios, while maintaining pitch constancy over extreme voice levels, within five simulated rooms. There were statistically significant differences in the slope of the autophonic loudness function (logarithm of autophonic loudness as a function of voice sound pressure level) for the five phonemes, with slopes ranging from 1.3 (/a:/) to 2.0 (/z/). There was no significant variation in the autophonic loudness function slopes with variations in room acoustics. The autophonic room response, which represents a systematic decrease in voice levels with increasing levels of room reflections, was also studied, with some evidence found in support. Overall, the average slope of the autophonic room response for the three corner vowels (/a:/, /i:/, and /u:/) was -1.4 for medium autophonic loudness. The findings relating to the slope of the autophonic loudness function are in agreement with the findings of previous studies where the sensorimotor mechanisms in regulating voice were shown to be more important in the perception of autophonic loudness than hearing of room acoustics. However, the role of room acoustics, in terms of the autophonic room response, is shown to be more complicated, requiring further inquiry. Overall, it is shown that autophonic loudness grows at more than twice the rate of loudness growth for sounds created outside the human body. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
International Space Station Acoustics - A Status Report

NASA Technical Reports Server (NTRS)

Allen, Christopher S.

2015-01-01

It is important to control acoustic noise aboard the International Space Station (ISS) to provide a satisfactory environment for voice communications, crew productivity, alarm audibility, and restful sleep, and to minimize the risk for temporary and permanent hearing loss. Acoustic monitoring is an important part of the noise control process on ISS, providing critical data for trend analysis, noise exposure analysis, validation of acoustic analyses and predictions, and to provide strong evidence for ensuring crew health and safety, thus allowing Flight Certification. To this purpose, sound level meter (SLM) measurements and acoustic noise dosimetry are routinely performed. And since the primary noise sources on ISS include the environmental control and life support system (fans and airflow) and active thermal control system (pumps and water flow), acoustic monitoring will reveal changes in hardware noise emissions that may indicate system degradation or performance issues. This paper provides the current acoustic levels in the ISS modules and sleep stations and is an update to the status presented in 2011. Since this last status report, many payloads (science experiment hardware) have been added and a significant number of quiet ventilation fans have replaced noisier fans in the Russian Segment. Also, noise mitigation efforts are planned to reduce the noise levels of the T2 treadmill and levels in Node 3, in general. As a result, the acoustic levels on the ISS continue to improve.
Acoustic, respiratory kinematic and electromyographic effects of vocal training

NASA Astrophysics Data System (ADS)

Mendes, Ana Paula De Brito Garcia

The longitudinal effects of vocal training on the respiratory, phonatory and articulatory systems were investigated in this study. During four semesters, fourteen voice major students were recorded while speaking and singing. Acoustic, temporal, respiratory kinematic and electromyographic parameters were measured to determine changes in the three systems as a function of vocal training. Acoustic measures of the speaking voice included fundamental frequency, sound pressure level (SPL), percent jitter and shimmer, and harmonic-to-noise ratio. Temporal measures included duration of sentences, diphthongs and the closure durations of stop consonants. Acoustic measures of the singing voice included fundamental frequency and sound pressure level of the phonational range, vibrato pulses per second, vibrato amplitude variation and the presence of the singer's formant. Analysis of the data revealed that vocal training had a significant effect on the singing voice. Fundamental frequency and SPL of the 90% level and 90--10% of the phonational range increased significantly during four semesters of vocal training. Physiological data was collected from four subjects during three semesters of vocal training. Respiratory kinematic measures included lung volume, rib cage and abdominal excursions extracted from spoken sung samples. Descriptive statistics revealed that rib cage and abdominal excursions increased from the 1st to the 2nd semester and decrease from the 2nd to the 3rd semester of vocal training. Electromyographic measures of the pectoralis major, rectus abdominis and external obliques muscles revealed that burst duration means decreased from the 1st to the 2nd semester and increased from the 2nd to the 3rd semester. Peak amplitude means increased from the 1st to the 2nd and decreased from the 2nd to the 3rd semester of vocal training. Chest wall excursions and muscle force generation of the three muscles increased as the demanding level and the length of the phonatory

Speech Adjustments for Room Acoustics and Their Effects on Vocal Effort.

PubMed

Bottalico, Pasquale

2017-05-01

The aims of the present study are (1) to analyze the effects of the acoustical environment and the voice style on time dose (D t_p ) and fundamental frequency (mean f 0 and standard deviation std_f 0 ) while taking into account the effect of short-term vocal fatigue and (2) to predict the self-reported vocal effort from the voice acoustical parameters. Ten male and ten female subjects were recorded while reading a text in normal and loud styles, in three rooms-anechoic, semi-reverberant, and reverberant-with and without acrylic glass panels 0.5 m from the mouth, which increased external auditory feedback. Subjects quantified how much effort was required to speak in each condition on a visual analogue scale after each task. (Aim1) In the loud style, D t_p , f 0 , and std_f 0 increased. The D t_p was higher in the reverberant room compared to the other two rooms. Both genders tended to increase f 0 in less reverberant environments, whereas a more monotonous speech was produced in rooms with greater reverberation. All three voice parameters increased with short-term vocal fatigue. (Aim2) A model of the vocal effort to acoustic vocal parameters is proposed. The sound pressure level contributed to 66% of the variance explained by the model, followed by the f 0 (30%) and the modulation in amplitude (4%). The results provide insight into how voice acoustical parameters can predict vocal effort. In particular, it increased when SPL and f 0 increased and when the amplitude voice modulation decreased. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice Quality in Native and Foreign Languages Investigated by Inverse Filtering and Perceptual Analyses.

PubMed

Järvinen, Kati; Laukkanen, Anne-Maria; Geneid, Ahmed

2017-03-01

Language shift from native (L1) to foreign language (L2) may affect speaker's voice production and induce vocal fatigue. This study investigates the effects of language shift on voice source and perceptual voice quality. This is a comparative experimental study. Twenty-four subjects were recorded in L1 and L2. Twelve of the subjects were native Finnish speakers and 12 were native English speakers, and the foreign languages were English and Finnish. Two groups were created based on reports of fatigability. Group 1 had the subjects who did not report more vocal fatigue in L2 than in L1, and in group 2 those who reported more vocal fatigue in L2 than in L1. Acoustic analyses by inverse filtering were conducted in L1 and L2. Also, the subjects' voices were perceptually evaluated in both languages. Results show that language shift from L1 to L2 increased perceived pressedness of voice. Acoustic analyses correlated with the perceptual evaluations. Also, the subjects who reported more vocal loading had poorer voice quality, more strenuous voice production, more pressed phonation, and a higher pitch. Voice production was less optimal in L2 than in L1. Speech training given in L2 could be beneficial for people who need to use L2 extensively. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The influence of vocal training and acting experience on measures of voice quality and emotional genuineness

PubMed Central

Livingstone, Steven R.; Choi, Deanna H.; Russo, Frank A.

2014-01-01

Vocal training through singing and acting lessons is known to modify acoustic parameters of the voice. While the effects of singing training have been well documented, the role of acting experience on the singing voice remains unclear. In two experiments, we used linear mixed models to examine the relationships between the relative amounts of acting and singing experience on the acoustics and perception of the male singing voice. In Experiment 1, 12 male vocalists were recorded while singing with five different emotions, each with two intensities. Acoustic measures of pitch accuracy, jitter, and harmonics-to-noise ratio (HNR) were examined. Decreased pitch accuracy and increased jitter, indicative of a lower “voice quality,” were associated with more years of acting experience, while increased pitch accuracy was associated with more years of singing lessons. We hypothesized that the acoustic deviations exhibited by more experienced actors was an intentional technique to increase the genuineness or truthfulness of their emotional expressions. In Experiment 2, listeners rated vocalists’ emotional genuineness. Vocalists with more years of acting experience were rated as more genuine than vocalists with less acting experience. No relationship was reported for singing training. Increased genuineness was associated with decreased pitch accuracy, increased jitter, and a higher HNR. These effects may represent a shifting of priorities by male vocalists with acting experience to emphasize emotional genuineness over pitch accuracy or voice quality in their singing performances. PMID:24639659
Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: The role of vocalizer body size and voice-acoustic allometry

NASA Astrophysics Data System (ADS)

Rendall, Drew; Kollias, Sophie; Ney, Christina; Lloyd, Peter

2005-02-01

Key voice features-fundamental frequency (F0) and formant frequencies-can vary extensively between individuals. Much of the variation can be traced to differences in the size of the larynx and vocal-tract cavities, but whether these differences in turn simply reflect differences in speaker body size (i.e., neutral vocal allometry) remains unclear. Quantitative analyses were therefore undertaken to test the relationship between speaker body size and voice F0 and formant frequencies for human vowels. To test the taxonomic generality of the relationships, the same analyses were conducted on the vowel-like grunts of baboons, whose phylogenetic proximity to humans and similar vocal production biology and voice acoustic patterns recommend them for such comparative research. For adults of both species, males were larger than females and had lower mean voice F0 and formant frequencies. However, beyond this, F0 variation did not track body-size variation between the sexes in either species, nor within sexes in humans. In humans, formant variation correlated significantly with speaker height but only in males and not in females. Implications for general vocal allometry are discussed as are implications for speech origins theories, and challenges to them, related to laryngeal position and vocal tract length. .
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holzrichter, J.F.; Ng, L.C.

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used formore » purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.« less
An initial study of voice characteristics of children using two different sound coding strategies in comparison to normal hearing children.

PubMed

Coelho, Ana Cristina; Brasolotto, Alcione Ghedini; Bevilacqua, Maria Cecília

2015-06-01

To compare some perceptual and acoustic characteristics of the voices of children who use the advanced combination encoder (ACE) or fine structure processing (FSP) speech coding strategies, and to investigate whether these characteristics differ from children with normal hearing. Acoustic analysis of the sustained vowel /a/ was performed using the multi-dimensional voice program (MDVP). Analyses of sequential and spontaneous speech were performed using the real time pitch. Perceptual analyses of these samples were performed using visual-analogic scales of pre-selected parameters. Seventy-six children from three years to five years and 11 months of age participated. Twenty-eight were users of ACE, 23 were users of FSP, and 25 were children with normal hearing. Although both groups with CI presented with some deviated vocal features, the users of ACE presented with voice quality more like children with normal hearing than the users of FSP. Sound processing of ACE appeared to provide better conditions for auditory monitoring of the voice, and consequently, for better control of the voice production. However, these findings need to be further investigated due to the lack of comparative studies published to understand exactly which attributes of sound processing are responsible for differences in performance.
Surgery or Rehabilitation: A Randomized Clinical Trial Comparing the Treatment of Vocal Fold Polyps via Phonosurgery and Traditional Voice Therapy with "Voice Therapy Expulsion" Training.

PubMed

Barillari, Maria Rosaria; Volpe, Umberto; Mirra, Giuseppina; Giugliano, Francesco; Barillari, Umberto

2017-05-01

Phonomicrosurgery is generally considered to be the treatment of choice for removing vocal fold polyps. However, specific techniques of voice therapy may represent, in selected cases and under certain conditions, a noninvasive therapeutic option for the treatment of such laryngeal lesions. The aim of the present study is to longitudinally assess, in terms of clinical outcomes and quality of life, two groups of patients with cordal polyps, treated either with standard surgery plus standard voice therapy or with a specific training of voice therapy alone, which we have called "Voice Therapy Expulsion." This study is a randomized controlled trial. A total of 150 patients with vocal fold polyps were randomly assigned to either standard surgery or "voice therapy expulsion" protocol. The trial was carried out at the Division of Phoniatrics and Audiology of the Second University of Naples and at the Division of Communication Disorders of Local Health Unit (3 Naples South) from January 2010 to December 2013. A thorough phoniatric evaluation, including laryngostroboscopy, acoustic voice analysis, global grade of dysphonia, instability, roughness, breathiness, asthenia, and strain scale, Voice Handicap Index, and Voice-Related Quality of Life, was performed by using standardized tools, at baseline, at the end of the treatment, and up to 1 year after treatment. We found no significant differences between the two experimental groups in terms of clinical outcomes and personal satisfaction. However, "Voice Therapy Expulsion" was associated with higher scores for quality of life at endpoint evaluation. Besides phonosurgery, this specific "Voice Therapy Expulsion" technique should be considered as a valid, noninvasive, and well-tolerated therapeutic option for the treatment of selected patients with vocal fold polyps. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The stop voicing contrast in French: From citation speech to sentencial speech

NASA Astrophysics Data System (ADS)

Abdelli-Beruh, Nassima; Demaio, Eileen; Hisagi, Miwako

2004-05-01

This study explores the influence of speaking style on the salience of the acoustic correlates to the stop voicing distinction in French. Monolingual French speakers produced twenty-one C_vC_ syllables in citation speech, in minimal pairs and in sentence-length utterances (/pa/-/a/ context: /il a di pa C_vC_ a lui/; /pas/-/s/ context: /il a di pas C_vC_ sa~ lui/). Prominent stress was on the C_vC_. Voicing-related differences in percentages of closure voicing, durations of aspiration, closure, and vowel were analyzed as a function of these three speaking styles. Results show that the salience of the acoustic-phonetic segments present when the syllables are uttered in isolation or in minimal pairs is different than when the syllables are spoken in a sentence. These results are in agreement with findings in English.
Relationship between perceived politeness and spectral characteristics of voice

NASA Astrophysics Data System (ADS)

Ito, Mika

2005-04-01

This study investigates the role of voice quality in perceiving politeness under conditions of varying relative social status among Japanese male speakers. The work focuses on four important methodological issues: experimental control of sociolinguistic aspects, eliciting natural spontaneous speech, obtaining recording quality suitable for voice quality analysis, and assessment of glottal characteristics through the use of non-invasive direct measurements of the speech spectrum. To obtain natural, unscripted utterances, the speech data were collected with a Map Task. This methodology allowed us to study the effect of manipulating relative social status among participants in the same community. We then computed the relative amplitudes of harmonics and formant peaks in spectra obtained from the Map Task recordings. Finally, an experiment was conducted to observe the alignment between acoustic measures and the perceived politeness of the voice samples. The results suggest that listeners' perceptions of politeness are determined by spectral characteristics of speakers, in particular, spectral tilts obtained by computing the difference in amplitude between the first harmonic and the third formant.
Vocal Acoustic and Auditory-Perceptual Characteristics During Fluctuations in Estradiol Levels During the Menstrual Cycle: A Longitudinal Study.

PubMed

Arruda, Polyanna; Diniz da Rosa, Marine Raquel; Almeida, Larissa Nadjara Alves; de Araujo Pernambuco, Leandro; Almeida, Anna Alice

2018-03-07

Estradiol production varies cyclically, changes in levels are hypothesized to affect the voice. The main objective of this study was to investigate vocal acoustic and auditory-perceptual characteristics during fluctuations in the levels of the hormone estradiol during the menstrual cycle. A total of 44 volunteers aged between 18 and 45 were selected. Of these, 27 women with regular menstrual cycles comprised the test group (TG) and 17 combined oral contraceptive users comprised the control group (CG). The study was performed in two phases. In phase 1, anamnesis was performed. Subsequently, the TG underwent blood sample collection for measurement of estradiol levels and voice recording for later acoustic and auditory-perceptual analysis. The CG underwent only voice recording. Phase 2 involved the same measurements as phase 1 for each group. Variables were evaluated using descriptive and inferential analysis to compare groups and phases and to determine relationships between variables. Voice changes were found during the menstrual cycle, and such changes were determined to be related to variations in estradiol levels. Impaired voice quality was observed to be associated with decreased levels of estradiol. The CG did not demonstrate significant vocal changes during phases 1 and 2. The TG showed significant increases in vocal parameters of roughness, tension, and instability during phase 2 (the period of low estradiol levels) when compared with the CG. Low estradiol levels were also found to be negatively correlated with the parameters of tension, instability, and jitter and positively correlated with fundamental voice frequency. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Rating, ranking, and understanding acoustical quality in university classrooms

NASA Astrophysics Data System (ADS)

Hodgson, Murray

2002-08-01

Nonoptimal classroom acoustical conditions directly affect speech perception and, thus, learning by students. Moreover, they may lead to voice problems for the instructor, who is forced to raise his/her voice when lecturing to compensate for poor acoustical conditions. The project applied previously developed simplified methods to predict speech intelligibility in occupied classrooms from measurements in unoccupied and occupied university classrooms. The methods were used to predict the speech intelligibility at various positions in 279 University of British Columbia (UBC) classrooms, when 70% occupied, and for four instructor voice levels. Classrooms were classified and rank ordered by acoustical quality, as determined by the room-average speech intelligibility. This information was used by UBC to prioritize classrooms for renovation. Here, the statistical results are reported to illustrate the range of acoustical qualities found at a typical university. Moreover, the variations of quality with relevant classroom acoustical parameters were studied to better understand the results. In particular, the factors leading to the best and worst conditions were studied. It was found that 81% of the 279 classrooms have "good," "very good," or "excellent" acoustical quality with a "typical" (average-male) instructor. However, 50 (18%) of the classrooms had "fair" or "poor" quality, and two had "bad" quality, due to high ventilation-noise levels. Most rooms were "very good" or "excellent" at the front, and "good" or "very good" at the back. Speech quality varied strongly with the instructor voice level. In the worst case considered, with a quiet female instructor, most of the classrooms were "bad" or "poor." Quality also varies with occupancy, with decreased occupancy resulting in decreased quality. The research showed that a new classroom acoustical design and renovation should focus on limiting background noise. They should promote high instructor speech levels at the back
Speech adjustments for room acoustics and their effects on vocal effort

PubMed Central

Bottalico, Pasquale

2016-01-01

Objectives The aims of the present study are: (1) to analyze the effects of the acoustical environment and the voice style on time dose (Dt_p,) and fundamental frequency (mean fo and standard deviation std_fo), while taking into account the effect of short term vocal fatigue; (2) to predict the self-reported vocal effort from the voice acoustical parameters. Methods Ten male and ten female subjects were recorded while reading a text in normal and loud styles, in three rooms - anechoic, semi-reverberant and reverberant –with and without acrylic glass panels 0.5 m from the mouth, which increased external auditory feedback. Subjects quantified how much effort was required to speak in each condition on a visual analogue scale after each task. Results (Aim1) In the loud style, Dt_p, fo and std_fo increased. The Dt_p was higher in the reverberant room compared to the other two rooms. Both genders tended to increase fo in less reverberant environments, while a more monotonous speech was produced in rooms with greater reverberation. All three voice parameters increased with short-term vocal fatigue. (Aim2) A model of the vocal effort to acoustic vocal parameters is proposed. The SPL (Sound Pressure Level) contributed to 66% of the variance explained by the model, followed by the fundamental frequency (30%) and the modulation in amplitude (4%). Conclusions The results provide insight into how voice acoustical parameters can predict vocal effort. In particular, it increased when SPL and fo increased and when the amplitude voice modulation (std_ΔSPL) decreased. PMID:28029555
Treatment outcomes for professional voice users.

PubMed

Wingate, Judith M; Brown, William S; Shrivastav, Rahul; Davenport, Paul; Sapienza, Christine M

2007-07-01

Professional voice users comprise 25% to 35% of the U.S. working population. Their voice problems may interfere with job performance and impact costs for both employers and employees. The purpose of this study was to examine treatment outcomes of two specific rehabilitation programs for a group of professional voice users. Eighteen professional voice users participated in this study; half had complaints of throat pain or vocal fatigue (Dysphonia Group), and half were found to have benign vocal fold lesions (Lesion Group). One group received 5 weeks of expiratory muscle strength training followed by six sessions of traditional voice therapy. Treatment order was reversed for the second group. The study was designed as a repeated measures study with independent variables of treatment order, laryngeal diagnosis (lesion vs non-lesion), gender, and time. Dependent variables included maximum expiratory pressure (MEP), Voice Handicap Index (VHI) score, Vocal Rating Scale (VRS) score, Voice Effort Scale score, phonetogram measures, subglottal pressures, and acoustic and perceptual measures. Results showed significant improvements in MEP, VHI scores, and VRS scores, subglottal pressure for loud intensity, phonetogram area, and dynamic range. No significant difference was found between laryngeal diagnosis groups. A significant difference was not observed for treatment order. It was concluded that the combined treatment was responsible for the improvements observed. The results indicate that a combined modality treatment may be successful in the remediation of vocal problems for professional voice users.
Voice hearing within the context of hearers' social worlds: an interpretative phenomenological analysis.

PubMed

Mawson, Amy; Berry, Katherine; Murray, Craig; Hayward, Mark

2011-09-01

Research has found relational qualities of power and intimacy to exist within hearer-voice interactions. The present study aimed to provide a deeper understanding of the interpersonal context of voice hearing by exploring participants' relationships with their voices and other people in their lives. This research was designed in consultation with service users and employed a qualitative, phenomenological, and idiographic design using semi-structured interviews. Ten participants, recruited via mental health services, and who reported hearing voices in the previous week, completed the interviews. These were transcribed verbatim and analysed using interpretative phenomenological analysis. Five themes resulted from the analysis. Theme 1: 'person and voice' demonstrated that participants' voices often reflected the identity, but not always the quality of social acquaintances. Theme 2: 'voices changing and confirming relationship with the self' explored the impact of voice hearing in producing an inferior sense-of-self in comparison to others. Theme 3: 'a battle for control' centred on issues of control and a dilemma of independence within voice relationships. Theme 4: 'friendships facilitating the ability to cope' and theme 5: 'voices creating distance in social relationships' explored experiences of social relationships within the context of voice hearing, and highlighted the impact of social isolation for voice hearers. The study demonstrated the potential role of qualitative research in developing theories of voice hearing. It extended previous research by highlighting the interface between voices and the social world of the hearer, including reciprocal influences of social relationships on voices and coping. Improving voice hearers' sense-of-self may be a key factor in reducing the distress caused by voices. ©2010 The British Psychological Society.
Effects of Voice Rehabilitation After Radiation Therapy for Laryngeal Cancer: A Randomized Controlled Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tuomi, Lisa, E-mail: lisa.tuomi@vgregion.se; Andréll, Paulin; Finizia, Caterina

Background: Patients treated with radiation therapy for laryngeal cancer often experience voice problems. The aim of this randomized controlled trial was to assess the efficacy of voice rehabilitation for laryngeal cancer patients after having undergone radiation therapy and to investigate whether differences between different tumor localizations with regard to rehabilitation outcomes exist. Methods and Materials: Sixty-nine male patients irradiated for laryngeal cancer participated. Voice recordings and self-assessments of communicative dysfunction were performed 1 and 6 months after radiation therapy. Thirty-three patients were randomized to structured voice rehabilitation with a speech-language pathologist and 36 to a control group. Furthermore, comparisons withmore » 23 healthy control individuals were made. Acoustic analyses were performed for all patients, including the healthy control individuals. The Swedish version of the Self Evaluation of Communication Experiences after Laryngeal Cancer and self-ratings of voice function were used to assess vocal and communicative function. Results: The patients who received vocal rehabilitation experienced improved self-rated vocal function after rehabilitation. Patients with supraglottic tumors who received voice rehabilitation had statistically significant improvements in voice quality and self-rated vocal function, whereas the control group did not. Conclusion: Voice rehabilitation for male patients with laryngeal cancer is efficacious regarding patient-reported outcome measurements. The patients experienced better voice function after rehabilitation. Patients with supraglottic tumors also showed an improvement in terms of acoustic voice outcomes. Rehabilitation with a speech-language pathologist is recommended for laryngeal cancer patients after radiation therapy, particularly for patients with supraglottic tumors.« less
Tipping point analysis of ocean acoustic noise

NASA Astrophysics Data System (ADS)

Livina, Valerie N.; Brouwer, Albert; Harris, Peter; Wang, Lian; Sotirakopoulos, Kostas; Robinson, Stephen

2018-02-01

We apply tipping point analysis to a large record of ocean acoustic data to identify the main components of the acoustic dynamical system and study possible bifurcations and transitions of the system. The analysis is based on a statistical physics framework with stochastic modelling, where we represent the observed data as a composition of deterministic and stochastic components estimated from the data using time-series techniques. We analyse long-term and seasonal trends, system states and acoustic fluctuations to reconstruct a one-dimensional stochastic equation to approximate the acoustic dynamical system. We apply potential analysis to acoustic fluctuations and detect several changes in the system states in the past 14 years. These are most likely caused by climatic phenomena. We analyse trends in sound pressure level within different frequency bands and hypothesize a possible anthropogenic impact on the acoustic environment. The tipping point analysis framework provides insight into the structure of the acoustic data and helps identify its dynamic phenomena, correctly reproducing the probability distribution and scaling properties (power-law correlations) of the time series.
Clinical voice analysis of Carnatic singers.

PubMed

Arunachalam, Ravikumar; Boominathan, Prakash; Mahalingam, Shenbagavalli

2014-01-01

Carnatic singing is a classical South Indian style of music that involves rigorous training to produce an "open throated" loud, predominantly low-pitched singing, embedded with vocal nuances in higher pitches. Voice problems in singers are not uncommon. The objective was to report the nature of voice problems and apply a routine protocol to assess the voice. Forty-five trained performing singers (females: 36 and males: 9) who reported to a tertiary care hospital with voice problems underwent voice assessment. The study analyzed their problems and the clinical findings. Voice change, difficulty in singing higher pitches, and voice fatigue were major complaints. Most of the singers suffered laryngopharyngeal reflux that coexisted with muscle tension dysphonia and chronic laryngitis. Speaking voices were rated predominantly as "moderate deviation" on GRBAS (Grade, Rough, Breathy, Asthenia, and Strain). Maximum phonation time ranged from 4 to 29 seconds (females: 10.2, standard deviation [SD]: 5.28 and males: 15.7, SD: 5.79). Singing frequency range was reduced (females: 21.3 Semitones and males: 23.99 Semitones). Dysphonia severity index (DSI) scores ranged from -3.5 to 4.91 (females: 0.075 and males: 0.64). Singing frequency range and DSI did not show significant difference between sex and across clinical diagnosis. Self-perception using voice disorder outcome profile revealed overall severity score of 5.1 (SD: 2.7). Findings are discussed from a clinical intervention perspective. Study highlighted the nature of voice problems (hyperfunctional) and required modifications in assessment protocol for Carnatic singers. Need for regular assessments and vocal hygiene education to maintain good vocal health are emphasized as outcomes. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Atmospheric effects on voice command intelligibility from acoustic hail and warning devices.

PubMed

Bostron, Jason H; Brungart, Timothy A; Barnard, Andrew R; McDevitt, Timothy E

2011-04-01

Voice command sound pressure levels (SPLs) were recorded at distances up to 1500 m. Received SPLs were related to the meteorological condition during sound propagation and compared with the outdoor sound propagation standard ISO 9613-2. Intelligibility of received signals was calculated using ANSI S3.5. Intelligibility results for the present voice command indicate that meteorological condition imposes little to no effect on intelligibility when the signal-to-noise ratio (SNR) is low (<-9 dB) or high (>0 dB). In these two cases the signal is firmly unintelligible or intelligible, respectively. However, at moderate SNRs, variations in received SPL can cause a fully intelligible voice command to become unintelligible, depending on the meteorological condition along the sound propagation path. These changes in voice command intelligibility often occur on time scales as short as minutes during upward refracting conditions, typically found above ground during the day or upwind of a sound source. Reliably predicting the intelligibility of a voice command in a moderate SNR environment can be challenging due to the inherent variability imposed by sound propagation through the atmosphere.
The acoustic correlates of valence depend on emotion family.

PubMed

Belyk, Michel; Brown, Steven

2014-07-01

The voice expresses a wide range of emotions through modulations of acoustic parameters such as frequency and amplitude. Although the acoustics of individual emotions are well understood, attempts to describe the acoustic correlates of broad emotional categories such as valence have yielded mixed results. In the present study, we analyzed the acoustics of emotional valence for different families of emotion. We divided emotional vocalizations into "motivational," "moral," and "aesthetic" families as defined by the OCC (Ortony, Clore, and Collins) model of emotion. Subjects viewed emotional scenarios and were cued to vocalize congruent exclamations in response to them, for example, "Yay!" and "Damn!". Positive valence was weakly associated with high-pitched and loud vocalizations. However, valence interacted with emotion family for both pitch and amplitude. A general acoustic code for valence does not hold across families of emotion, whereas family-specific codes provide a more accurate description of vocal emotions. These findings are consolidated into a set of "rules of expression" relating vocal dimensions to emotion dimensions. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Medications and Adverse Voice Effects.

PubMed

Nemr, Kátia; Di Carlos Silva, Ariana; Rodrigues, Danilo de Albuquerque; Zenari, Marcia Simões

2017-08-16

To identify the medications used by patients with dysphonia, describe the voice symptoms reported on initial speech-language pathology (SLP) examination, evaluate the possible direct and indirect effects of medications on voice production, and determine the association between direct and indirect adverse voice effects and self-reported voice symptoms, hydration and smoking habits, comorbidities, vocal assessment, and type and degree of dysphonia. This is a retrospective cross-sectional study. Fifty-five patients were evaluated and the vocal signs and symptoms indicated in the Dysphonia Risk Protocol were considered, as well as data on hydration, smoking and medication use. We analyzed the associations between type of side effect and self-reported vocal signs/symptoms, hydration, smoking, comorbidities, type of dysphonia, and auditory-perceptual and acoustic parameters. Sixty percent were women, the mean age was 51.8 years, 29 symptoms were reported on the screening, and 73 active ingredients were identified with 8.2% directly and 91.8% indirectly affecting vocal function. There were associations between the use of drugs with direct adverse voice effects, self-reported symptoms, general degree of vocal deviation, and pitch deviation. The symptoms of dry throat and shortness of breath were associated with the direct vocal side effect of the medicine, as well as the general degree of vocal deviation and the greater pitch deviation. Shortness of breath when speaking was also associated with the greatest degree of vocal deviation. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

The Effect of Traditional Singing Warm-Up Versus Semioccluded Vocal Tract Exercises on the Acoustic Parameters of Singing Voice.

PubMed

Duke, Emily; Plexico, Laura W; Sandage, Mary J; Hoch, Matthew

2015-11-01

This study investigated the effect of traditional vocal warm-up versus semioccluded vocal tract exercises on the acoustic parameters of voice through three questions: does vocal warm-up condition significantly alter the singing power ratio of the singing voice? Is singing power ratio dependent upon vowel? Is perceived phonatory effort affected by warm-up condition? Hypotheses were that vocal warm-up would alter the singing power ratio, and that semioccluded vocal tract warm-up would affect the singing power ratio more than no warm-up or traditional warm-up, that singing power ratio would vary across vowel, and that perceived phonatory effort would vary with warm-up condition. This study was a within-participant repeated measures design with counterbalanced conditions. Thirteen male singers were recorded under three different conditions: no warm-up, traditional warm-up, and semioccluded vocal tract exercise warm-up. Recordings were made of these singers performing the Star Spangled Banner, and singing power ratio (SPR) was calculated from four vowels. Singers rated their perceived phonatory effort (PPE) singing the Star Spangled Banner after each warm-up condition. Warm-up condition did not significantly affect SPR. SPR was significantly different for /i/ and /e/. PPE was not significantly different between warm-up conditions. The present study did not find significant differences in SPR between warm-up conditions. SPR differences for /i/, support previous findings. PPE did not differ significantly across warm-up condition despite the expectation that traditional or semioccluded warm-up would cause a decrease. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Relationship between Voice Complaints and Subjective and Objective Measures of Vocal Function in Iranian Female Teachers.

PubMed

Faham, Maryam; Jalilevand, Nahid; Torabinezhad, Farhad; Silverman, Erin Pearson; Ahmadi, Akram; Anaraki, Zahra Ghayoumi; Jafari, Narges

2017-07-01

Teachers are at high risk of developing voice problems because of the excessive vocal demands necessitated by their profession. Teachers' self-assessment of vocal complaints, combined with subjective and objective measures of voice, may enable better therapeutic decision-making. This investigation compared audio-perceptual assessment and acoustic variables in teachers with and without voice complaints. Ninety-nine teachers completed this cross-sectional study and were assigned to one of two groups: those "with voice complaint (VC)" and those "without voice complaint (W-VC)." Voice samples were collected during reading, counting, and vowel prolongation tasks. Teachers were also asked to document any voice symptoms they experienced. Voice samples were analyzed using Dr. Speech program (4th version; Tiger Ltd., USA), and labeled "normal" or "abnormal" according to the "grade" dimension "G" from GRBAS scale. Twenty-one teachers were assigned to the VC group based on self-assessment data. There were statistically significant differences between the two groups with regard to self-reported voice symptoms of hoarseness, breathiness, pitch breaks, and vocal fatigue (P < 0.05). Fourteen participants in the VC group and 40 from the W-VC group were determined to demonstrate "abnormal" vocal quality on perceptual assessment. Only harmonic-to-noise ratio was significantly higher for the W-VC group (ES = 0.55). Teachers with and without voice complaints differed in the incidence, but not type of voice symptoms. Teachers' voice complaints did not correspond to perceptual and acoustic measures. This suggests a potential unmet need for teachers to receive further education on voice disorders. Copyright © 2017 The Voice Foundation. All rights reserved.
Cerebral Processing of Voice Gender Studied Using a Continuous Carryover fMRI Design

PubMed Central

Pernet, Cyril; Latinus, Marianne; Crabbe, Frances; Belin, Pascal

2013-01-01

Normal listeners effortlessly determine a person's gender by voice, but the cerebral mechanisms underlying this ability remain unclear. Here, we demonstrate 2 stages of cerebral processing during voice gender categorization. Using voice morphing along with an adaptation-optimized functional magnetic resonance imaging design, we found that secondary auditory cortex including the anterior part of the temporal voice areas in the right hemisphere responded primarily to acoustical distance with the previously heard stimulus. In contrast, a network of bilateral regions involving inferior prefrontal and anterior and posterior cingulate cortex reflected perceived stimulus ambiguity. These findings suggest that voice gender recognition involves neuronal populations along the auditory ventral stream responsible for auditory feature extraction, functioning in pair with the prefrontal cortex in voice gender perception. PMID:22490550
Emergence of linguistic laws in human voice

PubMed Central

Torre, Iván González; Luque, Bartolo; Lacasa, Lucas; Luque, Jordi; Hernández-Fernández, Antoni

2017-01-01

Linguistic laws constitute one of the quantitative cornerstones of modern cognitive sciences and have been routinely investigated in written corpora, or in the equivalent transcription of oral corpora. This means that inferences of statistical patterns of language in acoustics are biased by the arbitrary, language-dependent segmentation of the signal, and virtually precludes the possibility of making comparative studies between human voice and other animal communication systems. Here we bridge this gap by proposing a method that allows to measure such patterns in acoustic signals of arbitrary origin, without needs to have access to the language corpus underneath. The method has been applied to sixteen different human languages, recovering successfully some well-known laws of human communication at timescales even below the phoneme and finding yet another link between complexity and criticality in a biological system. These methods further pave the way for new comparative studies in animal communication or the analysis of signals of unknown code. PMID:28272418
Emergence of linguistic laws in human voice

NASA Astrophysics Data System (ADS)

Torre, Iván González; Luque, Bartolo; Lacasa, Lucas; Luque, Jordi; Hernández-Fernández, Antoni

2017-03-01

Linguistic laws constitute one of the quantitative cornerstones of modern cognitive sciences and have been routinely investigated in written corpora, or in the equivalent transcription of oral corpora. This means that inferences of statistical patterns of language in acoustics are biased by the arbitrary, language-dependent segmentation of the signal, and virtually precludes the possibility of making comparative studies between human voice and other animal communication systems. Here we bridge this gap by proposing a method that allows to measure such patterns in acoustic signals of arbitrary origin, without needs to have access to the language corpus underneath. The method has been applied to sixteen different human languages, recovering successfully some well-known laws of human communication at timescales even below the phoneme and finding yet another link between complexity and criticality in a biological system. These methods further pave the way for new comparative studies in animal communication or the analysis of signals of unknown code.
The shouted voice: A pilot study of laryngeal physiology under extreme aerodynamic pressure.

PubMed

Lagier, Aude; Legou, Thierry; Galant, Camille; Amy de La Bretèque, Benoit; Meynadier, Yohann; Giovanni, Antoine

2017-12-01

The objective was to study the behavior of the larynx during shouted voice production, when the larynx is exposed to extremely high subglottic pressure. The study involved electroglottographic, acoustic, and aerodynamic analyses of shouts produced at maximum effort by three male participants. Under a normal speaking voice, the voice sound pressure level (SPL) is proportional to the subglottic pressure. However, when the subglottic pressure reached high levels, the voice SPL reached a maximum value and then decreased as subglottic pressure increased further. Furthermore, the electroglottographic signal sometimes lost its periodicity during the shout, suggesting irregular vocal fold vibration.
Nonlinear dynamic mechanism of vocal tremor from voice analysis and model simulations

NASA Astrophysics Data System (ADS)

Zhang, Yu; Jiang, Jack J.

2008-09-01

Nonlinear dynamic analysis and model simulations are used to study the nonlinear dynamic characteristics of vocal folds with vocal tremor, which can typically be characterized by low-frequency modulation and aperiodicity. Tremor voices from patients with disorders such as paresis, Parkinson's disease, hyperfunction, and adductor spasmodic dysphonia show low-dimensional characteristics, differing from random noise. Correlation dimension analysis statistically distinguishes tremor voices from normal voices. Furthermore, a nonlinear tremor model is proposed to study the vibrations of the vocal folds with vocal tremor. Fractal dimensions and positive Lyapunov exponents demonstrate the evidence of chaos in the tremor model, where amplitude and frequency play important roles in governing vocal fold dynamics. Nonlinear dynamic voice analysis and vocal fold modeling may provide a useful set of tools for understanding the dynamic mechanism of vocal tremor in patients with laryngeal diseases.
Human voice perception.

PubMed

Latinus, Marianne; Belin, Pascal

2011-02-22

We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.
Optimal Duration for Voice Rest After Vocal Fold Surgery: Randomized Controlled Clinical Study.

PubMed

Kaneko, Mami; Shiromoto, Osamu; Fujiu-Kurachi, Masako; Kishimoto, Yo; Tateya, Ichiro; Hirano, Shigeru

2017-01-01

Voice rest is commonly recommended after phonomicrosurgery to prevent worsening of vocal fold injuries. However, the most effective duration of voice rest is unknown. Recently, early vocal stimulation was recommended as a means to improve wound healing. The purpose of this study is to examine the optimal duration of voice rest after phonomicrosurgery. Randomized controlled clinical study. Patients undergoing phonomicrosurgery for leukoplakia, carcinoma in situ, vocal fold polyp, Reinke's edema, and cyst were chosen. Participants were randomly assigned to voice rest for 3 or 7 postoperative days. Voice therapy was administered to both groups after voice rest. Grade, roughness, breathiness, asthenia, and strain (GRBAS) scale, stroboscopic examination, aerodynamic assessment, acoustic analysis, and Voice Handicap Index-10 (VHI-10) were performed pre- and postoperatively at 1, 3, and 6 months. Stroboscopic examination evaluated normalized mucosal wave amplitude (NMWA). Parameters were compared between both groups. Thirty-one patients were analyzed (3-day group, n = 16; 7-day group, n = 15). Jitter, shimmer, and VHI-10 were significantly better in the 3-day group at 1 month post operation. GRBAS was significantly better in the 3-day group at 1 and 3 months post operation, and NMWA was significantly better in the 3-day group at 1, 3, and 6 months post operation compared to the 7-day group. The data suggest that 3 days of voice rest followed by voice therapy may lead to better wound healing of the vocal fold compared to 7 days of voice rest. Appropriate mechanical stimulation during early stages of vocal fold wound healing may lead to favorable functional recovery. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Performance of wavelet analysis and neural networks for pathological voices identification

NASA Astrophysics Data System (ADS)

Salhi, Lotfi; Talbi, Mourad; Abid, Sabeur; Cherif, Adnane

2011-09-01

Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital 'RABTA' of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.
Effects of voice training and voice hygiene education on acoustic and perceptual speech parameters and self-reported vocal well-being in female teachers.

PubMed

Ilomaki, Irma; Laukkanen, Anne-Maria; Leppanen, Kirsti; Vilkman, Erkki

2008-01-01

Voice education programs may help in optimizing teachers' voice use. This study compared effects of voice training (VT) and voice hygiene lecture (VHL) in 60 randomly assigned female teachers. All 60 attended the lecture, and 30 completed a short training course in addition. Text reading was recorded in working environments and analyzed for fundamental frequency (F0), equivalent sound level (Leq), alpha ratio, jitter, shimmer, and perceptual quality. Self-reports of vocal well-being were registered. In the VHL group, increased F0 and difficulty of phonation and in the VT group decreased perturbation, increased alpha ratio, easier phonation, and improved perceptual and self-reported voice quality were found. Both groups equally self-reported increase of voice care knowledge. Results seem to indicate improved vocal well-being after training.
Effects of chemoradiotherapy on voice and swallowing

PubMed Central

Lazarus, Cathy L.

2009-01-01

Purpose of review Chemotherapy has been found to result in comparable survival rates to surgery for head and neck cancer. However, toxicity can often be worse after chemoradiotherapy, with impairment in voice, swallowing, nutrition, and quality of life. Investigators are attempting to modify radiotherapy treatment regimens to spare organs that have an impact on swallowing. This review will highlight voice and swallowing impairment seen after chemoradiotherapy, as well as treatment for voice and swallowing disorders in this population. Results of newer radiotherapy regimens will also be highlighted. Recent findings Specific oropharyngeal swallowing motility disorders after chemoradiotherapy have been identified. Damage to specific structures has been correlated with specific pharyngeal phase swallow impairment. Swallowing function and quality of life have been examined over time, with improvement seen in both. Preventive/prophylactic swallow exercise programs have been encouraging. Chemoradiotherapy effects on voice have been identified in terms of acoustic, aerodynamic, and patient and clinician-rated perception of function. Improvement in voice has also been observed over time after chemoradiotherapy. Voice therapy has been found to have a positive impact on voice and perceptual measures in this population. Summary Current studies show some improvement in swallow function after swallow and voice therapy in patients treated with chemoradiotherapy. Further, there is a suggestion of improved swallow function with sparing of organs with specific radiotherapy protocols. Future research needs to focus on specific voice and swallow treatment regimens in the head and neck cancer patient treated with chemoradiotherapy, specifically, timing, frequency, duration, and specific treatment types. PMID:19337126
Vibro-acoustic analysis of composite plates

NASA Astrophysics Data System (ADS)

Sarigül, A. S.; Karagözlü, E.

2014-03-01

Vibro-acoustic analysis plays a vital role on the design of aircrafts, spacecrafts, land vehicles and ships produced from thin plates backed by closed cavities, with regard to human health and living comfort. For this type of structures, it is required a coupled solution that takes into account structural-acoustic interaction which is crucial for sensitive solutions. In this study, coupled vibro-acoustic analyses of plates produced from composite materials have been performed by using finite element analysis software. The study has been carried out for E-glass/Epoxy, Kevlar/Epoxy and Carbon/Epoxy plates with different ply angles and numbers of ply. The effects of composite material, ply orientation and number of layer on coupled vibro-acoustic characteristics of plates have been analysed for various combinations. The analysis results have been statistically examined and assessed.
ATC/pilot voice communications: A survey of the literature

NASA Astrophysics Data System (ADS)

Prinzo, O. Veronika; Britton, Thomas W.

1993-11-01

The first radio-equipped control tower in the United States opened at the Cleveland Municipal Airport in 1930. From that time to the present, voice radio communications have played a primary role in air safety. Verbal communications in air traffic control (ATC) operations have been frequently cited as causal factors in operational errors and pilot deviations in the FAA Operational Error and Deviation System, the NASA Aviation Safety Reporting System (ASRS), and reports derived from government sponsored research projects. Collectively, the data provided by these programs indicate that communications constitute a significant problem for pilots and controllers. Although the communications problem was well known the research literature was fragmented, making it difficult to appreciate the various types of verbal communications problems that existed and their unique influence on the quality of ATC/pilot communications. This is a survey of the voice radio communications literature. The 43 reports in the review represent survey data, field studies, laboratory studies, narrative reports, and reviews. The survey topics pertain to communications taxonomies, acoustical correlates and cognitive/psycholinguistic perspectives. Communications taxonomies were used to identify the frequency and types of information that constitute routine communications, as well as those communications involved in operational errors, pilot deviations, and other safety-related events. Acoustical correlate methodologies identified some qualities of a speaker's voice, such as loudness, pitch, and speech rate, which might be used potentially to monitor stress, mental workload, and other forms of psychological or physiological factors that affect performance. Cognitive/psycho-linguistic research offered an information processing perspective for understanding how pilots' and controllers' memory and language comprehension processes affect their ability to communicate effectively with one another. This
Acoustic and Perceptual Effects of Left–Right Laryngeal Asymmetries Based on Computational Modeling

PubMed Central

Samlan, Robin A.; Story, Brad H.; Lotto, Andrew J.; Bunton, Kate

2015-01-01

Purpose Computational modeling was used to examine the consequences of 5 different laryngeal asymmetries on acoustic and perceptual measures of vocal function. Method A kinematic vocal fold model was used to impose 5 laryngeal asymmetries: adduction, edge bulging, nodal point ratio, amplitude of vibration, and starting phase. Thirty /a/ and /I/ vowels were generated for each asymmetry and analyzed acoustically using cepstral peak prominence (CPP), harmonics-to-noise ratio (HNR), and 3 measures of spectral slope (H1*-H2*, B0-B1, and B0-B2). Twenty listeners rated voice quality for a subset of the productions. Results Increasingly asymmetric adduction, bulging, and nodal point ratio explained significant variance in perceptual rating (R2 = .05, p < .001). The same factors resulted in generally decreasing CPP, HNR, and B0-B2 and in increasing B0-B1. Of the acoustic measures, only CPP explained significant variance in perceived quality (R2 = .14, p < .001). Increasingly asymmetric amplitude of vibration or starting phase minimally altered vocal function or voice quality. Conclusion Asymmetries of adduction, bulging, and nodal point ratio drove acoustic measures and perception in the current study, whereas asymmetric amplitude of vibration and starting phase demonstrated minimal influence on the acoustic signal or voice quality. PMID:24845730
Acoustic correlate of vocal effort in spasmodic dysphonia.

PubMed

Eadie, Tanya L; Stepp, Cara E

2013-03-01

This study characterized the relationship between relative fundamental frequency (RFF) and listeners' perceptions of vocal effort and overall spasmodic dysphonia severity in the voices of 19 individuals with adductor spasmodic dysphonia. Twenty inexperienced listeners evaluated the vocal effort and overall severity of voices using visual analog scales. The squared correlation coefficients (R2) between average vocal effort and overall severity and RFF measures were calculated as a function of the number of acoustic instances used for the RFF estimate (from 1 to 9, of a total of 9 voiced-voiceless-voiced instances). Increases in the number of acoustic instances used for the RFF average led to increases in the variance predicted by the RFF at the first cycle of voicing onset (onset RFF) in the perceptual measures; the use of 6 or more instances resulted in a stable estimate. The variance predicted by the onset RFF for vocal effort (R2 range, 0.06 to 0.43) was higher than that for overall severity (R2 range, 0.06 to 0.35). The offset RFF was not related to the perceptual measures, irrespective of the sample size. This study indicates that onset RFF measures are related to perceived vocal effort in patients with adductor spasmodic dysphonia. These results have implications for measuring outcomes in this population.
The effect of choir formation on the acoustical attributes of the singing voice

NASA Astrophysics Data System (ADS)

Atkinson, Debra Sue

Research shows that many things can influence choral tone and choral blend. Some of these are vowel uniformity, vibrato, choral formation, strategic placement of singers, and spacing between singers. This study sought to determine the effect that changes in choral formation and spacing between singers would have on four randomly selected voices of an ensemble as revealed through long-term average spectra (LTAS) of the individual singers. All members of the ensemble were given the opportunity to express their preferences for each of the choral formations and the four randomly selected choristers were asked specific questions regarding the differences between choral singing and solo singing. The results indicated that experienced singers preferred singing in a mixed-spread choral formation. However, the graphs of the choral excerpts as compared to the solo recordings revealed that the choral graphs for the soprano and bass were very similar to the graphs of their solos, but the graphs of the tenor and the alto were different from their solo graphs. It is obvious from the results of this study that the four selected singers did sing with slightly different techniques in the choral formations than they did while singing their solos. The members of this ensemble were accustomed to singing in many different formations. Therefore, it was easy for them to consciously think about how they sang in each of the four formations (mixed-close, mixed-spread, sectional-close, and sectional-spread) and answer the questionnaire accordingly. This would not be as easy for a group that never changed choral formations. Therefore, the results of this study cannot be generalized to choirs who only sing in sectional formation. As researchers learn more about choral acoustics and the effects of choral singing on the voice, choral conductors will be able to make better decisions about the methods used to achieve their desired choral blend. It is up to the choral conductors to glean the
Effects of the Interaction of Caffeine and Water on Voice Performance: A Pilot Study

ERIC Educational Resources Information Center

Franca, Maria Claudia; Simpson, Kenneth O.

2013-01-01

The objective of this "pilot" investigation was to study the effects of the interaction of caffeine and water intake on voice as evidenced by acoustic and aerodynamic measures, to determine whether ingestion of 200 mg of caffeine and various levels of water intake have an impact on voice. The participants were 48 females ranging in age…
Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

PubMed

Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

2018-05-01

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Vocal Performance of Group Fitness Instructors Before and After Instruction: Changes in Acoustic Measures and Self-Ratings.

PubMed

Dallaston, Katherine; Rumbach, Anna F

2016-01-01

(1) To quantify acute changes in acoustic parameters of the voices of group fitness instructors (GFIs) before and after exercise instruction. (2) To determine whether these changes are discernible perceptually by the instructor. This is a pilot prospective cohort study. Participants were six female GFIs, based in Brisbane, Australia. Participants performed a series of vocal tasks before and after instruction of a 60-minute exercise class. Data were obtained pertaining to fundamental frequency (pitch), intensity (volume), jitter, shimmer, harmonic-to-noise ratio (HNR), maximum duration of sustained phonation (MDSP), and pitch range. Additionally, self-ratings of voice quality were obtained before and after instruction. Data were analyzed using the Wilcoxon signed rank test. Significant increases (P ≤ 0.05) were found in fundamental frequency and intensity after instruction. No significant changes in jitter, shimmer, HNR, or MDSP were found before and after instruction. For the group, no significant change in self-ratings of voice quality occurred before and after instruction. Statistically significant changes in pitch and volume were found on acoustic analysis. However, these subtle changes remained within the limits of what is considered normal and representative of the participant's age and gender. Further research into the effects of exercise instruction on the voice is needed. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Nebulized isotonic saline improves voice production in Sjögren's syndrome.

PubMed

Tanner, Kristine; Nissen, Shawn L; Merrill, Ray M; Miner, Alison; Channell, Ron W; Miller, Karla L; Elstad, Mark; Kendall, Katherine A; Roy, Nelson

2015-10-01

This study examined the effects of a topical vocal fold hydration treatment on voice production over time. Prospective, longitudinal, within-subjects A (baseline), B (treatment), A (withdrawal/reversal), B (treatment) experimental design. Eight individuals with primary Sjögren's syndrome (SS), an autoimmune disease causing laryngeal dryness, completed an 8-week A-B-A-B experiment. Participants performed twice-daily audio recordings of connected speech and sustained vowels and then rated vocal effort, mouth dryness, and throat dryness. Two-week treatment phases introduced twice-daily 9-mL doses of nebulized isotonic saline (0.9% Na(+)Cl(-)). Voice handicap and patient-based measures of SS disease severity were collected before and after each 2-week phase. Connected speech and sustained vowels were analyzed using the Cepstral Spectral Index of Dysphonia (CSID). Acoustic and patient-based ratings during each baseline and treatment phase were analyzed and compared. Baseline CSID and patient-based ratings were in the mild-to-moderate range. CSID measures of voice severity improved by approximately 20% with nebulized saline treatment and worsened during treatment withdrawal. Posttreatment CSID values fell within the normal-to-mild range. Similar patterns were observed in patient-based ratings of vocal effort and dryness. CSID values and patient-based ratings correlated significantly (P < .05). Nebulized isotonic saline improves voice production based on acoustic and patient-based ratings of voice severity. Future work should optimize topical vocal fold hydration treatment formulations, dose, and delivery methodologies for various patient populations. This study lays the groundwork for future topical vocal fold hydration treatment development to manage and possibly prevent dehydration-related voice disorders. 2b. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
Pulse analysis of acoustic emission signals

NASA Technical Reports Server (NTRS)

Houghton, J. R.; Packman, P. F.

1977-01-01

A method for the signature analysis of pulses in the frequency domain and the time domain is presented. Fourier spectrum, Fourier transfer function, shock spectrum and shock spectrum ratio were examined in the frequency domain analysis and pulse shape deconvolution was developed for use in the time domain analysis. Comparisons of the relative performance of each analysis technique are made for the characterization of acoustic emission pulses recorded by a measuring system. To demonstrate the relative sensitivity of each of the methods to small changes in the pulse shape, signatures of computer modeled systems with analytical pulses are presented. Optimization techniques are developed and used to indicate the best design parameter values for deconvolution of the pulse shape. Several experiments are presented that test the pulse signature analysis methods on different acoustic emission sources. These include acoustic emission associated with (a) crack propagation, (b) ball dropping on a plate, (c) spark discharge, and (d) defective and good ball bearings. Deconvolution of the first few micro-seconds of the pulse train is shown to be the region in which the significant signatures of the acoustic emission event are to be found.
Pulse analysis of acoustic emission signals

NASA Technical Reports Server (NTRS)

Houghton, J. R.; Packman, P. F.

1977-01-01

A method for the signature analysis of pulses in the frequency domain and the time domain is presented. Fourier spectrum, Fourier transfer function, shock spectrum and shock spectrum ratio were examined in the frequency domain analysis, and pulse shape deconvolution was developed for use in the time domain analysis. Comparisons of the relative performance of each analysis technique are made for the characterization of acoustic emission pulses recorded by a measuring system. To demonstrate the relative sensitivity of each of the methods to small changes in the pulse shape, signatures of computer modeled systems with analytical pulses are presented. Optimization techniques are developed and used to indicate the best design parameters values for deconvolution of the pulse shape. Several experiments are presented that test the pulse signature analysis methods on different acoustic emission sources. These include acoustic emissions associated with: (1) crack propagation, (2) ball dropping on a plate, (3) spark discharge and (4) defective and good ball bearings. Deconvolution of the first few micro-seconds of the pulse train are shown to be the region in which the significant signatures of the acoustic emission event are to be found.
Speaker's comfort in teaching environments: voice problems in Swedish teaching staff.

PubMed

Åhlander, Viveka Lyberg; Rydell, Roland; Löfqvist, Anders

2011-07-01

The primary objective of this study was to examine how a group of Swedish teachers rate aspects of their working environment that can be presumed to have an impact on vocal behavior and voice problems. The secondary objective was to explore the prevalence of voice problems in Swedish teachers. Questionnaires were distributed to the teachers of 23 randomized schools. Teaching staff at all levels were included, except preschool teachers and teachers at specialized, vocational high schools. The response rate was 73%. The results showed that 13% of the whole group reported voice problems occurring sometimes, often, or always. The teachers reporting voice problems were compared with those without problems. There were significant differences among the groups for several items. The teachers with voice problems rated items on room acoustics and work environment as more noticeable. This group also reported voice symptoms, such as hoarseness, throat clearing, and voice change, to a significantly higher degree, even though teachers in both groups reported some voice symptoms. Absence from work because of voice problems was also significantly more common in the group with voice problems--35% versus 9% in the group without problems. We may conclude that teachers suffering from voice problems react stronger to loading factors in the teaching environment, report more frequent symptoms of voice discomfort, and are more often absent from work because of voice problems than their voice-healthy colleagues. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Do Women's Voices Provide Cues of the Likelihood of Ovulation? The Importance of Sampling Regime

PubMed Central

Fischer, Julia; Semple, Stuart; Fickenscher, Gisela; Jürgens, Rebecca; Kruse, Eberhard; Heistermann, Michael; Amir, Ofer

2011-01-01

The human voice provides a rich source of information about individual attributes such as body size, developmental stability and emotional state. Moreover, there is evidence that female voice characteristics change across the menstrual cycle. A previous study reported that women speak with higher fundamental frequency (F0) in the high-fertility compared to the low-fertility phase. To gain further insights into the mechanisms underlying this variation in perceived attractiveness and the relationship between vocal quality and the timing of ovulation, we combined hormone measurements and acoustic analyses, to characterize voice changes on a day-to-day basis throughout the menstrual cycle. Voice characteristics were measured from free speech as well as sustained vowels. In addition, we asked men to rate vocal attractiveness from selected samples. The free speech samples revealed marginally significant variation in F0 with an increase prior to and a distinct drop during ovulation. Overall variation throughout the cycle, however, precluded unequivocal identification of the period with the highest conception risk. The analysis of vowel samples revealed a significant increase in degree of unvoiceness and noise-to-harmonic ratio during menstruation, possibly related to an increase in tissue water content. Neither estrogen nor progestogen levels predicted the observed changes in acoustic characteristics. The perceptual experiments revealed a preference by males for voice samples recorded during the pre-ovulatory period compared to other periods in the cycle. While overall we confirm earlier findings in that women speak with a higher and more variable fundamental frequency just prior to ovulation, the present study highlights the importance of taking the full range of variation into account before drawing conclusions about the value of these cues for the detection of ovulation. PMID:21957453
The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

PubMed

Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

2016-11-01

This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Reliability in perceptual analysis of voice quality.

PubMed

Bele, Irene Velsvik

2005-12-01

This study focuses on speaking voice quality in male teachers (n = 35) and male actors (n = 36), who represent untrained and trained voice users, because we wanted to investigate normal and supranormal voices. In this study, both substantial and methodologic aspects were considered. It includes a method for perceptual voice evaluation, and a basic issue was rater reliability. A listening group of 10 listeners, 7 experienced speech-language therapists, and 3 speech-language therapist students evaluated the voices by 15 vocal characteristics using VA scales. Two sets of voice signals were investigated: text reading (2 loudness levels) and sustained vowel (3 levels). The results indicated a high interrater reliability for most perceptual characteristics. Connected speech was evaluated more reliably, especially at the normal level, but both types of voice signals were evaluated reliably, although the reliability for connected speech was somewhat higher than for vowels. Experienced listeners tended to be more consistent in their ratings than did the student raters. Some vocal characteristics achieved acceptable reliability even with a smaller panel of listeners. The perceptual characteristics grouped in 4 factors reflected perceptual dimensions.
Obligatory and facultative brain regions for voice-identity recognition

PubMed Central

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal
High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Voice Call Analysis

DTIC Science & Technology

2015-09-01

Gateway 2 4. Voice Packet Flow: SIP , Session Description Protocol (SDP), and RTP 3 5. Voice Data Analysis 5 6. Call Analysis 6 7. Call Metrics 6...analysis processing is designed for a general VoIP system architecture based on Session Initiation Protocol ( SIP ) for negotiating call sessions and...employs Skinny Client Control Protocol for network communication between the phone and the local CallManager (e.g., for each dialed digit), SIP
Case-study magnetic resonance imaging and acoustic investigation of the effects of vocal warm-up on two voice professionals.

PubMed

Laukkanen, Anne-Maria; Horáček, Jaromir; Havlík, Radan

2012-07-01

Vocal warm-up (WU)-related changes were studied in one male musical singer and one female speech trainer. They sustained vowels before and after WU in a magnetic resonance imaging (MRI) device. Acoustic recordings were made in a studio. The vocal tract area increased after WU, a formant cluster appeared between 2 and 4.5 kHz, and SPL increased. Evidence of larynx lowering was only found for the male. The pharyngeal inlet over the epilaryngeal outlet ratio (A(ph)/A(e)) increased by 10%-28%, being 3-4 for the male and 5-7 for the female. The results seem to represent different voice training traditions. A singer's formant cluster may be achievable without a high A(ph)/A(e) (≥ 6), but limitations of the 2D method should be taken into account.
Objective and subjective evaluation of the acoustic comfort in classrooms.

PubMed

Zannin, Paulo Henrique Trombetta; Marcon, Carolina Reich

2007-09-01

The acoustic comfort of classrooms in a Brazilian public school has been evaluated through interviews with 62 teachers and 464 pupils, measurements of background noise, reverberation time, and sound insulation. Acoustic measurements have revealed the poor acoustic quality of the classrooms. Results have shown that teachers and pupils consider the noise generated and the voice of the teacher in neighboring classrooms as the main sources of annoyance inside the classroom. Acoustic simulations resulted in the suggestion of placement of perforated plywood on the ceiling, for reduction in reverberation time and increase in the acoustic comfort of the classrooms.
Overall voice and strain level analysis in rock singers.

PubMed

Gonsalves, Aline; Amin, Elisabeth; Behlau, Mara

2010-01-01

overall voice and strain level analysis in rock singers. to analyze the voice o rock singers according to two specific parameters: overall level of vocal deviation (OLVD) and strain level (SL); to compare these parameters in three different music samples. participants were 26 male rock singers, ranging in age from 17 to 46 years (mean = 29.8 years). All of the participants answered a questionnaire for sample characterization and were submitted to the recording of three voice samples: Brazilian National Anthem (BNA), Satisfaction and self-selected repertoire song (RS). Voice samples were analyzed by five speech-language pathologists according to OLVD and SL. Statistical analysis was done using the software SPSS, version 13.0. statistically significant differences were observed for the mean values of OLVD and SL during the performance of Satisfaction (OLVD = 32.8 and SL = 0.024 / p=0.024) and during the RS performance (OLVD = 38.4 and SL = 55.8 / p=0.010). The values of OLVD and SL are directly proportional to the samples of the BNA* and RS**, i.e. the higher the strain the higher the OLVD (p,0.001*; p=0.010**). When individually analyzing the three song samples, it is observed that the OLVD does not vary significantly among them. However, the mean values present a trend to increase from non-rock to rock performances (24.0 BNA / 32.8 Satisfaction / 38.4 RS). The level of strain found during the BNA performance presents statistically significant difference when compared to the rock performances (Satisfaction and RS, p=0.008 and p=0.001). the obtained data suggest that rock style is related to the greater use of vocal strain and that this strain does not necessarily impose a negative impression to the voice, but corresponds to a common interpretative factor related to this style of music.
Robust analysis method for acoustic properties of biological specimens measured by acoustic microscopy

NASA Astrophysics Data System (ADS)

Arakawa, Mototaka; Mori, Shohei; Kanai, Hiroshi; Nagaoka, Ryo; Horie, Miki; Kobayashi, Kazuto; Saijo, Yoshifumi

2018-07-01

We proposed a robust analysis method for the acoustic properties of biological specimens measured by acoustic microscopy. Reflected pulse signals from the substrate and specimen were converted into frequency domains to obtain sound speed and thickness. To obtain the average acoustic properties of the specimen, parabolic approximation was performed to determine the frequency at which the amplitude of the normalized spectrum became maximum or minimum, considering the sound speed and thickness of the specimens and the operating frequency of the ultrasonic device used. The proposed method was demonstrated for a specimen of malignant melanoma of the skin by using acoustic microscopy attaching a concave transducer with a center frequency of 80 MHz. The variations in sound speed and thickness analyzed by the proposed method were markedly smaller than those analyzed by the method based on an autoregressive model. The proposed method is useful for the analysis of the acoustic properties of bilogical tissues or cells.
[A study of the phenomenon of voice intonation: analysis, usage and diagnosis].

PubMed

Kazanecka, E; Pawłowski, Z; Zółtowski, M

1997-01-01

The aim of this work was to study the average rise time (RT) and average flow rate (MRT) in utterance. Data were collected from 48 singers and 44 patients. The group of patients included cases of modulus vocale, polypus laryngis, paresis bilateralis, hemiparesis, and CA laryngis. Various characteristics of utterance were recorded synchronously: the frequency and intensity of the fundamental laryngeal tone were measured with a laryngophone, a microphone was used to monitor acoustic radiation from the mouth, and a pneumotrachometer was applied for the measurement of flow rate. The data were stored and analysed with the use of a computer. Results show that the analysis carried out in the study describes the distinctive characteristics of normal and pathologic utterance. The main findings are as follows: a) rise time (RT) decreases with increasing loudness and pitch of the sound and is also shorter in staccato than inlegato sounds; b) during the initial transient of staccato sounds, the average flow rate in the glottis increases with intensity and pitch of the sound; c) pre-fonation time (TPP) and air volume do not differentiate normal and pathologic utterance; d) in cases of voice pathology, the analysis of utterance described in this study can be used for the evaluation of therapy and rehabilitation.
The stability of locus equation slopes across stop consonant voicing/aspiration

NASA Astrophysics Data System (ADS)

Sussman, Harvey M.; Modarresi, Golnaz

2004-05-01

The consistency of locus equation slopes as phonetic descriptors of stop place in CV sequences across voiced and voiceless aspirated stops was explored in the speech of five male speakers of American English and two male speakers of Persian. Using traditional locus equation measurement sites for F2 onsets, voiceless labial and coronal stops had significantly lower locus equation slopes relative to their voiced counterparts, whereas velars failed to show voicing differences. When locus equations were derived using F2 onsets for voiced stops that were measured closer to the stop release burst, comparable to the protocol for measuring voiceless aspirated stops, no significant effects of voicing/aspiration on locus equation slopes were observed. This methodological factor, rather than an underlying phonetic-based explanation, provides a reasonable account for the observed flatter locus equation slopes of voiceless labial and coronal stops relative to voiced cognates reported in previous studies [Molis et al., J. Acoust. Soc. Am. 95, 2925 (1994); O. Engstrand and B. Lindblom, PHONUM 4, 101-104]. [Work supported by NIH.
Involvement of the left insula in the ecological validity of the human voice

PubMed Central

Tamura, Yuri; Kuriki, Shinji; Nakano, Tamami

2015-01-01

A subtle difference between a real human and an artificial object that resembles a human evokes an impression of a large qualitative difference between them. This suggests the existence of a neural mechanism that processes the sense of humanness. To examine the presence of such a mechanism, we compared the behavioral and brain responses of participants who listened to human and artificial singing voices created from vocal fragments of a real human voice. The behavioral experiment showed that the song sung by human voices more often elicited positive feelings and feelings of humanness than the same song sung by artificial voices, although the lyrics, melody, and rhythm were identical. Functional magnetic resonance imaging revealed significantly higher activation in the left posterior insula in response to human voices than in response to artificial voices. Insular activation was not merely evoked by differences in acoustic features between the voices. Therefore, these results suggest that the left insula participates in the neural processing of the ecological quality of the human voice. PMID:25739519
Self, Voices and Embodiment: A Phenomenological Analysis

PubMed Central

Rosen, C; Jones, N; Chase, KA; Grossman, LS; Gin, H; Sharma, RP

2016-01-01

Objective The primary aim of this study was to examine first-person phenomenological descriptions of the relationship between the self and Auditory Verbal Hallucinations (AVHs). Complex AVHs are frequently described as entities with clear interpersonal characteristics. Strikingly, investigations of first-person (subjective) descriptions of the phenomenology of the relationship are virtually absent from the literature. Method Twenty participants with psychosis and actively experiencing AVHs were recruited from the University of Illinois at Chicago. A mixed-methods design involving qualitative and quantitative components was utilized. Following a priority-sequence model of complementarity, quantitative analyses were used to test elements of emergent qualitative themes. Results The qualitative analysis identified three foundational constructs in the relationship between self and voices: ‘understanding of origin,’ ‘distinct interpersonal identities,’ and ‘locus of control.’ Quantitative analyses further supported identified links of these constructs. Subjects experienced their AVHs as having identities distinct from self and actively engaged with their AVHs experienced a greater sense of autonomy and control over AVHs. Discussion Given the clinical importance of AVHs and emerging strategies targeting the relationship between the hearer and voices, our findings highlight the importance of these relational constructs in improvement and innovation of clinical interventions. Our analyses also underscore the value of detailed voice assessments such as those provided by the Maastricht Interview are needed in the evaluation process. Subjects narratives shows that the relational phenomena between hearer and AVH(s) is dynamic, and can be influenced and changed through the hearers’ engagement, conversation, and negotiation with their voices. PMID:27099869
Telehealth: voice therapy using telecommunications technology.

PubMed

Mashima, Pauline A; Birkmire-Peters, Deborah P; Syms, Mark J; Holtel, Michael R; Burgess, Lawrence P A; Peters, Leslie J

2003-11-01

Telehealth offers the potential to meet the needs of underserved populations in remote regions. The purpose of this study was a proof-of-concept to determine whether voice therapy can be delivered effectively remotely. Treatment outcomes were evaluated for a vocal rehabilitation protocol delivered under 2 conditions: with the patient and clinician interacting within the same room (conventional group) and with the patient and clinician in separate rooms, interacting in real time via a hard-wired video camera and monitor (video teleconference group). Seventy-two patients with voice disorders served as participants. Based on evaluation by otolaryngologists, 31 participants were diagnosed with vocal nodules, 29 were diagnosed with edema, 9 were diagnosed with unilateral vocal fold paralysis, and 3 presented with vocal hyperfunction with no laryngeal pathology. Fifty-one participants (71%) completed the vocal rehabilitation protocol. Outcome measures included perceptual judgments of voice quality, acoustic analyses of voice, patient satisfaction ratings, and fiber-optic laryngoscopy. There were no differences in outcome measures between the conventional group and the remote video teleconference group. Participants in both groups showed positive changes on all outcome measures after completing the vocal rehabilitation protocol. Reasons for participants discontinuing therapy prematurely provided support for the telehealth model of service delivery.
Voice disorders in teachers: occupational risk factors and psycho-emotional factors.

PubMed

van Houtte, Evelyne; Claeys, Sofie; Wuyts, Floris; van Lierde, Kristiane

2012-10-01

Teaching is a high-risk occupation for developing voice disorders. The purpose of this study was to investigate previously described vocal risk factors as well as to identify new risk factors related to both the personal life of the teacher (fluid intake, voice-demanding activities, family history of voice disorders, and children at home) and to environmental factors (temperature changes, chalk use, presence of curtains, carpet, or air-conditioning, acoustics in the classroom, and noise in and outside the classroom). The study group comprised 994 teachers (response rate 46.6%). All participants completed a questionnaire. Chi-square tests and logistic regression analyses were performed. A total of 51.2% (509/994) of the teachers presented with voice disorders. Women reported more voice disorders compared to men (56.4% versus 40.4%, P < 0.001). Vocal risk factors were a family history of voice disorders (P = 0.005), temperature changes in the classroom (P = 0.017), the number of pupils per classroom (P = 0.001), and noise level inside the classroom (P = 0.001). Teachers with voice disorders presented a higher level of psychological distress (P < 0.001) compared to teachers without voice problems. Voice disorders are frequent among teachers, especially in female teachers. The results of this study emphasize that multiple factors are involved in the development of voice disorders.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2002-01-01

Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

Occupational voice demands and their impact on the call-centre industry.

PubMed

Hazlett, D E; Duffy, O M; Moorhead, S A

2009-04-20

Within the last decade there has been a growth in the call-centre industry in the UK, with a growing awareness of the voice as an important tool for successful communication. Occupational voice problems such as occupational dysphonia, in a business which relies on healthy, effective voice as the primary professional communication tool, may threaten working ability and occupational health and safety of workers. While previous studies of telephone call-agents have reported a range of voice symptoms and functional vocal health problems, there have been no studies investigating the use and impact of vocal performance in the communication industry within the UK. This study aims to address a significant gap in the evidence-base of occupational health and safety research. The objectives of the study are: 1. to investigate the work context and vocal communication demands for call-agents; 2. to evaluate call-agents' vocal health, awareness and performance; and 3. to identify key risks and training needs for employees and employers within call-centres. This is an occupational epidemiological study, which plans to recruit call-centres throughout the UK and Ireland. Data collection will consist of three components: 1. interviews with managers from each participating call-centre to assess their communication and training needs; 2. an online biopsychosocial questionnaire will be administered to investigate the work environment and vocal demands of call-agents; and 3. voice acoustic measurements of a random sample of participants using the Multi-dimensional Voice Program (MDVP). Qualitative content analysis from the interviews will identify underlying themes and issues. A multivariate analysis approach will be adopted using Structural Equation Modelling (SEM), to develop voice measurement models in determining the construct validity of potential factors contributing to occupational dysphonia. Quantitative data will be analysed using SPSS version 15. Ethical approval is granted
Voice Habits and Behaviors: Voice Care Among Flamenco Singers.

PubMed

Garzón García, Marina; Muñoz López, Juana; Y Mendoza Lara, Elvira

2017-03-01

The purpose of this study is to analyze the vocal behavior of flamenco singers, as compared with classical music singers, to establish a differential vocal profile of voice habits and behaviors in flamenco music. Bibliographic review was conducted, and the Singer's Vocal Habits Questionnaire, an experimental tool designed by the authors to gather data regarding hygiene behavior, drinking and smoking habits, type of practice, voice care, and symptomatology perceived in both the singing and the speaking voice, was administered. We interviewed 94 singers, divided into two groups: the flamenco experimental group (FEG, n = 48) and the classical control group (CCG, n = 46). Frequency analysis, a Likert scale, and discriminant and exploratory factor analysis were used to obtain a differential profile for each group. The FEG scored higher than the CCG in speaking voice symptomatology. The FEG scored significantly higher than the CCG in use of "inadequate vocal technique" when singing. Regarding voice habits, the FEG scored higher in "lack of practice and warm-up" and "environmental habits." A total of 92.6% of the subjects classified themselves correctly in each group. The Singer's Vocal Habits Questionnaire has proven effective in differentiating flamenco and classical singers. Flamenco singers are exposed to numerous vocal risk factors that make them more prone to vocal fatigue, mucosa dehydration, phonotrauma, and muscle stiffness than classical singers. Further research is needed in voice training in flamenco music, as a means to strengthen the voice and enable it to meet the requirements of this musical genre. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Perceptual Adaptation of Voice Gender Discrimination with Spectrally Shifted Vowels

PubMed Central

Li, Tianhao; Fu, Qian-Jie

2013-01-01

Purpose To determine whether perceptual adaptation improves voice gender discrimination of spectrally shifted vowels and, if so, which acoustic cues contribute to the improvement. Method Voice gender discrimination was measured for 10 normal-hearing subjects, during 5 days of adaptation to spectrally shifted vowels, produced by processing the speech of 5 male and 5 female talkers with 16-channel sine-wave vocoders. The subjects were randomly divided into 2 groups; one subjected to 50-Hz, and the other to 200-Hz, temporal envelope cutoff frequencies. No preview or feedback was provided. Results: There was significant adaptation in voice gender discrimination with the 200-Hz cutoff frequency, but significant improvement was observed only for 3 female talkers with F0 > 180 Hz and 3 male talkers with F0 < 170 Hz. There was no significant adaptation with the 50-Hz cutoff frequency. Conclusions Temporal envelope cues are important for voice gender discrimination under spectral shift conditions with perceptual adaptation, but spectral shift may limit the exclusive use of spectral information and/or the use of formant structure on voice gender discrimination. The results have implications for cochlear implant users and for understanding voice gender discrimination. PMID:21173392
Perceptual adaptation of voice gender discrimination with spectrally shifted vowels.

PubMed

Li, Tianhao; Fu, Qian-Jie

2011-08-01

To determine whether perceptual adaptation improves voice gender discrimination of spectrally shifted vowels and, if so, which acoustic cues contribute to the improvement. Voice gender discrimination was measured for 10 normal-hearing subjects, during 5 days of adaptation to spectrally shifted vowels, produced by processing the speech of 5 male and 5 female talkers with 16-channel sine-wave vocoders. The subjects were randomly divided into 2 groups; one subjected to 50-Hz, and the other to 200-Hz, temporal envelope cutoff frequencies. No preview or feedback was provided. There was significant adaptation in voice gender discrimination with the 200-Hz cutoff frequency, but significant improvement was observed only for 3 female talkers with F(0) > 180 Hz and 3 male talkers with F(0) < 170 Hz. There was no significant adaptation with the 50-Hz cutoff frequency. Temporal envelope cues are important for voice gender discrimination under spectral shift conditions with perceptual adaptation, but spectral shift may limit the exclusive use of spectral information and/or the use of formant structure on voice gender discrimination. The results have implications for cochlear implant users and for understanding voice gender discrimination.
Giving Voice to Emotion: Voice Analysis Technology Uncovering Mental States is Playing a Growing Role in Medicine, Business, and Law Enforcement.

PubMed

Allen, Summer

2016-01-01

It's tough to imagine anything more frustrating than interacting with a call center. Generally, people don't reach out to call centers when they?re happy-they're usually trying to get help with a problem or gearing up to do battle over a billing error. Add in an automatic phone tree, and you have a recipe for annoyance. But what if that robotic voice offering you a smorgasbord of numbered choices could tell that you were frustrated and then funnel you to an actual human being? This type of voice analysis technology exists, and it's just one example of the many ways that computers can use your voice to extract information about your mental and emotional state-including information you may not think of as being accessible through your voice alone.
Lexical frequency and voice assimilation in complex words in Dutch

NASA Astrophysics Data System (ADS)

Ernestus, Mirjam; Lahey, Mybeth; Verhees, Femke; Baayen, Harald

2004-05-01

Words with higher token frequencies tend to have more reduced acoustic realizations than lower frequency words (e.g., Hay, 2000; Bybee, 2001; Jurafsky et al., 2001). This study documents frequency effects for regressive voice assimilation (obstruents are voiced before voiced plosives) in Dutch morphologically complex words in the subcorpus of read-aloud novels in the corpus of spoken Dutch (Oostdijk et al., 2002). As expected, the initial obstruent of the cluster tends to be absent more often as lexical frequency increases. More importantly, as frequency increases, the duration of vocal-fold vibration in the cluster decreases, and the duration of the bursts in the cluster increases, after partialing out cluster duration. This suggests that there is less voicing for higher-frequency words. In fact, phonetic transcriptions show regressive voice assimilation for only half of the words and progressive voice assimilation for one third. Interestingly, the progressive voice assimilation observed for higher-frequency complex words renders these complex words more similar to monomorphemic words: Dutch monomorphemic words typically contain voiceless obstruent clusters (Zonneveld, 1983). Such high-frequency complex words may therefore be less easily parsed into their constituent morphemes (cf. Hay, 2000), favoring whole word lexical access (Bertram et al., 2000).
Effects of subglottal and supraglottal acoustic loading on voice production

NASA Astrophysics Data System (ADS)

Zhang, Zhaoyan; Mongeau, Luc; Frankel, Steven

2002-05-01

Speech production involves sound generation by confined jets through an orifice (the glottis) with a time-varying area. Predictive models are usually based on the quasi-steady assumption. This assumption allows the complex unsteady flows to be treated as steady flows, which are more effectively modeled computationally. Because of the reflective properties of the human lungs, trachea and vocal tract, subglottal and supraglottal resonance and other acoustic effects occur in speech, which might affect glottal impedance, especially in the regime of unsteady flow separation. Changes in the flow structure, or flow regurgitation due to a transient negative transglottal pressure, could also occur. These phenomena may affect the quasi-steady behavior of speech production. To investigate the possible effects of the subglottal and supraglottal acoustic loadings, a dynamic mechanical model of the larynx was designed and built. The subglottal and supraglottal acoustic loadings are simulated using an expansion in the tube upstream of the glottis and a finite length tube downstream, respectively. The acoustic pressures of waves radiated upstream and downstream of the orifice were measured and compared to those predicted using a model based on the quasi-steady assumption. A good agreement between the experimental data and the predictions was obtained for different operating frequencies, flow rates, and orifice shapes. This supports the validity of the quasi-steady assumption for various subglottal and supraglottal acoustic loadings.
A study of classroom acoustics and school teachers' noise exposure, voice load and speaking time during teaching, and the effects on vocal and mental fatigue development.

PubMed

Kristiansen, Jesper; Lund, Søren Peter; Persson, Roger; Shibuya, Hitomi; Nielsen, Per Møberg; Scholz, Matthias

2014-11-01

The study investigated the noise exposure in a group of Danish school teachers. The aims were to investigate if noise posed a risk of impairment of hearing and to study the association between classroom acoustical conditions, noise exposure, vocal symptoms, and cognitive fatigue. Background noise levels, vocal load and speaking time were measured on 35 teachers during actual classroom teaching. The classrooms were characterized acoustically by measurements of reverberation time. Before and after the workday, the teachers answered a questionnaire on fatigue symptoms and carried out two cognitive test tasks sensitive to mental fatigue. The average noise level during the lessons was 72 dB(A), but during indoor sports activities the average noise level increased 6.6 dB(A). Room reverberation time (range 0.39-0.83 s) had no significant effect on the noise level. The teachers were talking with a raised voice in 61% of the time, and the vocal load increased 0.65 dB(A) per dB(A) increase in the average lesson noise level. An increase in voice symptoms during the workday correlated significantly with individual average noise exposure, and a decrease in performance in the two-back test correlated significantly with individual average vocal load. Noise exposure in general classrooms posed no risk of noise-induced hearing impairment in school teachers. However, the results provide evidence for an association between noise exposure and vocal load and development of vocal symptoms and cognitive fatigue after work.
[The acoustic changes of the voice in the singing boys during the permutation period].

PubMed

Chernobel'sky, S I

2016-01-01

The present study was based on the assumption that the determination of the fundamental frequency (Fo) of the speech by means of computer-assisted acoustic analysis makes it possible to detect the onset of vocal mutation in the singing boys. A total of 30 singing boys were available for the examination. They were allocated to two groups. Group 1 was comprised of 15 boys at the age between 11 years 10 months and 12 years 4 months. Group 2 consisted of 15 boys aged between 12 years 10 months and 13 years 2 months. All the participants of the study underwent an acoustic test in combination with indirect laryngoscopy. It was shown that fundamental frequency of the speech in the boys of group 2 was significantly lower than in group 1. The difference amounted to two half-tones and could be regarded as the onset of vocal pre-mutation. It is concluded that the acoustic analysis of the speech should be employed to determine the time of vocal pre-mutation in the singing boys. The singing teachers can use this method all by themselves.
Nonlinear dynamic-based analysis of severe dysphonia in patients with vocal fold scar and sulcus vocalis

PubMed Central

Choi, Seong Hee; Zhang, Yu; Jiang, Jack J.; Bless, Diane M.; Welham, Nathan V.

2011-01-01

Objective The primary goal of this study was to evaluate a nonlinear dynamic approach to the acoustic analysis of dysphonia associated with vocal fold scar and sulcus vocalis. Study Design Case-control study. Methods Acoustic voice samples from scar/sulcus patients and age/sex-matched controls were analyzed using correlation dimension (D2) and phase plots, time-domain based perturbation indices (jitter, shimmer, signal-to-noise ratio [SNR]), and an auditory-perceptual rating scheme. Signal typing was performed to identify samples with bifurcations and aperiodicity. Results Type 2 and 3 acoustic signals were highly represented in the scar/sulcus patient group. When data were analyzed irrespective of signal type, all perceptual and acoustic indices successfully distinguished scar/sulcus patients from controls. Removal of type 2 and 3 signals eliminated the previously identified differences between experimental groups for all acoustic indices except D2. The strongest perceptual-acoustic correlation in our dataset was observed for SNR; the weakest correlation was observed for D2. Conclusions These findings suggest that D2 is inferior to time-domain based perturbation measures for the analysis of dysphonia associated with scar/sulcus; however, time-domain based algorithms are inherently susceptible to inflation under highly aperiodic (i.e., type 2 and 3) signal conditions. Auditory-perceptual analysis, unhindered by signal aperiodicity, is therefore a robust strategy for distinguishing scar/sulcus patient voices from normal voices. Future acoustic analysis research in this area should consider alternative (e.g., frequency- and quefrency-domain based) measures alongside additional nonlinear approaches. PMID:22516315
Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

PubMed

Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

2016-10-01

Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.
Examining the Impact of Video Modeling Techniques on the Efficacy of Clinical Voice Assessment.

PubMed

Werner, Cara; Bowyer, Samantha; Weinrich, Barbara; Gottliebson, Renee; Brehm, Susan Baker

2017-01-01

The purpose of the current study was to determine whether or not presenting patients with a video model improves efficacy of the assessment as defined by efficiency and decreased variability in trials during the acoustic component of voice evaluations. Twenty pediatric participants with a mean age of 7.6 years (SD = 1.50; range = 6-11 years), 32 college-age participants with a mean age of 21.32 years (SD = 1.61; range = 18-30 years), and 17 adult participants with a mean age of 54.29 years (SD = 2.78; range = 50-70 years) were included in the study and divided into experimental and control groups. The experimental group viewed a training video prior to receiving verbal instructions and performing acoustic assessment tasks, whereas the control group received verbal instruction only prior to completing the acoustic assessment. Primary measures included the number of clinician cues required and instructional time. Standard deviations of acoustic measurements (eg, minimum and maximum frequency) were also examined to determine effects on stability. Individuals in the experimental group required significantly less cues, P = 0.012, compared to the control group. Although some trends were observed in instructional time and stability of measurements, no significant differences were observed. The findings of this study may be useful for speech-language pathologists in regard to improving assessment of patients' voice disorders with the use of video modeling. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Obligatory and facultative brain regions for voice-identity recognition.

PubMed

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is
Variation in stop consonant voicing in two regional varieties of American English

PubMed Central

Jacewicz, Ewa; Fox, Robert Allen; Lyle, Samantha

2010-01-01

This study is an acoustic investigation of the nature and extent of consonant voicing of the stop /b/ in two dialectal varieties of American English spoken in south-central Wisconsin and western North Carolina. The stop /b/ occurred at the juncture of two words such as small bids, in a position between two voiced sonorants, i.e. the liquid /l/ and a vowel. Twenty women participated, ten representing the Wisconsin and ten the North Carolina variety, respectively. Significant dialectal differences were found in the voicing patterns. The Wisconsin stop closures were usually not fully voiced and terminated in a complete silence followed by a closure release whereas North Carolina speakers produced mostly fully voiced closures. Further dialectal differences included the proportion of closure voicing as a function of word emphasis. For Wisconsin speakers, the proportion of closure voicing was smallest when the word was emphasized and it was greatest in non-emphatic positions. For North Carolina speakers, the degree of word emphasis did not have an effect on the proportion of closure voicing. The results suggest different mechanisms by which closure voicing is maintained in these two dialects, pointing to active articulatory maneuvers in North Carolina speakers and passive in Wisconsin speakers. PMID:20198112
Acoustics of contrastive prosody in children

NASA Astrophysics Data System (ADS)

Patel, Rupal; Piel, Jordan; Grigos, Maria

2005-04-01

Empirical data on the acoustics of prosodic control in children is limited, particularly for linguistically contrastive tasks. Twelve children aged 4, 7, and 11 years were asked to produce two utterances ``Show Bob a bot'' (voiced consonants) and ``Show Pop a pot'' (voiceless consonants) 10 times each with emphasis placed on the second word (Bob/Pop) and 10 times with emphasis placed on the last word (bot/pot). A total of 40 utterances were analyzed per child. The following acoustic measures were obtained for each word within each utterance: average fundamental frequency (f0), peak f0, average intensity, peak intensity, and duration. Preliminary results suggest that 4 year olds are unable to modulate prosodic cues to signal the linguistic contrast. The 7 year olds, however, not only signaled the appropriate stress location, but did so with the most contrastive differences in f0, intensity, and duration, of all age groups. Prosodic differences between stressed and unstressed words were more pronounced for the utterance with voiced consonants. These findings suggest that the acoustics of linguistic prosody begin to differentiate between age 4 and 7 and may be highly influenced by changes in physiological control and flexibility that may also affect segmental features.
Signal analysis of the female singing voice: Features for perceptual singer identity

NASA Astrophysics Data System (ADS)

Mellody, Maureen

2001-07-01

Individual singing voices tend to be easy for a listener to identify, particularly when compared to the difficulty of identifying the performer of any other musical instrument. What cues does a listener use to identify a particular singing voice? This work seeks to identify a set of features with which one can synthesize notes with the vocal quality of a particular singer. Such analysis and synthesis influences computer music (in the creation of synthetic sounds with different timbre), vocal pedagogy (as a training tool to help singers understand properties of their own voice as well as different professional-quality voices), and vocal health (to identify improper behavior in vocal production). The problem of singer identification is approached in three phases: signal analysis, the development of low- order representations, and perceptual evaluation. To perform the signal analysis, a high-resolution time- frequency distribution is applied to vowel tokens from sopranos and mezzo-sopranos. From these results, low- order representations are created for each singer's notes, which are used to synthesize sounds with the timbral quality of that singer. Finally, these synthesized sounds, along with original recordings, are evaluated by trained listeners in a variety of perceptual experiments to determine the extent to which the vocal quality of the desired singer is captured. Results from the signal analysis show that amplitude and frequency estimates extracted from the time-frequency signal analysis can be used to re-create each signal with little degradation in quality and no loss of perceptual identity. Low-order representations derived from the signal analysis are used in clustering and classification, which successfully clusters signals with corresponding singer identity. Finally, perceptual results indicate that trained listeners are, surprisingly, only modestly successful at correctly identifying the singer of a recording, and find the task to be particularly
Detecting Abnormal Word Utterances in Children With Autism Spectrum Disorders: Machine-Learning-Based Voice Analysis Versus Speech Therapists.

PubMed

Nakai, Yasushi; Takiguchi, Tetsuya; Matsui, Gakuyo; Yamaoka, Noriko; Takada, Satoshi

2017-10-01

Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.
Voice characteristics before versus after mandibular setback surgery in patients with mandibular prognathism using nonlinear dynamics and conventional acoustic analyses.

PubMed

Mishima, Katsuaki; Moritani, Norifumi; Nakano, Hiroyuki; Matsushita, Asuka; Iida, Seiji; Ueyama, Yoshiya

2013-12-01

The purpose of this study was to explore the voice characteristics of patients with mandibular prognathism, and to investigate the effects of mandibular setback surgery on these characteristics using nonlinear dynamics and conventional acoustic analyses. Sixteen patients (8 males and 8 females) who had skeletal 3, class III malocclusion without cleft palate, and who underwent a bilateral sagittal split ramus osteotomy (BSSRO), were enrolled. As controls, 50 healthy adults (25 males and 25 females) were enrolled. The mean first LEs (mLE1) computed for each one-second interval, and the fundamental frequency (F0) and frequencies of the first and second formant (F1, F2) were calculated for each Japanese vowel. The mLE1s for /u/ in males, and /o/ in females and the F2s for /i/ and /u/ in males, changed significantly after BSSRO. Class III voice characteristics were observed in the mLE1s for /i/ in both males and females, in the F0 for /a/, /i/, /u/ and /o/ in females, and in the F1 and F2 for /a/ in males, and the F1 for /u/ and the F2 for /i/ in females. Most of these characteristics were preserved after BSSRO. Copyright © 2013 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Improving Accuracy in Detecting Acoustic Onsets

ERIC Educational Resources Information Center

Duyck, Wouter; Anseel, Frederik; Szmalec, Arnaud; Mestdagh, Pascal; Tavernier, Antoine; Hartsuiker, Robert J.

2008-01-01

In current cognitive psychology, naming latencies are commonly measured by electronic voice keys that detect when sound exceeds a certain amplitude threshold. However, recent research (e.g., K. Rastle & M. H. Davis, 2002) has shown that these devices are particularly inaccurate in precisely detecting acoustic onsets. In this article, the authors…
Matching Speaking to Singing Voices and the Influence of Content.

PubMed

Peynircioğlu, Zehra F; Rabinovitz, Brian E; Repice, Juliana

2017-03-01

We tested whether speaking voices of unfamiliar people could be matched to their singing voices, and, if so, whether the content of the utterances would influence this matching performance. Our hypothesis was that enough acoustic features would remain the same between speaking and singing voices such that their identification as belonging to the same or different individuals would be possible even upon a single hearing. We also hypothesized that the contents of the utterances would influence this identification process such that voices uttering words would be easier to match than those uttering vowels. We used a within-participant design with blocked stimuli that were counterbalanced using a Latin square design. In one block, mode (speaking vs singing) was manipulated while content was held constant; in another block, content (word vs syllable) was manipulated while mode was held constant, and in the control block, both mode and content were held constant. Participants indicated whether the voices in any given pair of utterances belonged to the same person or to different people. Cross-mode matching was above chance level, although mode-congruent performance was better. Further, only speaking voices were easier to match when uttering words. We can identify speaking and singing voices as the same or different even on just a single hearing. However, content interacts with mode such that words benefit matching of speaking voices but not of singing voices. Results are discussed within an attentional framework. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Functional Voice Testing Detects Early Changes in Vocal Pitch in Women During Testosterone Administration

PubMed Central

Pencina, Karol M.; Coady, Jeffry A.; Beleva, Yusnie M.; Bhasin, Shalender; Basaria, Shehzad

2015-01-01

Objective: To determine dose-dependent effects of T administration on voice changes in women with low T levels. Methods: Seventy-one women who have undergone a hysterectomy with or without oophorectomy with total T < 31 ng/dL and/or free T < 3.5 pg/mL received a standardized transdermal estradiol regimen during the 12-week run-in period and were then randomized to receive weekly im injections of placebo or 3, 6.25, 12.5, or 25 mg T enanthate for 24 weeks. Total and free T levels were measured by liquid chromatography-tandem mass spectrometry and equilibrium dialysis, respectively. Voice handicap was measured by self-report using a validated voice handicap index questionnaire at baseline and 24 weeks after intervention. Functional voice testing was performed using the Kay Elemetrics-Computer Speech Lab to determine voice frequency, volume, and harmonics. Results: Forty-six women with evaluable voice data at baseline and after intervention were included in the analysis. The five groups were similar at baseline. Mean on-treatment nadir total T concentrations were 13, 83, 106, 122, and 250 ng/dL in the placebo, 3-, 6.25-, 12.5-, and 25-mg groups, respectively. Analyses of acoustic voice parameters revealed significant lowering of average pitch in the 12.5- and 25-mg dose groups compared to placebo (P < .05); these changes in pitch were significantly related to increases in T concentrations. No significant dose- or concentration-dependent changes in self-reported voice handicap index scores were observed. Conclusion: Testosterone administration in women with low T levels over 24 weeks was associated with dose- and concentration-dependent decreases in average pitch in the higher dose groups. These changes were seen despite the lack of self-reported changes in voice. PMID:25875779
Functional Voice Testing Detects Early Changes in Vocal Pitch in Women During Testosterone Administration.

PubMed

Huang, Grace; Pencina, Karol M; Coady, Jeffry A; Beleva, Yusnie M; Bhasin, Shalender; Basaria, Shehzad

2015-06-01

To determine dose-dependent effects of T administration on voice changes in women with low T levels. Seventy-one women who have undergone a hysterectomy with or without oophorectomy with total T < 31 ng/dL and/or free T < 3.5 pg/mL received a standardized transdermal estradiol regimen during the 12-week run-in period and were then randomized to receive weekly im injections of placebo or 3, 6.25, 12.5, or 25 mg T enanthate for 24 weeks. Total and free T levels were measured by liquid chromatography-tandem mass spectrometry and equilibrium dialysis, respectively. Voice handicap was measured by self-report using a validated voice handicap index questionnaire at baseline and 24 weeks after intervention. Functional voice testing was performed using the Kay Elemetrics-Computer Speech Lab to determine voice frequency, volume, and harmonics. Forty-six women with evaluable voice data at baseline and after intervention were included in the analysis. The five groups were similar at baseline. Mean on-treatment nadir total T concentrations were 13, 83, 106, 122, and 250 ng/dL in the placebo, 3-, 6.25-, 12.5-, and 25-mg groups, respectively. Analyses of acoustic voice parameters revealed significant lowering of average pitch in the 12.5- and 25-mg dose groups compared to placebo (P < .05); these changes in pitch were significantly related to increases in T concentrations. No significant dose- or concentration-dependent changes in self-reported voice handicap index scores were observed. Testosterone administration in women with low T levels over 24 weeks was associated with dose- and concentration-dependent decreases in average pitch in the higher dose groups. These changes were seen despite the lack of self-reported changes in voice.
On noninvasive assessment of acoustic fields acting on the fetus

NASA Astrophysics Data System (ADS)

Antonets, V. A.; Kazakov, V. V.

2014-05-01

The aim of this study is to verify a noninvasive technique for assessing the characteristics of acoustic fields in the audible range arising in the uterus under the action of maternal voice, external sounds, and vibrations. This problem is very important in view of actively developed methods for delivery of external sounds to the uterus: music, maternal voice recordings, sounds from outside the mother's body, etc., that supposedly support development of the fetus at the prenatal stage psychologically and cognitively. However, the parameters of acoustic signals have been neither measured nor normalized, which may be dangerous for the fetus and hinder actual assessment of their impact on fetal development. The authors show that at frequencies below 1 kHz, acoustic pressure in the uterus may be measured noninvasively using a hydrophone placed in a soft capsule filled with liquid. It was found that the acoustic field at frequencies up to 1 kHz arising in the uterus under the action of an external sound field has amplitude-frequency parameters close to those of the external field; i.e., the external field penetrates the uterus with hardly any difficulty.
Age-Related Changes to Spectral Voice Characteristics Affect Judgments of Prosodic, Segmental, and Talker Attributes for Child and Adult Speech

ERIC Educational Resources Information Center

Dilley, Laura C.; Wieland, Elizabeth A.; Gamache, Jessica L.; McAuley, J. Devin; Redford, Melissa A.

2013-01-01

Purpose: As children mature, changes in voice spectral characteristics co-vary with changes in speech, language, and behavior. In this study, spectral characteristics were manipulated to alter the perceived ages of talkers' voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were…
Temporal signatures of processing voiceness and emotion in sound

PubMed Central

Gunter, Thomas C.

2017-01-01

Abstract This study explored the temporal course of vocal and emotional sound processing. Participants detected rare repetitions in a stimulus stream comprising neutral and surprised non-verbal exclamations and spectrally rotated control sounds. Spectral rotation preserved some acoustic and emotional properties of the vocal originals. Event-related potentials elicited to unrepeated sounds revealed effects of voiceness and emotion. Relative to non-vocal sounds, vocal sounds elicited a larger centro-parietally distributed N1. This effect was followed by greater positivity to vocal relative to non-vocal sounds beginning with the P2 and extending throughout the recording epoch (N4, late positive potential) with larger amplitudes in female than in male listeners. Emotion effects overlapped with the voiceness effects but were smaller and differed topographically. Voiceness and emotion interacted only for the late positive potential, which was greater for vocal-emotional as compared with all other sounds. Taken together, these results point to a multi-stage process in which voiceness and emotionality are represented independently before being integrated in a manner that biases responses to stimuli with socio-emotional relevance. PMID:28338796
Voice deviation, dysphonia risk screening and quality of life in individuals with various laryngeal diagnoses

PubMed Central

Nemr, Katia; Cota, Ariane; Tsuji, Domingos; Simões-Zenari, Marcia

2018-01-01

OBJECTIVES: To characterize the voice quality of individuals with dysphonia and to investigate possible correlations between the degree of voice deviation (D) and scores on the Dysphonia Risk Screening Protocol-General (DRSP), the Voice-Related Quality of Life (V-RQOL) measure and the Voice Handicap Index, short version (VHI-10). METHODS: The sample included 200 individuals with dysphonia. Following laryngoscopy, the participants completed the DRSP, the V-RQOL measure, and the VHI-10; subsequently, voice samples were recorded for auditory-perceptual and acoustic analyses. The correlation between the score for each questionnaire and the overall degree of vocal deviation was analyzed, as was the correlation among the scores for the three questionnaires. RESULTS: Most of the participants (62%) were female, and the mean age of the sample was 49 years. The most common laryngeal diagnosis was organic dysphonia (79.5%). The mean D was 59.54, and the predominance of roughness had a mean of 54.74. All the participants exhibited at least one abnormal acoustic aspect. The mean questionnaire scores were DRSP, 44.7; V-RQOL, 57.1; and VHI-10, 16. An inverse correlation was found between the V-RQOL score and D; however, a positive correlation was found between both the VHI-10 and DRSP scores and D. CONCLUSION: A predominance of adult women, organic dysphonia, moderate voice deviation, high dysphonia risk, and low to moderate quality of life impact characterized our sample. There were correlations between the scores of each of the three questionnaires and the degree of voice deviation. It should be noted that the DRSP monitored the degree of dysphonia severity, which reinforces its applicability for patients with different laryngeal diagnoses. PMID:29538494
Voice deviation, dysphonia risk screening and quality of life in individuals with various laryngeal diagnoses.

PubMed

Nemr, Katia; Cota, Ariane; Tsuji, Domingos; Simões-Zenari, Marcia

2018-03-12

To characterize the voice quality of individuals with dysphonia and to investigate possible correlations between the degree of voice deviation (D) and scores on the Dysphonia Risk Screening Protocol-General (DRSP), the Voice-Related Quality of Life (V-RQOL) measure and the Voice Handicap Index, short version (VHI-10). The sample included 200 individuals with dysphonia. Following laryngoscopy, the participants completed the DRSP, the V-RQOL measure, and the VHI-10; subsequently, voice samples were recorded for auditory-perceptual and acoustic analyses. The correlation between the score for each questionnaire and the overall degree of vocal deviation was analyzed, as was the correlation among the scores for the three questionnaires. Most of the participants (62%) were female, and the mean age of the sample was 49 years. The most common laryngeal diagnosis was organic dysphonia (79.5%). The mean D was 59.54, and the predominance of roughness had a mean of 54.74. All the participants exhibited at least one abnormal acoustic aspect. The mean questionnaire scores were DRSP, 44.7; V-RQOL, 57.1; and VHI-10, 16. An inverse correlation was found between the V-RQOL score and D; however, a positive correlation was found between both the VHI-10 and DRSP scores and D. A predominance of adult women, organic dysphonia, moderate voice deviation, high dysphonia risk, and low to moderate quality of life impact characterized our sample. There were correlations between the scores of each of the three questionnaires and the degree of voice deviation. It should be noted that the DRSP monitored the degree of dysphonia severity, which reinforces its applicability for patients with different laryngeal diagnoses.
The Acoustic Correlates of Breathy Voice: a Study of Source-Vowel INTERACTION{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}.

NASA Astrophysics Data System (ADS)

Lin, Yeong-Fen Emily

This thesis is the result of an investigation of the source-vowel interaction from the point of view of perception. Major objectives include the identification of the acoustic correlates of breathy voice and the disclosure of the interdependent relationship between the perception of vowel identity and breathiness. Two experiments were conducted to achieve these objectives. In the first experiment, voice samples from one control group and seven patient groups were compared. The control group consisted of five female and five male adults. The ten normals were recruited to perform a sustained vowel phonation task with constant pitch and loudness. The voice samples of seventy patients were retrieved from a hospital data base, with vowels extracted from sentences repeated by patients at their habitual pitch and loudness. The seven patient groups were divided, based on a unique combination of patients' measures on mean flow rate and glottal resistance. Eighteen acoustic variables were treated with a three-way (Gender x Group x Vowel) ANOVA. Parameters showing a significant female-male difference as well as group differences, especially those between the presumed breathy group and the other groups, were identified as relevant to the distinction of breathy voice. As a result, F1-F3 amplitude difference and slope were found to be most effective in distinguishing breathy voice. Other acoustic correlates of breathy voice included F1 bandwidth, RMS-H1 amplitude difference, and F1-F2 amplitude difference and slope. In the second experiment, a formant synthesizer was used to generate vowel stimuli with varying spectral tilt and F1 bandwidth. Thirteen native American English speakers made dissimilarity judgements on paired stimuli in terms of vowel identity and breathiness. Listeners' perceptual vowel spaces were found to be affected by changes in the acoustic correlates of breathy voice. The threshold of detecting a change of vocal quality in the breathiness domain was also
Contemporary Commercial Music Singing Students-Voice Quality and Vocal Function at the Beginning of Singing Training.

PubMed

Sielska-Badurek, Ewelina M; Sobol, Maria; Olszowska, Katarzyna; Niemczyk, Kazimierz

2017-10-03

The purpose of this study was to assess the voice quality and the vocal tract function in popular singing students at the beginning of their singing training at the High School of Music. This is a retrospective cross-sectional study. The study consisted of 45 popular singing students (35 females and 10 males, mean age: 19.9 ± 2.8 years). They were assessed in the first 2 months of their 4-year singing training at the High School of Music, between 2013 and 2016. Voice quality and vocal tract function were evaluated using videolaryngostroboscopy, palpation of the vocal tract structures, the perceptual speaking and singing voice assessment, acoustic analysis, maximal phonation time, the Voice Handicap Index, and the Singing Voice Handicap Index (SVHI). Twenty-two percent of Contemporary Commercial Music singing students began their education in the High School, with vocal nodules. Palpation of the vocal tract structure showed in 50% correct motions and tension in speaking and in 39.3% in singing. Perceptual voice assessment showed in 80% proper speaking voice quality and in 82.4% proper singing voice quality. The mean vocal fundamental frequency while speaking in females was 214 Hz and in males was 116 Hz. Dysphonia Severity Index was at the level of 2, and maximum phonation time was 17.7 seconds. The Voice Handicap Index and the SVHI remained within the normal range: 7.5 and 19, respectively. Perceptual singing voice assessment correlated with the SVHI (P = 0.006). Twenty-two percent of the Contemporary Commercial Music singing students began their education in the High School, with organic vocal fold lesions. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
An Analysis of En Route Controller-Pilot Voice Communications

DOT National Transportation Integrated Search

1993-03-01

The purposes of this analysis were to examine current pilot-controller communication practices in the en route : environment. Forty-eight hours of voice tapes from eight different Air Route Traffic Control Centers (ARTCCs) were : examined. There were...
Validity and reliability of acoustic analysis of respiratory sounds in infants

PubMed Central

Elphick, H; Lancaster, G; Solis, A; Majumdar, A; Gupta, R; Smyth, R

2004-01-01

Objective: To investigate the validity and reliability of computerised acoustic analysis in the detection of abnormal respiratory noises in infants. Methods: Blinded, prospective comparison of acoustic analysis with stethoscope examination. Validity and reliability of acoustic analysis were assessed by calculating the degree of observer agreement using the κ statistic with 95% confidence intervals (CI). Results: 102 infants under 18 months were recruited. Convergent validity for agreement between stethoscope examination and acoustic analysis was poor for wheeze (κ = 0.07 (95% CI, –0.13 to 0.26)) and rattles (κ = 0.11 (–0.05 to 0.27)) and fair for crackles (κ = 0.36 (0.18 to 0.54)). Both the stethoscope and acoustic analysis distinguished well between sounds (discriminant validity). Agreement between observers for the presence of wheeze was poor for both stethoscope examination and acoustic analysis. Agreement for rattles was moderate for the stethoscope but poor for acoustic analysis. Agreement for crackles was moderate using both techniques. Within-observer reliability for all sounds using acoustic analysis was moderate to good. Conclusions: The stethoscope is unreliable for assessing respiratory sounds in infants. This has important implications for its use as a diagnostic tool for lung disorders in infants, and confirms that it cannot be used as a gold standard. Because of the unreliability of the stethoscope, the validity of acoustic analysis could not be demonstrated, although it could discriminate between sounds well and showed good within-observer reliability. For acoustic analysis, targeted training and the development of computerised pattern recognition systems may improve reliability so that it can be used in clinical practice. PMID:15499065
A Conjoint Analysis of Voice Over IP Attributes.

ERIC Educational Resources Information Center

Zubey, Michael L.; Wagner, William; Otto, James R.

2002-01-01

Managers need to understand the tradeoffs associated with voice over Internet protocol (VoIP) networks as compared to the Public Switched Telephone Network (PSTN). This article measures the preference structures between IP telephony and PSTN services using conjoint analysis. The purpose is to suggest VoIP technology attributes that best meet…
An Analysis of Tower (Local) Controller - Pilot Voice Communications

DOT National Transportation Integrated Search

1994-06-01

The purposes of this analysis were to examine current pilot-controller communication practices in the terminal environment. Forty-nine hours of voice tapes from local positions in ten Air Traffic Control Towers (ATCTs) were examined. There were 8,444...
Characterizing noise in nonhuman vocalizations: Acoustic analysis and human perception of barks by coyotes and dogs

NASA Astrophysics Data System (ADS)

Riede, Tobias; Mitchell, Brian R.; Tokuda, Isao; Owren, Michael J.

2005-07-01

Measuring noise as a component of mammalian vocalizations is of interest because of its potential relevance to the communicative function. However, methods for characterizing and quantifying noise are less well established than methods applicable to harmonically structured aspects of signals. Using barks of coyotes and domestic dogs, we compared six acoustic measures and studied how they are related to human perception of noisiness. Measures of harmonic-to-noise-ratio (HNR), percent voicing, and shimmer were found to be the best predictors of perceptual rating by human listeners. Both acoustics and perception indicated that noisiness was similar across coyote and dog barks, but within each species there was significant variation among the individual vocalizers. The advantages and disadvantages of the various measures are discussed.
Voice quality outcomes of idiopathic Parkinson's disease medical treatment: A systematic review.

PubMed

Lechien, J R; Blecic, S; Huet, K; Delvaux, V; Piccaluga, M; Roland, V; Harmegnies, B; Saussez, S

2018-06-01

To investigate voice quality (VQ) impairments in idiopathic Parkinson's disease (IPD) and to explore the impact of medical treatments and L-Dopa challenge testing on voice. Relevant studies published between January 1980 and June 2017 describing VQ evaluations in IPD were retrieved using PubMed, Scopus, Biological Abstracts, BioMed Central and Cochrane databases. Issues of clinical relevance, including IPD treatment efficiency and voice quality outcomes, were evaluated for each study. The grade of recommendation for each publication was determined according to the Oxford Centre for Evidence-Based Medicine evidence levels. The database research yielded 106 relevant publications, of which 33 studies met the inclusion criteria, for a total of 964 patients with IPD. Data were extracted by 3 independent physicians who identified 21, 11 and 1 trials with IIIb, IIb and IIa evidence levels, respectively. The main VQ assessment tools used were acoustic testing (N = 27), aerodynamic testing (N = 10), subjective measurements (N = 8) and videolaryngostroboscopy (N = 3). The majority of trials (N = 32/33) identified subjective or objective VQ improvements after medical treatment (N = 10) or better VQ evaluations in healthy subjects compared to patients with IPD (N = 22). Especially, our analysis supports that VQ overall improves during the L-Dopa challenge testing, making the VQ evaluation an additional tool for the IPD diagnosis. The methodology used to assess subjective and objective VQ substantially varied from 1 study to another. All of the included studies took into consideration the patient's clinical profile in the VQ analysis. The majority of studies supported that VQ assessments remain useful as outcome measures of the effectiveness of medical treatment and could be helpful for the IPD diagnosis based on L-Dopa challenge testing. Further controlled studies using standardised and transparent methodology for measuring acoustic parameters are necessary to
New standard measures for clinical voice analysis include high speed films

NASA Astrophysics Data System (ADS)

Pedersen, Mette; Munch, Kasper

2012-02-01

In the clinical work with patients in a medical voice clinic it is important to have a normal updated reference for the data used. Several new parameters have to be correlated to older traditional measures. The older ones are stroboscopy, eventually coordinated with electroglottography (EGG), the Multi- Dimensional-Voice Program and airflow rates. Long Time Averaged Spectrograms (LTAS) and phonetograms (voice profiles) are calculating the range and dynamics of tones of the patients. High-speed films, updated airflow measures as well as area calculations of phonotograms add information to the understanding of the glottis closure in single movements of the vocal cords. A multivariate analysis was made to study the connection between the measures. This information can be used in many connections, also in the otolaryngological clinic.
Dimensionality in voice quality.

PubMed

Bele, Irene Velsvik

2007-05-01

This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.
Trends in musical theatre voice: an analysis of audition requirements for singers.

PubMed

Green, Kathryn; Freeman, Warren; Edwards, Matthew; Meyer, David

2014-05-01

The American musical theatre industry is a multibillion dollar business in which the requirements for singers are varied and complex. This study identifies the musical genres and voice requirements that are currently most requested at professional auditions to help voice teachers, pedagogues, and physicians who work with musical theatre singers understand the demands of their clients' business. Frequency count. One thousand two thirty-eight professional musical theatre audition listings were gathered over a 6-month period, and information from each listing was categorized and entered into a spreadsheet for analysis. The results indicate that four main genres of music were requested over a wide variety of styles, with more than half of auditions requesting genre categories that may not be served by traditional or classical voice technique alone. To adequately prepare young musical theatre performers for the current job market and keep the performers healthily making the sounds required by the industry, new singing styles may need to be studied and integrated into voice training that only teaches classical styles. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Voice Formants in Individuals With Congenital, Isolated, Lifetime Growth Hormone Deficiency.

PubMed

Valença, Eugenia H O; Salvatori, Roberto; Souza, Anita H O; Oliveira-Neto, Luiz A; Oliveira, Alaíde H A; Gonçalves, Maria I R; Oliveira, Carla R P; D'Ávila, Jeferson S; Melo, Valdinaldo A; de Carvalho, Susana; de Andrade, Bruna M R; Nascimento, Larisse S; Rocha, Savinny B de V; Ribeiro, Thais R; Prado-Barreto, Valeria M; Melo, Enaldo V; Aguiar-Oliveira, Manuel H

2016-05-01

To analyze the voice formants (F1, F2, F3, and F4 in Hz) of seven oral vowels, in Brazilian Portuguese, [a, ε, e, i, ɔ, o, and u] in adult individuals with congenital lifetime untreated isolated growth hormone deficiency (IGHD). This is a cross-sectional study. Acoustic analysis of isolated vowels was performed in 33 individuals with IGHD, age 44.5 (17.6) years (16 women), and 29 controls, age 51.1 (17.6) years (15 women). Compared with controls, IGHD men showed higher values of F3 [i, e, and ε], P = 0.006, P = 0.022, and P = 0.006, respectively and F4 [i], P = 0.001 and lower values of F2 [u], P = 0.034; IGHD women presented higher values of F1 [i and e] P = 0.029 and P = 0.036; F2 [ɔ] P = 0.006; F4 [ɔ] P = 0.031 and lower values of F2 [i] P = 0.004. IGHD abolished most of the gender differences in formant frequencies present in controls. Congenital, severe IGHD results in higher values of most formant frequencies, suggesting smaller oral and pharyngeal cavities. In addition, it causes a reduction in the effect of gender on the structure of the formants, maintaining a prepubertal acoustic prediction. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.

PubMed

Fang, Shih-Hau; Tsao, Yu; Hsiao, Min-Jing; Chen, Ji-Ying; Lai, Ying-Hui; Lin, Feng-Chuan; Wang, Chi-Te

2018-03-19

Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. This study proposes a deep-learning-based approach to detect pathological voice and examines its performance and utility compared with other automatic classification algorithms. This study retrospectively collected 60 normal voice samples and 402 pathological voice samples of 8 common clinical voice disorders in a voice clinic of a tertiary teaching hospital. We extracted Mel frequency cepstral coefficients from 3-second samples of a sustained vowel. The performances of three machine learning algorithms, namely, deep neural network (DNN), support vector machine, and Gaussian mixture model, were evaluated based on a fivefold cross-validation. Collective cases from the voice disorder database of MEEI (Massachusetts Eye and Ear Infirmary) were used to verify the performance of the classification mechanisms. The experimental results demonstrated that DNN outperforms Gaussian mixture model and support vector machine. Its accuracy in detecting voice pathologies reached 94.26% and 90.52% in male and female subjects, based on three representative Mel frequency cepstral coefficient features. When applied to the MEEI database for validation, the DNN also achieved a higher accuracy (99.32%) than the other two classification algorithms. By stacking several layers of neurons with optimized weights, the proposed DNN algorithm can fully utilize the acoustic features and efficiently differentiate between normal and pathological voice samples. Based on this pilot study, future research may proceed to explore more application of DNN from laboratory and clinical perspectives. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

A study of voice production characteristics of astronuat speech during Apollo 11 for speaker modeling in space.

PubMed

Yu, Chengzhu; Hansen, John H L

2017-03-01

Human physiology has evolved to accommodate environmental conditions, including temperature, pressure, and air chemistry unique to Earth. However, the environment in space varies significantly compared to that on Earth and, therefore, variability is expected in astronauts' speech production mechanism. In this study, the variations of astronaut voice characteristics during the NASA Apollo 11 mission are analyzed. Specifically, acoustical features such as fundamental frequency and phoneme formant structure that are closely related to the speech production system are studied. For a further understanding of astronauts' vocal tract spectrum variation in space, a maximum likelihood frequency warping based analysis is proposed to detect the vocal tract spectrum displacement during space conditions. The results from fundamental frequency, formant structure, as well as vocal spectrum displacement indicate that astronauts change their speech production mechanism when in space. Moreover, the experimental results for astronaut voice identification tasks indicate that current speaker recognition solutions are highly vulnerable to astronaut voice production variations in space conditions. Future recommendations from this study suggest that successful applications of speaker recognition during extended space missions require robust speaker modeling techniques that could effectively adapt to voice production variation caused by diverse space conditions.
A system for analysis and classification of voice communications

NASA Technical Reports Server (NTRS)

Older, H. J.; Jenney, L. L.; Garland, L.

1973-01-01

A method for analysis and classification of verbal communications typically associated with manned space missions or simulations was developed. The study was carried out in two phases. Phase 1 was devoted to identification of crew tasks and activities which require voice communication for accomplishment or reporting. Phase 2 entailed development of a message classification system and a preliminary test of its feasibility. The classification system permits voice communications to be analyzed to three progressively more specific levels of detail and to be described in terms of message content, purpose, and the participants in the information exchange. A coding technique was devised to allow messages to be recorded by an eight-digit number.
Hearing Children's Voices through a Conversation Analysis Approach

ERIC Educational Resources Information Center

Bateman, Amanda

2017-01-01

This article introduces the methodological approach of conversation analysis (CA) and demonstrates its usefulness in presenting more authentic documentation and analysis of children's voices. Grounded in ethnomethodology, CA has recently gained interest in the area of early childhood studies due to the affordances it holds for gaining access to…
Brain 'talks over' boring quotes: top-down activation of voice-selective areas while listening to monotonous direct speech quotations.

PubMed

Yao, Bo; Belin, Pascal; Scheepers, Christoph

2012-04-15

In human communication, direct speech (e.g., Mary said, "I'm hungry") is perceived as more vivid than indirect speech (e.g., Mary said that she was hungry). This vividness distinction has previously been found to underlie silent reading of quotations: Using functional magnetic resonance imaging (fMRI), we found that direct speech elicited higher brain activity in the temporal voice areas (TVA) of the auditory cortex than indirect speech, consistent with an "inner voice" experience in reading direct speech. Here we show that listening to monotonously spoken direct versus indirect speech quotations also engenders differential TVA activity. This suggests that individuals engage in top-down simulations or imagery of enriched supra-segmental acoustic representations while listening to monotonous direct speech. The findings shed new light on the acoustic nature of the "inner voice" in understanding direct speech. Copyright Â© 2012 Elsevier Inc. All rights reserved.
Temporal signatures of processing voiceness and emotion in sound.

PubMed

Schirmer, Annett; Gunter, Thomas C

2017-06-01

This study explored the temporal course of vocal and emotional sound processing. Participants detected rare repetitions in a stimulus stream comprising neutral and surprised non-verbal exclamations and spectrally rotated control sounds. Spectral rotation preserved some acoustic and emotional properties of the vocal originals. Event-related potentials elicited to unrepeated sounds revealed effects of voiceness and emotion. Relative to non-vocal sounds, vocal sounds elicited a larger centro-parietally distributed N1. This effect was followed by greater positivity to vocal relative to non-vocal sounds beginning with the P2 and extending throughout the recording epoch (N4, late positive potential) with larger amplitudes in female than in male listeners. Emotion effects overlapped with the voiceness effects but were smaller and differed topographically. Voiceness and emotion interacted only for the late positive potential, which was greater for vocal-emotional as compared with all other sounds. Taken together, these results point to a multi-stage process in which voiceness and emotionality are represented independently before being integrated in a manner that biases responses to stimuli with socio-emotional relevance. © The Author (2017). Published by Oxford University Press.
Distributed acoustic cues for caller identity in macaque vocalization.

PubMed

Fukushima, Makoto; Doyle, Alex M; Mullarkey, Matthew P; Mishkin, Mortimer; Averbeck, Bruno B

2015-12-01

Individual primates can be identified by the sound of their voice. Macaques have demonstrated an ability to discern conspecific identity from a harmonically structured 'coo' call. Voice recognition presumably requires the integrated perception of multiple acoustic features. However, it is unclear how this is achieved, given considerable variability across utterances. Specifically, the extent to which information about caller identity is distributed across multiple features remains elusive. We examined these issues by recording and analysing a large sample of calls from eight macaques. Single acoustic features, including fundamental frequency, duration and Weiner entropy, were informative but unreliable for the statistical classification of caller identity. A combination of multiple features, however, allowed for highly accurate caller identification. A regularized classifier that learned to identify callers from the modulation power spectrum of calls found that specific regions of spectral-temporal modulation were informative for caller identification. These ranges are related to acoustic features such as the call's fundamental frequency and FM sweep direction. We further found that the low-frequency spectrotemporal modulation component contained an indexical cue of the caller body size. Thus, cues for caller identity are distributed across identifiable spectrotemporal components corresponding to laryngeal and supralaryngeal components of vocalizations, and the integration of those cues can enable highly reliable caller identification. Our results demonstrate a clear acoustic basis by which individual macaque vocalizations can be recognized.
Distributed acoustic cues for caller identity in macaque vocalization

PubMed Central

Doyle, Alex M.; Mullarkey, Matthew P.; Mishkin, Mortimer; Averbeck, Bruno B.

2015-01-01

Individual primates can be identified by the sound of their voice. Macaques have demonstrated an ability to discern conspecific identity from a harmonically structured ‘coo’ call. Voice recognition presumably requires the integrated perception of multiple acoustic features. However, it is unclear how this is achieved, given considerable variability across utterances. Specifically, the extent to which information about caller identity is distributed across multiple features remains elusive. We examined these issues by recording and analysing a large sample of calls from eight macaques. Single acoustic features, including fundamental frequency, duration and Weiner entropy, were informative but unreliable for the statistical classification of caller identity. A combination of multiple features, however, allowed for highly accurate caller identification. A regularized classifier that learned to identify callers from the modulation power spectrum of calls found that specific regions of spectral–temporal modulation were informative for caller identification. These ranges are related to acoustic features such as the call’s fundamental frequency and FM sweep direction. We further found that the low-frequency spectrotemporal modulation component contained an indexical cue of the caller body size. Thus, cues for caller identity are distributed across identifiable spectrotemporal components corresponding to laryngeal and supralaryngeal components of vocalizations, and the integration of those cues can enable highly reliable caller identification. Our results demonstrate a clear acoustic basis by which individual macaque vocalizations can be recognized. PMID:27019727
The Role of Occupational Voice Demand and Patient-Rated Impairment in Predicting Voice Therapy Adherence.

PubMed

Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen

2018-05-01

Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P < 0.0001, P = 0.018, and P = 0.008, respectively). CART analysis found that patients with low or no occupational voice demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P < 0.001). Furthermore, a VHI-10 score of ≤29 or a V-RQOL score of >40 is a significant cutoff point for predicting therapy nonadherence (P < 0.011 and P < 0.004, respectively). Occupational voice demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Scaling and dimensional analysis of acoustic streaming jets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moudjed, B.; Botton, V.; Henry, D.

2014-09-15

This paper focuses on acoustic streaming free jets. This is to say that progressive acoustic waves are used to generate a steady flow far from any wall. The derivation of the governing equations under the form of a nonlinear hydrodynamics problem coupled with an acoustic propagation problem is made on the basis of a time scale discrimination approach. This approach is preferred to the usually invoked amplitude perturbations expansion since it is consistent with experimental observations of acoustic streaming flows featuring hydrodynamic nonlinearities and turbulence. Experimental results obtained with a plane transducer in water are also presented together with amore » review of the former experimental investigations using similar configurations. A comparison of the shape of the acoustic field with the shape of the velocity field shows that diffraction is a key ingredient in the problem though it is rarely accounted for in the literature. A scaling analysis is made and leads to two scaling laws for the typical velocity level in acoustic streaming free jets; these are both observed in our setup and in former studies by other teams. We also perform a dimensional analysis of this problem: a set of seven dimensionless groups is required to describe a typical acoustic experiment. We find that a full similarity is usually not possible between two acoustic streaming experiments featuring different fluids. We then choose to relax the similarity with respect to sound attenuation and to focus on the case of a scaled water experiment representing an acoustic streaming application in liquid metals, in particular, in liquid silicon and in liquid sodium. We show that small acoustic powers can yield relatively high Reynolds numbers and velocity levels; this could be a virtue for heat and mass transfer applications, but a drawback for ultrasonic velocimetry.« less
The Traditional/Acoustic Music Project: a study of vocal demands and vocal health.

PubMed

Erickson, Molly L

2012-09-01

The Traditional/Acoustic Music Project seeks to identify the musical and performance characteristics of traditional/acoustic musicians and determine the vocal demands they face with the goals of (1) providing information and outreach to this important group of singers and (2) providing information to physicians, speech-language pathologists, and singing teachers who will enable them to provide appropriate services. Descriptive cross-sectional study. Data have been collected through administration of a 53-item questionnaire. The questionnaire was administered to artists performing at local venues in Knoxville, Tennessee and also to musicians attending the 2008 Folk Alliance Festival in Memphis, Tennessee. Approximately 41% of the respondents have had no vocal training, whereas approximately 34% of the respondents have had some form of formal vocal training (private lessons or group instruction). About 41% of the participants had experienced a tired voice, whereas about 30% of the participants had experienced either a loss of the top range of the voice or a total loss of voice at least once in their careers. Approximately 31% of the respondents had no health insurance. Approximately 69% of the respondents reported that they get their information about healthy singing practices solely from fellow musicians or that they do not get any information at all. Traditional/acoustic musicians are a poorly studied population at risk for the development of voice disorders. Continued research is necessary with the goal of a large sample that can be analyzed for associations, identification of subpopulations, and formulation of specific hypotheses that lend themselves to experimental research. Appropriate models of information and service delivery tailored for the singer-instrumentalist are needed. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Recurrent laryngeal nerve reinnervation in children: Acoustic and endoscopic characteristics pre-intervention and post-intervention. A comparison of treatment options.

PubMed

Zur, Karen B; Carroll, Linda M

2015-12-01

To establish the benefit of ansa cervicalis-recurrent laryngeal nerve reinnervation (ANSA-RLN) for the management of dysphonia secondary to unilateral vocal cord paralysis (UVCP) in children. Children treated with ANSA-RLN for the management of dysphonia secondary to unilateral vocal fold immobility will have superior acoustic, perceptual, and stroboscopic outcomes compared to injection laryngoplasty and observation. Retrospective case-series chart review. Laryngeal, perceptual, and acoustic analysis of dysphonia was performed in 33 children (age 2-16 years) diagnosed with UVCP. Comparison of pre-post function for treatment groups (no treatment, injection laryngoplasty, ANSA-RLN) with additional comparison between gestational ages, age at initial evaluation, and gender were examined. Perceptual measures included Pediatric Voice Handicap Index (pVHI) and Grade, Roughness, Breathiness, Asthenia, Strain (GBRAS) perceptual rating. Objective measures included semitone (ST) range, jitter%, shimmer%, noise-to-harmonic ratio, voicing, and maximum phonation time. Post-treatment, pVHI, jitter%, and ST were significantly improved for ANSA-RLN subjects compared to injection subjects. Improved function (laryngeal diadochokinesis, pVHI, GRBAS, and/or acoustic) was observed in all ANSA-RLN subjects who had vocal fold paralysis as the only laryngeal diagnosis. This study presents one of the largest studies of pediatric vocal fold paralysis diagnosis and treatment. The study looks at the spectrum of function in patients with UVCP and looks at the outcomes of options: no treatment, injection laryngoplasty, and ANSA-RLN. Although surgical outcomes vary, both injection laryngoplasty and ANSA-RLN show benefit in laryngeal function, voice stability, voice capacity, perceptual rating, and pVHI scores. Both injection laryngoplasty and ANSA-RLN showed improvements post-treatment, and should be considered for management of pediatric UVCP. However, the ANSA-RLN group showed better and longer
Acoustic cue weighting in the singleton vs geminate contrast in Lebanese Arabic: The case of fricative consonants.

PubMed

Al-Tamimi, Jalal; Khattab, Ghada

2015-07-01

This paper is the first reported investigation of the role of non-temporal acoustic cues in the singleton-geminate contrast in Lebanese Arabic, alongside the more frequently reported temporal cues. The aim is to explore the extent to which singleton and geminate consonants show qualitative differences in a language where phonological length is prominent and where moraic structure governs segment timing and syllable weight. Twenty speakers (ten male, ten female) were recorded producing trochaic disyllables with medial singleton and geminate fricatives preceded by phonologically short and long vowels. The following acoustic measures were applied on the medial fricative and surrounding vowels: absolute duration; intensity; fundamental frequency; spectral peak and shape, dynamic amplitude, and voicing patterns of medial fricatives; and vowel quality and voice quality correlates of surrounding vowels. Discriminant analysis and receiver operating characteristics (ROC) curves were used to assess each acoustic cue's contribution to the singleton-geminate contrast. Classification rates of 89% and ROC curves with an area under the curve rate of 96% confirmed the major role played by temporal cues, with non-temporal cues contributing to the contrast but to a much lesser extent. These results confirm that the underlying contrast for gemination in Arabic is temporal, but highlight [+tense] (fortis) as a secondary feature.
Lower Vocal Tract Morphologic Adjustments Are Relevant for Voice Timbre in Singing.

PubMed

Mainka, Alexander; Poznyakovskiy, Anton; Platzek, Ivan; Fleischer, Mario; Sundberg, Johan; Mürbe, Dirk

2015-01-01

The vocal tract shape is crucial to voice production. Its lower part seems particularly relevant for voice timbre. This study analyzes the detailed morphology of parts of the epilaryngeal tube and the hypopharynx for the sustained German vowels /a/, /e/, /i/, /o/, and /u/ by thirteen male singer subjects who were at the beginning of their academic singing studies. Analysis was based on two different phonatory conditions: a natural, speech-like phonation and a singing phonation, like in classical singing. 3D models of the vocal tract were derived from magnetic resonance imaging and compared with long-term average spectrum analysis of audio recordings from the same subjects. Comparison of singing to the speech-like phonation, which served as reference, showed significant adjustments of the lower vocal tract: an average lowering of the larynx by 8 mm and an increase of the hypopharyngeal cross-sectional area (+ 21:9%) and volume (+ 16:8%). Changes in the analyzed epilaryngeal portion of the vocal tract were not significant. Consequently, lower larynx-to-hypopharynx area and volume ratios were found in singing compared to the speech-like phonation. All evaluated measures of the lower vocal tract varied significantly with vowel quality. Acoustically, an increase of high frequency energy in singing correlated with a wider hypopharyngeal area. The findings offer an explanation how classical male singers might succeed in producing a voice timbre with increased high frequency energy, creating a singer`s formant cluster.
Lower Vocal Tract Morphologic Adjustments Are Relevant for Voice Timbre in Singing

PubMed Central

Mainka, Alexander; Poznyakovskiy, Anton; Platzek, Ivan; Fleischer, Mario; Sundberg, Johan; Mürbe, Dirk

2015-01-01

The vocal tract shape is crucial to voice production. Its lower part seems particularly relevant for voice timbre. This study analyzes the detailed morphology of parts of the epilaryngeal tube and the hypopharynx for the sustained German vowels /a/, /e/, /i/, /o/, and /u/ by thirteen male singer subjects who were at the beginning of their academic singing studies. Analysis was based on two different phonatory conditions: a natural, speech-like phonation and a singing phonation, like in classical singing. 3D models of the vocal tract were derived from magnetic resonance imaging and compared with long-term average spectrum analysis of audio recordings from the same subjects. Comparison of singing to the speech-like phonation, which served as reference, showed significant adjustments of the lower vocal tract: an average lowering of the larynx by 8 mm and an increase of the hypopharyngeal cross-sectional area (+ 21.9%) and volume (+ 16.8%). Changes in the analyzed epilaryngeal portion of the vocal tract were not significant. Consequently, lower larynx-to-hypopharynx area and volume ratios were found in singing compared to the speech-like phonation. All evaluated measures of the lower vocal tract varied significantly with vowel quality. Acoustically, an increase of high frequency energy in singing correlated with a wider hypopharyngeal area. The findings offer an explanation how classical male singers might succeed in producing a voice timbre with increased high frequency energy, creating a singer‘s formant cluster. PMID:26186691
Back-and-Forth Methodology for Objective Voice Quality Assessment: From/to Expert Knowledge to/from Automatic Classification of Dysphonia

NASA Astrophysics Data System (ADS)

Fredouille, Corinne; Pouchoulin, Gilles; Ghio, Alain; Revis, Joana; Bonastre, Jean-François; Giovanni, Antoine

2009-12-01

This paper addresses voice disorder assessment. It proposes an original back-and-forth methodology involving an automatic classification system as well as knowledge of the human experts (machine learning experts, phoneticians, and pathologists). The goal of this methodology is to bring a better understanding of acoustic phenomena related to dysphonia. The automatic system was validated on a dysphonic corpus (80 female voices), rated according to the GRBAS perceptual scale by an expert jury. Firstly, focused on the frequency domain, the classification system showed the interest of 0-3000 Hz frequency band for the classification task based on the GRBAS scale. Later, an automatic phonemic analysis underlined the significance of consonants and more surprisingly of unvoiced consonants for the same classification task. Submitted to the human experts, these observations led to a manual analysis of unvoiced plosives, which highlighted a lengthening of VOT according to the dysphonia severity validated by a preliminary statistical analysis.
Reliability of human-supervised formant-trajectory measurement for forensic voice comparison.

PubMed

Zhang, Cuiling; Morrison, Geoffrey Stewart; Ochoa, Felipe; Enzinger, Ewald

2013-01-01

Acoustic-phonetic approaches to forensic voice comparison often include human-supervised measurement of vowel formants, but the reliability of such measurements is a matter of concern. This study assesses the within- and between-supervisor variability of three sets of formant-trajectory measurements made by each of four human supervisors. It also assesses the validity and reliability of forensic-voice-comparison systems based on these measurements. Each supervisor's formant-trajectory system was fused with a baseline mel-frequency cepstral-coefficient system, and performance was assessed relative to the baseline system. Substantial improvements in validity were found for all supervisors' systems, but some supervisors' systems were more reliable than others.
Respiratory Muscle Strength, Sound Pressure Level, and Vocal Acoustic Parameters and Waist Circumference of Children With Different Nutritional Status.

PubMed

Pascotini, Fernanda dos Santos; Ribeiro, Vanessa Veis; Christmann, Mara Keli; Tomasi, Lidia Lis; Dellazzana, Amanda Alves; Haeffner, Leris Salete Bonfanti; Cielo, Carla Aparecida

2016-01-01

Relate respiratory muscle strength (RMS), sound pressure (SP) level, and vocal acoustic parameters to the abdominal circumference (AC) and nutritional status of children. This is a cross-sectional study. Eighty-two school children aged between 8 and 10 years, grouped by nutritional states (eutrophic, overweight, or obese) and AC percentile (≤25, 25-75, and ≥75), were included in the study. Evaluations of maximal inspiratory pressure (IPmax) and maximal expiratory pressure (EPmax) were conducted using the manometer and SP and acoustic parameters through the Multi-Dimensional Voice Program Advanced (KayPENTAX, Montvale, New Jersey). There were significant differences (P < 0.05) in the EPmax of children with AC between the 25th and 75th percentiles (72.4) and those less than or equal to the 25th percentile (61.9) and in the SP of those greater than or equal to the 75th percentile (73.4) and less than or equal to the 25th percentile (66.6). The IPmax, EPmax, SP levels, and acoustic variables were not different in relation to the nutritional states of the children. There was a strong and positive correlation between the coefficient of amplitude perturbations (shimmer), the harmonics-to-noise ratio and the variation of the fundamental frequency, respectively, 0.79 and 0.71. RMS and acoustic voice characteristics in children do not appear to be influenced by nutritional states, and respiratory pressure does not interfere with acoustic voice characteristics. However, localized fat, represented by the AC, alters the EPmax and the SP, each of which increases as the AC increases. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
A report on alterations to the speaking and singing voices of four women following hormonal therapy with virilizing agents.

PubMed

Baker, J

1999-12-01

Four women aged between 27 and 58 years sought otolaryngological examination due to significant alterations to their voices, the primary concerns being hoarseness in vocal quality, lowering of habitual pitch, difficulty projecting their speaking voices, and loss of control over their singing voices. Otolaryngological examination with a mirror or flexible laryngoscope revealed no apparent abnormality of vocal fold structure or function, and the women were referred for speech pathology with diagnoses of functional dysphonia. Objective acoustic measures using the Kay Visipitch indicated significant lowering of the mean fundamental frequency for each woman, and perceptual analysis of the patients' voices during quiet speaking, projected voice use, and comprehensive singing activities revealed a constellation of features typically noted in the pubescent male. The original diagnoses of a functional dysphonia were queried, prompting further exploration of each woman's medical history, revealing in each case onset of vocal symptoms shortly after commencing treatment for conditions with medications containing virilizing agents (eg, Danocrine (danazol), Deca-Durabolin (nandrolene decanoate), and testosterone). Although some of the vocal symptoms decreased in severity with the influences from 6 months voice therapy and after withdrawal from the drugs, a number of symptoms remained permanent, suggesting each subject had suffered significant alterations in vocal physiology, including muscle tissue changes, muscle coordination dysfunction, and propioceptive dysfunction. This retrospective study is presented in order to illustrate that it was both the projected speaking voice and the singing voice that proved so highly sensitive to the virilization effects. The implications for future prospective research studies and responsible clinical practice are discussed.
Influence of consonant voicing characteristics on sentence production in abductor versus adductor spasmodic dysphonia.

PubMed

Cannito, Michael P; Chorna, Lesya B; Kahane, Joel C; Dworkin, James P

2014-05-01

This study evaluated the hypotheses that sentence production by speakers with adductor (AD) and abductor (AB) spasmodic dysphonia (SD) may be differentially influenced by consonant voicing and manner features, in comparison with healthy, matched, nondysphonic controls. This was a prospective, single blind study, using a between-groups, repeated measures design for the independent variables of perceived voice quality and sentence duration. Sixteen subjects with ADSD and 10 subjects with ABSD, as well as 26 matched healthy controls produced four short, simple sentences that were systematically loaded with voiced or voiceless consonants of either obstruant or continuant manner categories. Experienced voice clinicians, who were "blind" as to speakers' group affixations, used visual analog scaling to judge the overall voice quality of each sentence. Acoustic sentence durations were also measured. Speakers with ABSD or ADSD demonstrated significantly poorer than normal voice quality on all sentences. Speakers with ABSD exhibited longer than normal duration for voiceless consonant sentences. Speakers with ADSD had poorer voice quality for voiced than for voiceless consonant sentences. Speakers with ABSD had longer durations for voiceless than for voiced consonant sentences. The two subtypes of SD exhibit differential performance on the basis of consonant voicing in short, simple sentences; however, each subgroup manifested voicing-related differences on a different variable (voice quality vs sentence duration). Findings suggest different underlying pathophysiological mechanisms for ABSD and ADSD. Findings also support inclusion of short, simple sentences containing voiced or voiceless consonants as part of the diagnostic protocol for SD, with measurement of sentence duration in addition to judments of voice quality severity. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter speech describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm

The Human Voice in Speech and Singing

NASA Astrophysics Data System (ADS)

Lindblom, Björn; Sundberg, Johan

This chapter describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control
Using Ambulatory Voice Monitoring to Investigate Common Voice Disorders: Research Update

PubMed Central

Mehta, Daryush D.; Van Stan, Jarrad H.; Zañartu, Matías; Ghassemi, Marzyeh; Guttag, John V.; Espinoza, Víctor M.; Cortés, Juan P.; Cheyne, Harold A.; Hillman, Robert E.

2015-01-01

Many common voice disorders are chronic or recurring conditions that are likely to result from inefficient and/or abusive patterns of vocal behavior, referred to as vocal hyperfunction. The clinical management of hyperfunctional voice disorders would be greatly enhanced by the ability to monitor and quantify detrimental vocal behaviors during an individual’s activities of daily life. This paper provides an update on ongoing work that uses a miniature accelerometer on the neck surface below the larynx to collect a large set of ambulatory data on patients with hyperfunctional voice disorders (before and after treatment) and matched-control subjects. Three types of analysis approaches are being employed in an effort to identify the best set of measures for differentiating among hyperfunctional and normal patterns of vocal behavior: (1) ambulatory measures of voice use that include vocal dose and voice quality correlates, (2) aerodynamic measures based on glottal airflow estimates extracted from the accelerometer signal using subject-specific vocal system models, and (3) classification based on machine learning and pattern recognition approaches that have been used successfully in analyzing long-term recordings of other physiological signals. Preliminary results demonstrate the potential for ambulatory voice monitoring to improve the diagnosis and treatment of common hyperfunctional voice disorders. PMID:26528472
Acoustic Properties of the Voice Source and the Vocal Tract: Are They Perceptually Independent?

PubMed

Erickson, Molly L

2016-11-01

This study sought to determine whether the properties of the voice source and vocal tract are perceptually independent. Within-subjects design. This study employed a paired-comparison paradigm where listeners heard synthetic voices and rated them as same or different using a visual analog scale. Stimuli were synthesized using three different source slopes and two different formant patterns (mezzo-soprano and soprano) on the vowel /a/ at four pitches: A3, C4, B4, and F5. Whereas formant pattern was the strongest effect, difference in source slope also affected perceived quality difference. Source slope and formant pattern were not independently perceived. These results suggest that when judging laryngeal adduction using perceptual information, judgments may not be accurate when the stimuli are of differing formant patterns. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic characteristics of phonation in "wet voice" conditions.

PubMed

Murugappan, Shanmugam; Boyce, Suzanne; Khosla, Sid; Kelchner, Lisa; Gutmark, Ephraim

2010-04-01

A perceptible change in phonation characteristics after a swallow has long been considered evidence that food and/or drink material has entered the laryngeal vestibule and is on the surface of the vocal folds as they vibrate. The current paper investigates the acoustic characteristics of phonation when liquid material is present on the vocal folds, using ex vivo porcine larynges as a model. Consistent with instrumental examinations of swallowing disorders or dysphagia in humans, three liquids of different Varibar viscosity ("thin liquid," "nectar," and "honey") were studied at constant volume. The presence of materials on the folds during phonation was generally found to suppress the higher frequency harmonics and generate intermittent additional frequencies in the low and high end of the acoustic spectrum. Perturbation measures showed a higher percentage of jitter and shimmer when liquid material was present on the folds during phonation, but they were unable to differentiate statistically between the three fluid conditions. The finite correlation dimension and positive Lyapunov exponent measures indicated that the presence of materials on the vocal folds excited a chaotic system. Further, these measures were able to reliably differentiate between the baseline and different types of liquid on the vocal folds.
Acoustic analysis of warp potential of green ponderosa pine lumber

Treesearch

Xiping Wang; William T. Simpson

2005-01-01

This study evaluated the potential of acoustic analysis as presorting criteria to identify warp-prone boards before kiln drying. Dimension lumber, 38 by 89 mm (nominal 2 by 4 in.) and 2.44 m (8 ft) long, sawn from open-grown small-diameter ponderosa pine trees, was acoustically tested lengthwise at green condition. Three acoustic properties (acoustic speed, rate of...
Teachers' voice use in teaching environments: a field study using ambulatory phonation monitor.

PubMed

Lyberg Åhlander, Viveka; Pelegrín García, David; Whitling, Susanna; Rydell, Roland; Löfqvist, Anders

2014-11-01

This case-control designed field study examines the vocal behavior in teachers with self-estimated voice problems (VP) and their age- and school-matched voice healthy (VH) colleagues. It was hypothesized that teachers with and teachers without VP use their voices differently regarding fundamental frequency, sound pressure level (SPL), and in relation to the background noise. Teachers with self-estimated VP (n = 14; two males and 12 females) were age and gender matched to VH school colleagues (n = 14; two males and 12 females). The subjects, recruited from an earlier study, had been examined in laryngeal, vocal, hearing, and psychosocial aspects. The fundamental frequency, SPL, and phonation time were recorded with an Ambulatory Phonation Monitor during one representative workday. The teachers reported their activities in a structured diary. The SPL (including teachers' and students' activity and ambient noise) was recorded with a sound level meter; the room temperature and air quality were measured simultaneously. The acoustic properties of the empty classrooms were measured. Teachers with VP behaved vocally different from their VH peers, in particular during teaching sessions. The phonation time was significantly higher in the group with VP, and the number of vibratory cycles differed between the female teachers. The F0 pattern, related to the vocal SPL and room acoustics, differed between the groups. The results suggest a different vocal behavior in subjects with subjective VP and a higher vocal load with fewer possibilities for vocal recovery. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic evolution of old Italian violins from Amati to Stradivari.

PubMed

Tai, Hwan-Ching; Shen, Yen-Ping; Lin, Jer-Horng; Chung, Dai-Ting

2018-06-05

The shape and design of the modern violin are largely influenced by two makers from Cremona, Italy: The instrument was invented by Andrea Amati and then improved by Antonio Stradivari. Although the construction methods of Amati and Stradivari have been carefully examined, the underlying acoustic qualities which contribute to their popularity are little understood. According to Geminiani, a Baroque violinist, the ideal violin tone should "rival the most perfect human voice." To investigate whether Amati and Stradivari violins produce voice-like features, we recorded the scales of 15 antique Italian violins as well as male and female singers. The frequency response curves are similar between the Andrea Amati violin and human singers, up to ∼4.2 kHz. By linear predictive coding analyses, the first two formants of the Amati exhibit vowel-like qualities (F1/F2 = 503/1,583 Hz), mapping to the central region on the vowel diagram. Its third and fourth formants (F3/F4 = 2,602/3,731 Hz) resemble those produced by male singers. Using F1 to F4 values to estimate the corresponding vocal tract length, we observed that antique Italian violins generally resemble basses/baritones, but Stradivari violins are closer to tenors/altos. Furthermore, the vowel qualities of Stradivari violins show reduced backness and height. The unique formant properties displayed by Stradivari violins may represent the acoustic correlate of their distinctive brilliance perceived by musicians. Our data demonstrate that the pioneering designs of Cremonese violins exhibit voice-like qualities in their acoustic output. Copyright © 2018 the Author(s). Published by PNAS.
Connections between voice ergonomic risk factors and voice symptoms, voice handicap, and respiratory tract diseases.

PubMed

Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva

2012-11-01

The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Benefits of the fiber optic versus the electret microphone in voice amplification.

PubMed

Kyriakou, Kyriaki; Fisher, Hélène R

2013-01-01

Voice disorders that result in reduced loudness may cause difficulty in communicating, socializing and participating in occupational activities. Amplification is often recommended in order to facilitate functional communication, reduce vocal load and avoid developing maladaptive compensatory behaviours. The most common microphone used with amplification systems is the electret microphone. One alternate form of microphone is the fiber optic microphone. To examine the benefits of the fiber optic (1190S) versus the electret (M04) microphone as measured by objective and subjective parameters in the amplification of a patient's voice with reduced loudness caused by neurological and/or respiratory-based problems. Eighteen patients with vocal fold paralysis, Parkinson's disease and/or chronic obstructive pulmonary disease (COPD) participated in the study. The study contained a measurement of intensity, amplitude perturbation and signal-to-noise ratio during a sustained vowel production and a measurement of intensity during conversation with the use of the two microphones simultaneously. It also included the completion of a questionnaire indicating the patient's satisfaction with each microphone. The fiber optic (1190S) microphone had better objective acoustic performance (i.e. lower amplitude perturbation, higher signal-to-noise ratio and higher intensity) than the electret (M04) microphone. It also had better patient subjective satisfaction (i.e. less conspicuousness, more voice clarity, less acoustic feedback, more loudness and more utilization) than the electret microphone. Patients with neurological and/or respiratory-based voice problems may more confidently and frequently use the fiber optic microphone to communicate, socialize and participate in occupational activities more easily. Speech-language pathologists may more confidently use or recommend the fiber optic microphone with amplification systems. © 2012 Royal College of Speech and Language Therapists.
The pattern of educator voice in clinical counseling in an educational hospital in Shiraz, Iran: a conversation analysis.

PubMed

Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran

2017-01-01

Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients' behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians' dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills.
The pattern of educator voice in clinical counseling in an educational hospital in Shiraz, Iran: a conversation analysis

PubMed Central

Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran

2017-01-01

Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients’ behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians’ dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills. PMID:29296258
Neural effects of environmental advertising: An fMRI analysis of voice age and temporal framing.

PubMed

Casado-Aranda, Luis-Alberto; Martínez-Fiestas, Myriam; Sánchez-Fernández, Juan

2018-01-15

Ecological information offered to society through advertising enhances awareness of environmental issues, encourages development of sustainable attitudes and intentions, and can even alter behavior. This paper, by means of functional Magnetic Resonance Imaging (fMRI) and self-reports, explores the underlying mechanisms of processing ecological messages. The study specifically examines brain and behavioral responses to persuasive ecological messages that differ in temporal framing and in the age of the voice pronouncing them. The findings reveal that attitudes are more positive toward future-framed messages presented by young voices. The whole-brain analysis reveals that future-framed (FF) ecological messages trigger activation in brain areas related to imagery, prospective memories and episodic events, thus reflecting the involvement of past behaviors in future ecological actions. Past-framed messages (PF), in turn, elicit brain activations within the episodic system. Young voices (YV), in addition to triggering stronger activation in areas involved with the processing of high-timbre, high-pitched and high-intensity voices, are perceived as more emotional and motivational than old voices (OV) as activations in anterior cingulate cortex and amygdala. Messages expressed by older voices, in turn, exhibit stronger activation in areas formerly linked to low-pitched voices and voice gender perception. Interestingly, a link is identified between neural and self-report responses indicating that certain brain activations in response to future-framed messages and young voices predicted higher attitudes toward future-framed and young voice advertisements, respectively. The results of this study provide invaluable insight into the unconscious origin of attitudes toward environmental messages and indicate which voice and temporal frame of a message generate the greatest subconscious value. Copyright © 2017 Elsevier Ltd. All rights reserved.
Technical Aspects of Acoustical Engineering for the ISS [International Space Station

NASA Technical Reports Server (NTRS)

Allen, Christopher S.

2009-01-01

It is important to control acoustic levels on manned space flight vehicles and habitats to protect crew-hearing, allow for voice communications, and to ensure a healthy and habitable environment in which to work and live. For the International Space Station (ISS) this is critical because of the long duration crew-stays of approximately 6-months. NASA and the JSC Acoustics Office set acoustic requirements that must be met for hardware to be certified for flight. Modules must meet the NC-50 requirement and other component hardware are given smaller allocations to meet. In order to meet these requirements many aspects of noise generation and control must be considered. This presentation has been developed to give an insight into the various technical activities performed at JSC to ensure that a suitable acoustic environment is provided for the ISS crew. Examples discussed include fan noise, acoustic flight material development, on-orbit acoustic monitoring, and a specific hardware development and acoustical design case, the ISS Crew Quarters.
Evaluation of vocal acoustic and efficiency analysis parameters in medical students and academic teachers with use of iris and diagnoscope specialist software.

PubMed

Zielińska-Bliźniewska, Hanna; Sułkowski, Wiesław J; Pietkiewicz, Piotr; Miłoński, Jarosław; Mazurek, Agnieszka; Olszewski, Jurek

2012-06-01

The aim of this study was to compare the parameters of vocal acoustic and vocal efficiency analyses in medical students and academic teachers with use of the IRIS and DiagnoScope Specialist software and to evaluate their usefulness in prevention and certification of occupational disease. The study group comprised 40 women, including students and employees of the Military Medical Faculty, Medical University of Łodź. After informed consent had been obtained from the participant women, the primary medical history was taken, videolaryngoscopic and stroboscopic examinations were performed and diagnostic vocal acoustic analysis was carried out with the use of the IRIS and Diagno-Scope Specialist software. Based on the results of the performed measurements, the statistical analysis evidenced the compatibility between two software programs, IRIS and DiagnoScope Specialist, with the only exception of the F4 formant. The mean values of vocal acoustic parameters in medical students and academic teachers, obtained by means of the IRIS software, can be used as standards for the female population not yet developed by the producer. When using the DiagnoScope Specialist software, some mean values were higher and some lower than the standards specified by the producer. The study evidenced the compatibility between two measurement software programs, IRIS and DiagnoScope Specialist, except for the F4 formant. It should be noted that the later has advantage over the former since the standard values of vocal acoustic parameters have been worked out by the producer. Moreover, they only slightly departed from the values obtained in our study and may be useful in diagnostics of occupational voice disorders.
Effects of Intensive Voice Treatment (the Lee Silverman Voice Treatment [LSVT]) on Vowel Articulation in Dysarthric Individuals with Idiopathic Parkinson Disease: Acoustic and Perceptual Findings

ERIC Educational Resources Information Center

Sapir, Shimon; Spielman, Jennifer L.; Ramig, Lorraine O.; Story, Brad H.; Fox, Cynthia

2007-01-01

Purpose: To evaluate the effects of intensive voice treatment targeting vocal loudness (the Lee Silverman Voice Treatment [LSVT]) on vowel articulation in dysarthric individuals with idiopathic Parkinson's disease (PD). Method: A group of individuals with PD receiving LSVT (n = 14) was compared to a group of individuals with PD not receiving LSVT…
Prospective clinical study on long-term swallowing function and voice quality in advanced head and neck cancer patients treated with concurrent chemoradiotherapy and preventive swallowing exercises.

PubMed

Kraaijenga, Sophie A C; van der Molen, Lisette; Jacobi, Irene; Hamming-Vrieze, Olga; Hilgers, Frans J M; van den Brekel, Michiel W M

2015-11-01

Concurrent chemoradiotherapy (CCRT) for advanced head and neck cancer (HNC) is associated with substantial early and late side effects, most notably regarding swallowing function, but also regarding voice quality and quality of life (QoL). Despite increased awareness/knowledge on acute dysphagia in HNC survivors, long-term (i.e., beyond 5 years) prospectively collected data on objective and subjective treatment-induced functional outcomes (and their impact on QoL) still are scarce. The objective of this study was the assessment of long-term CCRT-induced results on swallowing function and voice quality in advanced HNC patients. The study was conducted as a randomized controlled trial on preventive swallowing rehabilitation (2006-2008) in a tertiary comprehensive HNC center with twenty-two disease-free and evaluable HNC patients as participants. Multidimensional assessment of functional sequels was performed with videofluoroscopy, mouth opening measurements, Functional Oral Intake Scale, acoustic voice parameters, and (study specific, SWAL-QoL, and VHI) questionnaires. Outcome measures at 6 years post-treatment were compared with results at baseline and at 2 years post-treatment. At a mean follow-up of 6.1 years most initial tumor-, and treatment-related problems remained similarly low to those observed after 2 years follow-up, except increased xerostomia (68%) and increased (mild) pain (32%). Acoustic voice analysis showed less voicedness, increased fundamental frequency, and more vocal effort for the tumors located below the hyoid bone (n = 12), without recovery to baseline values. Patients' subjective vocal function (VHI score) was good. Functional swallowing and voice problems at 6 years post-treatment are minimal in this patient cohort, originating from preventive and continued post-treatment rehabilitation programs.
In vitro experimental investigation of voice production

PubMed Central

Horáčcek, Jaromír; Brücker, Christoph; Becker, Stefan

2012-01-01

The process of human phonation involves a complex interaction between the physical domains of structural dynamics, fluid flow, and acoustic sound production and radiation. Given the high degree of nonlinearity of these processes, even small anatomical or physiological disturbances can significantly affect the voice signal. In the worst cases, patients can lose their voice and hence the normal mode of speech communication. To improve medical therapies and surgical techniques it is very important to understand better the physics of the human phonation process. Due to the limited experimental access to the human larynx, alternative strategies, including artificial vocal folds, have been developed. The following review gives an overview of experimental investigations of artificial vocal folds within the last 30 years. The models are sorted into three groups: static models, externally driven models, and self-oscillating models. The focus is on the different models of the human vocal folds and on the ways in which they have been applied. PMID:23181007
Acoustic analysis in Mudejar-Gothic churches: Experimental results

NASA Astrophysics Data System (ADS)

Galindo, Miguel; Zamarreño, Teófilo; Girón, Sara

2005-05-01

This paper describes the preliminary results of research work in acoustics, conducted in a set of 12 Mudejar-Gothic churches in the city of Seville in the south of Spain. Despite common architectural style, the churches feature individual characteristics and have volumes ranging from 3947 to 10 708 m3. Acoustic parameters were measured in unoccupied churches according to the ISO-3382 standard. An extensive experimental study was carried out using impulse response analysis through a maximum length sequence measurement system in each church. It covered aspects such as reverberation (reverberation times, early decay times), distribution of sound levels (sound strength); early to late sound energy parameters derived from the impulse responses (center time, clarity for speech, clarity, definition, lateral energy fraction), and speech intelligibility (rapid speech transmission index), which all take both spectral and spatial distribution into account. Background noise was also measured to obtain the NR indices. The study describes the acoustic field inside each temple and establishes a discussion for each one of the acoustic descriptors mentioned by using the theoretical models available and the principles of architectural acoustics. Analysis of the quality of the spaces for music and speech is carried out according to the most widespread criteria for auditoria. .
Acoustic analysis in Mudejar-Gothic churches: experimental results.

PubMed

Galindo, Miguel; Zamarreño, Teófilo; Girón, Sara

2005-05-01

This paper describes the preliminary results of research work in acoustics, conducted in a set of 12 Mudejar-Gothic churches in the city of Seville in the south of Spain. Despite common architectural style, the churches feature individual characteristics and have volumes ranging from 3947 to 10 708 m3. Acoustic parameters were measured in unoccupied churches according to the ISO-3382 standard. An extensive experimental study was carried out using impulse response analysis through a maximum length sequence measurement system in each church. It covered aspects such as reverberation (reverberation times, early decay times), distribution of sound levels (sound strength); early to late sound energy parameters derived from the impulse responses (center time, clarity for speech, clarity, definition, lateral energy fraction), and speech intelligibility (rapid speech transmission index), which all take both spectral and spatial distribution into account. Background noise was also measured to obtain the NR indices. The study describes the acoustic field inside each temple and establishes a discussion for each one of the acoustic descriptors mentioned by using the theoretical models available and the principles of architectural acoustics. Analysis of the quality of the spaces for music and speech is carried out according to the most widespread criteria for auditoria.
Characterizing, synthesizing, and/or canceling out acoustic signals from sound sources

DOEpatents

Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA

2007-03-13

A system for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate and animate sound sources. Electromagnetic sensors monitor excitation sources in sound producing systems, such as animate sound sources such as the human voice, or from machines, musical instruments, and various other structures. Acoustical output from these sound producing systems is also monitored. From such information, a transfer function characterizing the sound producing system is generated. From the transfer function, acoustical output from the sound producing system may be synthesized or canceled. The systems disclosed enable accurate calculation of transfer functions relating specific excitations to specific acoustical outputs. Knowledge of such signals and functions can be used to effect various sound replication, sound source identification, and sound cancellation applications.

Injection laryngoplasty as miniinvasive office-based surgery in patients with unilateral vocal fold paralysis - voice quality outcomes.

PubMed

Sielska-Badurek, Ewelina M; Sobol, Maria; Jędra, Katarzyna; Rzepakowska, Anna; Osuch-Wójcikiewicz, Ewa; Niemczyk, Kazimierz

2017-09-01

Injection laryngoplasty (glottis augmentation) is the preferred method in surgical management of unilateral vocal fold paralysis (UVFP). Traditionally, these procedures are performed in the operating room. Nowadays, however, these procedures have moved into the office. To evaluate the voice quality after transoral injection laryngoplasty under local anaesthesia in patients with unilateral vocal fold paralysis. Fourteen subjects (5 women and 9 men) with unilateral vocal fold paresis (9 with right vocal fold paresis and 5 with left vocal fold paresis) were included in the study. The mean age of the group was 57.8 ±19.0 years (32-83 years). All of the injection laryngoplasties were performed transorally, under local anaesthesia. The injection material was calcium hydroxylapatite. Before and 1, 3 and 6 months after the procedure the following variables were evaluated: voice perception, videostroboscopy, acoustic analysis, aerodynamic evaluation, and the subjective rating of the voice quality by the patient. After injection laryngoplasty, complete glottal closure was achieved or there was a significant improvement in the glottal closure of each subject. We noted great improvement in the post-injection objective and subjective voice outcomes and patients reported improvement in the voice-related quality of life. The transoral approach for injection laryngoplasty under local anaesthesia is an effective and safe way to treat incomplete glottal closure in patients with UVFP. The transoral approach is an efficient alternative to other surgical techniques used for vocal fold injection.
Perturbation and Nonlinear Dynamic Analysis of Different Singing Styles

PubMed Central

Butte, Caitlin J.; Zhang, Yu; Song, Huangqiang; Jiang, Jack J.

2012-01-01

Summary Previous research has used perturbation analysis methods to study the singing voice. Using perturbation and nonlinear dynamic analysis (NDA) methods in conjunction may provide more accurate information on the singing voice and may distinguish vocal usage in different styles. Acoustic samples from different styles of singing were compared using nonlinear dynamic and perturbation measures. Twenty-six songs from different musical styles were obtained from an online music database (Rhapsody, RealNetworks, Inc., Seattle, WA). One-second samples were selected from each song for analysis. Perturbation analyses of jitter, shimmer, and signal-to-noise ratio and NDA of correlation dimension (D2) were performed on samples from each singing style. Percent jitter and shimmer median values were low normal for country (0.32% and 3.82%), musical theater (MT) (0.280% and 2.80%), jazz (0.440% and 2.34%), and soul (0.430% and 6.42%). The popular style had slightly higher median jitter and shimmer values (1.13% and 6.78%) than other singing styles, although this was not statistically significant. The opera singing style had median jitter of 0.520%, and yielded significantly high shimmer (P = 0.001) of 7.72%. All six singing styles were measured reliably using NDA, indicating that operatic singing is notably more chaotic than other singing styles. Median correlation dimension values were low to normal, compared to healthy voices, in country (median D2 = 2.14), jazz (median D2 = 2.24), pop (median D2 = 2.60), MT (median D2 = 2.73), and soul (mean D2 = 3.26). Correlation dimension was significantly higher in opera (P < 0.001) with median D2 = 6.19. In this study, acoustic analysis in opera singing gave significantly high values for shimmer and D2, suggesting that it is more irregular than other singing styles; a previously unknown quality of opera singing. Perturbation analysis also suggested significant differences in vocal output in different singing styles. This preliminary
Acoustic Emission Analysis Applet (AEAA) Software

NASA Technical Reports Server (NTRS)

Nichols, Charles T.; Roth, Don J.

2013-01-01

NASA Glenn Research and NASA White Sands Test Facility have developed software supporting an automated pressure vessel structural health monitoring (SHM) system based on acoustic emissions (AE). The software, referred to as the Acoustic Emission Analysis Applet (AEAA), provides analysts with a tool that can interrogate data collected on Digital Wave Corp. and Physical Acoustics Corp. software using a wide spectrum of powerful filters and charts. This software can be made to work with any data once the data format is known. The applet will compute basic AE statistics, and statistics as a function of time and pressure (see figure). AEAA provides value added beyond the analysis provided by the respective vendors' analysis software. The software can handle data sets of unlimited size. A wide variety of government and commercial applications could benefit from this technology, notably requalification and usage tests for compressed gas and hydrogen-fueled vehicles. Future enhancements will add features similar to a "check engine" light on a vehicle. Once installed, the system will ultimately be used to alert International Space Station crewmembers to critical structural instabilities, but will have little impact to missions otherwise. Diagnostic information could then be transmitted to experienced technicians on the ground in a timely manner to determine whether pressure vessels have been impacted, are structurally unsound, or can be safely used to complete the mission.
Using Rate of Divergence as an Objective Measure to Differentiate between Voice Signal Types Based on the Amount of Disorder in the Signal.

PubMed

Calawerts, William M; Lin, Liyu; Sprott, J C; Jiang, Jack J

2017-01-01

The purpose of this paper is to introduce the rate of divergence as an objective measure to differentiate between the four voice types based on the amount of disorder present in a signal. We hypothesized that rate of divergence would provide an objective measure that can quantify all four voice types. A total of 150 acoustic voice recordings were randomly selected and analyzed using traditional perturbation, nonlinear, and rate of divergence analysis methods. We developed a new parameter, rate of divergence, which uses a modified version of Wolf's algorithm for calculating Lyapunov exponents of a system. The outcome of this calculation is not a Lyapunov exponent, but rather a description of the divergence of two nearby data points for the next three points in the time series, followed in three time-delayed embedding dimensions. This measure was compared to currently existing perturbation and nonlinear dynamic methods of distinguishing between voice signals. There was a direct relationship between voice type and rate of divergence. This calculation is especially effective at differentiating between type 3 and type 4 voices (P < 0.001) and is equally effective at differentiating type 1, type 2, and type 3 signals as currently existing methods. The rate of divergence calculation introduced is an objective measure that can be used to distinguish between all four voice types based on the amount of disorder present, leading to quicker and more accurate voice typing as well as an improved understanding of the nonlinear dynamics involved in phonation. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Modification of computational auditory scene analysis (CASA) for noise-robust acoustic feature

NASA Astrophysics Data System (ADS)

Kwon, Minseok

While there have been many attempts to mitigate interferences of background noise, the performance of automatic speech recognition (ASR) still can be deteriorated by various factors with ease. However, normal hearing listeners can accurately perceive sounds of their interests, which is believed to be a result of Auditory Scene Analysis (ASA). As a first attempt, the simulation of the human auditory processing, called computational auditory scene analysis (CASA), was fulfilled through physiological and psychological investigations of ASA. CASA comprised of Zilany-Bruce auditory model, followed by tracking fundamental frequency for voice segmentation and detecting pairs of onset/offset at each characteristic frequency (CF) for unvoiced segmentation. The resulting Time-Frequency (T-F) representation of acoustic stimulation was converted into acoustic feature, gammachirp-tone frequency cepstral coefficients (GFCC). 11 keywords with various environmental conditions are used and the robustness of GFCC was evaluated by spectral distance (SD) and dynamic time warping distance (DTW). In "clean" and "noisy" conditions, the application of CASA generally improved noise robustness of the acoustic feature compared to a conventional method with or without noise suppression using MMSE estimator. The intial study, however, not only showed the noise-type dependency at low SNR, but also called the evaluation methods in question. Some modifications were made to capture better spectral continuity from an acoustic feature matrix, to obtain faster processing speed, and to describe the human auditory system more precisely. The proposed framework includes: 1) multi-scale integration to capture more accurate continuity in feature extraction, 2) contrast enhancement (CE) of each CF by competition with neighboring frequency bands, and 3) auditory model modifications. The model modifications contain the introduction of higher Q factor, middle ear filter more analogous to human auditory system
On shame and voice-hearing

PubMed Central

2017-01-01

Hearing voices in the absence of another speaker—what psychiatry terms an auditory verbal hallucination—is often associated with a wide range of negative emotions. Mainstream clinical research addressing the emotional dimensions of voice-hearing has tended to treat these as self-evident, undifferentiated and so effectively interchangeable. But what happens when a richer, more nuanced understanding of specific emotions is brought to bear on the analysis of distressing voices? This article draws findings from the ‘What is it like to hear voices’ study conducted as part of the interdisciplinary Hearing the Voice project into conversation with philosopher Dan Zahavi's Self and Other: Exploring Subjectivity, Empathy and Shame to consider how a focus on shame can open up new questions about the experience of hearing voices. A higher-order emotion of social cognition, shame directs our attention to aspects of voice-hearing which are understudied and elusive, particularly as they concern the status of voices as other and the constitution and conceptualisation of the self. PMID:28389551
The influence of pitch and loudness changes on the acoustics of vocal tremor.

PubMed

Dromey, Christopher; Warrick, Paul; Irish, Jonathan

2002-10-01

The effect of tremor on phonation is to modulate an otherwise steady sound source in its amplitude, fundamental frequency, or both. The severity of untreated vocal tremor has been reported to change under certain conditions that may be related to muscle tension. In order to better understand the phenomenon of vocal tremor, its acoustic properties were examined as individuals volitionally altered their pitch and loudness. These voice conditions were anticipated to alter the tension of the intrinsic laryngeal muscles. The voices of 10 individuals with a diagnosis of vocal tremor were recorded before participating in a longitudinal treatment study. They produced vowels at low and high pitch and loudness levels as well as in a comfortable voice condition. Acoustic analyses quantified the amplitude and frequency modulations of the speakers' voices across the various conditions. Individual speakers varied in the way the pitch and loudness changes affected their tremor, but the following statistically significant effects for the speakers as a group were observed: Higher pitch phonation was associated with a more rapid rate for both amplitude and frequency modulations. Amplitude modulation become faster for louder phonation. Low-pitched phonotion led to decreases in the extent of amplitude tremor. Varying pitch led to dramatic changes in the phase relationship between amplitude and frequency modulation in some of the speakers, whereas this effect was not apparent in other speakers.
Phonomicrosurgery in Vocal Fold Nodules: Quantification of Outcomes in Professional and Non-Professional Voice Users.

PubMed

Caffier, Philipp P; Salmen, Tatjana; Ermakova, Tatiana; Forbes, Eleanor; Ko, Seo-Rin; Song, Wen; Gross, Manfred; Nawka, Tadeus

2017-12-01

There are few data demonstrating the specific extent to which surgical intervention for vocal fold nodules (VFN) improves vocal function in professional (PVU) and non-professional voice users (NVU). The objective of this study was to compare and quantify results after phonomicrosurgery for VFN in these patient groups. In a prospective clinical study, surgery was performed via microlaryngoscopy in 37 female patients with chronic VFN manifestations (38±12 yrs, mean±SD). Pre- and postoperative evaluations of treatment efficacy comprised videolaryngostroboscopy, auditory-perceptual voice assessment, voice range profile (VRP), acoustic-aerodynamic analysis, and voice handicap index (VHI-9i). The dysphonia severity index (DSI) was compared with the vocal extent measure (VEM). PVU (n=24) and NVU (n=13) showed comparable laryngeal findings and levels of suffering (VHI-9i 16±7 vs 17±8), but PVU had a better pretherapeutic vocal range (26.8±7.4 vs 17.7±5.1 semitones, p<0.001) and vocal capacity (VEM 106±18 vs 74±29, p<0.01). Three months postoperatively, all patients had straight vocal fold edges, complete glottal closure, and recovered mucosal wave propagation. The mean VHI-9i score decreased by 8±6 points. DSI increased from 4.0±2.4 to 5.5±2.4, and VEM from 95±27 to 108±23 (p<0.001). Both parameters correlated significantly (rs=0.82). The average vocal range increased by 4.1±5.3 semitones, and the mean speaking pitch lowered by 0.5±1.4 semitones. These results confirm that phonomicrosurgery for VFN is a safe therapy for voice improvement in both PVU and NVU who do not respond to voice therapy alone. Top-level artistic capabilities in PVU were restored, but numeric changes of most vocal parameters were considerably larger in NVU.
Children's voices: can we hear them?

PubMed

McPherson, G; Thorne, S

2000-02-01

This article addresses an important but often neglected notion in the care of children--the notion of voice. Recognizing that a crucial role for pediatric nurses is that of advocate for the child, this article poses the questions of how children's voices can be heard and how nurses know whose voice they represent when they act in an advocacy capacity. Drawing on contributions from psychology, sociology, and feminist studies, the analysis narrows our focus to the special challenge created for pediatric nurses when they recognize the importance of voice in caring for children, and examines the complexities inherent in attending to voice in pediatric nursing practice.
Spectral Analysis of the Voice in Down Syndrome

ERIC Educational Resources Information Center

Albertini, G.; Bonassi, S.; Dall'Armi, V.; Giachetti, I.; Giaquinto, S.; Mignano, M.

2010-01-01

The voice quality of individuals with Down Syndrome (DS) is generally described as husky, monotonous and raucous. On the other hand, the voice of DS children is characterized by breathiness, roughness, and nasality and is typically low pitched. However, research on phonation and intonation in these participants is limited. The present study was…
Analysis and Classification of Voice Pathologies Using Glottal Signal Parameters.

PubMed

Forero M, Leonardo A; Kohler, Manoela; Vellasco, Marley M B R; Cataldo, Edson

2016-09-01

The classification of voice diseases has many applications in health, in diseases treatment, and in the design of new medical equipment for helping doctors in diagnosing pathologies related to the voice. This work uses the parameters of the glottal signal to help the identification of two types of voice disorders related to the pathologies of the vocal folds: nodule and unilateral paralysis. The parameters of the glottal signal are obtained through a known inverse filtering method, and they are used as inputs to an Artificial Neural Network, a Support Vector Machine, and also to a Hidden Markov Model, to obtain the classification, and to compare the results, of the voice signals into three different groups: speakers with nodule in the vocal folds; speakers with unilateral paralysis of the vocal folds; and speakers with normal voices, that is, without nodule or unilateral paralysis present in the vocal folds. The database is composed of 248 voice recordings (signals of vowels production) containing samples corresponding to the three groups mentioned. In this study, a larger database was used for the classification when compared with similar studies, and its classification rate is superior to other studies, reaching 97.2%. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
A model for treating voice disorders in school-age children within a video gaming environment.

PubMed

King, Suzanne N; Davis, Larry; Lehman, Jeffrey J; Ruddy, Bari Hoffman

2012-09-01

Clinicians use a variety of approaches to motivate children with hyperfunctional voice disorders to comply with voice therapy in a therapeutic session and improve the motivation of children to practice home-based exercises. Utilization of current entertainment technology in such approaches may improve participation and motivation in voice therapy. The purpose of this study is to test the feasibility of using an entertainment video game as a therapy device. Prospective cohort and case-control study. Three levels of game testing were conducted to an existing entertainment video game for use as a voice therapy protocol. The game was tested by two computer programmers and five normal participants. The third level of testing was a case study with a child diagnosed with a hyperfunctional voice disorder. Modifications to the game were made after each feasibility test. Errors with the video game performance were modified, including the addition of a time stamp directory and game controller. Resonance voice exercises were modified to accommodate the gaming environment and unique competitive situation, including speech rate, acoustic parameters, game speed, and point allocations. The development of video games for voice therapeutic purposes attempt to replicate the high levels of engagement and motivation attained with entertainment video games, stimulating a more productive means of learning while doing. This case study found that a purely entertainment video game can be implemented as a voice therapeutic protocol based on information obtained from the case study. Copyright © 2012 The Voice Foundation. All rights reserved.
Injection laryngoplasty as miniinvasive office-based surgery in patients with unilateral vocal fold paralysis – voice quality outcomes

PubMed Central

Sielska-Badurek, Ewelina M.; Jędra, Katarzyna; Rzepakowska, Anna; Osuch-Wójcikiewicz, Ewa; Niemczyk, Kazimierz

2017-01-01

Introduction Injection laryngoplasty (glottis augmentation) is the preferred method in surgical management of unilateral vocal fold paralysis (UVFP). Traditionally, these procedures are performed in the operating room. Nowadays, however, these procedures have moved into the office. Aim To evaluate the voice quality after transoral injection laryngoplasty under local anaesthesia in patients with unilateral vocal fold paralysis. Material and methods Fourteen subjects (5 women and 9 men) with unilateral vocal fold paresis (9 with right vocal fold paresis and 5 with left vocal fold paresis) were included in the study. The mean age of the group was 57.8 ±19.0 years (32–83 years). All of the injection laryngoplasties were performed transorally, under local anaesthesia. The injection material was calcium hydroxylapatite. Before and 1, 3 and 6 months after the procedure the following variables were evaluated: voice perception, videostroboscopy, acoustic analysis, aerodynamic evaluation, and the subjective rating of the voice quality by the patient. Results After injection laryngoplasty, complete glottal closure was achieved or there was a significant improvement in the glottal closure of each subject. We noted great improvement in the post-injection objective and subjective voice outcomes and patients reported improvement in the voice-related quality of life. Conclusions The transoral approach for injection laryngoplasty under local anaesthesia is an effective and safe way to treat incomplete glottal closure in patients with UVFP. The transoral approach is an efficient alternative to other surgical techniques used for vocal fold injection. PMID:29062449
Voices to reckon with: perceptions of voice identity in clinical and non-clinical voice hearers

PubMed Central

Badcock, Johanna C.; Chhabra, Saruchi

2013-01-01

The current review focuses on the perception of voice identity in clinical and non-clinical voice hearers. Identity perception in auditory verbal hallucinations (AVH) is grounded in the mechanisms of human (i.e., real, external) voice perception, and shapes the emotional (distress) and behavioral (help-seeking) response to the experience. Yet, the phenomenological assessment of voice identity is often limited, for example to the gender of the voice, and has failed to take advantage of recent models and evidence on human voice perception. In this paper we aim to synthesize the literature on identity in real and hallucinated voices and begin by providing a comprehensive overview of the features used to judge voice identity in healthy individuals and in people with schizophrenia. The findings suggest some subtle, but possibly systematic biases across different levels of voice identity in clinical hallucinators that are associated with higher levels of distress. Next we provide a critical evaluation of voice processing abilities in clinical and non-clinical voice hearers, including recent data collected in our laboratory. Our studies used diverse methods, assessing recognition and binding of words and voices in memory as well as multidimensional scaling of voice dissimilarity judgments. The findings overall point to significant difficulties recognizing familiar speakers and discriminating between unfamiliar speakers in people with schizophrenia, both with and without AVH. In contrast, these voice processing abilities appear to be generally intact in non-clinical hallucinators. The review highlights some important avenues for future research and treatment of AVH associated with a need for care, and suggests some novel insights into other symptoms of psychosis. PMID:23565088
Assessment of Severe Apnoea through Voice Analysis, Automatic Speech, and Speaker Recognition Techniques

NASA Astrophysics Data System (ADS)

Fernández Pozo, Rubén; Blanco Murillo, Jose Luis; Hernández Gómez, Luis; López Gonzalo, Eduardo; Alcázar Ramírez, José; Toledano, Doroteo T.

2009-12-01

This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Phonological experience modulates voice discrimination: Evidence from functional brain networks analysis.

PubMed

Hu, Xueping; Wang, Xiangpeng; Gu, Yan; Luo, Pei; Yin, Shouhang; Wang, Lijun; Fu, Chao; Qiao, Lei; Du, Yi; Chen, Antao

2017-10-01

Numerous behavioral studies have found a modulation effect of phonological experience on voice discrimination. However, the neural substrates underpinning this phenomenon are poorly understood. Here we manipulated language familiarity to test the hypothesis that phonological experience affects voice discrimination via mediating the engagement of multiple perceptual and cognitive resources. The results showed that during voice discrimination, the activation of several prefrontal regions was modulated by language familiarity. More importantly, the same effect was observed concerning the functional connectivity from the fronto-parietal network to the voice-identity network (VIN), and from the default mode network to the VIN. Our findings indicate that phonological experience could bias the recruitment of cognitive control and information retrieval/comparison processes during voice discrimination. Therefore, the study unravels the neural substrates subserving the modulation effect of phonological experience on voice discrimination, and provides new insights into studying voice discrimination from the perspective of network interactions. Copyright © 2017. Published by Elsevier Inc.
Experiments on Analysing Voice Production: Excised (Human, Animal) and In Vivo (Animal) Approaches

PubMed Central

Döllinger, Michael; Kobler, James; Berry, David A.; Mehta, Daryush D.; Luegmair, Georg; Bohr, Christopher

2015-01-01

Experiments on human and on animal excised specimens as well as in vivo animal preparations are so far the most realistic approaches to simulate the in vivo process of human phonation. These experiments do not have the disadvantage of limited space within the neck and enable studies of the actual organ necessary for phonation, i.e., the larynx. The studies additionally allow the analysis of flow, vocal fold dynamics, and resulting acoustics in relation to well-defined laryngeal alterations. Purpose of Review This paper provides an overview of the applications and usefulness of excised (human/animal) specimen and in vivo animal experiments in voice research. These experiments have enabled visualization and analysis of dehydration effects, vocal fold scarring, bifurcation and chaotic vibrations, three-dimensional vibrations, aerodynamic effects, and mucosal wave propagation along the medial surface. Quantitative data will be shown to give an overview of measured laryngeal parameter values. As yet, a full understanding of all existing interactions in voice production has not been achieved, and thus, where possible, we try to indicate areas needing further study. Recent Findings A further motivation behind this review is to highlight recent findings and technologies related to the study of vocal fold dynamics and its applications. For example, studies of interactions between vocal tract airflow and generation of acoustics have recently shown that airflow superior to the glottis is governed by not only vocal fold dynamics but also by subglottal and supraglottal structures. In addition, promising new methods to investigate kinematics and dynamics have been reported recently, including dynamic optical coherence tomography, X-ray stroboscopy and three-dimensional reconstruction with laser projection systems. Finally, we touch on the relevance of vocal fold dynamics to clinical laryngology and to clinically-oriented research. PMID:26581597
Color and texture associations in voice-induced synesthesia

PubMed Central

Moos, Anja; Simmons, David; Simner, Julia; Smith, Rachel

2013-01-01

Voice-induced synesthesia, a form of synesthesia in which synesthetic perceptions are induced by the sounds of people's voices, appears to be relatively rare and has not been systematically studied. In this study we investigated the synesthetic color and visual texture perceptions experienced in response to different types of “voice quality” (e.g., nasal, whisper, falsetto). Experiences of three different groups—self-reported voice synesthetes, phoneticians, and controls—were compared using both qualitative and quantitative analysis in a study conducted online. Whilst, in the qualitative analysis, synesthetes used more color and texture terms to describe voices than either phoneticians or controls, only weak differences, and many similarities, between groups were found in the quantitative analysis. Notable consistent results between groups were the matching of higher speech fundamental frequencies with lighter and redder colors, the matching of “whispery” voices with smoke-like textures, and the matching of “harsh” and “creaky” voices with textures resembling dry cracked soil. These data are discussed in the light of current thinking about definitions and categorizations of synesthesia, especially in cases where individuals apparently have a range of different synesthetic inducers. PMID:24032023
Voices on Voice: Perspectives, Definitions, Inquiry.

ERIC Educational Resources Information Center

Yancey, Kathleen Blake, Ed.

This collection of essays approaches "voice" as a means of expression that lives in the interactions of writers, readers, and language, and examines the conceptualizations of voice within the oral rhetorical and expressionist traditions, and the notion of voice as both a singular and plural phenomenon. An explanatory introduction by the…
Assessment of voice, speech and communication changes associated with cervical spinal cord injury.

PubMed

Johansson, Kerstin; Seiger, Åke; Forsén, Malin; Holmgren Nilsson, Jeanette; Hartelius, Lena; Schalling, Ellika

2018-02-24

Respiratory muscle impairment following cervical spinal cord injury (CSCI) may lead to reduced voice function, although the individual variation is large. Voice problems in this population may not always receive attention since individuals with CSCI face other, more acute and life-threatening issues that need/receive attention. Currently there is no consensus on the tasks suitable to identify the specific voice impairments and functional voice changes experienced by individuals with CSCI. To examine which voice/speech tasks identify the specific voice and communication changes associated with CSCI, habitual and maximum speech performance of a group with CSCI was compared with that of a healthy control group (CG), and the findings were related to respiratory function and to self-reported voice problems. Respiratory, aerodynamic, acoustic and self-reported voice data from 19 individuals (nine women and 10 men, aged 23-59 years, heights = 153-192 cm) with CSCI (levels C3-C7) were compared with data from a CG consisting of 19 carefully matched non-injured people (nine women and 10 men, aged 19-59 years, heights = 152-187 cm). Despite considerable variability of performance, highly significant differences between the group with CSCI and the CG were found in maximum phonation time, maximum duration of breath phrases, maximum sound pressure level and maximum voice area in voice-range profiles (all p = .000). Subglottal pressure was lower and phonatory stability was reduced in some of the individuals with CSCI, but differences between the groups were not statistically significant. Six of 19 had voice handicap index (VHI) scores above 20 (the cut-off for voice disorder). Individuals with a vital capacity below 50% of the expected for an equivalent reference individual performed significantly worse than participants with more normal vital capacity. Completeness and level of injury seemed to impact vocal function in some individuals. A combination of maximum performance

Influences of Fundamental Frequency, Formant Frequencies, Aperiodicity, and Spectrum Level on the Perception of Voice Gender

ERIC Educational Resources Information Center

Skuk, Verena G.; Schweinberger, Stefan R.

2014-01-01

Purpose: To determine the relative importance of acoustic parameters (fundamental frequency [F0], formant frequencies [FFs], aperiodicity, and spectrum level [SL]) on voice gender perception, the authors used a novel parameter-morphing approach that, unlike spectral envelope shifting, allows the application of nonuniform scale factors to transform…
Effects of a Straw Phonation Protocol on Acoustic and Perceptual Measures of an SATB Chorus.

PubMed

Manternach, Jeremy N; Daugherty, James F

2017-12-29

Recent scholarship has suggested that semi-occluded vocal tract (SOVT) exercises may increase vocal economy of individuals by reducing vocal effort while maintaining or increasing acoustic output. Choral singers, however, may use different resonance techniques or change voicing behaviors in an effort to hear their own sound in relation to others. One investigation revealed significant increases in a choir's mean spectral energy after participating in a straw phonation protocol. However, that study reported only acoustic measures and did not include choristers' perceptions of the choral sound and their own voicing efficiency. The purpose of this study was to measure the effect of a straw phonation protocol on acoustic (long-term average spectrum) and perceptual (self-report) measures of the choral sound of an intact soprano, alto, tenor, and bass (SATB) choir. This is a quasi-experimental, one-group, pretest-posttest design. An SATB choir (N = 48 singers) performed a Renaissance motet, participated in a 4-minute voicing protocol with a small straw, and then sang the motet a second time. They completed the same procedure later in the rehearsal. Long-term average spectrum results indicated no statistically significant mean changes in spectral energy after the SOVT protocols. Most participants, however, perceived that the choir sounded better (78.26%) and that their own vocal production was more efficient or comfortable (73.91%) following the protocol. Choristers perceived less vocal effort while maintaining vocal output after straw phonation, which may feasibly align with extant solo research. More research may determine whether this result is due specifically to SOVTs. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic and Perceptual Analyses of Adductor Spasmodic Dysphonia in Mandarin-speaking Chinese.

PubMed

Chen, Zhipeng; Li, Jingyuan; Ren, Qingyi; Ge, Pingjiang

2018-02-12

The objective of this study was to examine the perceptual structure and acoustic characteristics of speech of patients with adductor spasmodic dysphonia (ADSD) in Mandarin. Case-Control Study MATERIALS AND METHODS: For the estimation of dysphonia level, perceptual and acoustic analysis were used for patients with ADSD (N = 20) and the control group (N = 20) that are Mandarin-Chinese speakers. For both subgroups, a sustained vowel and connected speech samples were obtained. The difference of perceptual and acoustic parameters between the two subgroups was assessed and analyzed. For acoustic assessment, the percentage of phonatory breaks (PBs) of connected reading and the percentage of aperiodic segments and frequency shifts (FS) of vowel and reading in patients with ADSD were significantly worse than controls, the mean harmonics-to-noise ratio and the fundamental frequency standard deviation of vowel as well. For perceptual evaluation, the rating of speech and vowel in patients with ADSD are significantly higher than controls. The percentage of aberrant acoustic events (PB, frequency shift, and aperiodic segment) and the fundamental frequency standard deviation and mean harmonics-to-noise ratio were significantly correlated with the perceptual rating in the vowel and reading productions. The perceptual and acoustic parameters of connected vowel and reading in patients with ADSD are worse than those in normal controls, and could validly and reliably estimate dysphonia of ADSD in Mandarin-speaking Chinese. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Using rate of divergence as an objective measure to differentiate between voice signal types based on the amount of disorder in the signal

PubMed Central

Calawerts, William M; Lin, Liyu; Sprott, JC; Jiang, Jack J

2016-01-01

Objective/Hypothesis The purpose of this paper is to introduce rate of divergence as an objective measure to differentiate between the four voice types based on the amount of disorder present in a signal. We hypothesized that rate of divergence would provide an objective measure that can quantify all four voice types. Study Design 150 acoustic voice recordings were randomly selected and analyzed using traditional perturbation, nonlinear, and rate of divergence analysis methods. ty Methods We developed a new parameter, rate of divergence, which uses a modified version of Wolf’s algorithm for calculating Lyapunov exponents of a system. The outcome of this calculation is not a Lyapunov exponent, but rather a description of the divergence of two nearby data points for the next three points in the time series, followed in three time delayed embedding dimensions. This measure was compared to currently existing perturbation and nonlinear dynamic methods of distinguishing between voice signals. Results There was a direct relationship between voice type and rate of divergence. This calculation is especially effective at differentiating between type 3 and type 4 voices (p<0.001), and is equally effective at differentiating type 1, type 2, and type 3 signals as currently existing methods. Conclusion The rate of divergence calculation introduced is an objective measure that can be used to distinguish between all four voice types based on amount of disorder present, leading to quicker and more accurate voice typing as well as an improved understanding of the nonlinear dynamics involved in phonation. PMID:26920858
Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computational modeling

PubMed Central

Samlan, Robin A.; Story, Brad H.; Bunton, Kate

2014-01-01

Purpose To determine 1) how specific vocal fold structural and vibratory features relate to breathy voice quality and 2) the relation of perceived breathiness to four acoustic correlates of breathiness. Method A computational, kinematic model of the vocal fold medial surfaces was used to specify features of vocal fold structure and vibration in a manner consistent with breathy voice. Four model parameters were altered: vocal process separation, surface bulging, vibratory nodal point, and epilaryngeal constriction. Twelve naïve listeners rated breathiness of 364 samples relative to a reference. The degree of breathiness was then compared to 1) the underlying kinematic profile and 2) four acoustic measures: cepstral peak prominence (CPP), harmonics-to-noise ratio, and two measures of spectral slope. Results Vocal process separation alone accounted for 61.4% of the variance in perceptual rating. Adding nodal point ratio and bulging to the equation increased the explained variance to 88.7%. The acoustic measure CPP accounted for 86.7% of the variance in perceived breathiness, and explained variance increased to 92.6% with the addition of one spectral slope measure. Conclusions Breathiness ratings were best explained kinematically by the degree of vocal process separation and acoustically by CPP. PMID:23785184
Aerodynamic and acoustic effects of ventricular gap.

PubMed

Alipour, Fariborz; Karnell, Michael

2014-03-01

Supraglottic compression is frequently observed in individuals with dysphonia. It is commonly interpreted as an indication of excessive circumlaryngeal muscular tension and ventricular medialization. The purpose of this study was to describe the aerodynamic and acoustic impact of varying ventricular medialization in a canine model. Subglottal air pressure, glottal airflow, electroglottograph, acoustic signals, and high-speed video images were recorded in seven excised canine larynges mounted in vitro for laryngeal vibratory experimentation. The degree of gap between the ventricular folds was adjusted and measured using sutures and weights. Data were recorded during phonation when the ventricular gap was narrow, neutral, and large. Glottal resistance was estimated by measures of subglottal pressure and glottal flow. Glottal resistance increased systematically as ventricular gap became smaller. Wide ventricular gaps were associated with increases in fundamental frequency and decreases in glottal resistance. Sound pressure level did not appear to be impacted by the adjustments in ventricular gap used in this research. Increases in supraglottic compression and associated reduced ventricular width may be observed in a variety of disorders that affect voice quality. Ventricular compression may interact with true vocal fold posture and vibration resulting in predictable changes in aerodynamic, physiological, acoustic, and perceptual measures of phonation. The data from this report supports the theory that narrow ventricular gaps may be associated with disordered phonation. In vitro and in vivo human data are needed to further test this association. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
An Analysis of TRACON (Terminal Radar Approach Control) Controller-Pilot Voice Communication

DOT National Transportation Integrated Search

1996-06-01

The purpose of this analysis was to examine pilot-controller communication practices in the TRACONI (Terminal Radar Approach : Control) environment. Forty-eight hours of communications recorded on the voice tapes from eight TRACONs were analyzed. : T...
Acoustic analysis of trill sounds.

PubMed

Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri

2012-04-01

In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.
Voice - How humans communicate?

PubMed

Tiwari, Manjul; Tiwari, Maneesha

2012-01-01

Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.
Acoustic emission spectral analysis of fiber composite failure mechanisms

NASA Technical Reports Server (NTRS)

Egan, D. M.; Williams, J. H., Jr.

1978-01-01

The acoustic emission of graphite fiber polyimide composite failure mechanisms was investigated with emphasis on frequency spectrum analysis. Although visual examination of spectral densities could not distinguish among fracture sources, a paired-sample t statistical analysis of mean normalized spectral densities did provide quantitative discrimination among acoustic emissions from 10 deg, 90 deg, and plus or minus 45 deg, plus or minus 45 deg sub s specimens. Comparable discrimination was not obtained for 0 deg specimens.
Quantitative high-speed laryngoscopic analysis of vocal fold vibration in fatigued voice of young karaoke singers.

PubMed

Yiu, Edwin M-L; Wang, Gaowu; Lo, Andy C Y; Chan, Karen M-K; Ma, Estella P-M; Kong, Jiangping; Barrett, Elizabeth Ann

2013-11-01

The present study aimed to determine whether there were physiological differences in the vocal fold vibration between nonfatigued and fatigued voices using high-speed laryngoscopic imaging and quantitative analysis. Twenty participants aged from 18 to 23 years (mean, 21.2 years; standard deviation, 1.3 years) with normal voice were recruited to participate in an extended singing task. Vocal fatigue was induced using a singing task. High-speed laryngoscopic image recordings of /i/ phonation were taken before and after the singing task. The laryngoscopic images were semiautomatically analyzed with the quantitative high-speed video processing program to extract indices related to the anteroposterior dimension (length), transverse dimension (width), and the speed of opening and closing. Significant reduction in the glottal length-to-width ratio index was found after vocal fatigue. Physiologically, this indicated either a significantly shorter (anteroposteriorly) or a wider (transversely) glottis after vocal fatigue. The high-speed imaging technique using quantitative analysis has the potential for early identification of vocally fatigued voice. Copyright © 2013 The Voice Foundation. All rights reserved.
Vocal therapy of hyperkinetic dysphonia.

PubMed

Mumović, Gordana; Veselinović, Mila; Arbutina, Tanja; Škrbić, Renata

2014-01-01

Hyperkinetic (hyperfunctional) dysphonia is a common pathology. The disorder is often found in vocal professionals faced with high vocal requirements. The objective of this study was to evaluate the effects of vocal therapy on voice condition characterized by hyperkinetic dysphonia with prenodular lesions and soft nodules. The study included 100 adult patients and 27 children aged 4-16 years with prenodular lesions and soft nodules. A subjective acoustic analysis using the GIRBAS scale was performed prior to and after vocal therapy. Twenty adult patients and 10 children underwent objective acoustic analysis including several acoustic parameters. Pathological vocal qualities (hoarse, harsh and breathy voice) were also obtained by computer analysis. The subjective acoustic analysis revealed a significant (p<0.01) reduction in all dysphonia parameters after vocal treatment in adults and children. After treatment, all levels of dysphonia were lowered in 85% (85/100) of adult patients and 29% (29/100) had a normal voice. Before vocal therapy 9 children had severe, 13 had moderate and 8 slight dysphonia. After vocal therapy only 1 child had severe dysphonia, 7 had moderate, 10 had slight levels of dysphonia and 9 were without voice disorder. The objective acoustic analysis in adults revealed a significant improvement (p≤0.025) in all dysphonia parameters except SD FO and jitter %. In children, the acoustic parameters SD FO, jitter % and NNE (normal noise energy) were significantly improved (p=0.003-0.03). Pathological voice qualities were also improved in adults and children (p<0.05). Vocal therapy effectively improves the voice in hyperkinetic dysphonia with prenodular lesions and soft nodules in both adults and children, affectinq diverse acoustic parameters.
A Meta-Analysis: Acoustic Measurement of Roughness and Breathiness

ERIC Educational Resources Information Center

v. Latoszek, Ben Barsties; Maryn, Youri; Gerrits, Ellen; De Bodt, Marc

2018-01-01

Purpose: Over the last 5 decades, many acoustic measures have been created to measure roughness and breathiness. The aim of this study is to present a meta-analysis of correlation coefficients (r) between auditory-perceptual judgment of roughness and breathiness and various acoustic measures in both sustained vowels and continuous speech. Method:…
The effect of stretch-and-flow voice therapy on measures of vocal function and handicap.

PubMed

Watts, Christopher R; Diviney, Shelby S; Hamilton, Amy; Toles, Laura; Childs, Lesley; Mau, Ted

2015-03-01

To investigate the efficacy of stretch-and-flow voice therapy as a primary physiological treatment for patients with hyperfunctional voice disorders. Prospective case series. Participants with a diagnosis of primary muscle tension dysphonia or phonotraumatic lesions due to hyperfunctional vocal behaviors were included. Participants received stretch-and-flow voice therapy structured once weekly for 6 weeks. Outcome variables consisted of two physiologic measures (s/z ratio and maximum phonation time), an acoustic measure (cepstral peak prominence [CPP]), and a measure of vocal handicap (voice handicap index [VHI]). All measures were obtained at baseline before treatment and within 2 weeks posttreatment. The s/z ratio, maximum phonation time, sentence CPP, and VHI showed statistically significant (P < 0.05) improvement through therapy. Effect sizes reflecting the magnitude of change were large for s/z ratio and VHI (d = 1.25 and 1.96 respectively), and moderate for maximum phonation time and sentence CPP (d = 0.79 and 0.74, respectively). This study provides supporting evidence for preliminary efficacy of stretch-and-flow voice therapy in a small sample of patients. The treatment effect was large or moderate for multiple outcome measures. The data provide justification for larger, controlled clinical trials on the application of stretch-and-flow voice therapy in the treatment of hyperfunctional voice disorders. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice, Schooling, Inequality, and Scale

ERIC Educational Resources Information Center

Collins, James

2013-01-01

The rich studies in this collection show that the investigation of voice requires analysis of "recognition" across layered spatial-temporal and sociolinguistic scales. I argue that the concepts of voice, recognition, and scale provide insight into contemporary educational inequality and that their study benefits, in turn, from paying attention to…
A Longitudinal Study of Voice before and after Phonosurgery for Removal of a Polyp

ERIC Educational Resources Information Center

Stajner-Katusic, Smiljka; Horga, Damir; Zrinski, Karolina Vrban

2008-01-01

The aim of the present investigation was to evaluate the acoustic parameters, perceptual estimation, and self-estimation of voice before, 1 month after, and 6 years after surgical removal of a vocal fold polyp. Subjects were five male patients who came to the Phoniatric Clinic because of breathiness. For all patients, a polyp of one vocal fold was…
Numerical analysis of acoustic impedance microscope utilizing acoustic lens transducer to examine cultured cells.

PubMed

Gunawan, Agus Indra; Hozumi, Naohiro; Takahashi, Kenta; Yoshida, Sachiko; Saijo, Yoshifumi; Kobayashi, Kazuto; Yamamoto, Seiji

2015-12-01

A new technique is proposed for non-contact quantitative cell observation using focused ultrasonic waves. This technique interprets acoustic reflection intensity into the characteristic acoustic impedance of the biological cell. The cells are cultured on a plastic film substrate. A focused acoustic beam is transmitted through the substrate to its interface with the cell. A two-dimensional (2-D) reflection intensity profile is obtained by scanning the focal point along the interface. A reference substance is observed under the same conditions. These two reflections are compared and interpreted into the characteristic acoustic impedance of the cell based on a calibration curve that was created prior to the observation. To create the calibration curve, a numerical analysis of the sound field is performed using Fourier Transforms and is verified using several saline solutions. Because the cells are suspended by two plastic films, no contamination is introduced during the observation. In a practical observation, a sapphire lens transducer with a center frequency of 300 MHz was employed using ZnO thin film. The objects studied were co-cultured rat-derived glial (astrocyte) cells and glioma cells. The result was the clear observation of the internal structure of the cells. The acoustic impedance of the cells was spreading between 1.62 and 1.72 MNs/m(3). Cytoskeleton was indicated by high acoustic impedance. The introduction of cytochalasin-B led to a significant reduction in the acoustic impedance of the glioma cells; its effect on the glial cells was less significant. It is believed that this non-contact observation method will be useful for continuous cell inspections. Copyright © 2015 Elsevier B.V. All rights reserved.
Fourier descriptor analysis and unification of voice range profile contours: method and applications.

PubMed

Pabon, Peter; Ternström, Sten; Lamarche, Anick

2011-06-01

To describe a method for unified description, statistical modeling, and comparison of voice range profile (VRP) contours, even from diverse sources. A morphologic modeling technique, which is based on Fourier descriptors (FDs), is applied to the VRP contour. The technique, which essentially involves resampling of the curve of the contour, is assessed and also is compared to density-based VRP averaging methods that use the overlap count. VRP contours can be usefully described and compared using FDs. The method also permits the visualization of the local covariation along the contour average. For example, the FD-based analysis shows that the population variance for ensembles of VRP contours is usually smallest at the upper left part of the VRP. To illustrate the method's advantages and possible further application, graphs are given that compare the averaged contours from different authors and recording devices--for normal, trained, and untrained male and female voices as well as for child voices. The proposed technique allows any VRP shape to be brought to the same uniform base. On this uniform base, VRP contours or contour elements coming from a variety of sources may be placed within the same graph for comparison and for statistical analysis.
Emotional self-other voice processing in schizophrenia and its relationship with hallucinations: ERP evidence.

PubMed

Pinheiro, Ana P; Rezaii, Neguine; Rauber, Andréia; Nestor, Paul G; Spencer, Kevin M; Niznikiewicz, Margaret

2017-09-01

Abnormalities in self-other voice processing have been observed in schizophrenia, and may underlie the experience of hallucinations. More recent studies demonstrated that these impairments are enhanced for speech stimuli with negative content. Nonetheless, few studies probed the temporal dynamics of self versus nonself speech processing in schizophrenia and, particularly, the impact of semantic valence on self-other voice discrimination. In the current study, we examined these questions, and additionally probed whether impairments in these processes are associated with the experience of hallucinations. Fifteen schizophrenia patients and 16 healthy controls listened to 420 prerecorded adjectives differing in voice identity (self-generated [SGS] versus nonself speech [NSS]) and semantic valence (neutral, positive, and negative), while EEG data were recorded. The N1, P2, and late positive potential (LPP) ERP components were analyzed. ERP results revealed group differences in the interaction between voice identity and valence in the P2 and LPP components. Specifically, LPP amplitude was reduced in patients compared with healthy subjects for SGS and NSS with negative content. Further, auditory hallucinations severity was significantly predicted by LPP amplitude: the higher the SAPS "voices conversing" score, the larger the difference in LPP amplitude between negative and positive NSS. The absence of group differences in the N1 suggests that self-other voice processing abnormalities in schizophrenia are not primarily driven by disrupted sensory processing of voice acoustic information. The association between LPP amplitude and hallucination severity suggests that auditory hallucinations are associated with enhanced sustained attention to negative cues conveyed by a nonself voice. © 2017 Society for Psychophysiological Research.
Hands-free human-machine interaction with voice

NASA Astrophysics Data System (ADS)

Juang, B. H.

2004-05-01

Voice is natural communication interface between a human and a machine. The machine, when placed in today's communication networks, may be configured to provide automation to save substantial operating cost, as demonstrated in AT&T's VRCP (Voice Recognition Call Processing), or to facilitate intelligent services, such as virtual personal assistants, to enhance individual productivity. These intelligent services often need to be accessible anytime, anywhere (e.g., in cars when the user is in a hands-busy-eyes-busy situation or during meetings where constantly talking to a microphone is either undersirable or impossible), and thus call for advanced signal processing and automatic speech recognition techniques which support what we call ``hands-free'' human-machine communication. These techniques entail a broad spectrum of technical ideas, ranging from use of directional microphones and acoustic echo cancellatiion to robust speech recognition. In this talk, we highlight a number of key techniques that were developed for hands-free human-machine communication in the mid-1990s after Bell Labs became a unit of Lucent Technologies. A video clip will be played to demonstrate the accomplishement.

Voice quality change in future professional voice users after 9 months of voice training.

PubMed

Timmermans, Bernadette; De Bodt, Marc; Wuyts, Floris; Van de Heyning, Paul

2004-01-01

Sixty-eight students of a school for audiovisual communication participated in this study. A part of them, 49 students, received voice training for 9 months (the trained group); 19 subjects received no specific voice training (the untrained group). A multidimensional test battery containing the GRBAS scale, videolaryngostroboscopy, Maximum Phonation Time (MPT), jitter, lowest intensity (IL), highest frequency (FoH), Dysphonia Severity Index (DSI) and Voice Handicap Index (VHI) was applied before and after training to evaluate training outcome. The voice training is made up of technical workshops in small groups (five to eight subjects) and vocal coaching in the ateliers. In the technical workshops, basic skills are trained (posture, breathing technique, articulation and diction), and in the ateliers, the speech and language pathologist assists the subjects in the practice of their voice work. This study revealed a significant amelioration over time for the objective measurements [Dysphonia Severity Index: from 2.3 to 4.5 ( P<0.001)] and the self-evaluation [Voice Handicap Index, from 23 to 18.4 ( P=0.016)] for the trained group only. This outcome favors the systematic introduction of voice training during the schooling of professional voice users.
Characteristics and professional use of voice in street children in Aracaju, Brazil.

PubMed

Sales, Neuza Josina; Gurgel, Ricardo Queiroz; Gonçalves, Maria Inês Rebelo; Cunha, Edílson; Barreto, Valeria Maria Prado; Todt Neto, João Carlos; D'Avila, Jeferson Sampaio

2010-07-01

The objective of the study was to evaluate voice characteristics of children engaged in street selling, which involves an essentially professional use of voice in this population. A controlled cross-sectional study was carried out. A randomly chosen sample of 200 school children with a history of street selling assisted by public social services and 400 school children without this experience was selected. Seven- to 10-year-old children of both sexes were studied. Both groups were interviewed and given vocal assessment (auditory-perceptual assessment and spectrographic acoustic measures) and otorhinolaryngological evaluation (physical and videonasolaryngoscopic examination). Children with abnormal results in both groups were compared using chi(2) (Chi-squared test). The significance level was established at 5% (P<0.05). Voice problems were detected more frequently in working children (106-53%) than in regular school children (90-22.5%). The control group achieved better school performance as more children in this group attend school regularly than street children, although age-for-grade deficit was similar. The control group had more access to medical visits (80-40%) and treatment with a doctor (34-17%). Language assessment has shown that the control group had more dysphonia (73-37%) and myofunctional orofacial disorders (20-10%). Street children had more normal voice but had more nasal disorders and greater glottal closure than the school control group. Voice disorders were present in both groups, but less frequently in street children. Although subject to inadequate living conditions, street children had better voice quality than the control group. An explanation could be that by adapting their voice professionally for selling goods in the streets, they developed adequate resilience to their difficult living conditions. Copyright (c) 2010 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
"Who" is saying "what"? Brain-based decoding of human voice and speech.

PubMed

Formisano, Elia; De Martino, Federico; Bonte, Milene; Goebel, Rainer

2008-11-07

Can we decipher speech content ("what" is being said) and speaker identity ("who" is saying it) from observations of brain activity of a listener? Here, we combine functional magnetic resonance imaging with a data-mining algorithm and retrieve what and whom a person is listening to from the neural fingerprints that speech and voice signals elicit in the listener's auditory cortex. These cortical fingerprints are spatially distributed and insensitive to acoustic variations of the input so as to permit the brain-based recognition of learned speech from unknown speakers and of learned voices from previously unheard utterances. Our findings unravel the detailed cortical layout and computational properties of the neural populations at the basis of human speech recognition and speaker identification.
Validity and Reliability Study of Bahasa Malaysia Version of Voice Handicap Index-10.

PubMed

Ong, Fei Ming; Husna Nik Hassan, Nik Fariza; Azman, Mawaddah; Sani, Abdullah; Mat Baki, Marina

2018-05-21

This study aimed to determine the validity and reliability of Bahasa Malaysia version of Voice Handicap Index-10 (mVHI-10). This cross-sectional study was carried out in the Otorhinolaryngology, Head and Neck Surgery Department of Universiti Kebangsaan Malaysia Medical Centre (UKMMC) from June 2015 to May 2016. The mVHI-10 was produced following a rigorous forward and backward translation. One hundred participants, including 50 healthy volunteers (17 male, 33 female) and 50 patients with voice disorders (26 male, 24 female), were recruited to complete the mVHI-10 before flexible laryngoscopic examinations and acoustic analysis. The mVHI-10 was repeated in 2 weeks via telephone interview or clinic visit. Its reliability and validity were assessed using interclass correlation. The test-retest reliability for total mVHI-10 and each item score was high, with the Cronbach alpha of >0.90. The total mVHI-10 score and domain scores were significantly higher (P < 0.001) in the pathology groups (20.92 ± 8.74) than healthy volunteers (1.54 ± 1.97), depicting excellent discriminant validity. The Kaiser-Meyer-Olkin measure was 0.92, which depicted excellent construct validity. There was a significant positive correlation between the mVHI-10 score and jitter and shimmer result (P < 0.001). The present study showed good reliability and validity of the mVHI-10 when applied to both healthy volunteers and patients with voice disorders. We recommend the use of the mVHI-10 in daily clinical practice among Bahasa Malaysia-speaking population. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Spatio-Temporal Analysis of Urban Acoustic Environments with Binaural Psycho-Acoustical Considerations for IoT-Based Applications.

PubMed

Segura-Garcia, Jaume; Navarro-Ruiz, Juan Miguel; Perez-Solano, Juan J; Montoya-Belmonte, Jose; Felici-Castell, Santiago; Cobos, Maximo; Torres-Aranda, Ana M

2018-02-26

Sound pleasantness or annoyance perceived in urban soundscapes is a major concern in environmental acoustics. Binaural psychoacoustic parameters are helpful to describe generic acoustic environments, as it is stated within the ISO 12913 framework. In this paper, the application of a Wireless Acoustic Sensor Network (WASN) to evaluate the spatial distribution and the evolution of urban acoustic environments is described. Two experiments are presented using an indoor and an outdoor deployment of a WASN with several nodes using an Internet of Things (IoT) environment to collect audio data and calculate meaningful parameters such as the sound pressure level, binaural loudness and binaural sharpness. A chunk of audio is recorded in each node periodically with a microphone array and the binaural rendering is conducted by exploiting the estimated directional characteristics of the incoming sound by means of DOA estimation. Each node computes the parameters in a different location and sends the values to a cloud-based broker structure that allows spatial statistical analysis through Kriging techniques. A cross-validation analysis is also performed to confirm the usefulness of the proposed system.
Spatio-Temporal Analysis of Urban Acoustic Environments with Binaural Psycho-Acoustical Considerations for IoT-Based Applications

PubMed Central

Montoya-Belmonte, Jose; Cobos, Maximo; Torres-Aranda, Ana M.

2018-01-01

Sound pleasantness or annoyance perceived in urban soundscapes is a major concern in environmental acoustics. Binaural psychoacoustic parameters are helpful to describe generic acoustic environments, as it is stated within the ISO 12913 framework. In this paper, the application of a Wireless Acoustic Sensor Network (WASN) to evaluate the spatial distribution and the evolution of urban acoustic environments is described. Two experiments are presented using an indoor and an outdoor deployment of a WASN with several nodes using an Internet of Things (IoT) environment to collect audio data and calculate meaningful parameters such as the sound pressure level, binaural loudness and binaural sharpness. A chunk of audio is recorded in each node periodically with a microphone array and the binaural rendering is conducted by exploiting the estimated directional characteristics of the incoming sound by means of DOA estimation. Each node computes the parameters in a different location and sends the values to a cloud-based broker structure that allows spatial statistical analysis through Kriging techniques. A cross-validation analysis is also performed to confirm the usefulness of the proposed system. PMID:29495407
Voice-stress measure of mental workload

NASA Technical Reports Server (NTRS)

Alpert, Murray; Schneider, Sid J.

1988-01-01

In a planned experiment, male subjects between the age of 18 and 50 will be required to produce speech while performing various tasks. Analysis of the speech produced should reveal which aspects of voice prosody are associated with increased workloads. Preliminary results with two female subjects suggest a possible trend for voice frequency and amplitude to be higher and the variance of the voice frequency to be lower in the high workload condition.
Maxillary arch dimensions associated with acoustic parameters in prepubertal children.

PubMed

Hamdan, Abdul-Latif; Khandakji, Mohannad; Macari, Anthony Tannous

2018-04-18

To evaluate the association between maxillary arch dimensions and fundamental frequency and formants of voice in prepubertal subjects. Thirty-five consecutive prepubertal patients seeking orthodontic treatment were recruited (mean age = 11.41 ± 1.46 years; range, 8 to 13.7 years). Participants with a history of respiratory infection, laryngeal manipulation, dysphonia, congenital facial malformations, or history of orthodontic treatment were excluded. Dental measurements included maxillary arch length, perimeter, depth, and width. Voice parameters comprising fundamental frequency (f0_sustained), Habitual pitch (f0_count), Jitter, Shimmer, and different formant frequencies (F1, F2, F3, and F4) were measured using acoustic analysis prior to initiation of any orthodontic treatment. Pearson's correlation coefficients were used to measure the strength of associations between different dental and voice parameters. Multiple linear regressions were computed for the predictions of different dental measurements. Arch width and arch depth had moderate significant negative correlations with f0 ( r = -0.52; P = .001 and r = -0.39; P = .022, respectively) and with habitual frequency ( r = -0.51; P = .0014 and r = -0.34; P = .04, respectively). Arch depth and arch length were significantly correlated with formant F3 and formant F4, respectively. Predictors of arch depth included frequencies of F3 vowels, with a significant regression equation ( P-value < .001; R 2 = 0.49). Similarly, fundamental frequency f0 and frequencies of formant F3 vowels were predictors of arch width, with a significant regression equation ( P-value < .001; R 2 = 0.37). There is a significant association between arch dimensions, particularly arch length and depth, and voice parameters. The formant most predictive of arch depth and width is the third formant, along with fundamental frequency of voice.
Overall intelligibility, articulation, resonance, voice and language in a child with Nager syndrome.

PubMed

Van Lierde, Kristiane M; Luyten, Anke; Mortier, Geert; Tijskens, Anouk; Bettens, Kim; Vermeersch, Hubert

2011-02-01

The purpose of this study was to provide a description of the language and speech (intelligibility, voice, resonance, articulation) in a 7-year-old Dutch speaking boy with Nager syndrome. To reveal these features comparison was made with an age and gender related child with a similar palatal or hearing problem. Language was tested with an age appropriate language test namely the Dutch version of the Clinical Evaluation of Language Fundamentals. Regarding articulation a phonetic inventory, phonetic analysis and phonological process analysis was performed. A nominal scale with four categories was used to judge the overall speech intelligibility. A voice and resonance assessment included a videolaryngostroboscopy, a perceptual evaluation, acoustic analysis and nasometry. The most striking communication problems in this child were expressive and receptive language delay, moderately impaired speech intelligibility, the presence of phonetic and phonological disorders, resonance disorders and a high-pitched voice. The explanation for this pattern of communication is not completely straightforward. The language and the phonological impairment, only present in the child with the Nager syndrome, are not part of a more general developmental delay. The resonance disorders can be related to the cleft palate, but were not present in the child with the isolated cleft palate. One might assume that the cul-de-sac resonance and the much decreased mandibular movement and the restricted tongue lifting are caused by the restricted jaw mobility and micrognathia. To what extent the suggested mandibular distraction osteogenesis in early childhood allows increased mandibular movement and better speech outcome with increased oral resonance is subject for further research. According to the results of this study the speech and language management must be focused on receptive and expressive language skills and linguistic conceptualization, correct phonetic placement and the modification of
A Spectral Analysis Approach for Acoustic Radiation from Composite Panels

NASA Technical Reports Server (NTRS)

Turner, Travis L.; Singh, Mahendra P.; Mei, Chuh

2004-01-01

A method is developed to predict the vibration response of a composite panel and the resulting far-field acoustic radiation due to acoustic excitation. The acoustic excitation is assumed to consist of obliquely incident plane waves. The panel is modeled by a finite element analysis and the radiated field is predicted using Rayleigh's integral. The approach can easily include other effects such as shape memory alloy (SMA) ber reinforcement, large detection thermal postbuckling, and non-symmetric SMA distribution or lamination. Transmission loss predictions for the case of an aluminum panel excited by a harmonic acoustic pressure are shown to compare very well with a classical analysis. Results for a composite panel with and without shape memory alloy reinforcement are also presented. The preliminary results demonstrate that the transmission loss can be significantly increased with shape memory alloy reinforcement. The mechanisms for further transmission loss improvement are identified and discussed.
Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)

ERIC Educational Resources Information Center

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…
Listening to Schneiderian Voices: A Novel Phenomenological Analysis

PubMed Central

Rosen, Cherise; Chase, Kayla A.; Jones, Nev; Grossman, Linda S.; Gin, Hannah; Sharma, Rajiv P.

2016-01-01

Background/Aims This paper reports on analyses designed to elucidate phenomenological characteristics, content and experience specifically targeting participants with Schneiderian voices conversing/commenting (VC) while exploring difference in clinical presentation and quality of life compared to those with voices not conversing (VNC). Methods This mixed-method investigation of Schneiderian voices included standardized clinical metrics and exploratory phenomenological interviews designed to elicit in-depth information about characteristics, content, meaning and personification of AVHs. Results The subjective experience of VC show a striking pattern of VC that are experienced as internal at initial onset and during longer-term course of illness when compared to the VNC group. Participants in the VC group were more likely to attribute origins of their voices to an external source such as God, telepathic communication, or mediumistic sources. VC and VNC were described as characterological entities that were distinct from self (I/we versus you). We also found an association between VC and positive, cognitive, and depression symptom profile. However, we did not find a significant group difference in overall quality of life. Conclusions The clinical portrait of VC is complex, multisensory, and distinct, and suggests a need for further research into biopsychosocial interface between subjective experience, socioenvironmental constraints, individual psychology, and biological architecture of intersecting symptoms. PMID:27304081
Parameterization of the Voice Source by Combining Spectral Decay and Amplitude Features of the Glottal Flow.

ERIC Educational Resources Information Center

Alku, Paavo; Vilkman, Erkki; Laukkanen, Anne-Maria

1998-01-01

A new method is presented for the parameterization of glottal volume velocity waveforms that have been estimated by inverse filtering acoustic speech pressure signals. The new technique combines two features of voice production: the AC value and the spectral decay of the glottal flow. Testing found the new parameter correlates strongly with the…
[Estimation of quality of voice after removal of neoplasms T1 and T2 of glottis with simultaneous reconstruction of vocal fold with pedunculated sterno-thyroid muscle flap].

PubMed

Ratajczak, Jan; Wójtowicz, Piotr; Krzeski, Antoni

2014-01-01

In recent years there has been an increasing number of cases of cancer, including cancer of the larynx. The choice of treatment should be primarily dictated by the complete elimination of cancer, but from the point of view of the patient, an important factor to keep in mind, is the quality of the voice that will be created at the end of the therapeutic process. The aim of this study was to evaluate the voice quality of patients after partial surgery of the larynx with vocal fold reconstruction pedunculated sterno-thyroid muscle flap. The study included 30 men aged 53-72 years who were treated at the Clinic of Otorhinolaryngology Department of Medical-Dental Medical University of Warsaw on account of cancer of the larynx, qualified according to the TNM classification T1 or T2. The radical removal of cancer was associated with resection of one vocal fold, laryngeal pouches and ventricular fold. In 15 patients, included to the group I at the end of phase oncology surgery, a reconstruction of "vocal fold" pedunculated sterno-thyroid muscle flap were performed simultaneously. The group II consisted of 15 patients who underwent surgery that removed only the cancerous lesions. Impact assessments arising after surgery of voice disorders on quality of life were made using the self-test failure of the voice (Voice Handicap Index in the Pruszewicz modification). The nature of the created voice was studied using GRBAS scale. All patients performed the laryngostroboscope examination. With "IRIS" program, prepared by a team at Wrocław University of Technology, the voice was recorded, and then was subjected to acoustic analysis. In addition, noise level and the maximum phonation time was measured. The results indicate that the patients of group I gained a better voice confirming the values of objective acoustic analysis. The assessment made by the scale GRBAS patients who supplemented the resulting loss after tumour removal, with much less hoarseness of voice, did not have the
Vocal and neural responses to unexpected changes in voice pitch auditory feedback during register transitions

PubMed Central

Patel, Sona; Lodhavia, Anjli; Frankford, Saul; Korzyukov, Oleg; Larson, Charles R.

2016-01-01

Objective/Hypothesis It is known that singers are able to control their voice to maintain a relatively constant vocal quality while transitioning between vocal registers; however, the neural mechanisms underlying this effect are not understood. It was hypothesized that greater attention to the acoustical feedback of the voice and increased control of the vocal musculature during register transitions compared to singing within a register would be represented as neurological differences in event-related potentials (ERPs). Study Design/Methods Nine singers sang musical notes at the high end of the modal register (the boundary between the modal and head/falsetto registers) and at the low end (the boundary between the modal and fry/pulse registers). While singing, the pitch of the voice auditory feedback was unexpectedly shifted either into the adjacent register (“toward” the register boundary) or within the modal register (“away from” the boundary). Singers were instructed to maintain a constant pitch and ignore any changes to their voice feedback. Results Vocal response latencies and magnitude of the accompanying N1 and P2 ERPs were greatest at the lower (modal-fry) boundary when the pitch shift carried the subjects’ voices into the fry register as opposed to remaining within the modal register. Conclusions These findings suggest that when a singer lowers the pitch of their voice such that it enters the fry register from the modal register, there is increased sensory-motor control of the voice, reflected as increased magnitude of the neural potentials to help minimize qualitative changes in the voice. PMID:26739860
The Voice of Emotion: Acoustic Properties of Six Emotional Expressions.

NASA Astrophysics Data System (ADS)

Baldwin, Carol May

Studies in the perceptual identification of emotional states suggested that listeners seemed to depend on a limited set of vocal cues to distinguish among emotions. Linguistics and speech science literatures have indicated that this small set of cues included intensity, fundamental frequency, and temporal properties such as speech rate and duration. Little research has been done, however, to validate these cues in the production of emotional speech, or to determine if specific dimensions of each cue are associated with the production of a particular emotion for a variety of speakers. This study addressed deficiencies in understanding of the acoustical properties of duration and intensity as components of emotional speech by means of speech science instrumentation. Acoustic data were conveyed in a brief sentence spoken by twelve English speaking adult male and female subjects, half with dramatic training, and half without such training. Simulated expressions included: happiness, surprise, sadness, fear, anger, and disgust. The study demonstrated that the acoustic property of mean intensity served as an important cue for a vocal taxonomy. Overall duration was rejected as an element for a general taxonomy due to interactions involving gender and role. Findings suggested a gender-related taxonomy, however, based on differences in the ways in which men and women use the duration cue in their emotional expressions. Results also indicated that speaker training may influence greater use of the duration cue in expressions of emotion, particularly for male actors. Discussion of these results provided linkages to (1) practical management of emotional interactions in clinical and interpersonal environments, (2) implications for differences in the ways in which males and females may be socialized to express emotions, and (3) guidelines for future perceptual studies of emotional sensitivity.
Native voice, self-concept and the moral case for personalized voice technology.

PubMed

Nathanson, Esther

2017-01-01

Purpose (1) To explore the role of native voice and effects of voice loss on self-concept and identity, and survey the state of assistive voice technology; (2) to establish the moral case for developing personalized voice technology. Methods This narrative review examines published literature on the human significance of voice, the impact of voice loss on self-concept and identity, and the strengths and limitations of current voice technology. Based on the impact of voice loss on self and identity, and voice technology limitations, the moral case for personalized voice technology is developed. Results Given the richness of information conveyed by voice, loss of voice constrains expression of the self, but the full impact is poorly understood. Augmentative and alternative communication (AAC) devices facilitate communication but, despite advances in this field, voice output cannot yet express the unique nuances of individual voice. The ethical principles of autonomy, beneficence and equality of opportunity establish the moral responsibility to invest in accessible, cost-effective, personalized voice technology. Conclusions Although further research is needed to elucidate the full effects of voice loss on self-concept, identity and social functioning, current understanding of the profoundly negative impact of voice loss establishes the moral case for developing personalized voice technology. Implications for Rehabilitation Rehabilitation of voice-disordered patients should facilitate self-expression, interpersonal connectedness and social/occupational participation. Proactive questioning about the psychological and social experiences of patients with voice loss is a valuable entry point for rehabilitation planning. Personalized voice technology would enhance sense of self, communicative participation and autonomy and promote shared healthcare decision-making. Further research is needed to identify the best strategies to preserve and strengthen identity and sense of
Perturbation of voice signals in register transitions on sustained frequency in professional tenors.

PubMed

Echternach, Matthias; Traser, Louisa; Richter, Bernhard

2012-09-01

Vocal register transitions in the passaggio region remain an unclarified field in classically trained male singers. We examined the acoustic and electroglottographic signals of seven tenors' transitions from voix mixte to falsetto on a sustained pitch F4 (349Hz) on the vowels /a, e, i, o, u, and æ/. It was found that in many of the tested subjects, register transitions between voix mixte and falsetto were performed very continuously without clear register transition events. However, an increase of frequency and amplitude perturbation (jitter, relative average perturbation, and shimmer) was observed during register transitions. These data suggest that professional tenors are able to avoid sudden registration events frequently observed in untrained voices. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Acoustic Facies Analysis of Side-Scan Sonar Data

NASA Astrophysics Data System (ADS)

Dwan, Fa Shu

Acoustic facies analysis methods have allowed the generation of system-independent values for the quantitative seafloor acoustic parameter, backscattering strength, from GLORIA and (TAMU) ^2 side-scan sonar data. The resulting acoustic facies parameters enable quantitative comparisons of data collected by different sonar systems, data from different environments, and measurements made with survey geometries. Backscattering strength values were extracted from the sonar amplitude data by inversion based on the sonar equation. Image processing products reveal seafloor features and patterns of relative intensity. To quantitatively compare data collected at different times or by different systems, and to ground truth-measurements and geoacoustic models, quantitative corrections must be made on any given data set for system source level, beam pattern, time-varying gain, processing gain, transmission loss, absorption, insonified area contribution, and grazing angle effects. In the sonar equation, backscattering strength is the sonar parameter which is directly related to seafloor properties. The GLORIA data used in this study are from the edge of a distal lobe of the Monterey Fan. An interfingered region of strong and weak seafloor signal returns from a flat seafloor region provides an ideal data set for this study. Inversion of imagery data from the region allows the quantitative definition of different acoustic facies. The (TAMU) ^2 data used are from a calibration site near the Green Canyon area of the Gulf of Mexico. Acoustic facies analysis techniques were implemented to generate statistical information for acoustic facies based on the estimates of backscattering strength. The backscattering strength values have been compared with Lambert's Law and other functions to parameterize the description of the acoustic facies. The resulting Lambertian constant values range from -26 dB to -36 dB. A modified Lambert relationship, which consists of both intercept and slope
Relationship between acoustic voice onset and offset and selected instances of oscillatory onset and offset in young healthy males and females

PubMed Central

Patel, Rita; Forrest, Karen; Hedges, Drew

2016-01-01

Objective To investigate the relationship between (1) onset of the acoustic signal and pre-phonatory phases associated with oscillatory onset and (2) offset of the acoustic signal with the post-phonatory events associated with oscillatory offset across vocally healthy adults. Subjects and Methods High-speed videoendoscopy was captured simultaneously with the acoustic signal during repeated production of /hi.hi.hi/ at typical pitch and loudness from 56 vocally healthy adults (age 20–42 years; 21 male, 35 female). The relationship between the acoustic sound pressure signal and oscillatory onset /offset events from the glottal area waveforms (GAW), were statistically investigated using a multivariate linear regression analysis. Results The onset of the acoustic signal (X1a) is a significant predictor of the onset of first oscillations (X1g) and onset of sustained oscillations (X2g). X1a as well as gender are significant predictors of the first instance of medial contact (X1.5g). The offset of the acoustic signal (X2a) is a significant predictor of the first instance of oscillatory offset (X3g), first instance of incomplete glottal closure (X3.5g), and cessation of vocal fold motion (X4g). Conclusions The acoustic signal onset is closely related to the first medial contact of the vocal folds but the latency between these events is longer for females compared to males. The offset of the acoustic signal occurs immediately after incomplete glottal adduction. The emerging normative group latencies between the onset/offset of the acoustic and the GAW from this study appear promising for future investigations. PMID:27769696

Thermal welding vs. cold knife tonsillectomy: a comparison of voice and speech.

PubMed

Celebi, Saban; Yelken, Kursat; Celik, Oner; Taskin, Umit; Topak, Murat

2011-01-01

To compare acoustic, aerodynamic and perceptual voice and speech parameters in thermal welding system tonsillectomy and cold knife tonsillectomy patients in order to determine the impact of operation technique on voice and speech. Thirty tonsillectomy patients (22 children, 8 adults) participated in this study. The preferred technique was cold knife tonsillectomy in 15 patients and thermal welding system tonsillectomy in the remaining 15 patients. One week before and 1 month after surgery the following parameters were estimated: average of fundamental frequency, Jitter, Shimmer, harmonic to noise ratio, formant frequency analyses of sustained vowels. Perceptual speech analysis and aerodynamic measurements (maximum phonation time and s/z ratio) were also conducted. There was no significant difference in any of the parameters between cold knife tonsillectomy and thermal welding system tonsillectomy groups (p>0.05). When the groups were contrasted among themselves with regards to preoperative and postoperative rates, fundamental frequency was found to be significantly decreased after tonsillectomy in both of the groups (p<0.001). First formant for the vowel /a/ in the cold knife tonsillectomy group and for the vowel /i/ in the thermal welding system tonsillectomy group, second formant for the vowel /u/ in the thermal welding system tonsillectomy group and third formant for the vowel /u/ in the cold knife tonsillectomy group were found to be significantly decreased (p<0.05). The surgical technique, whether it is cold knife or thermal welding system, does not appear to affect voice and speech in tonsillectomy patients. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Acoustic Correlates of Compensatory Adjustments to the Glottic and Supraglottic Structures in Patients with Unilateral Vocal Fold Paralysis

PubMed Central

2015-01-01

The goal of this study was to analyse perceptually and acoustically the voices of patients with Unilateral Vocal Fold Paralysis (UVFP) and compare them to the voices of normal subjects. These voices were analysed perceptually with the GRBAS scale and acoustically using the following parameters: mean fundamental frequency (F0), standard-deviation of F0, jitter (ppq5), shimmer (apq11), mean harmonics-to-noise ratio (HNR), mean first (F1) and second (F2) formants frequency, and standard-deviation of F1 and F2 frequencies. Statistically significant differences were found in all of the perceptual parameters. Also the jitter, shimmer, HNR, standard-deviation of F0, and standard-deviation of the frequency of F2 were statistically different between groups, for both genders. In the male data differences were also found in F1 and F2 frequencies values and in the standard-deviation of the frequency of F1. This study allowed the documentation of the alterations resulting from UVFP and addressed the exploration of parameters with limited information for this pathology. PMID:26557690
A real-time LPC-based vocal tract area display for voice development.

PubMed

Rossiter, D; Howard, D M; Downes, M

1994-12-01

This article reports the design and implementation of a graphical display that presents an approximation to vocal tract area in real time for voiced vowel articulation. The acoustic signal is digitally sampled by the system. From these data a set of reflection coefficients is derived using linear predictive coding. A matrix of area coefficients is then determined that approximates the vocal tract area of the user. From this information a graphical display is then generated. The complete cycle of analysis and display is repeated at approximately 20 times/s. Synchronised audio and visual sequences can be recorded and used as dynamic targets for articulatory development. Use of the system is illustrated by diagrams of system output for spoken cardinal vowels and for vowels sung in a trained and untrained style.
Digital signal processing algorithms for automatic voice recognition

NASA Technical Reports Server (NTRS)

Botros, Nazeih M.

1987-01-01

The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.
Voice care knowledge among clinicians and people with healthy voices or dysphonia.

PubMed

Fletcher, Helen M; Drinnan, Michael J; Carding, Paul N

2007-01-01

An important clinical component in the prevention and treatment of voice disorders is voice care and hygiene. Research in voice care knowledge has mainly focussed on specific groups of professional voice users with limited reporting on the tool and evidence base used. In this study, a questionnaire to measure voice care knowledge was developed based on "best evidence." The questionnaire was validated by measuring specialist voice clinicians' agreement. Preliminary data are then presented using the voice care knowledge questionnaire with 17 subjects with nonorganic dysphonia and 17 with healthy voices. There was high (89%) agreement among the clinicians. There was a highly significant difference between the dysphonic and the healthy group scores (P = 0.00005). Furthermore, the dysphonic subjects (63% agreement) presented with less voice care knowledge than the subjects with healthy voices (72% agreement). The questionnaire provides a useful and valid tool to investigate voice care knowledge. The findings have implications for clinical intervention, voice therapy, and health prevention.
Voice, Articulation, and Prosody Contribute to Listener Perceptions of Speaker Gender: A Systematic Review and Meta-Analysis.

PubMed

Leung, Yeptain; Oates, Jennifer; Chan, Siew Pang

2018-02-15

The aim of this study was to provide a systematic review of the aspects of verbal communication contributing to listener perceptions of speaker gender with a view to providing clinicians with guidance for the selection of the training goals when working with transsexual individuals. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) guidelines were adopted in this systematic review. Studies evaluating the contribution of aspects of verbal communication to listener perceptions of speaker gender were rated against a new risk of bias assessment tool. Relevant data were extracted, and narrative synthesis was then conducted. Meta-analyses were conducted when appropriate data were available. Thirty-eight articles met the eligibility criteria. Meta-analysis showed speaking fundamental frequency contributing to 41.6% of the variance in gender perception. Auditory-perceptual and acoustic measures of pitch, resonance, loudness, articulation, and intonation were found to be associated with listeners' perceptions of speaker gender. Tempo and stress were not significantly associated. Mixed findings were found as to the contribution of a breathy voice quality to gender perception. Nonetheless, there exists significant risk of bias in this body of research. Speech and language clinicians working with transsexual individuals may use the results of this review for goal setting. Further research is required to redress the significant risk of bias.
Voice Onset Time for the Word-Initial Voiceless Consonant /t/ in Japanese Spasmodic Dysphonia-A Comparison With Normal Controls.

PubMed

Yanagida, Saori; Nishizawa, Noriko; Mizoguchi, Kenji; Hatakeyama, Hiromitsu; Fukuda, Satoshi

2015-07-01

Voice onset time (VOT) for word-initial voiceless consonants in adductor spasmodic dysphonia (ADSD) and abductor spasmodic dysphonia (ABSD) patients were measured to determine (1) which acoustic measures differed from the controls and (2) whether acoustic measures were related to the pause or silence between the test word and the preceding word. Forty-eight patients with ADSD and nine patients with ABSD, as well as 20 matched normal controls read a story in which the word "taiyo" (the sun) was repeated three times, each differentiated by the position of the word in the sentence. The target of measurement was the VOT for the word-initial voiceless consonant /t/. When the target syllable appeared in a sentence following a comma, or at the beginning of a sentence following a period, the ABSD patients' VOTs were significantly longer than those of the ADSD patients and controls. Abnormal prolongation of the VOTs was related to the pause or silence between the test word and the preceding word. VOTs in spasmodic dysphonia (SD) may vary according to the SD subtype or speaking conditions. VOT measurement was suggested to be a useful method for quantifying voice symptoms in SD. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Perceived control and voice handicap in patients with voice disorders.

PubMed

Frazier, Patricia; Merians, Addie; Misono, Stephanie

2017-11-01

The purpose of the study was to replicate and extend previous research on the relation between perceived present control and voice handicap and to further examine the psychometric properties of a present control scale adapted for patients with voice disorders (Misono, Meredith, Peterson, & Frazier, 2016). Sample 1 consisted of 1,129 patients recruited from a voice disorder clinic who completed measures of perceived present control, distress, and voice handicap in the clinic. Sample 2 consisted of 62 patients from the same clinic who completed measures of present control, distress, voice handicap, and general control beliefs online at baseline and measures of present control and voice handicap again 3 weeks later (n = 59). With regard to the psychometric properties of the voice-adapted present control scale, alpha coefficients were above .80 and the 3-week test-reliability coefficient was .69. There was mixed support for the hypothesized 1-factor structure of the scale. In Sample 1, present control was more strongly associated with lower voice handicap than was distress and accounted for significant variance in voice handicap controlling for distress. In Sample 2, present control at baseline predicted later voice handicap, controlling for general control beliefs and distress. Present control appears to be a promising target for adjunctive interventions for patients with voice disorders. An evidence-based online present control intervention (Hintz, Frazier, & Meredith, 2015) is being adapted for this patient population. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Acoustic vibration analysis for utilization of woody plant in space environment

NASA Astrophysics Data System (ADS)

Chida, Yukari; Yamashita, Masamichi; Hashimoto, Hirofumi; Sato, Seigo; Tomita-Yokotani, Kaori; Baba, Keiichi; Suzuki, Toshisada; Motohashi, Kyohei; Sakurai, Naoki; Nakagawa-izumi, Akiko

2012-07-01

We are proposing to raise woody plants for space agriculture in Mars. Space agriculture has the utilization of wood in their ecosystem. Nobody knows the real tree shape grown under space environment under the low or micro gravitational conditions such as outer environment. Angiosperm tree forms tension wood for keeping their shape. Tension wood formation is deeply related to gravity, but the details of the mechanism of its formation has not yet been clarified. For clarifying the mechanism, the space experiment in international space station, ISS is the best way to investigate about them as the first step. It is necessary to establish the easy method for crews who examine the experiments at ISS. Here, we are proposing to investigate the possibility of the acoustic vibration analysis for the experiment at ISS. Two types of Japanese cherry tree, weeping and upright types in Prunus sp., were analyzed by the acoustic vibration method. Coefficient-of-variation (CV) of sound speed was calculated by the acoustic vibration analysis. The amount of lignin and decomposed lignin were estimated by both Klason and Py-GC/MS method, respectively. The relationships of the results of acoustic vibration analysis and the inner components in tested woody materials were investigated. After the experiments, we confirm the correlation about them. Our results indicated that the acoustic vibration analysis would be useful for determining the inside composition as a nondestructive method in outer space environment.
Telephony-based voice pathology assessment using automated speech analysis.

PubMed

Moran, Rosalyn J; Reilly, Richard B; de Chazal, Philip; Lacy, Peter D

2006-03-01

A system for remotely detecting vocal fold pathologies using telephone-quality speech is presented. The system uses a linear classifier, processing measurements of pitch perturbation, amplitude perturbation and harmonic-to-noise ratio derived from digitized speech recordings. Voice recordings from the Disordered Voice Database Model 4337 system were used to develop and validate the system. Results show that while a sustained phonation, recorded in a controlled environment, can be classified as normal or pathologic with accuracy of 89.1%, telephone-quality speech can be classified as normal or pathologic with an accuracy of 74.2%, using the same scheme. Amplitude perturbation features prove most robust for telephone-quality speech. The pathologic recordings were then subcategorized into four groups, comprising normal, neuromuscular pathologic, physical pathologic and mixed (neuromuscular with physical) pathologic. A separate classifier was developed for classifying the normal group from each pathologic subcategory. Results show that neuromuscular disorders could be detected remotely with an accuracy of 87%, physical abnormalities with an accuracy of 78% and mixed pathology voice with an accuracy of 61%. This study highlights the real possibility for remote detection and diagnosis of voice pathology.
Acoustic and laryngographic measures of the laryngeal reflexes of linguistic prominence and vocal effort in German1

PubMed Central

Mooshammer, Christine

2010-01-01

This study uses acoustic and physiological measures to compare laryngeal reflexes of global changes in vocal effort to the effects of modulating such aspects of linguistic prominence as sentence accent, induced by focus variation, and word stress. Seven speakers were recorded by using a laryngograph. The laryngographic pulses were preprocessed to normalize time and amplitude. The laryngographic pulse shape was quantified using open and skewness quotients and also by applying a functional version of the principal component analysis. Acoustic measures included the acoustic open quotient and spectral balance in the vowel ∕e∕ during the test syllable. The open quotient and the laryngographic pulse shape indicated a significantly shorter open phase for loud speech than for soft speech. Similar results were found for lexical stress, suggesting that lexical stress and loud speech are produced with a similar voice source mechanism. Stressed syllables were distinguished from unstressed syllables by their open phase and pulse shape, even in the absence of sentence accent. Evidence for laryngeal involvement in signaling focus, independent of fundamental frequency changes, was not as consistent across speakers. Acoustic results on various spectral balance measures were generally much less consistent compared to results from laryngographic data. PMID:20136226
Sasak Voice

ERIC Educational Resources Information Center

Asikin-Garmager, Eli Scott

2017-01-01

This dissertation provides a formal and functional analysis of grammatical voice in Sasak, an Austronesian language spoken in Eastern Indonesia. The research addresses two primary questions, which are (1) how does Sasak clause structure and morphosyntax vary across dialects? and (2) what shapes speakers' syntactic production, namely grammatical…
The Effect of Microphone Type on Acoustical Measures of Synthesized Vowels.

PubMed

Kisenwether, Jessica Sofranko; Sataloff, Robert T

2015-09-01

The purpose of this study was to compare microphones of different directionality, transducer type, and cost, with attention to their effects on acoustical measurements of period perturbation, amplitude perturbation, and noise using synthesized sustained vowel samples. This was a repeated measures design. Synthesized sustained vowel stimuli (with known acoustic characteristics and systematic changes in jitter, shimmer, and noise-to-harmonics ratio) were recorded by a variety of dynamic and condenser microphones. Files were then analyzed for mean fundamental frequency (fo), fo standard deviation, absolute jitter, shimmer in dB, peak-to-peak amplitude variation, and noise-to-harmonics ratio. Acoustical measures following recording were compared with the synthesized, known acoustical measures before recording. Although informal analyses showed some differences among microphones, and analyses of variance showed that type of microphone is a significant predictor, t-tests revealed that none of the microphones generated different means compared with the generated acoustical measures. In this sample, microphone type, directionality, and cost did not have a significant effect on the validity of acoustic measures. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Auditory-Perceptual and Acoustic Methods in Measuring Dysphonia Severity of Korean Speech.

PubMed

Maryn, Youri; Kim, Hyung-Tae; Kim, Jaeock

2016-09-01

The purpose of this study was to explore the criterion-related concurrent validity of two standardized auditory-perceptual rating protocols and the Acoustic Voice Quality Index (AVQI) for measuring dysphonia severity in Korean speech. Sixty native Korean subjects with various voice disorders were asked to sustain the vowel [a:] and to read aloud the Korean text "Walk." A 3-second midvowel portion of the sustained vowel and two sentences (with 25 syllables) were edited, concatenated, and analyzed according to methods described elsewhere. From 56 participants, both continuous speech and sustained vowel recordings had sufficiently high signal-to-noise ratios (35.5 dB and 37 dB on average, respectively) and were therefore subjected to further dysphonia severity analysis with (1) "G" or Grade from the GRBAS protocol, (2) "OS" or Overall Severity from the Consensus Auditory-Perceptual Evaluation of Voice protocol, and (3) AVQI. First, high correlations were found between G and OS (rS = 0.955 for sustained vowels; rS = 0.965 for continuous speech). Second, the AVQI showed a strong correlation with G (rS = 0.911) as well as OS (rP = 0.924). These findings are in agreement with similar studies dealing with continuous speech in other languages. The present study highlights the criterion-related concurrent validity of these methods in Korean speech. Furthermore, it supports the cross-linguistic robustness of the AVQI as a valid and objective marker of overall dysphonia severity. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Vibro-acoustic modeling and analysis of a coupled acoustic system comprising a partially opened cavity coupled with a flexible plate

NASA Astrophysics Data System (ADS)

Shi, Shuangxia; Su, Zhu; Jin, Guoyong; Liu, Zhigang

2018-01-01

This paper is concerned with the modeling and solution method of a three-dimensional (3D) coupled acoustic system comprising a partially opened cavity coupled with a flexible plate and an exterior field of semi-infinite size, which is ubiquitously encountered in architectural acoustics and is a reasonable representation of many engineering occasions. A general solution method is presented to predict the dynamic behaviors of the three-dimensional (3D) acoustic coupled system, in which the displacement of the plate and the sound pressure in the cavity are respectively constructed in the form of the two-dimensional and three-dimensional modified Fourier series with several auxiliary functions introduced to ensure the uniform convergence of the solution over the entire solution domain. The effect of the opening is taken into account via the work done by the sound pressure acting at the coupling aperture that is contributed from the vibration of particles on the acoustic coupling interface and on the structural-acoustic coupling interface. Both the acoustic coupling between finite cavity and exterior field and the structural-acoustic coupling between flexible plate and interior acoustic field are considered in the vibro-acoustic modeling of the three-dimensional acoustic coupled acoustic system. The dynamic responses of the coupled structural-acoustic system are obtained using the Rayleigh-Ritz procedure based on the energy expressions for the coupled system. The accuracy and effectiveness of the proposed method are validated through numerical examples and comparison with results obtained by the boundary element analysis. Furthermore, the influence of the opening and the cavity volume on the acoustic behaviors of opened cavity system is studied.
Effects of vocal training and phonatory task on voice onset time.

PubMed

McCrea, Christopher R; Morris, Richard J

2007-01-01

The purpose of this study was to examine the temporal-acoustic differences between trained singers and nonsingers during speech and singing tasks. Thirty male participants were separated into two groups of 15 according to level of vocal training (ie, trained or untrained). The participants spoke and sang carrier phrases containing English voiced and voiceless bilabial stops, and voice onset time (VOT) was measured for the stop consonant productions. Mixed analyses of variance revealed a significant main effect between speech and singing for /p/ and /b/, with VOT durations longer during speech than singing for /p/, and the opposite true for /b/. Furthermore, a significant phonatory task by vocal training interaction was observed for /p/ productions. The results indicated that the type of phonatory task influences VOT and that these influences are most obvious in trained singers secondary to the articulatory and phonatory adjustments learned during vocal training.
Speech Recognition: Acoustic-Phonetic Knowledge Acquisition and Representation.

DTIC Science & Technology

1987-09-25

the release duration is the voice onset time, or VOT. For the purpose of this investigation, alveolar flaps ( as in "butter’) and and glottalized /t/’s...Cambridge, Massachusetts 02139 Abstract females and 8 males. The other sentence was said by 7 females We discuss a framework for an acoustic-phonetic...tarned a number of semivowels. One sentence was said by 6 vowels + + "jpporte.d by a Xerox Fellowhsp Table It Features which characterite
Quantitative evaluation of the voice range profile in patients with voice disorder.

PubMed

Ikeda, Y; Masuda, T; Manako, H; Yamashita, H; Yamamoto, T; Komiyama, S

1999-01-01

In 1953, Calvet first displayed the fundamental frequency (pitch) and sound pressure level (intensity) of a voice on a two-dimensional plane and created a voice range profile. This profile has been used to evaluate clinically various vocal disorders, although such evaluations to date have been subjective without quantitative assessment. In the present study, a quantitative system was developed to evaluate the voice range profile utilizing a personal computer. The area of the voice range profile was defined as the voice volume. This volume was analyzed in 137 males and 175 females who were treated for various dysphonias at Kyushu University between 1984 and 1990. Ten normal subjects served as controls. The voice volume in cases with voice disorders significantly decreased irrespective of the disease and sex. Furthermore, cases having better improvement after treatment showed a tendency for the voice volume to increase. These findings illustrated the voice volume as a useful clinical test for evaluating voice control in cases with vocal disorders.
Voice outcomes after concurrent chemoradiotherapy for advanced nonlaryngeal head and neck cancer: a prospective study.

PubMed

Paleri, Vinidh; Carding, Paul; Chatterjee, Sanjoy; Kelly, Charles; Wilson, Janet Ann; Welch, Andrew; Drinnan, Michael

2012-12-01

The voice impact of treatment for nonlaryngeal head and neck primary sites remains unknown. We conducted a prospective study of a consecutive sample of patients undergoing chemoradiation for nonlaryngeal head and neck cancer. The Voice Symptom Scale (VoiSS) was completed, and voice recordings were made at 3 time-points. Of 42 recruited patients, 34 completed the measures before and in the early posttreatment phase (mean 16.5 weeks), while 21 patients were assessed at the final time-point (mean, 20.4 months). VoiSS scores showed statistically significant progressive deterioration in the total score (p = .02) and impairment subscale (p < .0001) through to the final assessment. Acoustic measures and perceptual ratings deteriorated significantly (p < .001) in the early posttreatment weeks and improved at the final assessment, but not to the baseline. Interrater agreement was excellent for expert measures. To the best of our knowledge, this is the first prospective study to show that chemoradiation therapy for nonlaryngeal head and neck cancer has a significant effect on the patients' self-reported voice quality, even in the long term. Copyright © 2012 Wiley Periodicals, Inc.
Voice symptoms and voice-related quality of life in college students.

PubMed

Merrill, Ray M; Tanner, Kristine; Merrill, Joseph G; McCord, Matthew D; Beardsley, Melissa M; Steele, Brittanie A

2013-08-01

The purpose of this study was to examine the prevalence of voice disorders in college students and their effect on the students as shown by quality-of-life indicators. A cross-sectional survey was completed by 545 college students in 2012. The survey included 10 questions from the Voice-Related Quality of Life (V-RQOL), selected voice symptoms, and quality-of-life indicators of functional health and well-being based on the Short Form 36-item Health Survey (SF-36). Twenty-nine percent of the college students (mean age, 22.7 years) reported a history of a voice disorder. Hoarseness was the most prevalent voice symptom, but was not correlated with V-RQOL scores. A wobbly or shaky voice, throat dryness, vocal fatigue, and vocal effort explained a significant amount of variance on the social-emotional and physical domains of the V-RQOL index (p < 0.05). Voice symptoms limited emotional and physical functioning as indicated by SF-36 scores. Voice disorders significantly influence psychosocial and physical functioning in college students. These findings have important implications for voice-care services in this population.

[Fundamental frequency analysis - a contribution to the objective examination of the speaking and singing voice (author's transl)].

PubMed

Schultz-Coulon, H J

1975-07-01

The applicability of a newly developed fundamental frequency analyzer to diagnosis in phoniatrics is reviewed. During routine voice examination, the analyzer allows a quick and accurate measurement of fundamental frequency and sound level of the speaking voice, and of vocal range and maximum phonation time. By computing fundamental frequency histograms, the median fundamental frequency and the total pitch range can be better determined and compared. Objective studies of certain technical faculties of the singing voice, which usually are estimated subjectively by the speech therapist, may now be done by means of this analyzer. Several examples demonstrate the differences between correct and incorrect phonation. These studies compare the pitch perturbations during the crescendo and decrescendo of a swell-tone, and show typical traces of staccato, thrill and yodel. Conclusions of the study indicate that fundamental frequency analysis is a valuable supplemental method for objective voice examination.
Women use voice parameters to assess men's characteristics

PubMed Central

Bruckert, Laetitia; Liénard, Jean-Sylvain; Lacroix, André; Kreutzer, Michel; Leboucher, Gérard

2005-01-01

The purpose of this study was: (i) to provide additional evidence regarding the existence of human voice parameters, which could be reliable indicators of a speaker's physical characteristics and (ii) to examine the ability of listeners to judge voice pleasantness and a speaker's characteristics from speech samples. We recorded 26 men enunciating five vowels. Voices were played to 102 female judges who were asked to assess vocal attractiveness and speakers' age, height and weight. Statistical analyses were used to determine: (i) which physical component predicted which vocal component and (ii) which vocal component predicted which judgment. We found that men with low-frequency formants and small formant dispersion tended to be older, taller and tended to have a high level of testosterone. Female listeners were consistent in their pleasantness judgment and in their height, weight and age estimates. Pleasantness judgments were based mainly on intonation. Female listeners were able to correctly estimate age by using formant components. They were able to estimate weight but we could not explain which acoustic parameters they used. However, female listeners were not able to estimate height, possibly because they used intonation incorrectly. Our study confirms that in all mammal species examined thus far, including humans, formant components can provide a relatively accurate indication of a vocalizing individual's characteristics. Human listeners have the necessary information at their disposal; however, they do not necessarily use it. PMID:16519239
Pulse analysis of acoustic emission signals. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Houghton, J. R.

1976-01-01

A method for the signature analysis of pulses in the frequency domain and the time domain is presented. Fourier spectrum, Fourier transfer function, shock spectrum and shock spectrum ratio are examined in the frequency domain analysis, and pulse shape deconvolution is developed for use in the time domain analysis. To demonstrate the relative sensitivity of each of the methods to small changes in the pulse shape, signatures of computer modeled systems with analytical pulses are presented. Optimization techniques are developed and used to indicate the best design parameters values for deconvolution of the pulse shape. Several experiments are presented that test the pulse signature analysis methods on different acoustic emission sources. These include acoustic emissions associated with: (1) crack propagation, (2) ball dropping on a plate, (3) spark discharge and (4) defective and good ball bearings.
An acoustic analysis of laughter produced by congenitally deaf and normally hearing college students.

PubMed

Makagon, Maja M; Funayama, E Sumie; Owren, Michael J

2008-07-01

Relatively few empirical data are available concerning the role of auditory experience in nonverbal human vocal behavior, such as laughter production. This study compared the acoustic properties of laughter in 19 congenitally, bilaterally, and profoundly deaf college students and in 23 normally hearing control participants. Analyses focused on degree of voicing, mouth position, air-flow direction, temporal features, relative amplitude, fundamental frequency, and formant frequencies. Results showed that laughter produced by the deaf participants was fundamentally similar to that produced by the normally hearing individuals, which in turn was consistent with previously reported findings. Finding comparable acoustic properties in the sounds produced by deaf and hearing vocalizers confirms the presumption that laughter is importantly grounded in human biology, and that auditory experience with this vocalization is not necessary for it to emerge in species-typical form. Some differences were found between the laughter of deaf and hearing groups; the most important being that the deaf participants produced lower-amplitude and longer-duration laughs. These discrepancies are likely due to a combination of the physiological and social factors that routinely affect profoundly deaf individuals, including low overall rates of vocal fold use and pressure from the hearing world to suppress spontaneous vocalizations.
Identifying hidden voice and video streams

NASA Astrophysics Data System (ADS)

Fan, Jieyan; Wu, Dapeng; Nucci, Antonio; Keralapura, Ram; Gao, Lixin

2009-04-01

Given the rising popularity of voice and video services over the Internet, accurately identifying voice and video traffic that traverse their networks has become a critical task for Internet service providers (ISPs). As the number of proprietary applications that deliver voice and video services to end users increases over time, the search for the one methodology that can accurately detect such services while being application independent still remains open. This problem becomes even more complicated when voice and video service providers like Skype, Microsoft, and Google bundle their voice and video services with other services like file transfer and chat. For example, a bundled Skype session can contain both voice stream and file transfer stream in the same layer-3/layer-4 flow. In this context, traditional techniques to identify voice and video streams do not work. In this paper, we propose a novel self-learning classifier, called VVS-I , that detects the presence of voice and video streams in flows with minimum manual intervention. Our classifier works in two phases: training phase and detection phase. In the training phase, VVS-I first extracts the relevant features, and subsequently constructs a fingerprint of a flow using the power spectral density (PSD) analysis. In the detection phase, it compares the fingerprint of a flow to the existing fingerprints learned during the training phase, and subsequently classifies the flow. Our classifier is not only capable of detecting voice and video streams that are hidden in different flows, but is also capable of detecting different applications (like Skype, MSN, etc.) that generate these voice/video streams. We show that our classifier can achieve close to 100% detection rate while keeping the false positive rate to less that 1%.
The singing power ratio as an objective measure of singing voice quality in untrained talented and nontalented singers.

PubMed

Watts, Christopher; Barnes-Burroughs, Kathryn; Estis, Julie; Blanton, Debra

2006-03-01

A growing body of contemporary research has investigated differences between trained and untrained singing voices. However, few studies have separated untrained singers into those who do and do not express abilities related to singing talent, including accurate pitch control and production of a pleasant timbre (voice quality). This investigation studied measures of the singing power ratio (SPR), which is a quantitative measure of the resonant quality of the singing voice. SPR reflects the amplification or suppression in the vocal tract of the harmonics produced by the sound source. This measure was acquired from the voices of untrained talented and nontalented singers as a means to objectively investigate voice quality differences. Measures of SPR were acquired from vocal samples with fast Fourier transform (FFT) power spectra to analyze the amplitude level of the partials in the acoustic spectrum. Long-term average spectra (LTAS) were also analyzed. Results indicated significant differences in SPR between groups, which suggest that vocal tract resonance, and its effect on perceived vocal timbre or quality, may be an important variable related to the perception of singing talent. LTAS confirmed group differences in the tuning of vocal tract harmonics.
Particle analysis in an acoustic cytometer

DOEpatents

Kaduchak, Gregory; Ward, Michael D

2012-09-18

The present invention is a method and apparatus for acoustically manipulating one or more particles. Acoustically manipulated particles may be separated by size. The particles may be flowed in a flow stream and acoustic radiation pressure, which may be radial, may be applied to the flow stream. This application of acoustic radiation pressure may separate the particles. In one embodiment, the particles may be separated by size, and as a further example, the larger particles may be transported to a central axis.
Issues in forensic voice.

PubMed

Hollien, Harry; Huntley Bahr, Ruth; Harnsberger, James D

2014-03-01

The following article provides a general review of an area that can be referred to as Forensic Voice. Its goals will be outlined and that discussion will be followed by a description of its major elements. Considered are (1) the processing and analysis of spoken utterances, (2) distorted speech, (3) enhancement of speech intelligibility (re: surveillance and other recordings), (4) transcripts, (5) authentication of recordings, (6) speaker identification, and (7) the detection of deception, intoxication, and emotions in speech. Stress in speech and the psychological stress evaluation systems (that some individuals attempt to use as lie detectors) also will be considered. Points of entry will be suggested for individuals with the kinds of backgrounds possessed by professionals already working in the voice area. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Acoustic sensor array extracts physiology during movement

NASA Astrophysics Data System (ADS)

Scanlon, Michael V.

2001-08-01

An acoustic sensor attached to a person's neck can extract heart and breath sounds, as well as voice and other physiology related to their health and performance. Soldiers, firefighters, law enforcement, and rescue personnel, as well as people at home or in health care facilities, can benefit form being remotely monitored. ARLs acoustic sensor, when worn around a person's neck, picks up the carotid artery and breath sounds very well by matching the sensor's acoustic impedance to that of the body via a gel pad, while airborne noise is minimized by an impedance mismatch. Although the physiological sounds have high SNR, the acoustic sensor also responds to motion-induced artifacts that obscure the meaningful physiology. To exacerbate signal extraction, these interfering signals are usually covariant with the heart sounds, in that as a person walks faster the heart tends to beat faster, and motion noises tend to contain low frequency component similar to the heart sounds. A noise-canceling configuration developed by ARL uses two acoustic sensor on the front sides of the neck as physiology sensors, and two additional acoustic sensor on the back sides of the neck as noise references. Breath and heart sounds, which occur with near symmetry and simultaneously at the two front sensor, will correlate well. The motion noise present on all four sensor will be used to cancel the noise on the two physiology sensors. This report will compare heart rate variability derived from both the acoustic array and from ECG data taken simultaneously on a treadmill test. Acoustically derived breath rate and volume approximations will be introduced as well. A miniature 3- axis accelerometer on the same neckband provides additional noise references to validate footfall and motion activity.
Electrical circuit modeling and analysis of microwave acoustic interaction with biological tissues.

PubMed

Gao, Fei; Zheng, Qian; Zheng, Yuanjin

2014-05-01

Numerical study of microwave imaging and microwave-induced thermoacoustic imaging utilizes finite difference time domain (FDTD) analysis for simulation of microwave and acoustic interaction with biological tissues, which is time consuming due to complex grid-segmentation and numerous calculations, not straightforward due to no analytical solution and physical explanation, and incompatible with hardware development requiring circuit simulator such as SPICE. In this paper, instead of conventional FDTD numerical simulation, an equivalent electrical circuit model is proposed to model the microwave acoustic interaction with biological tissues for fast simulation and quantitative analysis in both one and two dimensions (2D). The equivalent circuit of ideal point-like tissue for microwave-acoustic interaction is proposed including transmission line, voltage-controlled current source, envelop detector, and resistor-inductor-capacitor (RLC) network, to model the microwave scattering, thermal expansion, and acoustic generation. Based on which, two-port network of the point-like tissue is built and characterized using pseudo S-parameters and transducer gain. Two dimensional circuit network including acoustic scatterer and acoustic channel is also constructed to model the 2D spatial information and acoustic scattering effect in heterogeneous medium. Both FDTD simulation, circuit simulation, and experimental measurement are performed to compare the results in terms of time domain, frequency domain, and pseudo S-parameters characterization. 2D circuit network simulation is also performed under different scenarios including different sizes of tumors and the effect of acoustic scatterer. The proposed circuit model of microwave acoustic interaction with biological tissue could give good agreement with FDTD simulated and experimental measured results. The pseudo S-parameters and characteristic gain could globally evaluate the performance of tumor detection. The 2D circuit network
Voice to Voice: Developing In-Service Teachers' Personal, Collaborative, and Public Voices.

ERIC Educational Resources Information Center

Thurber, Frances; Zimmerman, Enid

1997-01-01

Describes a model for inservice education that begins with an interchange of teachers' voices with those of the students in an interactive dialog. The exchange allows them to develop their private voices through self-reflection and validation of their own experiences. (JOW)
Interventions for preventing voice disorders in adults.

PubMed

Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H

2007-10-17

Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no
What can vortices tell us about vocal fold vibration and voice production.

PubMed

Khosla, Sid; Murugappan, Shanmugam; Gutmark, Ephraim

2008-06-01

Much clinical research on laryngeal airflow has assumed that airflow is unidirectional. This review will summarize what additional knowledge can be obtained about vocal fold vibration and voice production by studying rotational motion, or vortices, in laryngeal airflow. Recent work suggests two types of vortices that may strongly contribute to voice quality. The first kind forms just above the vocal folds during glottal closing, and is formed by flow separation in the glottis; these flow separation vortices significantly contribute to rapid closing of the glottis, and hence, to producing loudness and high frequency harmonics in the acoustic spectrum. The second is a group of highly three-dimensional and coherent supraglottal vortices, which can produce sound by interaction with structures in the vocal tract. Present work is also described that suggests that certain laryngeal pathologies, such as asymmetric vocal fold tension, will significantly modify both types of vortices, with adverse impact on sound production: decreased rate of glottal closure, increased broadband noise, and a decreased signal to noise ratio. Recent research supports the hypothesis that glottal airflow contains certain vortical structures that significantly contribute to voice quality.
Acoustic Analysis of Nasal Vowels in Monguor Language

NASA Astrophysics Data System (ADS)

Zhang, Hanbin

2017-09-01

The purpose of the study is to analyze the spectrum characteristics and acoustic features for the nasal vowels [ɑ˜] and [ɔ˜] in Monguor language. On the base of acoustic parameter database of the Monguor speech, the study finds out that there are five main zero-pole pairs appearing for the nasal vowel [ɔ˜] and two zero-pole pairs appear for the nasal vowel [ɔ˜]. The results of regression analysis demonstrate that the duration of the nasal vowel [ɔ˜] or the nasal vowel [ɔ˜] can be predicted by its F1, F2 and F3 respectively.
Applicability of the Arabic version of Vocal Tract Discomfort Scale (VTDS) with student singers as professional voice users.

PubMed

Darawsheh, Wesam B; Natour, Yaser S; Sada, Eve G

2018-07-01

This pilot study aimed to evaluate the internal consistency, convergent construct validity and criterion validity of Arabic version of the Vocal Tract Discomfort Scale (VTDS), and to investigate the correlation between the scores of the VTDS, the VHI and the acoustic measures of fundamental frequency (F0), shimmer, jitter and signal-to-noise ratio (SNR). A cross-sectional study where 97 participants participated (47 males and 50 females) (mean age 20.5 ± 2.1 years) (31 student singers and 66 other non-professional voice user students). Participants were without self-perceived voice disorders who completed the VTDS-Arab scale and the Voice Handicap Index (VHI-Arab), and recorded a vocal sample of/a:/at a comfortable level. A positive internal consistency that signifies reliability was confirmed by Cronbach's α = .884 and 0.874 for the VTDS-Arab frequency and severity subscales, respectively. A moderate positive correlation was found between the VTDS-Arab (frequency, severity, total) and the VHI-Arab total where values of Pearson's correlation coefficient were r= 0.459, 0.430 and 0.451, respectively. Weak correlations were found between all of the acoustic measures and the scores of the VTDS-Arab and VHI-Arab (total and subscales). The area under curve for the VTDS was AUC= 0.824, 0.804 and 0.817 for the VTDS frequency, VTDS severity and VTDS total, respectively. The VTDS-Arab is a valid and reliable tool in measuring vocal tract sensations and predicting the perception of vocal handicap in student singers and can be used to predict the vocal load among professional voice users.
Voice and endocrinology

PubMed Central

Hari Kumar, K. V. S.; Garg, Anurag; Ajai Chandra, N. S.; Singh, S. P.; Datta, Rakesh

2016-01-01

Voice is one of the advanced features of natural evolution that differentiates human beings from other primates. The human voice is capable of conveying the thoughts into spoken words along with a subtle emotion to the tone. This extraordinary character of the voice in expressing multiple emotions is the gift of God to the human beings and helps in effective interpersonal communication. Voice generation involves close interaction between cerebral signals and the peripheral apparatus consisting of the larynx, vocal cords, and trachea. The human voice is susceptible to the hormonal changes throughout life right from the puberty until senescence. Thyroid, gonadal and growth hormones have tremendous impact on the structure and function of the vocal apparatus. The alteration of voice is observed even in physiological states such as puberty and menstruation. Astute clinical observers make out the changes in the voice and refer the patients for endocrine evaluation. In this review, we shall discuss the hormonal influence on the voice apparatus in normal and endocrine disorders. PMID:27730065
Distinguishing between forensic science and forensic pseudoscience: testing of validity and reliability, and approaches to forensic voice comparison.

PubMed

Morrison, Geoffrey Stewart

2014-05-01

In this paper it is argued that one should not attempt to directly assess whether a forensic analysis technique is scientifically acceptable. Rather one should first specify what one considers to be appropriate principles governing acceptable practice, then consider any particular approach in light of those principles. This paper focuses on one principle: the validity and reliability of an approach should be empirically tested under conditions reflecting those of the case under investigation using test data drawn from the relevant population. Versions of this principle have been key elements in several reports on forensic science, including forensic voice comparison, published over the last four-and-a-half decades. The aural-spectrographic approach to forensic voice comparison (also known as "voiceprint" or "voicegram" examination) and the currently widely practiced auditory-acoustic-phonetic approach are considered in light of this principle (these two approaches do not appear to be mutually exclusive). Approaches based on data, quantitative measurements, and statistical models are also considered in light of this principle. © 2013.
Measuring positive and negative affect in the voiced sounds of African elephants (Loxodonta africana).

PubMed

Soltis, Joseph; Blowers, Tracy E; Savage, Anne

2011-02-01

As in other mammals, there is evidence that the African elephant voice reflects affect intensity, but it is less clear if positive and negative affective states are differentially reflected in the voice. An acoustic comparison was made between African elephant "rumble" vocalizations produced in negative social contexts (dominance interactions), neutral social contexts (minimal social activity), and positive social contexts (affiliative interactions) by four adult females housed at Disney's Animal Kingdom®. Rumbles produced in the negative social context exhibited higher and more variable fundamental frequencies (F(0)) and amplitudes, longer durations, increased voice roughness, and higher first formant locations (F1), compared to the neutral social context. Rumbles produced in the positive social context exhibited similar shifts in most variables (F(0 )variation, amplitude, amplitude variation, duration, and F1), but the magnitude of response was generally less than that observed in the negative context. Voice roughness and F(0) observed in the positive social context remained similar to that observed in the neutral context. These results are most consistent with the vocal expression of affect intensity, in which the negative social context elicited higher intensity levels than the positive context, but differential vocal expression of positive and negative affect cannot be ruled out.
Toward the Development of an Objective Index of Dysphonia Severity: A Four-Factor Acoustic Model

ERIC Educational Resources Information Center

Awan, Shaheen N.; Roy, Nelson

2006-01-01

During assessment and management of individuals with voice disorders, clinicians routinely attempt to describe or quantify the severity of a patient's dysphonia. This investigation used acoustic measures derived from sustained vowel samples to predict dysphonia severity (as determined by auditory-perceptual ratings), for a diverse set of voice…
Factors Predicting the Use of Passive Voice in Newspaper Headlines

ERIC Educational Resources Information Center

Micciulla, Linnea Margaret

2011-01-01

Information packaging researchers have found that certain factors influence active/passive voice alternations: Animacy, Definiteness and Weight influence argument order and thus choice of voice. Researchers in Critical Discourse Analysis (CDA) and psycholinguistics claim that voice is influenced by social factors, e.g. gender, social standing, or…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.