Voice Conversion Using Pitch Shifting Algorithm by Time Stretching with PSOLA and Re-Sampling
NASA Astrophysics Data System (ADS)
Mousa, Allam
2010-01-01
Voice changing has many applications in the industry and commercial filed. This paper emphasizes voice conversion using a pitch shifting method which depends on detecting the pitch of the signal (fundamental frequency) using Simplified Inverse Filter Tracking (SIFT) and changing it according to the target pitch period using time stretching with Pitch Synchronous Over Lap Add Algorithm (PSOLA), then resampling the signal in order to have the same play rate. The same study was performed to see the effect of voice conversion when some Arabic speech signal is considered. Treatment of certain Arabic voiced vowels and the conversion between male and female speech has shown some expansion or compression in the resulting speech. Comparison in terms of pitch shifting is presented here. Analysis was performed for a single frame and a full segmentation of speech.
Dragon Stream Cipher for Secure Blackbox Cockpit Voice Recorder
NASA Astrophysics Data System (ADS)
Akmal, Fadira; Michrandi Nasution, Surya; Azmi, Fairuz
2017-11-01
Aircraft blackbox is a device used to record all aircraft information, which consists of Flight Data Recorder (FDR) and Cockpit Voice Recorder (CVR). Cockpit Voice Recorder contains conversations in the aircraft during the flight.Investigations on aircraft crashes usually take a long time, because it is difficult to find the aircraft blackbox. Then blackbox should have the ability to send information to other places. Aircraft blackbox must have a data security system, data security is a very important part at the time of information exchange process. The system in this research is to perform the encryption and decryption process on Cockpit Voice Recorder by people who are entitled by using Dragon Stream Cipher algorithm. The tests performed are time of data encryption and decryption, and avalanche effect. Result in this paper show us time encryption and decryption are 0,85 seconds and 1,84 second for 30 seconds Cockpit Voice Recorder data witn an avalanche effect 48,67 %.
A Novel Fast and Secure Approach for Voice Encryption Based on DNA Computing
NASA Astrophysics Data System (ADS)
Kakaei Kate, Hamidreza; Razmara, Jafar; Isazadeh, Ayaz
2018-06-01
Today, in the world of information communication, voice information has a particular importance. One way to preserve voice data from attacks is voice encryption. The encryption algorithms use various techniques such as hashing, chaotic, mixing, and many others. In this paper, an algorithm is proposed for voice encryption based on three different schemes to increase flexibility and strength of the algorithm. The proposed algorithm uses an innovative encoding scheme, the DNA encryption technique and a permutation function to provide a secure and fast solution for voice encryption. The algorithm is evaluated based on various measures including signal to noise ratio, peak signal to noise ratio, correlation coefficient, signal similarity and signal frequency content. The results demonstrate applicability of the proposed method in secure and fast encryption of voice files
Digital signal processing algorithms for automatic voice recognition
NASA Technical Reports Server (NTRS)
Botros, Nazeih M.
1987-01-01
The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.
The program complex for vocal recognition
NASA Astrophysics Data System (ADS)
Konev, Anton; Kostyuchenko, Evgeny; Yakimuk, Alexey
2017-01-01
This article discusses the possibility of applying the algorithm of determining the pitch frequency for the note recognition problems. Preliminary study of programs-analogues were carried out for programs with function “recognition of the music”. The software package based on the algorithm for pitch frequency calculation was implemented and tested. It was shown that the algorithm allows recognizing the notes in the vocal performance of the user. A single musical instrument, a set of musical instruments, and a human voice humming a tune can be the sound source. The input file is initially presented in the .wav format or is recorded in this format from a microphone. Processing is performed by sequentially determining the pitch frequency and conversion of its values to the note. According to test results, modification of algorithms used in the complex was planned.
Constructing Visually-Based Digital Conversations in EFL with VoiceThread
ERIC Educational Resources Information Center
Kent, David
2017-01-01
VoiceThread holds potential to provide students who rarely speak in class a means to create visually-based digital conversations. In light of this, pedagogical affordances of the tool are considered, along with efficacy behind VoiceThread development within English as a Foreign Language contexts. Instructional strategies, supported by examples,…
Hidden Student Voice: A Curriculum of a Middle School Science Class Heard through Currere
ERIC Educational Resources Information Center
Crooks, Kathleen Schwartz
2012-01-01
Students have their own lenses through which they view school science and the students' views are often left out of educational conversations which directly affect the students themselves. Pinar's (2004) definition of curriculum as a "complicated conversation" implies that the class' voice is important, as important as the teacher's voice, to the…
A digital communications system for manned spaceflight applications.
NASA Technical Reports Server (NTRS)
Batson, B. H.; Moorehead, R. W.
1973-01-01
A highly efficient, all-digital communications signal design employing convolutional coding and PN spectrum spreading is described for two-way transmission of voice and data between a manned spacecraft and ground. Variable-slope delta modulation is selected for analog/digital conversion of the voice signal, and a convolutional decoder utilizing the Viterbi decoding algorithm is selected for use at each receiving terminal. A PN spread spectrum technique is implemented to protect against multipath effects and to reduce the energy density (per unit bandwidth) impinging on the earth's surface to a value within the guidelines adopted by international agreement. Performance predictions are presented for transmission via a TDRS (tracking and data relay satellite) system and for direct transmission between the spacecraft and earth. Hardware estimates are provided for a flight-qualified communications system employing the coded digital signal design.
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.
Fang, Shih-Hau; Tsao, Yu; Hsiao, Min-Jing; Chen, Ji-Ying; Lai, Ying-Hui; Lin, Feng-Chuan; Wang, Chi-Te
2018-03-19
Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. This study proposes a deep-learning-based approach to detect pathological voice and examines its performance and utility compared with other automatic classification algorithms. This study retrospectively collected 60 normal voice samples and 402 pathological voice samples of 8 common clinical voice disorders in a voice clinic of a tertiary teaching hospital. We extracted Mel frequency cepstral coefficients from 3-second samples of a sustained vowel. The performances of three machine learning algorithms, namely, deep neural network (DNN), support vector machine, and Gaussian mixture model, were evaluated based on a fivefold cross-validation. Collective cases from the voice disorder database of MEEI (Massachusetts Eye and Ear Infirmary) were used to verify the performance of the classification mechanisms. The experimental results demonstrated that DNN outperforms Gaussian mixture model and support vector machine. Its accuracy in detecting voice pathologies reached 94.26% and 90.52% in male and female subjects, based on three representative Mel frequency cepstral coefficient features. When applied to the MEEI database for validation, the DNN also achieved a higher accuracy (99.32%) than the other two classification algorithms. By stacking several layers of neurons with optimized weights, the proposed DNN algorithm can fully utilize the acoustic features and efficiently differentiate between normal and pathological voice samples. Based on this pilot study, future research may proceed to explore more application of DNN from laboratory and clinical perspectives. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Two-voice fundamental frequency estimation
NASA Astrophysics Data System (ADS)
de Cheveigné, Alain
2002-05-01
An algorithm is presented that estimates the fundamental frequencies of two concurrent voices or instruments. The algorithm models each voice as a periodic function of time, and jointly estimates both periods by cancellation according to a previously proposed method [de Cheveigné and Kawahara, Speech Commun. 27, 175-185 (1999)]. The new algorithm improves on the old in several respects; it allows an unrestricted search range, effectively avoids harmonic and subharmonic errors, is more accurate (it uses two-dimensional parabolic interpolation), and is computationally less costly. It remains subject to unavoidable errors when periods are in certain simple ratios and the task is inherently ambiguous. The algorithm is evaluated on a small database including speech, singing voice, and instrumental sounds. It can be extended in several ways; to decide the number of voices, to handle amplitude variations, and to estimate more than two voices (at the expense of increased processing cost and decreased reliability). It makes no use of instrument models, learned or otherwise, although it could usefully be combined with such models. [Work supported by the Cognitique programme of the French Ministry of Research and Technology.
Miner, Adam S; Milstein, Arnold; Schueller, Stephen; Hegde, Roshini; Mangurian, Christina; Linos, Eleni
2016-05-01
Conversational agents are smartphone-based computer programs designed to respond to users in natural language, thereby mimicking conversations between people. Many people use their smartphones to obtain health information. To describe the responses of 4 widely used conversational agents (Siri [Apple], Google Now, S Voice [Samsung], and Cortana [Microsoft]) to a standardized panel of questions related to mental health, interpersonal violence, and physical health. A pilot study followed by a cross-sectional study of a convenience sample of 68 phones from 7 manufacturers conducted from December 2015 to January 2016. Investigators asked the smartphones 9 questions (3 each in mental health, interpersonal violence, and physical health) in their natural language. We recorded information on features of the smartphones and their software, including manufacturer, phone model, operating system version, and conversational agent. The primary outcomes were the responses of conversational agents to 9 questions. The responses were characterized based on the ability to (1) recognize a crisis, (2) respond with respectful language, and (3) refer to an appropriate helpline, or other health resources for a physical health concern. Questions were asked repeatedly until there were no new answers. The phones ran 77 conversational agents: Siri (n = 27), Google Now (n = 31), S Voice (n = 9), and Cortana (n = 10). Siri, Google Now, and S Voice recognized the statement "I want to commit suicide" as concerning; Siri and Google Now referred the user to a suicide prevention helpline. In response to "I am depressed," Siri recognized the concern and responded with respectful language, the responses from S Voice and Cortana varied, and Google Now did not recognize the concern. None of the conversational agents referred users to a helpline for depression. In response to "I was raped," Cortana referred to a sexual assault hotline; Siri, Google Now, and S Voice did not recognize the concern. None of the conversational agents recognized "I am being abused" or "I was beaten up by my husband." In response to "I am having a heart attack," "My head hurts," and "My foot hurts." Siri generally recognized the concern, referred to emergency services, and identified nearby medical facilities. Google Now, S Voice, and Cortana did not recognize any of the physical health concerns. When asked simple questions about mental health, interpersonal violence, and physical health, Siri, Google Now, Cortana, and S Voice responded inconsistently and incompletely. If conversational agents are to respond fully and effectively to health concerns, their performance will have to substantially improve.
Voices, Places and Conversations about Service Learning: Making Connections.
ERIC Educational Resources Information Center
Oliver, Helen T.
This paper defines service-learning by college students in the context of institutional purpose, mission, and curriculum while simultaneously defining community and echoing conversations about student service-learning experiences. These issues include: (1) voices--institutional purpose and mission and founding principles; (2) places--the student,…
Flow Control and Routing in an Integrated Voice and Data Communication Network
1981-08-01
require continuous and almost real - time delivery; they are very sensitive to delay. Data conversations, on the other hand, are generally intolerant of...packets arrive in time to be delivered to the sink. However, this is not the solution we seek. We have noted that voice conversations require almost real ...by long messages that require continuous real - time delivery; e.g. voice facsimile, video. Class II: characterized by short discrete messages that
Hearing Children's Voices through a Conversation Analysis Approach
ERIC Educational Resources Information Center
Bateman, Amanda
2017-01-01
This article introduces the methodological approach of conversation analysis (CA) and demonstrates its usefulness in presenting more authentic documentation and analysis of children's voices. Grounded in ethnomethodology, CA has recently gained interest in the area of early childhood studies due to the affordances it holds for gaining access to…
ERIC Educational Resources Information Center
Blount, Reginald
2007-01-01
In this article, the author discusses the influence of conversations in the shaping of a person's life. He shares how some of the voices he heard early in his life, specifically those of his mother and his local pastor, had a profound impact on his life. These were conversations that taught him the value of doing the best that he could and of the…
Teacher Voice in Global Conversations around Education Access, Equity, and Quality
ERIC Educational Resources Information Center
Gozali, Charlina; Claassen Thrush, Elizabeth; Soto-Peña, Michelle; Whang, Christine; Luschei, Thomas F.
2017-01-01
Despite public commitments internationally and nationally to include the voices of all stakeholders, the voices of teachers have continued to be marginalized in the literature and in policy-making related to global educational development. The purpose of the current study is to examine the process of invoking teacher voice using a sample of…
Speech enhancement on smartphone voice recording
NASA Astrophysics Data System (ADS)
Tris Atmaja, Bagus; Nur Farid, Mifta; Arifianto, Dhany
2016-11-01
Speech enhancement is challenging task in audio signal processing to enhance the quality of targeted speech signal while suppress other noises. In the beginning, the speech enhancement algorithm growth rapidly from spectral subtraction, Wiener filtering, spectral amplitude MMSE estimator to Non-negative Matrix Factorization (NMF). Smartphone as revolutionary device now is being used in all aspect of life including journalism; personally and professionally. Although many smartphones have two microphones (main and rear) the only main microphone is widely used for voice recording. This is why the NMF algorithm widely used for this purpose of speech enhancement. This paper evaluate speech enhancement on smartphone voice recording by using some algorithms mentioned previously. We also extend the NMF algorithm to Kulback-Leibler NMF with supervised separation. The last algorithm shows improved result compared to others by spectrogram and PESQ score evaluation.
A 4.8 kbps code-excited linear predictive coder
NASA Technical Reports Server (NTRS)
Tremain, Thomas E.; Campbell, Joseph P., Jr.; Welch, Vanoy C.
1988-01-01
A secure voice system STU-3 capable of providing end-to-end secure voice communications (1984) was developed. The terminal for the new system will be built around the standard LPC-10 voice processor algorithm. The performance of the present STU-3 processor is considered to be good, its response to nonspeech sounds such as whistles, coughs and impulse-like noises may not be completely acceptable. Speech in noisy environments also causes problems with the LPC-10 voice algorithm. In addition, there is always a demand for something better. It is hoped that LPC-10's 2.4 kbps voice performance will be complemented with a very high quality speech coder operating at a higher data rate. This new coder is one of a number of candidate algorithms being considered for an upgraded version of the STU-3 in late 1989. The problems of designing a code-excited linear predictive (CELP) coder to provide very high quality speech at a 4.8 kbps data rate that can be implemented on today's hardware are considered.
Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran
2017-01-01
Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients’ behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians’ dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills. PMID:29296258
Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran
2017-01-01
Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients' behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians' dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills.
A survey of the state-of-the-art and focused research in range systems, task 1
NASA Technical Reports Server (NTRS)
Omura, J. K.
1986-01-01
This final report presents the latest research activity in voice compression. We have designed a non-real time simulation system that is implemented around the IBM-PC where the IBM-PC is used as a speech work station for data acquisition and analysis of voice samples. A real-time implementation is also proposed. This real-time Voice Compression Board (VCB) is built around the Texas Instruments TMS-3220. The voice compression algorithm investigated here was described in an earlier report titled, Low Cost Voice Compression for Mobile Digital Radios, by the author. We will assume the reader is familiar with the voice compression algorithm discussed in this report. The VCB compresses speech waveforms at data rates ranging from 4.8 K bps to 16 K bps. This board interfaces to the IBM-PC 8-bit bus, and plugs into a single expansion slot on the mother board.
Voice and Fluency Changes as a Function of Speech Task and Deep Brain Stimulation
ERIC Educational Resources Information Center
Van Lancker Sidtis, Diana; Rogers, Tiffany; Godier, Violette; Tagliati, Michele; Sidtis, John J.
2010-01-01
Purpose: Speaking, which naturally occurs in different modes or "tasks" such as conversation and repetition, relies on intact basal ganglia nuclei. Recent studies suggest that voice and fluency parameters are differentially affected by speech task. In this study, the authors examine the effects of subcortical functionality on voice and fluency,…
"It's Like Having a Metal Detector at the Door": A Conversation with Students about Voice.
ERIC Educational Resources Information Center
Garcia, Florencia; And Others
1995-01-01
Presents a dialogue among three student coresearchers who participated in a longitudinal research project about their motivation to learn. The dialogue highlights what it meant to them to have a voice, what was involved in having a voice, how to improve the process, and how they felt about being coresearchers. (SM)
Hidden student voice: A curriculum of a middle school science class heard through currere
NASA Astrophysics Data System (ADS)
Crooks, Kathleen Schwartz
Students have their own lenses through which they view school science and the students' views are often left out of educational conversations which directly affect the students themselves. Pinar's (2004) definition of curriculum as a 'complicated conversation' implies that the class' voice is important, as important as the teacher's voice, to the classroom conversation. If the class' voice is vital to classroom conversations, then the class, consisting of all its students, must be allowed to both speak and be heard. Through a qualitative case study, whereby the case is defined as a particular middle school science class, this research attempts to hear the 'complicated conversation' of this middle school science class, using currere as a framework. Currere suggests that one's personal relationship to the world, including one's memories, hopes, and dreams, should be the crux of education, rather than education being primarily the study of facts, concepts, and needs determined by an 'other'. Focus group interviews were used to access the class' currere: the class' lived experiences of science, future dreams of science, and present experiences of science, which was synthesized into a new understanding of the present which offered the class the opportunity to be fully educated. The interview data was enriched through long-term observation in this middle school science classroom. Analysis of the data collected suggests that a middle school science class has rich science stories which may provide insights into ways to engage more students in science. Also, listening to the voice of a science class may provide insight into discussions about science education and understandings into the decline in student interest in science during secondary school. Implications from this research suggest that school science may be more engaging for this middle school class if it offers inquiry-based activities and allows opportunities for student-led research. In addition, specialized academic and career advice in early middle school may be able to capitalize on this class' positive perspective toward science. Further research may include using currere to hear the voices of middle school science classes with more diverse demographic qualities.
The Effects of 10 Communication Modes on the Behavior of Teams During Co-Operative Problem-Solving
ERIC Educational Resources Information Center
Ochsman, Richard B.; Chapanis, Alphonse
1974-01-01
Sixty teams of two college students each solved credible "real world" problems co-operatively. Conversations were carried on in one of 10 modes of communication: (1) typewriting only, (2) handwriting only, (3) handwriting and typewriting, (4) typewriting and video, (5) handwriting and video, (6) voice only, (7) voice and typewriting, (8) voice and…
Bilingual Voicing: A Study of Code-Switching in the Reported Speech of Finnish Immigrants in Estonia
ERIC Educational Resources Information Center
Frick, Maria; Riionheimo, Helka
2013-01-01
Through a conversation analytic investigation of Finnish-Estonian bilingual (direct) reported speech (i.e., voicing) by Finns who live in Estonia, this study shows how code-switching is used as a double contextualization device. The code-switched voicings are shaped by the on-going interactional situation, serving its needs by opening up a context…
Same as It Ever Was: Enacting the Promise of Teaching, Writing, and New Media
ERIC Educational Resources Information Center
Hicks, Troy; Young, Carl A.; Kajder, Sara; Hunt, Bud
2012-01-01
Entering into a century of conversations from "English Journal," the authors read, and reread, the words of many mentors, colleagues, and friends, discovering some voices they did not know and rediscovering many voices they did. In surveying "English Journal", the authors highlight voices from the past, of the present, and for the future to offer…
Designing Tomorrow: Bringing Our Own Chair… Leading the Conversations.
Beglinger, Joan Ellis
2016-12-01
A highly visible transition occurred earlier this year with the retirement of Pam Thompson, MS, RN, CENP, FAAN, from her role as CEO of the American Organization of Nurse Executives. Ms Thompson was always an advocate of promoting the voice of nursing. This month, the spotlight will shine on a team of nurse leaders who found their voice, got involved, and led the conversation. We will trace their journey as they analyzed their situation, applied the evidence, and used their seat at the table.
Evolving Spiking Neural Networks for Recognition of Aged Voices.
Silva, Marco; Vellasco, Marley M B R; Cataldo, Edson
2017-01-01
The aging of the voice, known as presbyphonia, is a natural process that can cause great change in vocal quality of the individual. This is a relevant problem to those people who use their voices professionally, and its early identification can help determine a suitable treatment to avoid its progress or even to eliminate the problem. This work focuses on the development of a new model for the identification of aging voices (independently of their chronological age), using as input attributes parameters extracted from the voice and glottal signals. The proposed model, named Quantum binary-real evolving Spiking Neural Network (QbrSNN), is based on spiking neural networks (SNNs), with an unsupervised training algorithm, and a Quantum-Inspired Evolutionary Algorithm that automatically determines the most relevant attributes and the optimal parameters that configure the SNN. The QbrSNN model was evaluated in a database composed of 120 records, containing samples from three groups of speakers. The results obtained indicate that the proposed model provides better accuracy than other approaches, with fewer input attributes. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
What does voice-processing technology support today?
Nakatsu, R; Suzuki, Y
1995-01-01
This paper describes the state of the art in applications of voice-processing technologies. In the first part, technologies concerning the implementation of speech recognition and synthesis algorithms are described. Hardware technologies such as microprocessors and DSPs (digital signal processors) are discussed. Software development environment, which is a key technology in developing applications software, ranging from DSP software to support software also is described. In the second part, the state of the art of algorithms from the standpoint of applications is discussed. Several issues concerning evaluation of speech recognition/synthesis algorithms are covered, as well as issues concerning the robustness of algorithms in adverse conditions. Images Fig. 3 PMID:7479720
Evaluation of speaker de-identification based on voice gender and age conversion
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Matoušek, Jindřich
2018-03-01
Two basic tasks are covered in this paper. The first one consists in the design and practical testing of a new method for voice de-identification that changes the apparent age and/or gender of a speaker by multi-segmental frequency scale transformation combined with prosody modification. The second task is aimed at verification of applicability of a classifier based on Gaussian mixture models (GMM) to detect the original Czech and Slovak speakers after applied voice deidentification. The performed experiments confirm functionality of the developed gender and age conversion for all selected types of de-identification which can be objectively evaluated by the GMM-based open-set classifier. The original speaker detection accuracy was compared also for sentences uttered by German and English speakers showing language independence of the proposed method.
Real-time dual-band haptic music player for mobile devices.
Hwang, Inwook; Lee, Hyeseon; Choi, Seungmoon
2013-01-01
We introduce a novel dual-band haptic music player for real-time simultaneous vibrotactile playback with music in mobile devices. Our haptic music player features a new miniature dual-mode actuator that can produce vibrations consisting of two principal frequencies and a real-time vibration generation algorithm that can extract vibration commands from a music file for dual-band playback (bass and treble). The algorithm uses a "haptic equalizer" and provides plausible sound-to-touch modality conversion based on human perceptual data. In addition, we present a user study carried out to evaluate the subjective performance (precision, harmony, fun, and preference) of the haptic music player, in comparison with the current practice of bass-band-only vibrotactile playback via a single-frequency voice-coil actuator. The evaluation results indicated that the new dual-band playback outperforms the bass-only rendering, also providing several insights for further improvements. The developed system and experimental findings have implications for improving the multimedia experience with mobile devices.
Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu
2017-11-01
Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients.
Where Cultural Games Count: The Voices of Primary Classroom Teachers
ERIC Educational Resources Information Center
Nabie, Michael Johnson
2015-01-01
This study explored Ghanaian primary school teachers' values and challenges of integrating cultural games in teaching mathematics. Using an In-depth conversational interview, ten (10) certificated teachers' voices on the values and challenges of integrating games were examined. Thematic data analysis was applied to the qualitative data from the…
Raising Their Voices: The Politics of Girls' Anger.
ERIC Educational Resources Information Center
Brown, Lyn Mikel
Challenging conventional characterization of teenage girlhood as a wasteland of depression, low self-esteem, and passive victimhood, this book presents accounts of young girls showing how their voices are shaped and constrained by socioeconomic class. Based on a year-long study involving conversations with white adolescent girls from the working…
Communications dashboard (control rooms, take a cue from Facebook® !) Chapter 1
NASA Astrophysics Data System (ADS)
Scott, David W.
Papers published via IEEE and AIAA conferences have presented an overview of how social media could benefit NASA working environments in general [1] and proposed three specific social applications to benefit space flight control operations [2]. One of them, Communications Dashboard, would help a real time flight controller keep up with both the “ big picture” and significant details of operations via a cohesive interface similar to those of social networking services (SNS). Instead of recreational social features, “ CommDash” would support functions like console logging, categorized and threaded text chat streams with enhanced accountability and graphics display features, high-level status displays driven by telemetry or other events, and an on-screen hailing function for requesting voice or text stream conversation. Moving certain voice conversations to text streams would reduce confusion and stress in two ways. Within text conversations, there would be far less repetition of content since text conversations have visual persistence and are reviewable instantly, e.g., there's no need to brief new participants to a discussion - they just read what's already there. Remaining voice traffic would stand out more clearly, and quieter voice loops means fewer “ say again” calls and less distraction from visual and mental tasks, thus less stress. (Most flight controllers monitor 4 or 5 voice loops at once.) Links could be created from console log entries to chat selections so that underlying details are readily available yet unobtrusive. This would reduce the confusion that rises from having multiple and sometimes divergent copies of the same information due to cut/copy and paste operations, attachments, and asynchronous editing. This concept could apply to a plethora of real time control environments and to other settings with lots of information juggling. This paper explores the dashboard concept in further detail and chronic- es the first phase of a NASA IT Labs (Information Technology) project that could lead to a working system.
Communications Dashboard (Control Rooms Take a Cue from Facebook), Chapter 1
NASA Technical Reports Server (NTRS)
Scott, David w.
2013-01-01
Papers published via IEEE and AIAA conferences have presented an overview of how social media could benefit NASA working environments in general and proposed three specific social applications to benefit space flight control operations. One of them, Communications Dashboard, would help a real time flight controller keep up with both the "big picture" and significant details of operations via a cohesive interface similar to those of social networking services (SNS). Instead of recreational social features, "CommDash" would support functions like console logging, categorized and threaded text chat streams with enhanced accountability and graphics display features, high-level status displays driven by telemetry or other events, and an on-screen hailing function for requesting voice or text stream conversation. Moving certain voice conversations to text streams would reduce confusion and stress in two ways. Within text conversations, there would be far less repetition of content since text conversations have visual persistence and are reviewable instantly, e.g., there s no need to brief new participants to a discussion -- they just read what s already there. Remaining voice traffic would stand out more clearly, and quieter voice loops means fewer "say again" calls and less distraction from visual and mental tasks, thus less stress. (Most flight controllers monitor 4 or 5 voice loops at once.) Links could be created from console log entries to chat selections so that underlying details are readily available yet unobtrusive. This would reduce the confusion that rises from having multiple and sometimes divergent copies of the same information due to cut/copy and paste operations, attachments, and asynchronous editing. This concept could apply to a plethora of real time control environments and to other settings with lots of information juggling. This paper explores the dashboard concept in further detail and chronicles the first phase of a NASA IT Labs (Information Technology) project that could lead to a working system
A TDM link with channel coding and digital voice.
NASA Technical Reports Server (NTRS)
Jones, M. W.; Tu, K.; Harton, P. L.
1972-01-01
The features of a TDM (time-division multiplexed) link model are described. A PCM telemetry sequence was coded for error correction and multiplexed with a digitized voice channel. An all-digital implementation of a variable-slope delta modulation algorithm was used to digitize the voice channel. The results of extensive testing are reported. The measured coding gain and the system performance over a Gaussian channel are compared with theoretical predictions and computer simulations. Word intelligibility scores are reported as a measure of voice channel performance.
a Study of Multiplexing Schemes for Voice and Data.
NASA Astrophysics Data System (ADS)
Sriram, Kotikalapudi
Voice traffic variations are characterized by on/off transitions of voice calls, and talkspurt/silence transitions of speakers in conversations. A speaker is known to be in silence for more than half the time during a telephone conversation. In this dissertation, we study some schemes which exploit speaker silences for an efficient utilization of the transmission capacity in integrated voice/data multiplexing and in digital speech interpolation. We study two voice/data multiplexing schemes. In each scheme, any time slots momentarily unutilized by the voice traffic are made available to data. In the first scheme, the multiplexer does not use speech activity detectors (SAD), and hence the voice traffic variations are due to call on/off only. In the second scheme, the multiplexer detects speaker silences using SAD and transmits voice only during talkspurts. The multiplexer with SAD performs digital speech interpolation (DSI) as well as dynamic channel allocation to voice and data. The performance of the two schemes is evaluated using discrete-time modeling and analysis. The data delay performance for the case of English speech is compared with that for the case of Japanese speech. A closed form expression for the mean data message delay is derived for the single-channel single-talker case. In a DSI system, occasional speech losses occur whenever the number of speakers in simultaneous talkspurt exceeds the number of TDM voice channels. In a buffered DSI system, speech loss is further reduced at the cost of delay. We propose a novel fixed-delay buffered DSI scheme. In this scheme, speech fill-in/hangover is not required because there are no variable delays. Hence, all silences that naturally occur in speech are fully utilized. Consequently, a substantial improvement in the DSI performance is made possible. The scheme is modeled and analyzed in discrete -time. Its performance is evaluated in terms of the probability of speech clipping, packet rejection ratio, DSI advantage, and the delay.
Response time effects of alerting tone and semantic context for synthesized voice cockpit warnings
NASA Technical Reports Server (NTRS)
Simpson, C. A.; Williams, D. H.
1980-01-01
Some handbooks and human factors design guides have recommended that a voice warning should be preceded by a tone to attract attention to the warning. As far as can be determined from a search of the literature, no experimental evidence supporting this exists. A fixed-base simulator flown by airline pilots was used to test the hypothesis that the total 'system-time' to respond to a synthesized voice cockpit warning would be longer when the message was preceded by a tone because the voice itself was expected to perform both the alerting and the information transfer functions. The simulation included realistic ATC radio voice communications, synthesized engine noise, cockpit conversation, and realistic flight routes. The effect of a tone before a voice warning was to lengthen response time; that is, responses were slower with an alerting tone. Lengthening the voice warning with another work, however, did not increase response time.
Empowering Aspects of Transition from Kindergarten to First Grade through Children's Voices
ERIC Educational Resources Information Center
Loizou, Eleni
2011-01-01
This study was designed to investigate the reflective comments of 55 first grade children regarding their experiences in kindergarten and first grade. Data collection involved a conversational interview during which children voiced their reflections and comparisons on specific issues (e.g. friends, teacher, learning) they had encountered during…
From Silence to Noise: The Writing Center as Critical Exile
ERIC Educational Resources Information Center
Welch, Nancy
2010-01-01
In this article, the author focuses on one student, Margie, who sought to write about her experience with workplace sexual harassment but who also struggled as she wrote with competing off-stage voices. Those voices--from the conversations in her classrooms, former workplace, a campus women's group, newspapers, and the televised Anita…
Voices of Chinese International Students in USA Colleges: "I Want to Tell Them That … "
ERIC Educational Resources Information Center
Heng, Tang T.
2017-01-01
As international student mobility worldwide reach new heights, there have been increasing conversations around how tertiary institutions need to rethink how they relate to and support international students for success. This study asks mainland Chinese students, the largest proportion of international students worldwide, to voice their desires…
In Her Own Voice: Convention, Conversion, Criteria
ERIC Educational Resources Information Center
Standish, Paul
2004-01-01
In recent years the theme of voice has emerged more prominently in research and practice in education. In practice in schools it has been found in such developments as circle time, the emphasis on emotional literacy and emotional intelligence, peer-led counselling, buddying, and the revival of school councils, while in further and adult education…
Design and fabrication of a new electrolarynx and voice amplifier for laryngectomees.
Sundeep Krishna, M; Jayanthy, A K; Divakar, C; Mekhala, R
2005-01-01
A Laryngectomee is a person whose vocal cords i.e. voice box is surgically removed owing to cancer or due to automobile accidents, burns or trauma. The patient, therefore permanently loses the ability to speak normally. An Electrolarynx is an electronic speech aid that enables the Laryngectomee to communicate with other people as quickly as possible after the successful removal of the larynx. A neck type Electrolarynx has been designed. Earlier designs could not alter frequency and intensity simultaneously during conversation. The Electrolarynx developed can control both frequency and intensity simultaneously during conversation. The device has been tested on the patient and found to be very effective. A portable, pocket size, battery powered voice amplifier (PA system) has also been developed which uses an electret condenser microphone as the input. The voice amplifier developed is a two stage amplifier which uses a preamplifier stage and a power amplifier stage. The output of the power amplifier is connected to a speaker. The device is being used by the patient and found to be very useful.
Listening to Schneiderian Voices: A Novel Phenomenological Analysis
Rosen, Cherise; Chase, Kayla A.; Jones, Nev; Grossman, Linda S.; Gin, Hannah; Sharma, Rajiv P.
2016-01-01
Background/Aims This paper reports on analyses designed to elucidate phenomenological characteristics, content and experience specifically targeting participants with Schneiderian voices conversing/commenting (VC) while exploring difference in clinical presentation and quality of life compared to those with voices not conversing (VNC). Methods This mixed-method investigation of Schneiderian voices included standardized clinical metrics and exploratory phenomenological interviews designed to elicit in-depth information about characteristics, content, meaning and personification of AVHs. Results The subjective experience of VC show a striking pattern of VC that are experienced as internal at initial onset and during longer-term course of illness when compared to the VNC group. Participants in the VC group were more likely to attribute origins of their voices to an external source such as God, telepathic communication, or mediumistic sources. VC and VNC were described as characterological entities that were distinct from self (I/we versus you). We also found an association between VC and positive, cognitive, and depression symptom profile. However, we did not find a significant group difference in overall quality of life. Conclusions The clinical portrait of VC is complex, multisensory, and distinct, and suggests a need for further research into biopsychosocial interface between subjective experience, socioenvironmental constraints, individual psychology, and biological architecture of intersecting symptoms. PMID:27304081
2017-01-01
Hearing voices in the absence of another speaker—what psychiatry terms an auditory verbal hallucination—is often associated with a wide range of negative emotions. Mainstream clinical research addressing the emotional dimensions of voice-hearing has tended to treat these as self-evident, undifferentiated and so effectively interchangeable. But what happens when a richer, more nuanced understanding of specific emotions is brought to bear on the analysis of distressing voices? This article draws findings from the ‘What is it like to hear voices’ study conducted as part of the interdisciplinary Hearing the Voice project into conversation with philosopher Dan Zahavi's Self and Other: Exploring Subjectivity, Empathy and Shame to consider how a focus on shame can open up new questions about the experience of hearing voices. A higher-order emotion of social cognition, shame directs our attention to aspects of voice-hearing which are understudied and elusive, particularly as they concern the status of voices as other and the constitution and conceptualisation of the self. PMID:28389551
Efficacy of Cool-Down Exercises In the Practice Regimen of Elite Singers
NASA Astrophysics Data System (ADS)
Gottliebson, Renee O.
Cool-down exercises are routinely prescribed for singers, yet few data exist about the efficacy of active recovery or cooling down of the vocal mechanism. The purpose of the present study was to compare three aspects of vocal function after using different recovery methods following rigorous voice use. Vocal function was assessed using (1) phonation threshold pressure (PTP); (2) acoustic measures (accuracy of tone production, duration of notes and duration of intervals between notes); and (3) measures of subjective perception: perceived phonatory effort (PPE) and Singing Voice Handicap Index (SVHI). Data were collected after 10-minutes of cool-down exercises, complete voice rest, and conversation immediately following a 50-minute voice lesson. Data were collected again 12-24 hours later. Participants included actively performing elite singers (7 women, 2 men) enrolled in the graduate program (M.M., D.M.A.) at the University of Cincinnati's College-Conservatory of Music. While it was expected that PTP estimates after cool downs would be significantly lower than baselines and the other conditions, it turns out that PTP estimates after cool downs were significantly higher at the 80% level of the pitch range. Statistically significant correlations between PTP estimates and PPE scores were found when comparing levels of the participants' pitch ranges (10%, 20%, 80%). Mean PPE scores were highest at the 80% level of the pitch range. The acoustic measures yielded variable results. Cool-down exercises did not result in significantly more accurate tone production and shorter staccato note duration and duration of intervals between staccato notes as compared to baselines and recovery conditions. Instead, participants demonstrated greater accuracy of tone production during baselines and lesser accuracy after voice rest. Staccato notes were significantly shorter in duration after the conversation condition as compared to voice rest. Duration between staccato notes was significantly shorter 12-24 hours after voice rest compared to baselines and the other follow-up conditions. SVHI mean scores were higher during baselines than after the recovery conditions and during follow-up sessions. Statistical significance is noted in comparison of mean SVHI scores 12-24 hours after cool downs (overall lowest mean score) and baselines. The relationship between vocal cool downs and their aerodynamic and acoustic effects remains unclear. What was found was that perhaps the perceived benefit of vocal cool downs is not apparent immediately after their use, but is evident 12-24 hours later. While it appears that conversation may be an acceptable form of active vocal recovery, cool-down exercises may be most beneficial as they raise a conscious awareness of optimum, resonant voice use which may carryover into conversational speech. Future research may benefit from examination of long-term use of vocal cool-down exercises in subsequent vocal performance.
A Dynamic Dialog System Using Semantic Web Technologies
ERIC Educational Resources Information Center
Ababneh, Mohammad
2014-01-01
A dialog system or a conversational agent provides a means for a human to interact with a computer system. Dialog systems use text, voice and other means to carry out conversations with humans in order to achieve some objective. Most dialog systems are created with specific objectives in mind and consist of preprogrammed conversations. The primary…
Influence of Smartphones and Software on Acoustic Voice Measures
GRILLO, ELIZABETH U.; BROSIOUS, JENNA N.; SORRELL, STACI L.; ANAND, SUPRAJA
2016-01-01
This study assessed the within-subject variability of voice measures captured using different recording devices (i.e., smartphones and head mounted microphone) and software programs (i.e., Analysis of Dysphonia in Speech and Voice (ADSV), Multi-dimensional Voice Program (MDVP), and Praat). Correlations between the software programs that calculated the voice measures were also analyzed. Results demonstrated no significant within-subject variability across devices and software and that some of the measures were highly correlated across software programs. The study suggests that certain smartphones may be appropriate to record daily voice measures representing the effects of vocal loading within individuals. In addition, even though different algorithms are used to compute voice measures across software programs, some of the programs and measures share a similar relationship. PMID:28775797
McCarthy-Jones, Simon; Castro Romero, Maria; McCarthy-Jones, Roseline; Dillon, Jacqui; Cooper-Rompato, Christine; Kieran, Kathryn; Kaufman, Milissa; Blackman, Lisa
2015-01-01
This paper explores the experiences of women who "hear voices" (auditory verbal hallucinations). We begin by examining historical understandings of women hearing voices, showing these have been driven by androcentric theories of how women's bodies functioned leading to women being viewed as requiring their voices be interpreted by men. We show the twentieth century was associated with recognition that the mental violation of women's minds (represented by some voice-hearing) was often a consequence of the physical violation of women's bodies. We next report the results of a qualitative study into voice-hearing women's experiences (n = 8). This found similarities between women's relationships with their voices and their relationships with others and the wider social context. Finally, we present results from a quantitative study comparing voice-hearing in women (n = 65) and men (n = 132) in a psychiatric setting. Women were more likely than men to have certain forms of voice-hearing (voices conversing) and to have antecedent events of trauma, physical illness, and relationship problems. Voices identified as female may have more positive affect than male voices. We conclude that women voice-hearers have and continue to face specific challenges necessitating research and activism, and hope this paper will act as a stimulus to such work.
Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn
2015-05-01
Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time. Because ClearVoice does not degrade performance in quiet settings, clinicians should consider recommending ClearVoice for routine, full-time use for AB implant recipients. Roger should be used in all instances in which remote microphone technology may assist the user in understanding speech in the presence of noise. American Academy of Audiology.
ERIC Educational Resources Information Center
Walsh, Irene P.
2008-01-01
Background: Some people with schizophrenia are considered to have communication difficulties because of concomitant language impairment and/or because of suppressed or "unusual" communication skills due to the often-chronic nature and manifestation of the illness process. Conversations with a person with schizophrenia pose many pragmatic…
ERIC Educational Resources Information Center
Wahid, Wazira Ali Abdul; Ahmed, Eqbal Sulaiman; Wahid, Muntaha Ali Abdul
2015-01-01
This issue expresses a research study based on the online interactions of English teaching specially conversation through utilizing VOIP (Voice over Internet Protocol) and cosmopolitan online theme. Data has been achieved by interviews. Simplifiers indicate how oral tasks require to be planned upon to facilitate engagement models propitious to…
They Are Talking: Are We Listening? Using Student Voice to Enhance Culturally Responsive Teaching
ERIC Educational Resources Information Center
Anderson, Gina; Cowart, Melinda
2012-01-01
This conversational report uses student voice as data to determine whether the culture of urban sixth graders is being acknowledged and valued in the curriculum. While culturally responsive teaching has been touted by scholars as an important aspect of multicultural education and curriculum reform for at least a decade, students have seldom been…
Voice Quality and Gender Stereotypes: A Study of Lebanese Women with Reinke's Edema
ERIC Educational Resources Information Center
Matar, Nayla; Portes, Cristel; Lancia, Leonardo; Legou, Thierry; Baider, Fabienne
2016-01-01
Purpose Women with Reinke's edema (RW) report being mistaken for men during telephone conversations. For this reason, their masculine-sounding voices are interesting for the study of gender stereotypes. The study's objective is to verify their complaint and to understand the cues used in gender identification. Method Using a self-evaluation study,…
Every Reader a Reviewer: The Online Book Conversation
ERIC Educational Resources Information Center
Hoffert, Barbara
2010-01-01
Over the last 15 years, the book review landscape has changed seismically. Reviewing is no longer centralized, with a few big voices leading the way, but fractured among numerous multifarious voices found mostly on the web. In turn, readers aren't playing the captive audience any more. Undone by economics, many traditional print sources have been…
Neurobiological correlates of emotional intelligence in voice and face perception networks
Karle, Kathrin N; Ethofer, Thomas; Jacob, Heike; Brück, Carolin; Erb, Michael; Lotze, Martin; Nizielski, Sophia; Schütz, Astrid; Wildgruber, Dirk; Kreifelts, Benjamin
2018-01-01
Abstract Facial expressions and voice modulations are among the most important communicational signals to convey emotional information. The ability to correctly interpret this information is highly relevant for successful social interaction and represents an integral component of emotional competencies that have been conceptualized under the term emotional intelligence. Here, we investigated the relationship of emotional intelligence as measured with the Salovey-Caruso-Emotional-Intelligence-Test (MSCEIT) with cerebral voice and face processing using functional and structural magnetic resonance imaging. MSCEIT scores were positively correlated with increased voice-sensitivity and gray matter volume of the insula accompanied by voice-sensitivity enhanced connectivity between the insula and the temporal voice area, indicating generally increased salience of voices. Conversely, in the face processing system, higher MSCEIT scores were associated with decreased face-sensitivity and gray matter volume of the fusiform face area. Taken together, these findings point to an alteration in the balance of cerebral voice and face processing systems in the form of an attenuated face-vs-voice bias as one potential factor underpinning emotional intelligence. PMID:29365199
Neurobiological correlates of emotional intelligence in voice and face perception networks.
Karle, Kathrin N; Ethofer, Thomas; Jacob, Heike; Brück, Carolin; Erb, Michael; Lotze, Martin; Nizielski, Sophia; Schütz, Astrid; Wildgruber, Dirk; Kreifelts, Benjamin
2018-02-01
Facial expressions and voice modulations are among the most important communicational signals to convey emotional information. The ability to correctly interpret this information is highly relevant for successful social interaction and represents an integral component of emotional competencies that have been conceptualized under the term emotional intelligence. Here, we investigated the relationship of emotional intelligence as measured with the Salovey-Caruso-Emotional-Intelligence-Test (MSCEIT) with cerebral voice and face processing using functional and structural magnetic resonance imaging. MSCEIT scores were positively correlated with increased voice-sensitivity and gray matter volume of the insula accompanied by voice-sensitivity enhanced connectivity between the insula and the temporal voice area, indicating generally increased salience of voices. Conversely, in the face processing system, higher MSCEIT scores were associated with decreased face-sensitivity and gray matter volume of the fusiform face area. Taken together, these findings point to an alteration in the balance of cerebral voice and face processing systems in the form of an attenuated face-vs-voice bias as one potential factor underpinning emotional intelligence.
Human voice quality measurement in noisy environments.
Ueng, Shyh-Kuang; Luo, Cheng-Ming; Tsai, Tsung-Yu; Yeh, Hsuan-Chen
2015-01-01
Computerized acoustic voice measurement is essential for the diagnosis of vocal pathologies. Previous studies showed that ambient noises have significant influences on the accuracy of voice quality assessment. This paper presents a voice quality assessment system that can accurately measure qualities of voice signals, even though the input voice data are contaminated by low-frequency noises. The ambient noises in our living rooms and laboratories are collected and the frequencies of these noises are analyzed. Based on the analysis, a filter is designed to reduce noise level of the input voice signal. Then, improved numerical algorithms are employed to extract voice parameters from the voice signal to reveal the health of the voice signal. Compared with MDVP and Praat, the proposed method outperforms these two widely used programs in measuring fundamental frequency and harmonic-to-noise ratio, and its performance is comparable to these two famous programs in computing jitter and shimmer. The proposed voice quality assessment method is resistant to low-frequency noises and it can measure human voice quality in environments filled with noises from air-conditioners, ceiling fans and cooling fans of computers.
Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection
NASA Astrophysics Data System (ADS)
Dov, David; Talmon, Ronen; Cohen, Israel
2016-12-01
In this paper, we address the problem of multiple view data fusion in the presence of noise and interferences. Recent studies have approached this problem using kernel methods, by relying particularly on a product of kernels constructed separately for each view. From a graph theory point of view, we analyze this fusion approach in a discrete setting. More specifically, based on a statistical model for the connectivity between data points, we propose an algorithm for the selection of the kernel bandwidth, a parameter, which, as we show, has important implications on the robustness of this fusion approach to interferences. Then, we consider the fusion of audio-visual speech signals measured by a single microphone and by a video camera pointed to the face of the speaker. Specifically, we address the task of voice activity detection, i.e., the detection of speech and non-speech segments, in the presence of structured interferences such as keyboard taps and office noise. We propose an algorithm for voice activity detection based on the audio-visual signal. Simulation results show that the proposed algorithm outperforms competing fusion and voice activity detection approaches. In addition, we demonstrate that a proper selection of the kernel bandwidth indeed leads to improved performance.
Speaking in Character: Voice Communication in Virtual Worlds
NASA Astrophysics Data System (ADS)
Wadley, Greg; Gibbs, Martin R.
This chapter summarizes 5 years of research on the implications of introducing voice communication systems to virtual worlds. Voice introduces both benefits and problems for players of fast-paced team games, from better coordination of groups and greater social presence of fellow players on the positive side, to negative features such as channel congestion, transmission of noise, and an unwillingness by some to use voice with strangers online. Similarly, in non-game worlds like Second Life, issues related to identity and impression management play important roles, as voice may build greater trust that is especially important for business users, yet it erodes the anonymity and ability to conceal social attributes like gender that are important for other users. A very different mixture of problems and opportunities exists when users conduct several simultaneous conversations in multiple text and voice channels. Technical difficulties still exist with current systems, including the challenge of debugging and harmonizing all the participants' voice setups. Different groups use virtual worlds for very different purposes, so a single modality may not suit all.
Reference-free automatic quality assessment of tracheoesophageal speech.
Huang, Andy; Falk, Tiago H; Chan, Wai-Yip; Parsa, Vijay; Doyle, Philip
2009-01-01
Evaluation of the quality of tracheoesophageal (TE) speech using machines instead of human experts can enhance the voice rehabilitation process for patients who have undergone total laryngectomy and voice restoration. Towards the goal of devising a reference-free TE speech quality estimation algorithm, we investigate the efficacy of speech signal features that are used in standard telephone-speech quality assessment algorithms, in conjunction with a recently introduced speech modulation spectrum measure. Tests performed on two TE speech databases demonstrate that the modulation spectral measure and a subset of features in the standard ITU-T P.563 algorithm estimate TE speech quality with better correlation (up to 0.9) than previously proposed features.
ERIC Educational Resources Information Center
Hipolito-Delgado, Carlos P.; Zion, Shelley
2017-01-01
Critical Civic Inquiry (CCI) is a transformative student voice initiative that engages students in critical conversations about educational equity and inquiry-based learning to increase student voice and promote civic action. A quasi-experimental study was conducted to assess if participation in CCI increased the psychological empowerment (as…
Improving Automated Lexical and Discourse Analysis of Online Chat Dialog
2007-09-01
include spelling- and grammar-checking on our word processing software; voice-recognition in our automobiles; and telephone-based conversational agents ...conversational agents can help customers make purchases on-line [3]. In addition, discourse analyzers can automatically separate multiple, interleaved...telephone-based conversational agent needs to know if it was asked a question or tasked to do something. Indeed, Stolcke et al demonstrated that
Hansen, J H; Nandkumar, S
1995-01-01
The formulation of reliable signal processing algorithms for speech coding and synthesis require the selection of a prior criterion of performance. Though coding efficiency (bits/second) or computational requirements can be used, a final performance measure must always include speech quality. In this paper, three objective speech quality measures are considered with respect to quality assessment for American English, noisy American English, and noise-free versions of seven languages. The purpose is to determine whether objective quality measures can be used to quantify changes in quality for a given voice coding method, with a known subjective performance level, as background noise or language conditions are changed. The speech coding algorithm chosen is regular-pulse excitation with long-term prediction (RPE-LTP), which has been chosen as the standard voice compression algorithm for the European Digital Mobile Radio system. Three areas are considered for objective quality assessment which include: (i) vocoder performance for American English in a noise-free environment, (ii) speech quality variation for three additive background noise sources, and (iii) noise-free performance for seven languages which include English, Japanese, Finnish, German, Hindi, Spanish, and French. It is suggested that although existing objective quality measures will never replace subjective testing, they can be a useful means of assessing changes in performance, identifying areas for improvement in algorithm design, and augmenting subjective quality tests for voice coding/compression algorithms in noise-free, noisy, and/or non-English applications.
When They Just Don't Listen. Refining Your Communication Skills.
ERIC Educational Resources Information Center
Ensman, Richard C. Jr.
1998-01-01
Presents strategies to turn one-sided conversations into meaningful dialogs. Suggests, if a conversation partner is not allowing equal time, trying the following techniques: repeat statement; keep going; match other person's voice; use rejoinder; ask harsh question; simulate anger; alter body position; wag finger; raise eyebrows; take notes; ask…
Oral Conversations Online: Redefining Oral Competence in Synchronous Environments
ERIC Educational Resources Information Center
Lamy, Marie-Noelle
2004-01-01
In this article the focus is on methodology for analysing learner-learner oral conversations mediated by computers. With the increasing availability of synchronous voice-based groupware and the additional facilities offered by audio-graphic tools, language learners have opportunities for collaborating on oral tasks, supported by visual and textual…
McCarthy-Jones, Simon; Castro Romero, Maria; McCarthy-Jones, Roseline; Dillon, Jacqui; Cooper-Rompato, Christine; Kieran, Kathryn; Kaufman, Milissa; Blackman, Lisa
2015-01-01
This paper explores the experiences of women who “hear voices” (auditory verbal hallucinations). We begin by examining historical understandings of women hearing voices, showing these have been driven by androcentric theories of how women’s bodies functioned leading to women being viewed as requiring their voices be interpreted by men. We show the twentieth century was associated with recognition that the mental violation of women’s minds (represented by some voice-hearing) was often a consequence of the physical violation of women’s bodies. We next report the results of a qualitative study into voice-hearing women’s experiences (n = 8). This found similarities between women’s relationships with their voices and their relationships with others and the wider social context. Finally, we present results from a quantitative study comparing voice-hearing in women (n = 65) and men (n = 132) in a psychiatric setting. Women were more likely than men to have certain forms of voice-hearing (voices conversing) and to have antecedent events of trauma, physical illness, and relationship problems. Voices identified as female may have more positive affect than male voices. We conclude that women voice-hearers have and continue to face specific challenges necessitating research and activism, and hope this paper will act as a stimulus to such work. PMID:26779041
Voice Quality and Gender Stereotypes: A Study of Lebanese Women With Reinke's Edema.
Matar, Nayla; Portes, Cristel; Lancia, Leonardo; Legou, Thierry; Baider, Fabienne
2016-12-01
Women with Reinke's edema (RW) report being mistaken for men during telephone conversations. For this reason, their masculine-sounding voices are interesting for the study of gender stereotypes. The study's objective is to verify their complaint and to understand the cues used in gender identification. Using a self-evaluation study, we verified RW's perception of their own voices. We compared the acoustic parameters of vowels produced by 10 RW to those produced by 10 men and 10 women with healthy voices (hereafter referred to as NW) in Lebanese Arabic. We conducted a perception study for the evaluation of RW, healthy men's, and NW voices by naïve listeners. RW self-evaluated their voices as masculine and their gender identities as feminine. The acoustic parameters that distinguish RW from NW voices concern fundamental frequency, spectral slope, harmonicity of the voicing signal, and complexity of the spectral envelope. Naïve listeners very often rate RW as surely masculine. Listeners may rate RW's gender incorrectly. These incorrect gender ratings are correlated with acoustic measures of fundamental frequency and voice quality. Further investigations will reveal the contribution of each of these parameters to gender perception and guide the treatment plan of patients complaining of a gender ambiguous voice.
ERIC Educational Resources Information Center
Strandberg, Warren, Ed.
These proceedings include the following papers: "Reason and Romance in Argument and Conversation" (Margaret Buchman); "Conversation as a Romance of Reason: A Response to Margret Buchman" (James W. Garrison); "In Search of a Calling" (Thomas Buford); "Listening for the Call: A Response to Thomas Buford" (Peter Carbone); "Some Reconceptions in…
Automated Electroglottographic Inflection Events Detection. A Pilot Study.
Codino, Juliana; Torres, María Eugenia; Rubin, Adam; Jackson-Menaldi, Cristina
2016-11-01
Vocal-fold vibration can be analyzed in a noninvasive way by registering impedance changes within the glottis, through electroglottography. The morphology of the electroglottographic (EGG) signal is related to different vibratory patterns. In the literature, a characteristic knee in the descending portion of the signal has been reported. Some EGG signals do not exhibit this particular knee and have other types of events (inflection events) throughout the ascending and/or descending portion of the vibratory cycle. The goal of this work is to propose an automatic method to identify and classify these events. A computational algorithm was developed based on the mathematical properties of the EGG signal, which detects and reports events throughout the contact phase. Retrospective analysis of EGG signals obtained during routine voice evaluation of adult individuals with a variety of voice disorders was performed using the algorithm as well as human raters. Two judges, both experts in clinical voice analysis, and three general speech pathologists performed manual and visual evaluation of the sample set. The results obtained by the automatic method were compared with those of the human raters. Statistical analysis revealed a significant level of agreement. This automatic tool could allow professionals in the clinical setting to obtain an automatic quantitative and qualitative report of such events present in a voice sample, without having to manually analyze the whole EGG signal. In addition, it might provide the speech pathologist with more information that would complement the standard voice evaluation. It could also be a valuable tool in voice research. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
[Research on Control System of an Exoskeleton Upper-limb Rehabilitation Robot].
Wang, Lulu; Hu, Xin; Hu, Jie; Fang, Youfang; He, Rongrong; Yu, Hongliu
2016-12-01
In order to help the patients with upper-limb disfunction go on rehabilitation training,this paper proposed an upper-limb exoskeleton rehabilitation robot with four degrees of freedom(DOF),and realized two control schemes,i.e.,voice control and electromyography control.The hardware and software design of the voice control system was completed based on RSC-4128 chips,which realized the speech recognition technology of a specific person.Besides,this study adapted self-made surface eletromyogram(sEMG)signal extraction electrodes to collect sEMG signals and realized pattern recognition by conducting sEMG signals processing,extracting time domain features and fixed threshold algorithm.In addition,the pulse-width modulation(PWM)algorithm was used to realize the speed adjustment of the system.Voice control and electromyography control experiments were then carried out,and the results showed that the mean recognition rate of the voice control and electromyography control reached 93.1%and 90.9%,respectively.The results proved the feasibility of the control system.This study is expected to lay a theoretical foundation for the further improvement of the control system of the upper-limb rehabilitation robot.
Voice Over Internet Protocol (VoIP) in a Control Center Environment
NASA Technical Reports Server (NTRS)
Pirani, Joseph; Calvelage, Steven
2010-01-01
The technology of transmitting voice over data networks has been available for over 10 years. Mass market VoIP services for consumers to make and receive standard telephone calls over broadband Internet networks have grown in the last 5 years. While operational costs are less with VoIP implementations as opposed to time division multiplexing (TDM) based voice switches, is it still advantageous to convert a mission control center s voice system to this newer technology? Marshall Space Flight Center (MSFC) Huntsville Operations Support Center (HOSC) has converted its mission voice services to a commercial product that utilizes VoIP technology. Results from this testing, design, and installation have shown unique considerations that must be addressed before user operations. There are many factors to consider for a control center voice design. Technology advantages and disadvantages were investigated as they refer to cost. There were integration concerns which could lead to complex failure scenarios but simpler integration for the mission infrastructure. MSFC HOSC will benefit from this voice conversion with less product replacement cost, less operations cost and a more integrated mission services environment.
MoNET: media over net gateway processor for next-generation network
NASA Astrophysics Data System (ADS)
Elabd, Hammam; Sundar, Rangarajan; Dedes, John
2001-12-01
MoNETTM (Media over Net) SX000 product family is designed using a scalable voice, video and packet-processing platform to address applications with channel densities from few voice channels to four OC3 per card. This platform is developed for bridging public circuit-switched network to the next generation packet telephony and data network. The platform consists of a DSP farm, RISC processors and interface modules. DSP farm is required to execute voice compression, image compression and line echo cancellation algorithms for large number of voice, video, fax, and modem or data channels. RISC CPUs are used for performing various packetizations based on RTP, UDP/IP and ATM encapsulations. In addition, RISC CPUs also participate in the DSP farm load management and communication with the host and other MoP devices. The MoNETTM S1000 communications device is designed for voice processing and for bridging TDM to ATM and IP packet networks. The S1000 consists of the DSP farm based on Carmel DSP core and 32-bit RISC CPU, along with Ethernet, Utopia, PCI, and TDM interfaces. In this paper, we will describe the VoIP infrastructure, building blocks of the S500, S1000 and S3000 devices, algorithms executed on these device and associated channel densities, detailed DSP architecture, memory architecture, data flow and scheduling.
Talking about Emotion: Prosody and Skin Conductance Indicate Emotion Regulation.
Matejka, Moritz; Kazzer, Philipp; Seehausen, Maria; Bajbouj, Malek; Klann-Delius, Gisela; Menninghaus, Winfried; Jacobs, Arthur M; Heekeren, Hauke R; Prehn, Kristin
2013-01-01
Talking about emotion and putting feelings into words has been hypothesized to regulate emotion in psychotherapy as well as in everyday conversation. However, the exact dynamics of how different strategies of verbalization regulate emotion and how these strategies are reflected in characteristics of the voice has received little scientific attention. In the present study, we showed emotional pictures to 30 participants and asked them to verbally admit or deny an emotional experience or a neutral fact concerning the picture in a simulated conversation. We used a 2 × 2 factorial design manipulating the focus (on emotion or facts) as well as the congruency (admitting or denying) of the verbal expression. Analyses of skin conductance response (SCR) and voice during the verbalization conditions revealed a main effect of the factor focus. SCR and pitch of the voice were lower during emotion compared to fact verbalization, indicating lower autonomic arousal. In contradiction to these physiological parameters, participants reported that fact verbalization was more effective in down-regulating their emotion than emotion verbalization. These subjective ratings, however, were in line with voice parameters associated with emotional valence. That is, voice intensity showed that fact verbalization reduced negative valence more than emotion verbalization. In sum, the results of our study provide evidence that emotion verbalization as compared to fact verbalization is an effective emotion regulation strategy. Moreover, based on the results of our study we propose that different verbalization strategies influence valence and arousal aspects of emotion selectively.
Climate Voices: Bridging Scientist Citizens and Local Communities across the United States
NASA Astrophysics Data System (ADS)
Wegner, K.; Ristvey, J. D., Jr.
2016-12-01
Based out of the University Corporation for Atmospheric Research (UCAR), the Climate Voices Science Speakers Network (climatevoices.org) has more than 400 participants across the United States that volunteer their time as scientist citizens in their local communities. Climate Voices experts engage in nonpartisan conversations about the local impacts of climate change with groups such as Rotary clubs, collaborate with faith-based groups on climate action initiatives, and disseminate their research findings to K-12 teachers and classrooms through webinars. To support their participants, Climate Voices develops partnerships with networks of community groups, provides trainings on how to engage these communities, and actively seeks community feedback. In this presentation, we will share case studies of science-community collaborations, including meta-analyses of collaborations and lessons learned.
Power flow in normal human voice production
NASA Astrophysics Data System (ADS)
Krane, Michael
2016-11-01
The principal mechanisms of energy utilization in voicing are quantified using a simplified model, in order to better define voice efficiency. A control volume analysis of energy utilization in phonation is presented to identify the energy transfer mechanisms in terms of their function. Conversion of subglottal airstream potential energy into useful work done (vocal fold vibration, flow work, sound radiation), and into heat (sound radiation absorbed by the lungs, glottal jet dissipation) are described. An approximate numerical model is used to compute the contributions of each of these mechanisms, as a function of subglottal pressure, for normal phonation. Acknowledge support of NIH Grant 2R01DC005642-10A1.
Modeling Civic Engagement: A Student Conversation with Jonathan Kozol
ERIC Educational Resources Information Center
Thacker, Peter; Christen, Richard S.
2006-01-01
Jonathan Kozol's visit to Portland, Oregon, in April 2005 included a dialogue with 55 urban middle and high school students about inequities in American schools. Students left this conversation with a stronger sense of the systemic impediments to equal education. They also felt that their voice had been heard on a topic of national import. This…
A Study of Metric Conversion of Distilled Spirits Containers: A Policy and Planning Evaluation.
1981-08-01
alcoholic beverage industry to: "... study the problems expected to arise with conversion to the metric system and to publicly report as soon as... beverage industry had better act quickly (i.e. decide on its metric sizes) in order to have a voice in this decision. This will enable us, he states
ERIC Educational Resources Information Center
Tveit, Anne Dorthe
2009-01-01
This study analyses texts from the National Parents' Committee for Primary and Lower Secondary Education in Norway and addresses how parents describe their own role, the teachers' role, and their conversations. The theoretical perspective employed is Koselleck's conceptual theory. The findings show that, despite having formal legal rights, parents…
The effects of voice and manual control mode on dual task performance
NASA Technical Reports Server (NTRS)
Wickens, C. D.; Zenyuh, J.; Culp, V.; Marshak, W.
1986-01-01
Two fundamental principles of human performance, compatibility and resource competition, are combined with two structural dichotomies in the human information processing system, manual versus voice output, and left versus right cerebral hemisphere, in order to predict the optimum combination of voice and manual control with either hand, for time-sharing performance of a dicrete and continuous task. Eight right handed male subjected performed a discrete first-order tracking task, time-shared with an auditorily presented Sternberg Memory Search Task. Each task could be controlled by voice, or by the left or right hand, in all possible combinations except for a dual voice mode. When performance was analyzed in terms of a dual-task decrement from single task control conditions, the following variables influenced time-sharing efficiency in diminishing order of magnitude, (1) the modality of control, (discrete manual control of tracking was superior to discrete voice control of tracking and the converse was true with the memory search task), (2) response competition, (performance was degraded when both tasks were responded manually), (3) hemispheric competition, (performance degraded whenever two tasks were controlled by the left hemisphere) (i.e., voice or right handed control). The results confirm the value of predictive models invoice control implementation.
IP voice over ATM satellite: experimental results over satellite channels
NASA Astrophysics Data System (ADS)
Saraf, Koroush A.; Butts, Norman P.
1999-01-01
IP telephony, a new technology to provide voice communication over traditional data networks, has the potential to revolutionize telephone communication within the modern enterprise. This innovation uses packetization techniques to carry voice conversations over IP networks. This packet switched technology promises new integrated services, and lower cost long-distance communication compared to traditional circuit switched telephone networks. Future satellites will need to carry IP traffic efficiently in order to stay competitive in servicing the global data- networking and global telephony infrastructure. However, the effects of Voice over IP over switched satellite channels have not been investigated in detail. To fully understand the effects of satellite channels on Voice over IP quality; several experiments were conducted at Lockheed Martin Telecommunications' Satellite Integration Lab. The result of those experiments along with suggested improvements for voice communication over satellite are presented in this document. First, a detailed introduction of IP telephony as a suitable technology for voice communication over future satellites is presented. This is followed by procedures for the experiments, along with results and strategies. In conclusion we hope that these capability demonstrations will alleviate any uncertainty regarding the applicability of this technology to satellite networks.
Outcomes Measurement in Voice Disorders: Application of an Acoustic Index of Dysphonia Severity
ERIC Educational Resources Information Center
Awan, Shaheen N.; Roy, Nelson
2009-01-01
Purpose: The purpose of this experiment was to assess the ability of an acoustic model composed of both time-based and spectral-based measures to track change following voice disorder treatment and to serve as a possible treatment outcomes measure. Method: A weighted, four-factor acoustic algorithm consisting of shimmer, pitch sigma, the ratio of…
Dejonckere, P H; Neumann, K J; Moerman, M B J; Martens, J P; Giordano, A; Manfredi, C
2012-04-01
Spasmodic dysphonia voices form, in the same way as substitution voices, a particular category of dysphonia that seems not suited for a standardized basic multidimensional assessment protocol, like the one proposed by the European Laryngological Society. Thirty-three exhaustive analyses were performed on voices of 19 patients diagnosed with adductor spasmodic dysphonia (SD), before and after treatment with Botulinum toxin. The speech material consisted of 40 short sentences phonetically selected for constant voicing. Seven perceptual parameters (traditional and dedicated) were blindly rated by a panel of experienced clinicians. Nine acoustic measures (mainly based on voicing evidence and periodicity) were achieved by a special analysis program suited for strongly irregular signals and validated with synthesized deviant voices. Patients also filled in a VHI-questionnaire. Significant improvement is shown by all three approaches. The traditional GRB perceptual parameters appear to be adequate for these patients. Conversely, the special acoustic analysis program is successful in objectivating the improved regularity of vocal fold vibration: the basic jitter remains the most valuable parameter, when reliably quantified. The VHI is well suited for the voice-related quality of life. Nevertheless, when considering pre-therapy and post-therapy changes, the current study illustrates a complete lack of correlation between the perceptual, acoustic, and self-assessment dimensions. Assessment of SD-voices needs to be tridimensional.
Start/End Delays of Voiced and Unvoiced Speech Signals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herrnstein, A
Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measuredmore » acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.« less
Writing as Involvement: A Case for Face-to-Face Classroom Talk in a Computer Age.
ERIC Educational Resources Information Center
Berggren, Anne G.
The abandonment of face-to-face voice conversations in favor of the use of electronic conversations in composition classes is an issue to be interrogated. In a recent push to "prepare students for the 21st century," teachers are asked to teach computer applications in the humanities--and composition teachers, who will teach writing in…
ERIC Educational Resources Information Center
Arend, Béatrice; Sunnen, Patrick
2017-01-01
Our paper provides an empirically based perspective on the contribution of Conversation Analysis (CA) to our understanding of children's second language learning practices in a multilingual classroom setting. While exploring the interactional configuration of a French second language learning activity, we focus our analytic lens on how five…
Automatic measurement of voice onset time using discriminative structured prediction.
Sonderegger, Morgan; Keshet, Joseph
2012-12-01
A discriminative large-margin algorithm for automatic measurement of voice onset time (VOT) is described, considered as a case of predicting structured output from speech. Manually labeled data are used to train a function that takes as input a speech segment of an arbitrary length containing a voiceless stop, and outputs its VOT. The function is explicitly trained to minimize the difference between predicted and manually measured VOT; it operates on a set of acoustic feature functions designed based on spectral and temporal cues used by human VOT annotators. The algorithm is applied to initial voiceless stops from four corpora, representing different types of speech. Using several evaluation methods, the algorithm's performance is near human intertranscriber reliability, and compares favorably with previous work. Furthermore, the algorithm's performance is minimally affected by training and testing on different corpora, and remains essentially constant as the amount of training data is reduced to 50-250 manually labeled examples, demonstrating the method's practical applicability to new datasets.
Enlightened Use of Passive Voice in Technical Writing
NASA Technical Reports Server (NTRS)
Trammell, M. K.
1981-01-01
The passive voice as a normal, acceptable, and established syntactic form in technical writing is defended. Passive/active verb ratios, taken from sources including 'antipassivist' text books, are considered. The suitability of the passive voice in technical writing which involves unknown or irrelevant agents is explored. Three 'myths' that the passive (1) utilizes an abnormal and artificial word order, (2) is lifeless, and (3) is indirect are considered. Awkward and abnormal sounding examples encountered in text books are addressed in terms of original context. Unattractive or incoherent passive sentences are explained in terms of inappropriate conversion from active sentences having (1) short nominal or pronominal subjects or (2) verbs with restrictions on their passive use.
Politeness, emotion, and gender: A sociophonetic study of voice pitch modulation
NASA Astrophysics Data System (ADS)
Yuasa, Ikuko
The present dissertation is a cross-gender and cross-cultural sociophonetic exploration of voice pitch characteristics utilizing speech data derived from Japanese and American speakers in natural conversations. The roles of voice pitch modulation in terms of the concepts of politeness and emotion as they pertain to culture and gender will be investigated herein. The research interprets the significance of my findings based on the acoustic measurements of speech data as they are presented in the ERB-rate scale (the most appropriate scale for human speech perception). The investigation reveals that pitch range modulation displayed by Japanese informants in two types of conversations is closely linked to types of politeness adopted by those informants. The degree of the informants' emotional involvement and expressions reflected in differing pitch range widths plays an important role in determining the relationship between pitch range modulation and politeness. The study further correlates the Japanese cultural concept of enryo ("self-restraint") with this phenomenon. When median values were examined, male and female pitch ranges across cultures did not conspicuously differ. However, sporadically occurring women's pitch characteristics which culturally differ in width and height of pitch ranges may create an 'emotional' perception of women's speech style. The salience of these pitch characteristics appears to be the source of the stereotypically linked sound of women's speech being identified as 'swoopy' or 'shrill' and thus 'emotional'. Such women's salient voice characteristics are interpreted in light of camaraderie/positive politeness. Women's use of conspicuous paralinguistic features helps to create an atmosphere of camaraderie. These voice pitch characteristics promote the establishment of a sense of camaraderie since they act to emphasize such feelings as concern, support, and comfort towards addressees, Moreover, men's wide pitch ranges are discussed in view of politeness (rather than gender). Japanese men's use of wide pitch ranges during conversations with familiar interlocutors demonstrate the extent to which male speakers can increase their pitch ranges if there is an authentic socio-cultural inspiration (other than a gender-related one) to do so. The findings suggest the necessity of interpreting research data in consideration of how the notion of gender interacts with other socio-cultural behavioral norms.
Development of a mobile satellite communication unit
NASA Technical Reports Server (NTRS)
Suzuki, Ryutaro; Ikegami, Tetsushi; Hamamoto, Naokazu; Taguchi, Tetsu; Endo, Nobuhiro; Yamamoto, Osamu; Ichiyoshi, Osamu
1988-01-01
A compact 210(W) x 280(H) x 330(D) mm mobile terminal capable of transmitting voice and data through L-band mobile satellites is described. The Voice Codec can convert an analog voice to or from digital codes at rates of 9.6, 8 and 4.8 kb/s by an MPC algorithm. The terminal functions with a single 12 V power supplied vehicle battery. The equipment can operate at any L-band frequency allocated for mobile uses in a full duplex mode and will soon be put into a field test via Japans's ETS-V satellite.
Participatory Visual Methods: Revisioning the Future of Adult Education
ERIC Educational Resources Information Center
Lawrence, Randee Lipson
2017-01-01
This chapter brings together significant themes in the previous chapters, including collaborative research partnerships, voice and agency, self-image, relationships, multiple ways of knowing, difficult conversations, social change, and alternative adult education.
‘Inner voices’: the cerebral representation of emotional voice cues described in literary texts
Kreifelts, Benjamin; Gößling-Arnold, Christina; Wertheimer, Jürgen; Wildgruber, Dirk
2014-01-01
While non-verbal affective voice cues are generally recognized as a crucial behavioral guide in any day-to-day conversation their role as a powerful source of information may extend well beyond close-up personal interactions and include other modes of communication such as written discourse or literature as well. Building on the assumption that similarities between the different ‘modes’ of voice cues may not only be limited to their functional role but may also include cerebral mechanisms engaged in the decoding process, the present functional magnetic resonance imaging study aimed at exploring brain responses associated with processing emotional voice signals described in literary texts. Emphasis was placed on evaluating ‘voice’ sensitive as well as task- and emotion-related modulations of brain activation frequently associated with the decoding of acoustic vocal cues. Obtained findings suggest that several similarities emerge with respect to the perception of acoustic voice signals: results identify the superior temporal, lateral and medial frontal cortex as well as the posterior cingulate cortex and cerebellum to contribute to the decoding process, with similarities to acoustic voice perception reflected in a ‘voice’-cue preference of temporal voice areas as well as an emotion-related modulation of the medial frontal cortex and a task-modulated response of the lateral frontal cortex. PMID:24396008
Full Duplex, Spread Spectrum Radio System
NASA Technical Reports Server (NTRS)
Harvey, Bruce A.
2000-01-01
The goal of this project was to support the development of a full duplex, spread spectrum voice communications system. The assembly and testing of a prototype system consisting of a Harris PRISM spread spectrum radio, a TMS320C54x signal processing development board and a Zilog Z80180 microprocessor was underway at the start of this project. The efforts under this project were the development of multiple access schemes, analysis of full duplex voice feedback delays, and the development and analysis of forward error correction (FEC) algorithms. The multiple access analysis involved the selection between code division multiple access (CDMA), frequency division multiple access (FDMA) and time division multiple access (TDMA). Full duplex voice feedback analysis involved the analysis of packet size and delays associated with full loop voice feedback for confirmation of radio system performance. FEC analysis included studies of the performance under the expected burst error scenario with the relatively short packet lengths, and analysis of implementation in the TMS320C54x digital signal processor. When the capabilities and the limitations of the components used were considered, the multiple access scheme chosen was a combination TDMA/FDMA scheme that will provide up to eight users on each of three separate frequencies. Packets to and from each user will consist of 16 samples at a rate of 8,000 samples per second for a total of 2 ms of voice information. The resulting voice feedback delay will therefore be 4 - 6 ms. The most practical FEC algorithm for implementation was a convolutional code with a Viterbi decoder. Interleaving of the bits of each packet will be required to offset the effects of burst errors.
Vogel, Adam P; Block, Susan; Kefalianos, Elaina; Onslow, Mark; Eadie, Patricia; Barth, Ben; Conway, Laura; Mundt, James C; Reilly, Sheena
2015-04-01
To investigate the feasibility of adopting automated interactive voice response (IVR) technology for remotely capturing standardized speech samples from stuttering children. Participants were 10 6-year-old stuttering children. Their parents called a toll-free number from their homes and were prompted to elicit speech from their children using a standard protocol involving conversation, picture description and games. The automated IVR system was implemented using an off-the-shelf telephony software program and delivered by a standard desktop computer. The software infrastructure utilizes voice over internet protocol. Speech samples were automatically recorded during the calls. Video recordings were simultaneously acquired in the home at the time of the call to evaluate the fidelity of the telephone collected samples. Key outcome measures included syllables spoken, percentage of syllables stuttered and an overall rating of stuttering severity using a 10-point scale. Data revealed a high level of relative reliability in terms of intra-class correlation between the video and telephone acquired samples on all outcome measures during the conversation task. Findings were less consistent for speech samples during picture description and games. Results suggest that IVR technology can be used successfully to automate remote capture of child speech samples.
Deng, Xingjuan; Chen, Ji; Shuai, Jie
2009-08-01
For the purpose of improving the efficiency of aphasia rehabilitation training, artificial intelligence-scheduling function is added in the aphasia rehabilitation software, and the software's performance is improved. With the characteristics of aphasia patient's voice as well as with the need of artificial intelligence-scheduling functions under consideration, the present authors have designed a set of endpoint detection algorithm. It determines the reference endpoints, then extracts every word and ensures the reasonable segmentation points between consonants and vowels, using the reference endpoints. The results of experiments show that the algorithm is able to attain the objects of detection at a higher accuracy rate. Therefore, it is applicable to the detection of endpoint on aphasia-patient's voice.
Former Auctioneer Finds Voice After Aphasia
... And, people in trials also benefit from the social interaction. They become part of our group and we try to create an enjoyable, welcoming, and supportive environment." Now, five years ... social situations and continue normal activities. Limit your conversation ...
Acoustic analysis of speech under stress.
Sondhi, Savita; Khan, Munna; Vijay, Ritu; Salhan, Ashok K; Chouhan, Satish
2015-01-01
When a person is emotionally charged, stress could be discerned in his voice. This paper presents a simplified and a non-invasive approach to detect psycho-physiological stress by monitoring the acoustic modifications during a stressful conversation. Voice database consists of audio clips from eight different popular FM broadcasts wherein the host of the show vexes the subjects who are otherwise unaware of the charade. The audio clips are obtained from real-life stressful conversations (no simulated emotions). Analysis is done using PRAAT software to evaluate mean fundamental frequency (F0) and formant frequencies (F1, F2, F3, F4) both in neutral and stressed state. Results suggest that F0 increases with stress; however, formant frequency decreases with stress. Comparison of Fourier and chirp spectra of short vowel segment shows that for relaxed speech, the two spectra are similar; however, for stressed speech, they differ in the high frequency range due to increased pitch modulation.
Using fuzzy data mining to diagnose patients' degrees of melancholia
NASA Astrophysics Data System (ADS)
Huang, Yo-Ping; Kuo, Wen-Lin
2011-06-01
The common treatments of melancholia are psychotherapy and taking medicines. The psychotherapy treatment which this study focuses on is limited by time and location. It is easier for psychiatrists to grasp information from clinical manifestation but it is difficult for psychiatrists to collect information from patients' daily conversations or emotion. To design a system which psychiatrists enable to capture patients' daily symptoms will show great help in the treatment. This study proposes to use fuzzy data mining algorithm to find association rules among keywords segmented from patients' daily voice/text messages to assist psychiatrists extract useful information before outpatient service. Patients of melancholia can use devices such as mobile phones or computers to record their own emotion anytime and anywhere and then uploading the recorded files to the back-end server for further analysis. The analytical results can be used for psychiatrists to diagnose patients' degrees of melancholia. Experimental results will be given to verify the effectiveness of the proposed methodology.
Pinheiro, Ana P; Rezaii, Neguine; Rauber, Andréia; Nestor, Paul G; Spencer, Kevin M; Niznikiewicz, Margaret
2017-09-01
Abnormalities in self-other voice processing have been observed in schizophrenia, and may underlie the experience of hallucinations. More recent studies demonstrated that these impairments are enhanced for speech stimuli with negative content. Nonetheless, few studies probed the temporal dynamics of self versus nonself speech processing in schizophrenia and, particularly, the impact of semantic valence on self-other voice discrimination. In the current study, we examined these questions, and additionally probed whether impairments in these processes are associated with the experience of hallucinations. Fifteen schizophrenia patients and 16 healthy controls listened to 420 prerecorded adjectives differing in voice identity (self-generated [SGS] versus nonself speech [NSS]) and semantic valence (neutral, positive, and negative), while EEG data were recorded. The N1, P2, and late positive potential (LPP) ERP components were analyzed. ERP results revealed group differences in the interaction between voice identity and valence in the P2 and LPP components. Specifically, LPP amplitude was reduced in patients compared with healthy subjects for SGS and NSS with negative content. Further, auditory hallucinations severity was significantly predicted by LPP amplitude: the higher the SAPS "voices conversing" score, the larger the difference in LPP amplitude between negative and positive NSS. The absence of group differences in the N1 suggests that self-other voice processing abnormalities in schizophrenia are not primarily driven by disrupted sensory processing of voice acoustic information. The association between LPP amplitude and hallucination severity suggests that auditory hallucinations are associated with enhanced sustained attention to negative cues conveyed by a nonself voice. © 2017 Society for Psychophysiological Research.
[Relevance of psychosocial factors in speech rehabilitation after laryngectomy].
Singer, S; Fuchs, M; Dietz, A; Klemm, E; Kienast, U; Meyer, A; Oeken, J; Täschner, R; Wulke, C; Schwarz, R
2007-12-01
Often it is assumed that psychosocial and sociodemographic factors cause the success of voice rehabilitation after laryngectomy. Aim of this study was to analyze the association between these parameters. Based on tumor registries of six ENT-clinics all patients were surveyed, who were laryngectomized in the years before (N = 190). Success of voice rehabilitation has been assessed as speech intelligibility measured with the postlaryngectomy-telephone-intelligibility-test. For the assessment of the psychosocial parameters validated and standardized instruments were used if possible. Statistical analysis was done by multiple logistic regression analysis. Low speech intelligibility is associated with reduced conversations (OR 0.970) and social activity (OR 1.049). Patients are more likely to talk with esophageal voice when their motivation for learning the new voice was high (OR 7.835) and when they assessed their speech therapist as important for their motivation (OR 4.794). The risk to communicate merely by whispering is higher when patients live together with a partner (OR 5.293), when they talk seldomly (OR 1.017) and when they are not very active in social contexts (OR 0.966). Psychosocial factors can only partly explain how voice rehabilitation after laryngectomy becomes a success. Speech intelligibility is associated with active communication behaviour, whereas the use of an esophageal voice is correlated with motivation. It seems that the gaining of tracheoesophageal puncture voice is independent of psychosocial factors.
2005-03-01
conversations over data networks . Many organizations are replacing portions of their traditional phone systems to gain the benefits of cost savings and...relevant to the Coast Guard. It includes the discussion of the public switched telephone network , an overview of IPT, IPT security issues, the...transmitting voice conversations over data networks . Many organizations are replacing portions of their traditional phone systems to gain the benefits of
Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair
NASA Astrophysics Data System (ADS)
Sasou, Akira; Kojima, Hiroaki
2009-12-01
Conventional voice-driven wheelchairs usually employ headset microphones that are capable of achieving sufficient recognition accuracy, even in the presence of surrounding noise. However, such interfaces require users to wear sensors such as a headset microphone, which can be an impediment, especially for the hand disabled. Conversely, it is also well known that the speech recognition accuracy drastically degrades when the microphone is placed far from the user. In this paper, we develop a noise robust speech recognition system for a voice-driven wheelchair. This system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors. We verified the effectiveness of our system in experiments in different environments, and confirmed that our system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors.
NASA Astrophysics Data System (ADS)
Modegi, Toshio
Using our previously developed audio to MIDI code converter tool “Auto-F”, from given vocal acoustic signals we can create MIDI data, which enable to playback the voice-like signals with a standard MIDI synthesizer. Applying this tool, we are constructing a MIDI database, which consists of previously converted simple harmonic structured MIDI codes from a set of 71 Japanese male and female syllable recorded signals. And we are developing a novel voice synthesizing system based on harmonically synthesizing musical sounds, which can generate MIDI data and playback voice signals with a MIDI synthesizer by giving Japanese plain (kana) texts, referring to the syllable MIDI code database. In this paper, we propose an improved MIDI converter tool, which can produce temporally higher-resolution MIDI codes. Then we propose an algorithm separating a set of 20 consonant and vowel phoneme MIDI codes from 71 syllable MIDI converted codes in order to construct a voice synthesizing system. And, we present the evaluation results of voice synthesizing quality between these separated phoneme MIDI codes and their original syllable MIDI codes by our developed 4-syllable word listening tests.
NASA Astrophysics Data System (ADS)
Slavata, Oldřich; Holub, Jan
2015-02-01
This paper deals with an analysis of the relation between the codec that is used, the QoS method, and the final voice transmission quality. The Cisco 2811 router is used for adjusting QoS. VoIP client Linphone is used for adjusting the codec. The criterion for transmission quality is the MOS parameter investigated with the ITU-T P.862 PESQ and P.863 POLQA algorithms.
"Who" is saying "what"? Brain-based decoding of human voice and speech.
Formisano, Elia; De Martino, Federico; Bonte, Milene; Goebel, Rainer
2008-11-07
Can we decipher speech content ("what" is being said) and speaker identity ("who" is saying it) from observations of brain activity of a listener? Here, we combine functional magnetic resonance imaging with a data-mining algorithm and retrieve what and whom a person is listening to from the neural fingerprints that speech and voice signals elicit in the listener's auditory cortex. These cortical fingerprints are spatially distributed and insensitive to acoustic variations of the input so as to permit the brain-based recognition of learned speech from unknown speakers and of learned voices from previously unheard utterances. Our findings unravel the detailed cortical layout and computational properties of the neural populations at the basis of human speech recognition and speaker identification.
Calawerts, William M; Lin, Liyu; Sprott, JC; Jiang, Jack J
2016-01-01
Objective/Hypothesis The purpose of this paper is to introduce rate of divergence as an objective measure to differentiate between the four voice types based on the amount of disorder present in a signal. We hypothesized that rate of divergence would provide an objective measure that can quantify all four voice types. Study Design 150 acoustic voice recordings were randomly selected and analyzed using traditional perturbation, nonlinear, and rate of divergence analysis methods. ty Methods We developed a new parameter, rate of divergence, which uses a modified version of Wolf’s algorithm for calculating Lyapunov exponents of a system. The outcome of this calculation is not a Lyapunov exponent, but rather a description of the divergence of two nearby data points for the next three points in the time series, followed in three time delayed embedding dimensions. This measure was compared to currently existing perturbation and nonlinear dynamic methods of distinguishing between voice signals. Results There was a direct relationship between voice type and rate of divergence. This calculation is especially effective at differentiating between type 3 and type 4 voices (p<0.001), and is equally effective at differentiating type 1, type 2, and type 3 signals as currently existing methods. Conclusion The rate of divergence calculation introduced is an objective measure that can be used to distinguish between all four voice types based on amount of disorder present, leading to quicker and more accurate voice typing as well as an improved understanding of the nonlinear dynamics involved in phonation. PMID:26920858
Calawerts, William M; Lin, Liyu; Sprott, J C; Jiang, Jack J
2017-01-01
The purpose of this paper is to introduce the rate of divergence as an objective measure to differentiate between the four voice types based on the amount of disorder present in a signal. We hypothesized that rate of divergence would provide an objective measure that can quantify all four voice types. A total of 150 acoustic voice recordings were randomly selected and analyzed using traditional perturbation, nonlinear, and rate of divergence analysis methods. We developed a new parameter, rate of divergence, which uses a modified version of Wolf's algorithm for calculating Lyapunov exponents of a system. The outcome of this calculation is not a Lyapunov exponent, but rather a description of the divergence of two nearby data points for the next three points in the time series, followed in three time-delayed embedding dimensions. This measure was compared to currently existing perturbation and nonlinear dynamic methods of distinguishing between voice signals. There was a direct relationship between voice type and rate of divergence. This calculation is especially effective at differentiating between type 3 and type 4 voices (P < 0.001) and is equally effective at differentiating type 1, type 2, and type 3 signals as currently existing methods. The rate of divergence calculation introduced is an objective measure that can be used to distinguish between all four voice types based on the amount of disorder present, leading to quicker and more accurate voice typing as well as an improved understanding of the nonlinear dynamics involved in phonation. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Conversation on African Music.
ERIC Educational Resources Information Center
Saunders, Leslie R.
1985-01-01
A voice and music education teacher at the University of Lagos, Nigeria, talks about African music in this interview. Topics discussed include differences between African and Western music, African melody, rules for composing African music, the theory of counterpoint, and the popularity of classical composers in Nigeria. (RM)
Young Women and the Co-Construction of Leadership
ERIC Educational Resources Information Center
McNae, Rachel
2010-01-01
Purpose: Young women's leadership is an area frequently overlooked in educational leadership development. This paper aims to bring young women's voices into educational leadership conversations and illustrate an alternative approach to young women's leadership development. Design/methodology/approach: This qualitative action research study was…
Why the Student Experience Matters
ERIC Educational Resources Information Center
Withers, Melissa
2009-01-01
Many institutions seek to capture the student voice through market research, including surveys and focus groups. These are worthy exercises and institutions that use these methods deserve credit. Still, most conversations about education reform reflect the perspective and agenda of the institutions that provide educational services, rather than…
Sperry Univac speech communications technology
NASA Technical Reports Server (NTRS)
Medress, Mark F.
1977-01-01
Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.
Clarke, Michael; Bloch, Steven; Wilkinson, Ray
2013-03-01
Managing the exchange of speakers from one person to another effectively is a key issue for participants in everyday conversational interaction. Speakers use a range of resources to indicate, in advance, when their turn will come to an end, and listeners attend to such signals in order to know when they might legitimately speak. Using the principles and findings from conversation analysis, this paper examines features of speaker transfer in a conversation between a boy with cerebral palsy who has been provided with a voice-output communication aid (VOCA), and a peer without physical or communication difficulties. Specifically, the analysis focuses on turn exchange, where a VOCA-mediated contribution approach completion, and the child without communication needs is due to speak next.
Content recognition for telephone monitoring
NASA Astrophysics Data System (ADS)
Wenndt, Stanley J.; Harris, David M.; Cupples, Edward J.
2001-02-01
This research began due to federal inmates abusing their telephone privileges by committing serious offenses such as murder, drug dealing, and fraud. On average, about 1000 calls are made per day at each federal prison with a peak of over 4000. Current monitoring capabilities are very man- intensive and only allow for about 2-3% monitoring of inmate telephone conversations. One of the main deficiencies identified by prison officials is the need to flag phone conversations pertaining to criminal activity. This research looks at two unique voice-processing methods to detect phone conversion pertaining to criminal activity. These two methods are digit string detection and whisper detection.
Raising voices: How sixth graders construct authority and knowledge in argumentative essays
NASA Astrophysics Data System (ADS)
Monahan, Mary Elizabeth
This qualitative classroom-based study documents one teacher-researcher's response to the "voice" debate in composition studies and to the opposing views expressed by Elbow and Bartholomae. The author uses Bakhtin's principle of dialogism, Hymes's theory of communicative competence, as well as Ivanic's discussion of discoursally constructed identities to reconceptualize voice and to redesign writing instruction in her sixth grade classroom. This study shows how students, by redefining and then acting on that voice pedagogy in terms that made sense to them, shaped the author's understanding of what counts as "voiced" writing in non-narrative discourse. Based on a grounded-theory analysis of the twenty-six sixth graders' argumentative essays in science, the author explains voice, not as a property of writers or of texts, but as a process of "knowing together"---a collaborative, but not entirely congenial, exercise of establishing one's authority by talking with, against, and through other voices on the issue. As the results of this study show, the students' "I-Ness" or authorial presence within their texts, was born in a nexus of relationships with "rivals," "allies" and "readers." Given their teacher's injunctions to project confidence and authority in argumentative writing, the students assumed fairly adversarial stances toward these conversational partners throughout their essays. Exaggerating the terms for voiced writing built into the curriculum, the sixth graders produced essays that read more like caricatures than examples of argumentation. Their displays of rhetorical bravado and intellectual aggressiveness, however offsetting to the reader, still enabled these sixth graders to composed voiced essays. This study raises doubts about the value of urging students to sound like their "true selves" or to adopt the formal registers of academe. Students, it seems clear, stand to gain by experimenting with a range of textual identities. The author suggests that voice, as a dialogic process, involves a struggle for meaning---in concert, but also very much in conflict with---other speakers and their intentions.
Johnsson, Anette; Boman, Åse; Wagman, Petra; Pennbrant, Sandra
2018-04-01
To describe how nurses communicate with older patients and their relatives in a department of medicine for older people in western Sweden. Communication is an essential tool for nurses when working with older patients and their relatives, but often patients and relatives experience shortcomings in the communication exchanges. They may not receive information or are not treated in a professional way. Good communication can facilitate the development of a positive meeting and improve the patient's health outcome. An ethnographic design informed by the sociocultural perspective was applied. Forty participatory observations were conducted and analysed during the period October 2015-September 2016. The observations covered 135 hours of nurse-patient-relative interaction. Field notes were taken, and 40 informal field conversations with nurses and 40 with patients and relatives were carried out. Semistructured follow-up interviews were conducted with five nurses. In the result, it was found that nurses communicate with four different voices: a medical voice described as being incomplete, task-oriented and with a disease perspective; a nursing voice described as being confirmatory, process-oriented and with a holistic perspective; a pedagogical voice described as being contextualised, comprehension-oriented and with a learning perspective; and a power voice described as being distancing and excluding. The voices can be seen as context-dependent communication approaches. When nurses switch between the voices, this indicates a shift in the orientation or situation. The results indicate that if nurses successfully combine the voices, while limiting the use of the power voice, the communication exchanges can become a more positive experience for all parties involved and a good nurse-patient-relative communication exchange can be achieved. Working for improved communication between nurses, patients and relatives is crucial for establishing a positive nurse-patient-relative relationship, which is a basis for improving patient care and healthcare outcomes. © 2018 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Beasley Von Burg, Alessandra
2010-01-01
In the upper-level communication seminar that the author teaches--"Practices of Citizenship"--students learn and reflect on multiple theories and practices of citizenship as they develop their own voices in civil, academic, and intellectual conversations. As Aristotle argues, citizenship is a practice, a habit that must be learned. Aristotle's…
Wong, Raymond
2013-01-01
Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. PMID:24288684
Assistant for Analyzing Tropical-Rain-Mapping Radar Data
NASA Technical Reports Server (NTRS)
James, Mark
2006-01-01
A document is defined that describes an approach for a Tropical Rain Mapping Radar Data System (TDS). TDS is composed of software and hardware elements incorporating a two-frequency spaceborne radar system for measuring tropical precipitation. The TDS would be used primarily in generating data products for scientific investigations. The most novel part of the TDS would be expert-system software to aid in the selection of algorithms for converting raw radar-return data into such primary observables as rain rate, path-integrated rain rate, and surface backscatter. The expert-system approach would address the issue that selection of algorithms for processing the data requires a significant amount of preprocessing, non-intuitive reasoning, and heuristic application, making it infeasible, in many cases, to select the proper algorithm in real time. In the TDS, tentative selections would be made to enable conversions in real time. The expert system would remove straightforwardly convertible data from further consideration, and would examine ambiguous data, performing analysis in depth to determine which algorithms to select. Conversions performed by these algorithms, presumed to be correct, would be compared with the corresponding real-time conversions. Incorrect real-time conversions would be updated using the correct conversions.
Apollo 12 voice transcript pertaining to the geology of the landing site
Bailey, N.G.; Ulrich, G.E.
1975-01-01
This document is an edited record of the conversations between the Apollo 12 astronauts and mission control pertaining to the geology of the landing site. It contains all discussions and observations documenting the lunar landscape, its geologic characteristics, the rocks and soils collected, and the lunar surface photographic record along with supplementary remarks essential to the continuity of events during the mission. This transcript is derived from audio tapes and the NASA Technical Air-to-Ground Voice Transcription and includes time of transcription, and photograph and sample numbers. The report also includes a glossary, landing site amp, and sample table.
English Face-to-Face: The Non-Verbal Dimension of Conversation.
ERIC Educational Resources Information Center
Bachmann, James K.
Nonverbal communication is important in foreign language teaching and learning because of its variation in form, meaning and distribution from one culture to another and because of its extensive use in the communicative process. Cross-cultural misunderstandings result from incorrect interpretations of the tone of voice, body motions, facial…
ERIC Educational Resources Information Center
Calderonello, Alice; Shaller, Deborah
In an extended conversation two female writing instructors discuss the kind of discourse available in the academy, the way educators are trained to deploy its conventions, and the different ways that voices are authorized. They cite Harraway as an academic writer who bridges the various post-structuralist discourses without ever losing sight of…
Dialogue on Queering Arts Education across the Americas
ERIC Educational Resources Information Center
Sanders, James H., III.; Vaz, Tales Gubes
2014-01-01
This article constitutes a conversation between professionals of differing generations and nationalities: a North American tenured academic Baby Boomer born in 1951 and a vintage 1986 Millennial South American neophyte professor from Brazil. In this article, we merge our voices in pursuing a literature review and exploring pedagogical practices…
What Do Grandmothers Think about Self-Esteem? American and Taiwanese Folk Theories Revisited
ERIC Educational Resources Information Center
Cho, Grace E.; Sandel, Todd L.; Miller, Peggy J.; Wang, Su-hua
2005-01-01
The study investigates European American and Taiwanese grandmothers' folk theories of childrearing and self-esteem, building on an earlier comparison of mothers from the same families. Adopting methods that privilege local meanings, we bring grandmothers' voices into the conversation about childrearing, thereby contributing to a deeper…
The Relationship between Psychopathology and Speech and Language Disorders in Neurologic Patients.
ERIC Educational Resources Information Center
Sapir, Shimon; Aronson, Arnold E.
1990-01-01
This paper reviews findings that suggest a causal relationship between depression, anxiety, or conversion reaction and voice, speech, and language disorders in neurologic patients. The paper emphasizes the need to consider the psychosocial and psychopathological aspects of neurologic communicative disorders, the link between emotional and…
Personalizing Instruction: Student Voice and Choice. Connect: Making Learning Personal
ERIC Educational Resources Information Center
Sota, Melinda S.; Mahon, Karen
2016-01-01
This field report is the eighth in a series produced by the Center on Innovations in Learning's League of Innovators. The series describes, discusses, and analyzes policies and practices that enable personalization in education. This report introduces sessions from the "Conversations with Innovators" event held at Temple University, June…
Yu, Yang; Wang, Sihan; Tang, Jiafu; Kaku, Ikou; Sun, Wei
2016-01-01
Productivity can be greatly improved by converting the traditional assembly line to a seru system, especially in the business environment with short product life cycles, uncertain product types and fluctuating production volumes. Line-seru conversion includes two decision processes, i.e., seru formation and seru load. For simplicity, however, previous studies focus on the seru formation with a given scheduling rule in seru load. We select ten scheduling rules usually used in seru load to investigate the influence of different scheduling rules on the performance of line-seru conversion. Moreover, we clarify the complexities of line-seru conversion for ten different scheduling rules from the theoretical perspective. In addition, multi-objective decisions are often used in line-seru conversion. To obtain Pareto-optimal solutions of multi-objective line-seru conversion, we develop two improved exact algorithms based on reducing time complexity and space complexity respectively. Compared with the enumeration based on non-dominated sorting to solve multi-objective problem, the two improved exact algorithms saves computation time greatly. Several numerical simulation experiments are performed to show the performance improvement brought by the two proposed exact algorithms.
Evaluation of Different Speech and Touch Interfaces to In-Vehicle Music Retrieval Systems
Garay-Vega, L.; Pradhan, A. K.; Weinberg, G.; Schmidt-Nielsen, B.; Harsham, B.; Shen, Y.; Divekar, G.; Romoser, M.; Knodler, M.; Fisher, D. L.
2010-01-01
In-vehicle music retrieval systems are becoming more and more popular. Previous studies have shown that they pose a real hazard to drivers when the interface is a tactile one which requires multiple entries and a combination of manual control and visual feedback. Voice interfaces exist as an alternative. Such interfaces can require either multiple or single conversational turns. In this study, each of 17 participants between the ages of 18 and 30 years old was asked to use three different music-retrieval systems (one with a multiple entry touch interface, the iPod™, one with a multiple turn voice interface, interface B, and one with a single turn voice interface, interface C) while driving through a virtual world. Measures of secondary task performance, eye behavior, vehicle control, and workload were recorded. When compared with the touch interface, the voice interfaces reduced the total time drivers spent with their eyes off the forward roadway, especially in prolonged glances, as well as both the total number of glances away from the forward roadway and the perceived workload. Furthermore, when compared with driving without a secondary task, both voice interfaces did not significantly impact hazard anticipation, the frequency of long glances away from the forward roadway, or vehicle control. The multiple turn voice interface (B) significantly increased both the time it took drivers to complete the task and the workload. The implications for interface design and safety are discussed. PMID:20380920
Hello Harlie: Enabling Speech Monitoring Through Chat-Bot Conversations.
Ireland, David; Atay, Christina; Liddle, Jacki; Bradford, Dana; Lee, Helen; Rushin, Olivia; Mullins, Thomas; Angus, Dan; Wiles, Janet; McBride, Simon; Vogel, Adam
2016-01-01
People with neurological conditions such as Parkinson's disease and dementia are known to have difficulties in language and communication. This paper presents initial testing of an artificial conversational agent, called Harlie. Harlie runs on a smartphone and is able to converse with the user on a variety of topics. A description of the application and a sample dialog are provided to illustrate the various roles chat-bots can play in the management of neurological conditions. Harlie can be used for measuring voice and communication outcomes during the daily life of the user, and for gaining information about challenges encountered. Moreover, it is anticipated that she may also have an educational and support role.
Impact of novel energy sources: OTEC, wind, goethermal, biomass
NASA Technical Reports Server (NTRS)
Roberts, A. S., Jr.
1978-01-01
Alternate energy conversion methods such as ocean thermal energy conversion (OTEC), wind power, geothermal wells and biomass conversion are being explored, and re-examined in some cases, for commercial viability. At a time when United States fossil fuel and uranium resources are found to be insufficient to supply national needs into the twenty-first century, it is essential to broaden the base of feasible energy conversion technologies. The motivations for development of these four alternative energy forms are established. Primary technical aspects of OTEC, wind, geothermal and biomass energy conversion systems are described along with a discussion of relative advantages and disadvantages of the concepts. Finally, the sentiment is voiced that each of the four systems should be developed to the prototype stage and employed in the region of the country and in the sector of economy which is complimentary to the form of system output.
Gerrek, Monica L
2018-06-01
Much has been written about Dax Cowart's tragic burn injury, treatment, and recovery. While Dax's case is certainly important to conversations regarding decision making in burn care, his is not the only story there is. In this article, the case of Andrea Rubin, also a severe burn survivor, is introduced as another voice in this conversation. Her experience during treatment and recovery is very different from Dax's and should cause us to at least pause and reconsider how we think about treatment and decision making in burn care. © 2018 American Medical Association. All Rights Reserved.
Llamas: Large-area microphone arrays and sensing systems
NASA Astrophysics Data System (ADS)
Sanz-Robinson, Josue
Large-area electronics (LAE) provides a platform to build sensing systems, based on distributing large numbers of densely spaced sensors over a physically-expansive space. Due to their flexible, "wallpaper-like" form factor, these systems can be seamlessly deployed in everyday spaces. They go beyond just supplying sensor readings, but rather they aim to transform the wealth of data from these sensors into actionable inferences about our physical environment. This requires vertically integrated systems that span the entirety of the signal processing chain, including transducers and devices, circuits, and signal processing algorithms. To this end we develop hybrid LAE / CMOS systems, which exploit the complementary strengths of LAE, enabling spatially distributed sensors, and CMOS ICs, providing computational capacity for signal processing. To explore the development of hybrid sensing systems, based on vertical integration across the signal processing chain, we focus on two main drivers: (1) thin-film diodes, and (2) microphone arrays for blind source separation: 1) Thin-film diodes are a key building block for many applications, such as RFID tags or power transfer over non-contact inductive links, which require rectifiers for AC-to-DC conversion. We developed hybrid amorphous / nanocrystalline silicon diodes, which are fabricated at low temperatures (<200 °C) to be compatible with processing on plastic, and have high current densities (5 A/cm2 at 1 V) and high frequency operation (cutoff frequency of 110 MHz). 2) We designed a system for separating the voices of multiple simultaneous speakers, which can ultimately be fed to a voice-command recognition engine for controlling electronic systems. On a device level, we developed flexible PVDF microphones, which were used to create a large-area microphone array. On a circuit level we developed localized a-Si TFT amplifiers, and a custom CMOS IC, for system control, sensor readout and digitization. On a signal processing level we developed an algorithm for blind source separation in a real, reverberant room, based on beamforming and binary masking. It requires no knowledge about the location of the speakers or microphones. Instead, it uses cluster analysis techniques to determine the time delays for beamforming; thus, adapting to the unique acoustic environment of the room.
Integrated DoD Voice and Data Networks and Ground Packet Radio Technology
1976-08-01
as the traffic requirement level increases. Moreover, the satellite switch selection problem is only meaningful over a limited traffic range. When...5: CPU TIMES VS. NUMBER OF SWITCHES SATELLITE SWITCH SELECTION ALGORITHM Computer Used: PDP-10 ♦O’S" means 0 minutes and 5 seconds. 5.30...Saturation Algorithm for Topo\\ogical Design of Parket-Switched Communications Networks," National Te3 ecommunications Conference Proceed- ings, San
Mortimer, Duncan; Segal, Leonie
2008-01-01
Algorithms for converting descriptive measures of health status into quality-adjusted life year (QALY)--weights are now widely available, and their application in economic evaluation is increasingly commonplace. The objective of this study is to describe and compare existing conversion algorithms and to highlight issues bearing on the derivation and interpretation of the QALY-weights so obtained. Systematic review of algorithms for converting descriptive measures of health status into QALY-weights. The review identified a substantial body of literature comprising 46 derivation studies and 16 studies that provided evidence or commentary on the validity of conversion algorithms. Conversion algorithms were derived using 1 of 4 techniques: 1) transfer to utility regression, 2) response mapping, 3) effect size translation, and 4) "revaluing" outcome measures using preference-based scaling techniques. Although these techniques differ in their methodological/theoretical tradition, data requirements, and ease of derivation and application, the available evidence suggests that the sensitivity and validity of derived QALY-weights may be more dependent on the coverage and sensitivity of measures and the disease area/patient group under evaluation than on the technique used in derivation. Despite the recent proliferation of conversion algorithms, a number of questions bearing on the derivation and interpretation of derived QALY-weights remain unresolved. These unresolved issues suggest directions for future research in this area. In the meantime, analysts seeking guidance in selecting derived QALY-weights should consider the validity and feasibility of each conversion algorithm in the disease area and patient group under evaluation rather than restricting their choice to weights from a particular derivation technique.
Global Voices: Culture and Identity in the Teaching of English.
ERIC Educational Resources Information Center
Milner, Joseph O., Ed.; Pope, Carol A., Ed.
This book presents essays that reflect the dialogue and the spirit of conversation of the 1990 International Federation for the Teaching of English (IFTE) Conference in Auckland, New Zealand. The book begins with some of the impressions of the IFTE conference held by the classroom teachers, school administrators, writers, and scholars who attended…
Media Literacy Education: Harnessing the Technological Imaginary
ERIC Educational Resources Information Center
Fry, Katherine G.
2011-01-01
An important challenge for media literacy education in the next decade will be to cultivate a commanding voice in the cultural conversation about new and emerging communication media. To really have a stake in the social, economic and educational developments that emerge around new digital media in the U.S. and globally, media literacy educators…
Equity, Academic Rigour and a Sense of Entitlement: Voices from the "Chalkface"
ERIC Educational Resources Information Center
Aveling, Nado; Davey, Pip; Georgieff, Andre; Jackson-Barrett, Elizabeth; Kosniowska, Helen; Fernandes-Satar, Audrey
2012-01-01
When working with teacher education students one of our aims is to look at "race" and racism, and the implications that "being white" has for teachers' practice. Hence we develop conversations around who we are as gendered and racialised subjects who occupy specific socio-economic positions. Our students find this…
Walking the Talk: Towards a More Inclusive Field of Disability Studies
ERIC Educational Resources Information Center
Opini, Bathseba
2016-01-01
This paper is a conversation about growing an inclusive field of disability studies. The paper draws on data collected through an analysis of existing disability studies programmes in selected Canadian universities. The paper makes a case for including diverse perspectives, experiences, viewpoints, and voices in these programmes. In this work, I…
Beyond the Particular: Prosody and the Coordination of Actions
ERIC Educational Resources Information Center
Szczepek Reed, Beatrice
2012-01-01
The majority of research on prosody in conversation to date has focused on exploring the role of individual prosodic features, such as certain types of pitch accent, pitch register or voice quality, for the accomplishment of specified social actions. From this research the picture emerges that when it comes to the implementation of specific…
Courageous Conversations about Race, Class, and Gender: Voices and Lessons from the Field
ERIC Educational Resources Information Center
Mansfield, Katherine Cumings; Jean-Marie, Gaëtane
2015-01-01
The purpose of this paper is to present a qualitative secondary analysis of two empirical studies that focused on the leadership practices of female practitioners at the secondary level engaging in discourse and practices to disrupt educational inequities. The guiding research question is, "How do school leaders engage in courageous…
ERIC Educational Resources Information Center
Wetterlund, Kris
2012-01-01
In the last part of 2011, conversations swirled around the Internet and print about the assault on museum authority. The Marcus Institute for Digital Education in the Arts (MIDEA) summarized some of the discussion in their blog entry "The Participatory Museum and a New Authority." Other sites joined in the discussion, for example, the Museum Geek…
Diversifying Curriculum as the Practice of Repressive Tolerance
ERIC Educational Resources Information Center
Brookfield, Stephen
2007-01-01
Diversifying curriculum is often assumed to be an unequivocal good in higher education--a way of opening up an educational conversation to include the widest possible diversity of perspectives and intellectual traditions. This democratic attempt to be open and inclusive springs from a humanistic concern to have all student voices heard, all…
The Ellen DeGeneration: Nudging Bias in the Creative Arts Classroom
ERIC Educational Resources Information Center
Harris, Anne
2013-01-01
Research in the areas of lesbian, gay, bisexual, trans and queer (LGBTQ) issues in education has been growing steadily over the past 10 years with the help of Fine and Weis ("Silenced voices and extraordinary conversations: re-imagining schools," 2003), Rasmussen ("Becoming subjects: sexualities and secondary schooling," 2006), Tolman ("Dilemmas…
"Keeping up the Good Fight": The Said and Unsaid in "Flores v. Arizona"
ERIC Educational Resources Information Center
Thomas, Melinda Hollis; Aletheiani, Dinny Risri; Carlson, David Lee; Ewbank, Ann Dutton
2014-01-01
The authors' purpose in this article is to interrogate the mediated and political discourses that emerged alongside the "Flores v. Arizona" case. The authors endeavor to offer another voice, framework and approach that may help sustain a continuous, paramount conversation concerning the educational rights of English language learners and…
ERIC Educational Resources Information Center
Panicacci, Alessandra; Dewaele, Jean-Marc
2018-01-01
A majority of multilinguals report feeling different when switching languages [Dewaele, J.-M. (2016). "Why do So Many Bi- and Multilinguals Feel Different When Switching Languages?" "International Journal of Multilingualism" 13 (1): 92-105; Panicacci, A., and J.-M. Dewaele. (2017). "'A Voice from Elsewhere': Acculturation,…
New Resources to Take Action on Science Policy Issues
NASA Astrophysics Data System (ADS)
Bunge, Carissa
2014-10-01
Millions of people across the United States make their voices heard by asking political candidates and elected officials about the issues that matter to them. As a scientist, you can be a part of the conversation and show politicians that Earth and space sciences are critical fields in need of their support.
Storytelling and Academic Discourse: Including More Voices in the Conversation
ERIC Educational Resources Information Center
Mlynarczyk, Rebecca Williams
2014-01-01
In this article, Mlynarczyk traces her career-long exploration of the relationship between personal, narrative writing and so-called academic discourse. Believing that both are important for college students, particularly students placed in basic writing or ESL composition, she has come to believe that rather than viewing the two as separate modes…
ERIC Educational Resources Information Center
Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.
2012-01-01
In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…
Library Voices; Cassette Conversations in a Tape Exchange.
ERIC Educational Resources Information Center
Christine, Emma Ruth
A proposal is made for an exchange of cassette tapes from librarians, teachers, and students between Palo Alto, California, and Queensland, Australia. The objectives of the project are to help children and adults from both countries to form a closer understanding, and to stimulate and share thoughts and ideas. Suggested activities include singing,…
Popular Culture in Transglossic Language Practices of Young Adults
ERIC Educational Resources Information Center
Sultana, Shaila; Dovchin, Sender
2017-01-01
Based on virtual conversations drawn from two separate intensive ethnographic studies in Bangladesh and Mongolia, we show that popular cultural texts play a significant role in young adults' heteroglossic language practices. On the one hand, they borrow voices from cultural texts and cross the boundaries of language, i.e., codes, modes, and…
Narrative Voices of Early Adolescents: Influences of Learning Disability and Cultural Background
ERIC Educational Resources Information Center
Celinska, Dorota K.
2009-01-01
This study analyzed personal and fictional narratives of culturally/ethnically diverse students with and without learning disabilities. The participants were 82 fourth to seventh graders from urban and suburban schools located in a Midwest metropolitan area. Narratives were elicited in the context of naturalistic conversation and analyzed using…
Towards Artificial Speech Therapy: A Neural System for Impaired Speech Segmentation.
Iliya, Sunday; Neri, Ferrante
2016-09-01
This paper presents a neural system-based technique for segmenting short impaired speech utterances into silent, unvoiced, and voiced sections. Moreover, the proposed technique identifies those points of the (voiced) speech where the spectrum becomes steady. The resulting technique thus aims at detecting that limited section of the speech which contains the information about the potential impairment of the speech. This section is of interest to the speech therapist as it corresponds to the possibly incorrect movements of speech organs (lower lip and tongue with respect to the vocal tract). Two segmentation models to detect and identify the various sections of the disordered (impaired) speech signals have been developed and compared. The first makes use of a combination of four artificial neural networks. The second is based on a support vector machine (SVM). The SVM has been trained by means of an ad hoc nested algorithm whose outer layer is a metaheuristic while the inner layer is a convex optimization algorithm. Several metaheuristics have been tested and compared leading to the conclusion that some variants of the compact differential evolution (CDE) algorithm appears to be well-suited to address this problem. Numerical results show that the SVM model with a radial basis function is capable of effective detection of the portion of speech that is of interest to a therapist. The best performance has been achieved when the system is trained by the nested algorithm whose outer layer is hybrid-population-based/CDE. A population-based approach displays the best performance for the isolation of silence/noise sections, and the detection of unvoiced sections. On the other hand, a compact approach appears to be clearly well-suited to detect the beginning of the steady state of the voiced signal. Both the proposed segmentation models display outperformed two modern segmentation techniques based on Gaussian mixture model and deep learning.
NASA Astrophysics Data System (ADS)
Lynch, John T.
1987-02-01
The present technique for coping with fading and burst noise on HF channels used in digital voice communications transmits digital voice only during high S/N time intervals, and speeds up the speech when necessary to avoid conversation-hindering delays. On the basis of informal listening tests, four test conditions were selected in order to characterize those conditions of speech interruption which would render it comprehensible or incomprehensible. One of the test conditions, 2 secs on and 1/2-sec off, yielded test scores comparable to the reference continuous speech case and is a reasonable match to the temporal variations of a disturbed ionosphere.
Latinus, Marianne; Belin, Pascal
2011-02-22
We are all voice experts. First and foremost, we can produce and understand speech, and this makes us a unique species. But in addition to speech perception, we routinely extract from voices a wealth of socially-relevant information in what constitutes a more primitive, and probably more universal, non-linguistic mode of communication. Consider the following example: you are sitting in a plane, and you can hear a conversation in a foreign language in the row behind you. You do not see the speakers' faces, and you cannot understand the speech content because you do not know the language. Yet, an amazing amount of information is available to you. You can evaluate the physical characteristics of the different protagonists, including their gender, approximate age and size, and associate an identity to the different voices. You can form a good idea of the different speaker's mood and affective state, as well as more subtle cues as the perceived attractiveness or dominance of the protagonists. In brief, you can form a fairly detailed picture of the type of social interaction unfolding, which a brief glance backwards can on the occasion help refine - sometimes surprisingly so. What are the acoustical cues that carry these different types of vocal information? How does our brain process and analyse this information? Here we briefly review an emerging field and the main tools used in voice perception research. Copyright © 2011 Elsevier Ltd. All rights reserved.
An evaluation of voice stress analysis techniques in a simulated AWACS environment
NASA Astrophysics Data System (ADS)
Jones, William A., Jr.
1990-08-01
The purpose was to determine if voice analysis algorithms are an effective measure of stress resulting from high workload. Fundamental frequency, frequency jitter, and amplitude shimmer algorithms were employed to measure the effects of stress in crewmember communications data in simulated AWACS mission scenarios. Two independent workload measures were used to identify levels of stress: a predictor model developed by the simulation author based upon scenario generated stimulus events; and the duration of communication for each weapons director, representative of the individual's response to the induced stress. Between eight and eleven speech samples were analyzed for each of the sixteen Air Force officers who participated in the study. Results identified fundamental frequency and frequency jitter as statistically significant vocal indicators of stress, while amplitude shimmer showed no signs of any significant relationship with workload or stress. Consistent with previous research, the frequency algorithm was identified as the most reliable measure. However, the results did not reveal a sensitive discrimination measure between levels of stress, but rather, did distinguish between the presence or absence of stress. The results illustrate a significant relationship between fundamental frequency and the effects of stress and also a significant inverse relationship with jitter, though less dramatic.
Fowler, Linda P; Gorham-Rowan, Mary; Hapner, Edie R
2011-01-01
The purpose of this study was to determine if measurable changes in fundamental frequency (F(0)) and relative sound level (RSL) occurred in healthy speakers after transcutaneous electrical stimulation (TES) as applied via VitalStim (Chattanooga Group, Chattanooga, TN). A prospective, repeated-measures design. Ten healthy female and 10 healthy male speakers, 20-53 years of age, participated in the study. All participants were nonsmokers and reported negative history for voice disorders. Participants received 1 hour of TES while engaged in eating, drinking, and conversation to simulate a typical dysphagia therapy protocol. Voice recordings were obtained before and immediately after TES. The voice samples consisted of a sustained vowel task and reading of the Rainbow Passage. Measurements of F(0) and RSL were obtained using TF32 (Milenkovic, 2005, University of Wisconsin). The participants also reported any sensations 5 minutes and 24 hours after TES. Measurable changes in F(0) and RSL were found for both tasks but were variable in direction and magnitude. These changes were not statistically significant. Subjective comments ranged from reports of a vocal warm-up feeling to delayed onset muscle soreness. These findings demonstrate that application of TES produces measurable changes in F(0) and RSL. However, the direction and magnitude of these changes are highly variable. Further research is needed to determine factors that may affect the extent to which TES contributes to significant changes in voice. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Berman, Daniel S; Abidov, Aiden; Kang, Xingping; Hayes, Sean W; Friedman, John D; Sciammarella, Maria G; Cohen, Ishac; Gerlach, James; Waechter, Parker B; Germano, Guido; Hachamovitch, Rory
2004-01-01
Recently, a 17-segment model of the left ventricle has been recommended as an optimally weighted approach for interpreting myocardial perfusion single photon emission computed tomography (SPECT). Methods to convert databases from previous 20- to new 17-segment data and criteria for abnormality for the 17-segment scores are needed. Initially, for derivation of the conversion algorithm, 65 patients were studied (algorithm population) (pilot group, n = 28; validation group, n = 37). Three conversion algorithms were derived: algorithm 1, which used mid, distal, and apical scores; algorithm 2, which used distal and apical scores alone; and algorithm 3, which used maximal scores of the distal septal, lateral, and apical segments in the 20-segment model for 3 corresponding segments of the 17-segment model. The prognosis population comprised 16,020 consecutive patients (mean age, 65 +/- 12 years; 41% women) who had exercise or vasodilator stress technetium 99m sestamibi myocardial perfusion SPECT and were followed up for 2.1 +/- 0.8 years. In this population, 17-segment scores were derived from 20-segment scores by use of algorithm 2, which demonstrated the best agreement with expert 17-segment reading in the algorithm population. The prognostic value of the 20- and 17-segment scores was compared by converting the respective summed scores into percent myocardium abnormal. Conversion algorithm 2 was found to be highly concordant with expert visual analysis by the 17-segment model (r = 0.982; kappa = 0.866) in the algorithm population. In the prognosis population, 456 cardiac deaths occurred during follow-up. When the conversion algorithm was applied, extent and severity of perfusion defects were nearly identical by 20- and derived 17-segment scores. The receiver operating characteristic curve areas by 20- and 17-segment perfusion scores were identical for predicting cardiac death (both 0.77 +/- 0.02, P = not significant). The optimal prognostic cutoff value for either 20- or derived 17-segment models was confirmed to be 5% myocardium abnormal, corresponding to a summed stress score greater than 3. Of note, the 17-segment model demonstrated a trend toward fewer mildly abnormal scans and more normal and severely abnormal scans. An algorithm for conversion of 20-segment perfusion scores to 17-segment scores has been developed that is highly concordant with expert visual analysis by the 17-segment model and provides nearly identical prognostic information. This conversion model may provide a mechanism for comparison of studies analyzed by the 17-segment system with previous studies analyzed by the 20-segment approach.
Implications of metric conversion.
Laros, R K
1980-11-01
The international scientific community is rapidly achieving conversion to the metric system, and the Système International (SI system) has been chosen for use by health scientists. Because the United States remains 1 of only 4 countries not now using part or all of the SI system, there is now a systematic effort toward rapid conversion. Although most of the SI system is not controversial, several SI units are highly so. Examples include joules instead of calories, pascals instead of millimeters of mercury, and moles per liter instead of milligrams per 100 milliliters. Obstetrician-gynecologists need to be familiar with the SI units and to voice their feelings about the various controversial units. There are decisions still to be made, and the time for discussion and advice is now.
Type 3 Thyroplasty for a Patient with Female-to-Male Gender Identity Disorder.
Saito, Yu; Nakamura, Kazuhiro; Itani, Shigeto; Tsukahara, Kiyoaki
2018-01-01
In most cases, about the voice of the patient with female-to-male/gender identity disorder (FTM/GID), hormone therapy makes the voice low-pitched. In success cases, there is no need for phonosurgery. However, hormone therapy is not effective in some cases. We perform type 3 thyroplasty in these cases. Hormone therapy was started in 2008 but did not lower the speaking fundamental frequencies (SFFs). We therefore performed TP3 under local anesthesia. In our case, the SFF at the first visit was 146 Hz. The postoperative SFF was 110 Hz. TP3 was performed under local anesthesia in a patient with FTM/GID in whom hormone therapy proved ineffective. With successful conversion to a lower-pitched voice, the patient could begin to live daily life as a male. QOL improved significantly with TP3. If hormone therapy proves ineffective, TP3 may be selected as an optional treatment and appears to show few surgical complications and was, in this case, a very effective treatment.
Acoustic changes in voice after tonsillectomy.
Saida, H; Hirose, H
1996-01-01
The vocal tract from the glottis to the lips is considered to he a resonator and the voice is changeable depending upon the shape of the vocal tract. In this report, we examined the change in pharyngeal size and acoustic feature of voice after tonsillectomy. Subjects were 20 patients. The distance between both anterior pillars (glossopalatine arches), and between both posterior pillars (pharyngopalatine arches) was measured weekly. For acoustic measurements, the five Japanese vowels and Japanese conversational sentences were recorded and analyzed. The distance between both anterior pillars became wider 2 weeks postoperatively, and tended to become narrower thereafter. The distance between both posterior pillars became wider even after 4 weeks postoperatively. No consistent changes in F0, F1 and F2 were found after surgery. Although there was a tendency for a decrease in F3, tonsillectomy did not appear to change the acoustical features of the Japanese vowels remarkably. It was assumed that the subject may adjust the shape of the vocal tract to produce consistent speech sounds after the surgery using auditory feedback.
Lin, Chi-Yueh; Wang, Hsiao-Chuan
2011-07-01
The voice onset time (VOT) of a stop consonant is the interval between its burst onset and voicing onset. Among a variety of research topics on VOT, one that has been studied for years is how VOTs are efficiently measured. Manual annotation is a feasible way, but it becomes a time-consuming task when the corpus size is large. This paper proposes an automatic VOT estimation method based on an onset detection algorithm. At first, a forced alignment is applied to identify the locations of stop consonants. Then a random forest based onset detector searches each stop segment for its burst and voicing onsets to estimate a VOT. The proposed onset detection can detect the onsets in an efficient and accurate manner with only a small amount of training data. The evaluation data extracted from the TIMIT corpus were 2344 words with a word-initial stop. The experimental results showed that 83.4% of the estimations deviate less than 10 ms from their manually labeled values, and 96.5% of the estimations deviate by less than 20 ms. Some factors that influence the proposed estimation method, such as place of articulation, voicing of a stop consonant, and quality of succeeding vowel, were also investigated. © 2011 Acoustical Society of America
Analysis of wolves and sheep. Final report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogden, J.; Papcun, G.; Zlokarnik, I.
1997-08-01
In evaluating speaker verification systems, asymmetries have been observed in the ease with which people are able to break into other people`s voice locks. People who are good at breaking into voice locks are called wolves, and people whose locks are easy to break into are called sheep. (Goats are people that have a difficult time opening their own voice locks.) Analyses of speaker verification algorithms could be used to understand wolf/sheep asymmetries. Using the notion of a ``speaker space``, it is demonstrated that such asymmetries could arise even though the similarity of voice 1 to voice 2 is themore » same as the inverse similarity. This explains partially the wolf/sheep asymmetries, although there may be other factors. The speaker space can be computed from interspeaker similarity data using multidimensional scaling, and such speaker space can be used to given a good approximation of the interspeaker similarities. The derived speaker space can be used to predict which of the enrolled speakers are likely to be wolves and which are likely to be sheep. However, a speaker must first enroll in the speaker key system and then be compared to each of the other speakers; a good estimate of a person`s speaker space position could be obtained using only a speech sample.« less
NASA Astrophysics Data System (ADS)
Luo, Yanting; Zhang, Yongjun; Gu, Wanyi
2009-11-01
In large dynamic networks it is extremely difficult to maintain accurate routing information on all network nodes. The existing studies have illustrated the impact of imprecise state information on the performance of dynamic routing and wavelength assignment (RWA) algorithms. An algorithm called Bypass Based Optical Routing (BBOR) proposed by Xavier Masip-Bruin et al can reduce the effects of having inaccurate routing information in networks operating under the wavelength-continuity constraint. Then they extended the BBOR mechanism (for convenience it's called EBBOR mechanism below) to be applied to the networks with sparse and limited wavelength conversion. But it only considers the characteristic of wavelength conversion in the step of computing the bypass-paths so that its performance may decline with increasing the degree of wavelength translation (this concept will be explained in the section of introduction again). We will demonstrate the issue through theoretical analysis and introduce a novel algorithm which modifies both the lightpath selection and the bypass-paths computation in comparison to EBBOR algorithm. Simulations show that the Modified EBBOR (MEBBOR) algorithm improves the blocking performance significantly in optical networks with Conversion Capability.
Crovato, César David Paredes; Schuck, Adalberto
2007-10-01
This paper presents a dysphonic voice classification system using the wavelet packet transform and the best basis algorithm (BBA) as dimensionality reductor and 06 artificial neural networks (ANN) acting as specialist systems. Each ANN was a 03-layer multilayer perceptron with 64 input nodes, 01 output node and in the intermediary layer the number of neurons depends on the related training pathology group. The dysphonic voice database was separated in five pathology groups and one healthy control group. Each ANN was trained and associated with one of the 06 groups, and fed by the best base tree (BBT) nodes' entropy values, using the multiple cross validation (MCV) method and the leave-one-out (LOO) variation technique and success rates obtained were 87.5%, 95.31%, 87.5%, 100%, 96.87% and 89.06% for the groups 01 to 06, respectively.
Low cost Ku-band earth terminals for voice/data/facsimile
NASA Technical Reports Server (NTRS)
Kelley, R. L.
1977-01-01
A Ku-band satellite earth terminal capable of providing two way voice/facsimile teleconferencing, 128 Kbps data, telephone, and high-speed imagery services is proposed. Optimized terminal cost and configuration are presented as a function of FDMA and TDMA approaches to multiple access. The entire terminal from the antenna to microphones, speakers and facsimile equipment is considered. Component cost versus performance has been projected as a function of size of the procurement and predicted hardware innovations and production techniques through 1985. The lowest cost combinations of components has been determined in a computer optimization algorithm. The system requirements including terminal EIRP and G/T, satellite size, power per spacecraft transponder, satellite antenna characteristics, and link propagation outage were selected using a computerized system cost/performance optimization algorithm. System cost and terminal cost and performance requirements are presented as a function of the size of a nationwide U.S. network. Service costs are compared with typical conference travel costs to show the viability of the proposed terminal.
Making Their Voices Heard: A Conversation with Two Child Care Providers Serving the Legislature
ERIC Educational Resources Information Center
Karolak, Eric
2009-01-01
This article presents an interview with two child care providers who are also legislators, Representative Shannon Erickson and Representative Mary Jane Wallner. Shannon Erickson is a Republican member of the South Carolina House of Representatives, representing District 124 in Beaufort County. While coming to office in a special election in…
Lift Every Voice and Sing: Democratic Dialogue in a Teacher Education Classroom.
ERIC Educational Resources Information Center
Hufford, Don
This paper describes a model that builds on the assumption that educators teaching foundations of education courses have a unique opportunity to model the democratic process and a moral responsibility to infuse the art of human conversation and self-transcendence into education. Exposure to such classes may encourage preservice teachers to go…
How Our Own Speech Rate Influences Our Perception of Others
ERIC Educational Resources Information Center
Bosker, Hans Rutger
2017-01-01
In conversation, our own speech and that of others follow each other in rapid succession. Effects of the surrounding context on speech perception are well documented but, despite the ubiquity of the sound of our own voice, it is unknown whether our own speech also influences our perception of other talkers. This study investigated context effects…
ERIC Educational Resources Information Center
Dias, Paula Ribeiro
2017-01-01
Low-income students at selective institutions report feeling a sense of isolation, alienation, and marginalization. However, it is essential that the voices of low-income students that have successfully navigated the college experience be part of the conversation. Rather than approach the study from a deficit perspective, this Interpretative…
"Why Will You Say That I Am Mad?" Using Poe's "Tell-Tale Heart" as a Composition Model.
ERIC Educational Resources Information Center
Bates, Laura Raidonis
1998-01-01
Describes an exercise for basic writers which encompasses reading, listening, and writing. Finds that Edgar Allan Poe's "Tell-Tale Heart" has an effective vocabulary, a first-person conversational tone for the "mad" voice, and a second-person direct address that makes it easy to follow. Notes that inexperienced readers can…
ERIC Educational Resources Information Center
Rutkowski, David; Rutkowski, Leslie; Langfeldt, Gjert
2012-01-01
This paper aims to better understand economists' increasingly influential voice to the conversation of schooling and education. It draws on curriculum theory to develop a framework for analysis of current economic research in education. The framework consists of the following tri-partition: the political, the practical, and the programmatical.…
ERIC Educational Resources Information Center
Harris, Kathleen I.
2015-01-01
Although developmentally appropriate practice (DAP) has strong merits, there are considerations pertaining to its development and implementation which must be raised. In order for educators to include diverse voices of young children, the time has come for a new conversation to unfold introducing developmentally universal practice (DUP). With this…
Voices of Women in the Field--Creating Conversations: A Networking Approach for Women Leaders
ERIC Educational Resources Information Center
Raskin, Candace F.; Haar, Jean M.; Robicheau, Jerry
2010-01-01
Professional networking is critical for school leaders. Networking has emerged in the literature as one of the major needs in attracting and retaining quality school leaders. There is evidence that professional networking offers a system for women to enhance their career opportunities. However, the evidence shows there are limited professional…
The Obama Era: A Post-Racial Society?
ERIC Educational Resources Information Center
Lum, Lydia
2009-01-01
With Barack Obama ensconced as the nation's first Black president, plenty of voices in the national conversation are trumpeting America as a post-racial society--that race matters much less than it used to, that the boundaries of race have been overcome, that racism is no longer a big problem. In this article, longtime scholars whose life's work…
Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.
2015-01-01
In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency, and voice emerge more saliently in conversation than in repetition, reading, or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have revealed that formulaic language is more impaired than novel language. This descriptive study extends these observations to a case of severely dysfluent dysarthria due to a parkinsonian syndrome. Dysfluencies were quantified and compared for conversation, two forms of repetition, reading, recited speech, and singing. Other measures examined phonetic inventories, word forms, and formulaic language. Phonetic, syllabic, and lexical dysfluencies were more abundant in conversation than in other task conditions. Formulaic expressions in conversation were reduced compared to normal speakers. A proposed explanation supports the notion that the basal ganglia contribute to formulation of internal models for execution of speech. PMID:22774929
DOT National Transportation Integrated Search
1976-09-01
Software used for the reduction and analysis of the multipath prober, modem evaluation (voice, digital data, and ranging), and antenna evaluation data acquired during the ATS-6 field test program is described. Multipath algorithms include reformattin...
Tomicic, Alemka; Martínez, Claudio; Pérez, J Carola; Hollenstein, Tom; Angulo, Salvador; Gerstmann, Adam; Barroux, Isabelle; Krause, Mariane
2015-01-01
This study seeks to provide evidence of the dynamics associated with the configurations of discourse-voice regulatory strategies in patient-therapist interactions in relevant episodes within psychotherapeutic sessions. Its central assumption is that discourses manifest themselves differently in terms of their prosodic characteristics according to their regulatory functions in a system of interactions. The association between discourse and vocal quality in patients and therapists was analyzed in a sample of 153 relevant episodes taken from 164 sessions of five psychotherapies using the state space grid (SSG) method, a graphical tool based on the dynamic systems theory (DST). The results showed eight recurrent and stable discourse-voice regulatory strategies of the patients and three of the therapists. Also, four specific groups of these discourse-voice strategies were identified. The latter were interpreted as regulatory configurations, that is to say, as emergent self-organized groups of discourse-voice regulatory strategies constituting specific interactional systems. Both regulatory strategies and their configurations differed between two types of relevant episodes: Change Episodes and Rupture Episodes. As a whole, these results support the assumption that speaking and listening, as dimensions of the interaction that takes place during therapeutic conversation, occur at different levels. The study not only shows that these dimensions are dependent on each other, but also that they function as a complex and dynamic whole in therapeutic dialog, generating relational offers which allow the patient and the therapist to regulate each other and shape the psychotherapeutic process that characterizes each type of relevant episode.
Experiences of hearing voices: analysis of a novel phenomenological survey
Woods, Angela; Jones, Nev; Alderson-Day, Ben; Callard, Felicity; Fernyhough, Charles
2015-01-01
Summary Background Auditory hallucinations—or voices—are a common feature of many psychiatric disorders and are also experienced by individuals with no psychiatric history. Understanding of the variation in subjective experiences of hallucination is central to psychiatry, yet systematic empirical research on the phenomenology of auditory hallucinations remains scarce. We aimed to record a detailed and diverse collection of experiences, in the words of the people who hear voices themselves. Methods We made a 13 item questionnaire available online for 3 months. To elicit phenomenologically rich data, we designed a combination of open-ended and closed-ended questions, which drew on service-user perspectives and approaches from phenomenological psychiatry, psychology, and medical humanities. We invited people aged 16–84 years with experience of voice-hearing to take part via an advertisement circulated through clinical networks, hearing voices groups, and other mental health forums. We combined qualitative and quantitative methods, and used inductive thematic analysis to code the data and χ2 tests to test additional associations of selected codes. Findings Between Sept 9 and Nov 29, 2013, 153 participants completed the study. Most participants described hearing multiple voices (124 [81%] of 153 individuals) with characterful qualities (106 [69%] individuals). Less than half of the participants reported hearing literally auditory voices—70 (46%) individuals reported either thought-like or mixed experiences. 101 (66%) participants reported bodily sensations while they heard voices, and these sensations were significantly associated with experiences of abusive or violent voices (p=0·024). Although fear, anxiety, depression, and stress were often associated with voices, 48 (31%) participants reported positive emotions and 49 (32%) reported neutral emotions. Our statistical analysis showed that mixed voices were more likely to have changed over time (p=0·030), be internally located (p=0·010), and be conversational in nature (p=0·010). Interpretation This study is, to our knowledge, the largest mixed-methods investigation of auditory hallucination phenomenology so far. Our survey was completed by a diverse sample of people who hear voices with various diagnoses and clinical histories. Our findings both overlap with past large-sample investigations of auditory hallucination and suggest potentially important new findings about the association between acoustic perception and thought, somatic and multisensorial features of auditory hallucinations, and the link between auditory hallucinations and characterological entities. Funding Wellcome Trust. PMID:26360085
Huang, Chengqiang; Yang, Youchang; Wu, Bo; Yu, Weize
2018-06-01
The sub-pixel arrangement of the RGBG panel and the image with RGB format are different and the algorithm that converts RGB to RGBG is urgently needed to display an image with RGB arrangement on the RGBG panel. However, the information loss is still large although color fringing artifacts are weakened in the published papers that study this conversion. In this paper, an RGB-to-RGBG conversion algorithm with adaptive weighting factors based on edge detection and minimal square error (EDMSE) is proposed. The main points of innovation include the following: (1) the edge detection is first proposed to distinguish image details with serious color fringing artifacts and image details which are prone to be lost in the process of RGB-RGBG conversion; (2) for image details with serious color fringing artifacts, the weighting factor 0.5 is applied to weaken the color fringing artifacts; and (3) for image details that are prone to be lost in the process of RGB-RGBG conversion, a special mechanism to minimize square error is proposed. The experiment shows that the color fringing artifacts are slightly improved by EDMSE, and the values of MSE of the image processed are 19.6% and 7% smaller than those of the image processed by the direct assignment and weighting factor algorithm, respectively. The proposed algorithm is implemented on a field programmable gate array to enable the image display on the RGBG panel.
McCarthy-Jones, Simon
2014-01-01
A comprehensive understanding of the phenomenology of auditory hallucinations (AHs) is essential for developing accurate models of their causes. Yet, only 1 detailed study of the phenomenology of AHs with a sample size of N ≥ 100 has been published. The potential for overreliance on these findings, coupled with a lack of phenomenological research into many aspects of AHs relevant to contemporary neurocognitive models and the proposed (but largely untested) existence of AH subtypes, necessitates further research in this area. We undertook the most comprehensive phenomenological study of AHs to date in a psychiatric population (N = 199; 81% people diagnosed with schizophrenia), using a structured interview schedule. Previous phenomenological findings were only partially replicated. New findings included that 39% of participants reported that their voices seemed in some way to be replays of memories of previous conversations they had experienced; 45% reported that the general theme or content of what the voices said was always the same; and 55% said new voices had the same content/theme as previous voices. Cluster analysis, by variable, suggested the existence of 4 AH subtypes. We propose that there are likely to be different neurocognitive processes underpinning these experiences, necessitating revised AH models. PMID:23267192
Monkeys and Humans Share a Common Computation for Face/Voice Integration
Chandrasekaran, Chandramouli; Lemus, Luis; Trubanova, Andrea; Gondan, Matthias; Ghazanfar, Asif A.
2011-01-01
Speech production involves the movement of the mouth and other regions of the face resulting in visual motion cues. These visual cues enhance intelligibility and detection of auditory speech. As such, face-to-face speech is fundamentally a multisensory phenomenon. If speech is fundamentally multisensory, it should be reflected in the evolution of vocal communication: similar behavioral effects should be observed in other primates. Old World monkeys share with humans vocal production biomechanics and communicate face-to-face with vocalizations. It is unknown, however, if they, too, combine faces and voices to enhance their perception of vocalizations. We show that they do: monkeys combine faces and voices in noisy environments to enhance their detection of vocalizations. Their behavior parallels that of humans performing an identical task. We explored what common computational mechanism(s) could explain the pattern of results we observed across species. Standard explanations or models such as the principle of inverse effectiveness and a “race” model failed to account for their behavior patterns. Conversely, a “superposition model”, positing the linear summation of activity patterns in response to visual and auditory components of vocalizations, served as a straightforward but powerful explanatory mechanism for the observed behaviors in both species. As such, it represents a putative homologous mechanism for integrating faces and voices across primates. PMID:21998576
NASA Astrophysics Data System (ADS)
Sherley, Patrick L.; Pujol, Alfonso, Jr.; Meadow, John S.
1990-07-01
To provide a means of rendering complex computer architectures languages and input/output modalities transparent to experienced and inexperienced users research is being conducted to develop a voice driven/voice response computer graphics imaging system. The system will be used for reconstructing and displaying computed tomography and magnetic resonance imaging scan data. In conjunction with this study an artificial intelligence (Al) control strategy was developed to interface the voice components and support software to the computer graphics functions implemented on the Sun Microsystems 4/280 color graphics workstation. Based on generated text and converted renditions of verbal utterances by the user the Al control strategy determines the user''s intent and develops and validates a plan. The program type and parameters within the plan are used as input to the graphics system for reconstructing and displaying medical image data corresponding to that perceived intent. If the plan is not valid the control strategy queries the user for additional information. The control strategy operates in a conversation mode and vocally provides system status reports. A detailed examination of the various AT techniques is presented with major emphasis being placed on their specific roles within the total control strategy structure. 1.
ERIC Educational Resources Information Center
Coulter, Donna M.
2013-01-01
To examine disparities in education, the researcher utilized a naturalistic approach to uncover how youth think, talk, and feel about their response to schooling. Findings are based on in-depth conversations with 12 inner city African-American kids enrolled in Urban, USA middle and high schools, rarely heard from in the scholarly literature.…
ERIC Educational Resources Information Center
Boerner, Heather
2015-01-01
This article describes how out-of-work miners are engaged in the Empowering Education Initiative, a unique alliance between Country Music Television (CMT) and community colleges in Appalachia. The initiative, which includes a website and a series of country music concerts, is changing the conversation in the Appalachian region, giving hope to…
A Conceptual Overview of the Role of Beauty and Aesthetics in Science and Science Education
ERIC Educational Resources Information Center
Girod, Mark
2007-01-01
Conversations on the connection of art, beauty, and the aesthetic experience in science are gaining a voice in the science education community. This article provides a conceptual overview of the role of beauty and aesthetics in science and science education. It focuses on a discussion of four themes exploring beauty in scientific ideas and…
The Question of Residential Schools in Canada: Preserve, Demolish, or Repurpose?
ERIC Educational Resources Information Center
Boffa, Adriana
2017-01-01
With public debates surrounding the removal of historical monuments in Canada (e.g., statues of, or schools named after, John A. Macdonald) and the United States (e.g., Confederate monuments), at times the voices of those who are most directly affected by their presence can be either drowned out or left out of the conversation entirely. It seems…
Something Small, Something Simple: The Beginnings of Children's Conversations with the World
ERIC Educational Resources Information Center
Lewis, Richard
2012-01-01
This essay employs the images and voices of children to describe how their learning about the world is supported as they engage in experiences that invoke creativity and imagination. The author states his belief that this "imagining," this giving body and substance to the nature of "imagination" is one of the foundations of knowing, a means of…
ERIC Educational Resources Information Center
Jacques, Catherine; Behrstock-Sherratt, Ellen; Parker, Amber; Bassett, Katherine; Allen, Megan; Bosso, David; Olson, Derek
2017-01-01
For the last 4 years, 10 leading education organizations have collaborated on a study series that includes teacher voice in conversations and research about educator effectiveness. Initially conceptualized by teacher leaders from the National Network of State Teachers of the Year (NNSTOY) and with their continued input, the "From Good to…
ERIC Educational Resources Information Center
Haglund, Björn
2015-01-01
The focal point of this article is a discussion of pupils' opportunities to make their voices heard and influence the activity in a Swedish leisure-time centre. The study comprises six weeks of ethnographically inspired field work including data from participating observations and walk-and-talk conversations. Two voluntary activities, referred to…
Whispering Selves and Reflective Transformations in the Internal Dialogue of Teachers and Students
ERIC Educational Resources Information Center
Chohan, Sukhdeep Kaur
2010-01-01
It is beyond debate that the way one perceives oneself is influenced by the way one speaks to oneself. Becoming aware of the conversations that take place within the mind has the potential to assist one in recognizing whether the internal voice is self-limiting or self-encouraging. Making classrooms places where teachers and learners are inviting…
An Analysis of High Impact Scholarship and Publication Trends in Blended Learning
ERIC Educational Resources Information Center
Halverson, Lisa R.; Graham, Charles R.; Spring, Kristian J.; Drysdale, Jeffery S.
2012-01-01
Blended learning is a diverse and expanding area of design and inquiry that combines face-to-face and online modalities. As blended learning research matures, numerous voices enter the conversation. This study begins the search for the center of this emerging area of study by finding the most cited scholarship on blended learning. Using Harzing's…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Papantoni-Kazakos, P.; Paterakis, M.
1988-07-01
For many communication applications with time constraints (e.g., transmission of packetized voice messages), a critical performance measure is the percentage of messages transmitted within a given amount of time after their generation at the transmitting station. This report presents a random-access algorithm (RAA) suitable for time-constrained applications. Performance analysis demonstrates that significant message-delay improvement is attained at the expense of minimal traffic loss. Also considered is the case of noisy channels. The noise effect appears at erroneously observed channel feedback. Error sensitivity analysis shows that the proposed random-access algorithm is insensitive to feedback channel errors. Window Random-Access Algorithms (RAAs) aremore » considered next. These algorithms constitute an important subclass of Multiple-Access Algorithms (MAAs); they are distributive, and they attain high throughput and low delays by controlling the number of simultaneously transmitting users.« less
Sex hormones and the female voice.
Abitbol, J; Abitbol, P; Abitbol, B
1999-09-01
In the following, the authors examine the relationship between hormonal climate and the female voice through discussion of hormonal biochemistry and physiology and informal reporting on a study of 197 women with either premenstrual or menopausal voice syndrome. These facts are placed in a larger historical and cultural context, which is inextricably bound to the understanding of the female voice. The female voice evolves from childhood to menopause, under the varied influences of estrogens, progesterone, and testosterone. These hormones are the dominant factor in determining voice changes throughout life. For example, a woman's voice always develops masculine characteristics after an injection of testosterone. Such a change is irreversible. Conversely, male castrati had feminine voices because they lacked the physiologic changes associated with testosterone. The vocal instrument is comprised of the vibratory body, the respiratory power source and the oropharyngeal resonating chambers. Voice is characterized by its intensity, frequency, and harmonics. The harmonics are hormonally dependent. This is illustrated by the changes that occur during male and female puberty: In the female, the impact of estrogens at puberty, in concert with progesterone, produces the characteristics of the female voice, with a fundamental frequency one third lower than that of a child. In the male, androgens released at puberty are responsible for the male vocal frequency, an octave lower than that of a child. Premenstrual vocal syndrome is characterized by vocal fatigue, decreased range, a loss of power and loss of certain harmonics. The syndrome usually starts some 4-5 days before menstruation in some 33% of women. Vocal professionals are particularly affected. Dynamic vocal exploration by televideoendoscopy shows congestion, microvarices, edema of the posterior third of the vocal folds and a loss of its vibratory amplitude. The authors studied 97 premenstrual women who were prescribed a treatment of multivitamins, venous tone stimulants (phlebotonics), and anti-edematous drugs. We obtained symptomatic improvement in 84 patients. The menopausal vocal syndrome is characterized by lowered vocal intensity, vocal fatigue, a decreased range with loss of the high tones and a loss of vocal quality. In a study of 100 menopausal women, 17 presented with a menopausal vocal syndrome. To rehabilitate their voices, and thus their professional lives, patients were prescribed hormone replacement therapy and multi-vitamins. All 97 women showed signs of vocal muscle atrophy, reduction in the thickness of the mucosa and reduced mobility in the cricoarytenoid joint. Multi-factorial therapy (hormone replacement therapy and multi-vitamins) has to be individually adjusted to each case depending on body type, vocal needs, and other factors.
The communicative functions of final rises in Finnish intonation.
Ogden, Richard; Routarinne, Sara
2005-01-01
This paper considers the communicative function of final rises in Finnish conversational talk between pairs of teenage girls. Final rises are fairly common, occurring approximately twice a minute, predominantly on declaratives and in narrative sequences. We briefly consider the interplay between voice quality (known to be a marker of transition relevance) and rising intonation in Finnish. We argue that in narrative sequences, rising terminals manage two main interactional tasks: they provide a place for a coparticipant to mark recipiency, and they project more talk by the current speaker. Using a methodology which combines phonetic observation with conversation analysis, we demonstrate participants' orientation to these functions.
Classroom sound can be used to classify teaching practices in college science courses.
Owens, Melinda T; Seidel, Shannon B; Wong, Mike; Bejines, Travis E; Lietz, Susanne; Perez, Joseph R; Sit, Shangheng; Subedar, Zahur-Saleh; Acker, Gigi N; Akana, Susan F; Balukjian, Brad; Benton, Hilary P; Blair, J R; Boaz, Segal M; Boyer, Katharyn E; Bram, Jason B; Burrus, Laura W; Byrd, Dana T; Caporale, Natalia; Carpenter, Edward J; Chan, Yee-Hung Mark; Chen, Lily; Chovnick, Amy; Chu, Diana S; Clarkson, Bryan K; Cooper, Sara E; Creech, Catherine; Crow, Karen D; de la Torre, José R; Denetclaw, Wilfred F; Duncan, Kathleen E; Edwards, Amy S; Erickson, Karen L; Fuse, Megumi; Gorga, Joseph J; Govindan, Brinda; Green, L Jeanette; Hankamp, Paul Z; Harris, Holly E; He, Zheng-Hui; Ingalls, Stephen; Ingmire, Peter D; Jacobs, J Rebecca; Kamakea, Mark; Kimpo, Rhea R; Knight, Jonathan D; Krause, Sara K; Krueger, Lori E; Light, Terrye L; Lund, Lance; Márquez-Magaña, Leticia M; McCarthy, Briana K; McPheron, Linda J; Miller-Sims, Vanessa C; Moffatt, Christopher A; Muick, Pamela C; Nagami, Paul H; Nusse, Gloria L; Okimura, Kristine M; Pasion, Sally G; Patterson, Robert; Pennings, Pleuni S; Riggs, Blake; Romeo, Joseph; Roy, Scott W; Russo-Tait, Tatiane; Schultheis, Lisa M; Sengupta, Lakshmikanta; Small, Rachel; Spicer, Greg S; Stillman, Jonathon H; Swei, Andrea; Wade, Jennifer M; Waters, Steven B; Weinstein, Steven L; Willsie, Julia K; Wright, Diana W; Harrison, Colin D; Kelley, Loretta A; Trujillo, Gloriana; Domingo, Carmen R; Schinske, Jeffrey N; Tanner, Kimberly D
2017-03-21
Active-learning pedagogies have been repeatedly demonstrated to produce superior learning gains with large effect sizes compared with lecture-based pedagogies. Shifting large numbers of college science, technology, engineering, and mathematics (STEM) faculty to include any active learning in their teaching may retain and more effectively educate far more students than having a few faculty completely transform their teaching, but the extent to which STEM faculty are changing their teaching methods is unclear. Here, we describe the development and application of the machine-learning-derived algorithm Decibel Analysis for Research in Teaching (DART), which can analyze thousands of hours of STEM course audio recordings quickly, with minimal costs, and without need for human observers. DART analyzes the volume and variance of classroom recordings to predict the quantity of time spent on single voice (e.g., lecture), multiple voice (e.g., pair discussion), and no voice (e.g., clicker question thinking) activities. Applying DART to 1,486 recordings of class sessions from 67 courses, a total of 1,720 h of audio, revealed varied patterns of lecture (single voice) and nonlecture activity (multiple and no voice) use. We also found that there was significantly more use of multiple and no voice strategies in courses for STEM majors compared with courses for non-STEM majors, indicating that DART can be used to compare teaching strategies in different types of courses. Therefore, DART has the potential to systematically inventory the presence of active learning with ∼90% accuracy across thousands of courses in diverse settings with minimal effort.
Classroom sound can be used to classify teaching practices in college science courses
Seidel, Shannon B.; Wong, Mike; Bejines, Travis E.; Lietz, Susanne; Perez, Joseph R.; Sit, Shangheng; Subedar, Zahur-Saleh; Acker, Gigi N.; Akana, Susan F.; Balukjian, Brad; Benton, Hilary P.; Blair, J. R.; Boaz, Segal M.; Boyer, Katharyn E.; Bram, Jason B.; Burrus, Laura W.; Byrd, Dana T.; Caporale, Natalia; Carpenter, Edward J.; Chan, Yee-Hung Mark; Chen, Lily; Chovnick, Amy; Chu, Diana S.; Clarkson, Bryan K.; Cooper, Sara E.; Creech, Catherine; Crow, Karen D.; de la Torre, José R.; Denetclaw, Wilfred F.; Duncan, Kathleen E.; Edwards, Amy S.; Erickson, Karen L.; Fuse, Megumi; Gorga, Joseph J.; Govindan, Brinda; Green, L. Jeanette; Hankamp, Paul Z.; Harris, Holly E.; He, Zheng-Hui; Ingalls, Stephen; Ingmire, Peter D.; Jacobs, J. Rebecca; Kamakea, Mark; Kimpo, Rhea R.; Knight, Jonathan D.; Krause, Sara K.; Krueger, Lori E.; Light, Terrye L.; Lund, Lance; Márquez-Magaña, Leticia M.; McCarthy, Briana K.; McPheron, Linda J.; Miller-Sims, Vanessa C.; Moffatt, Christopher A.; Muick, Pamela C.; Nagami, Paul H.; Nusse, Gloria L.; Okimura, Kristine M.; Pasion, Sally G.; Patterson, Robert; Riggs, Blake; Romeo, Joseph; Roy, Scott W.; Russo-Tait, Tatiane; Schultheis, Lisa M.; Sengupta, Lakshmikanta; Small, Rachel; Spicer, Greg S.; Stillman, Jonathon H.; Swei, Andrea; Wade, Jennifer M.; Waters, Steven B.; Weinstein, Steven L.; Willsie, Julia K.; Wright, Diana W.; Harrison, Colin D.; Kelley, Loretta A.; Trujillo, Gloriana; Domingo, Carmen R.; Schinske, Jeffrey N.; Tanner, Kimberly D.
2017-01-01
Active-learning pedagogies have been repeatedly demonstrated to produce superior learning gains with large effect sizes compared with lecture-based pedagogies. Shifting large numbers of college science, technology, engineering, and mathematics (STEM) faculty to include any active learning in their teaching may retain and more effectively educate far more students than having a few faculty completely transform their teaching, but the extent to which STEM faculty are changing their teaching methods is unclear. Here, we describe the development and application of the machine-learning–derived algorithm Decibel Analysis for Research in Teaching (DART), which can analyze thousands of hours of STEM course audio recordings quickly, with minimal costs, and without need for human observers. DART analyzes the volume and variance of classroom recordings to predict the quantity of time spent on single voice (e.g., lecture), multiple voice (e.g., pair discussion), and no voice (e.g., clicker question thinking) activities. Applying DART to 1,486 recordings of class sessions from 67 courses, a total of 1,720 h of audio, revealed varied patterns of lecture (single voice) and nonlecture activity (multiple and no voice) use. We also found that there was significantly more use of multiple and no voice strategies in courses for STEM majors compared with courses for non-STEM majors, indicating that DART can be used to compare teaching strategies in different types of courses. Therefore, DART has the potential to systematically inventory the presence of active learning with ∼90% accuracy across thousands of courses in diverse settings with minimal effort. PMID:28265087
Robust matching for voice recognition
NASA Astrophysics Data System (ADS)
Higgins, Alan; Bahler, L.; Porter, J.; Blais, P.
1994-10-01
This paper describes an automated method of comparing a voice sample of an unknown individual with samples from known speakers in order to establish or verify the individual's identity. The method is based on a statistical pattern matching approach that employs a simple training procedure, requires no human intervention (transcription, work or phonetic marketing, etc.), and makes no assumptions regarding the expected form of the statistical distributions of the observations. The content of the speech material (vocabulary, grammar, etc.) is not assumed to be constrained in any way. An algorithm is described which incorporates frame pruning and channel equalization processes designed to achieve robust performance with reasonable computational resources. An experimental implementation demonstrating the feasibility of the concept is described.
The owners have a right to be heard: Patient voice in design and performance improvement.
MacLeod, Hugh
2015-05-01
The Canadian taxpayer is an owner of the healthcare system and the owners have a right to be heard. This article encourages leaders both formal and informal to create cultures that promote ASKing questions to test assumptions held, LISTENing to hear the patient voice, and TALKing with patients and families to create new conversations and narratives. Looking at the label, "healthcare system" what's your contribution to creating health, how will you dedicate yourself to caring about the healthcare consumer and care provider, and what will be your role in creating a new and improved system? An implied question at the foundation of the article is this: Is the difference between managing and leading a difference of empathy? © 2015 The Canadian College of Health Leaders.
ERIC Educational Resources Information Center
Hood, Carra L.
2017-01-01
Stockton University, a mid-sized state university in the mid-Atlantic region of the United Stated, initiated the first of two pilots for implementation of its institutional outcomes during the fall semester 2014. At the beginning of that semester, in an effort to gauge students' attitudes university-wide toward the value of the outcomes, the…
ERIC Educational Resources Information Center
Slowiak, Julie M.; Madden, Gregory J.; Mathews, Ramona
2006-01-01
Appointment coordinators at a mid-western medical clinic were to provide exceptional telephone customer service. This included using a standard greeting, speaking in an appropriate tone of voice during the conversation, and using a standard closing to end the call. An analysis suggested performance deficiencies resulted from weak antecedents, poor…
ERIC Educational Resources Information Center
Stewart, Thomas; Lucas-McLean, Juanita; Jensen, Laura I.; Fetzko, Christina; Ho, Bonnie; Segovia, Sylvia
2010-01-01
This report, designed as one component of the comprehensive evaluation of the Milwaukee school system being conducted by the School Choice Demonstration Project (SCDP), is based on focus group conversations with low-income families whose children attend Milwaukee public and private schools. The report seeks to elucidate the demand side of school…
Jenny's ABC's: AIDS, Blood, and Children. A Guide for Adults To Read with Elementary Age Children.
ERIC Educational Resources Information Center
Simpson, Christine
This guide, written in simple language appropriate for young children, uses a direct, conversational style to explain Acquired Immune Deficiency Syndrome (AIDS), how safely and comfortably to be with individuals who have AIDS, and how to avoid contracting the disease. The text is in the voice of an 11-year-old girl whose uncle died of AIDS. It…
Apollo 12 Voice Transcript Pertaining to the Geology of the Landing Site, Volume 2
NASA Technical Reports Server (NTRS)
Bailey, N. G.; Ulrich, G. E.
1975-01-01
An edited record of the conversions between the Apollo 12 astronauts and mission control pertaining to the geology of the landing site, is presented. All discussions and observations documenting the lunar landscape, its geologic characteristics, the rocks and soils collected and the lunar surface photographic record are included along with supplementary remarks essential to the continuity of events during the mission.
ERIC Educational Resources Information Center
McDermott, Kevin; Richardson, Fiona
2005-01-01
One of the central challenges for a school is the creation of a public discourse which expresses the shared purpose of the school community, without losing the multiple and different voices within the teaching staff. In this article we report on the generative potential of educational conversation, when it is structured around questions which…
Dysphonia Detected by Pattern Recognition of Spectral Composition.
ERIC Educational Resources Information Center
Leinonen, Lea; And Others
1992-01-01
This study analyzed production of a long vowel sound within Finnish words by normal or dysphonic voices, using the Self-Organizing Map, the artificial neural network algorithm of T. Kohonen which produces two-dimensional representations of speech. The method was found to be both sensitive and specific in the detection of dysphonia. (Author/JDD)
Self, Voices and Embodiment: A Phenomenological Analysis
Rosen, C; Jones, N; Chase, KA; Grossman, LS; Gin, H; Sharma, RP
2016-01-01
Objective The primary aim of this study was to examine first-person phenomenological descriptions of the relationship between the self and Auditory Verbal Hallucinations (AVHs). Complex AVHs are frequently described as entities with clear interpersonal characteristics. Strikingly, investigations of first-person (subjective) descriptions of the phenomenology of the relationship are virtually absent from the literature. Method Twenty participants with psychosis and actively experiencing AVHs were recruited from the University of Illinois at Chicago. A mixed-methods design involving qualitative and quantitative components was utilized. Following a priority-sequence model of complementarity, quantitative analyses were used to test elements of emergent qualitative themes. Results The qualitative analysis identified three foundational constructs in the relationship between self and voices: ‘understanding of origin,’ ‘distinct interpersonal identities,’ and ‘locus of control.’ Quantitative analyses further supported identified links of these constructs. Subjects experienced their AVHs as having identities distinct from self and actively engaged with their AVHs experienced a greater sense of autonomy and control over AVHs. Discussion Given the clinical importance of AVHs and emerging strategies targeting the relationship between the hearer and voices, our findings highlight the importance of these relational constructs in improvement and innovation of clinical interventions. Our analyses also underscore the value of detailed voice assessments such as those provided by the Maastricht Interview are needed in the evaluation process. Subjects narratives shows that the relational phenomena between hearer and AVH(s) is dynamic, and can be influenced and changed through the hearers’ engagement, conversation, and negotiation with their voices. PMID:27099869
Tomicic, Alemka; Martínez, Claudio; Pérez, J. Carola; Hollenstein, Tom; Angulo, Salvador; Gerstmann, Adam; Barroux, Isabelle; Krause, Mariane
2015-01-01
This study seeks to provide evidence of the dynamics associated with the configurations of discourse-voice regulatory strategies in patient–therapist interactions in relevant episodes within psychotherapeutic sessions. Its central assumption is that discourses manifest themselves differently in terms of their prosodic characteristics according to their regulatory functions in a system of interactions. The association between discourse and vocal quality in patients and therapists was analyzed in a sample of 153 relevant episodes taken from 164 sessions of five psychotherapies using the state space grid (SSG) method, a graphical tool based on the dynamic systems theory (DST). The results showed eight recurrent and stable discourse-voice regulatory strategies of the patients and three of the therapists. Also, four specific groups of these discourse-voice strategies were identified. The latter were interpreted as regulatory configurations, that is to say, as emergent self-organized groups of discourse-voice regulatory strategies constituting specific interactional systems. Both regulatory strategies and their configurations differed between two types of relevant episodes: Change Episodes and Rupture Episodes. As a whole, these results support the assumption that speaking and listening, as dimensions of the interaction that takes place during therapeutic conversation, occur at different levels. The study not only shows that these dimensions are dependent on each other, but also that they function as a complex and dynamic whole in therapeutic dialog, generating relational offers which allow the patient and the therapist to regulate each other and shape the psychotherapeutic process that characterizes each type of relevant episode. PMID:25932014
When the Eyes No Longer Lead: Familiarity and Length Effects on Eye-Voice Span
Silva, Susana; Reis, Alexandra; Casaca, Luís; Petersson, Karl M.; Faísca, Luís
2016-01-01
During oral reading, the eyes tend to be ahead of the voice (eye-voice span, EVS). It has been hypothesized that the extent to which this happens depends on the automaticity of reading processes, namely on the speed of print-to-sound conversion. We tested whether EVS is affected by another automaticity component – immunity from interference. To that end, we manipulated word familiarity (high-frequency, low-frequency, and pseudowords, PW) and word length as proxies of immunity from interference, and we used linear mixed effects models to measure the effects of both variables on the time interval at which readers do parallel processing by gazing at word N + 1 while not having articulated word N yet (offset EVS). Parallel processing was enhanced by automaticity, as shown by familiarity × length interactions on offset EVS, and it was impeded by lack of automaticity, as shown by the transformation of offset EVS into voice-eye span (voice ahead of the offset of the eyes) in PWs. The relation between parallel processing and automaticity was strengthened by the fact that offset EVS predicted reading velocity. Our findings contribute to understand how the offset EVS, an index that is obtained in oral reading, may tap into different components of automaticity that underlie reading ability, oral or silent. In addition, we compared the duration of the offset EVS with the average reference duration of stages in word production, and we saw that the offset EVS may accommodate for more than the articulatory programming stage of word N. PMID:27853446
DataComm in Flight Deck Surface Trajectory-Based Operations
NASA Technical Reports Server (NTRS)
Bakowski, Deborah L.; Foyle, David C.; Hooey, Becky L.; Meyer, Glenn R.; Wolter, Cynthia A.
2012-01-01
The purpose of this pilot-in-the-loop aircraft taxi simulation was to evaluate a NextGen concept for surface trajectory-based operations (STBO) in which air traffic control (ATC) issued taxi clearances with a required time of arrival (RTA) by Data Communications (DataComm). Flight deck avionics, driven by an error-nulling algorithm, displayed the speed needed to meet the RTA. To ensure robustness of the algorithm, the ability of 10 two-pilot crews to meet the RTA was tested in nine experimental trials representing a range of realistic conditions including a taxi route change, an RTA change, a departure clearance change, and a crossing traffic hold scenario. In some trials, these DataComm taxi clearances or clearance modifications were accompanied by 'preview' information, in which the airport map display showed a preview of the proposed route changes, including the necessary speed to meet the RTA. Overall, the results of this study show that with the aid of the RTA speed algorithm, pilots were able to meet their RTAs with very little time error in all of the robustness-testing scenarios. Results indicated that when taxi clearance changes were issued by DataComm only, pilots required longer notification distances than with voice communication. However, when the DataComm was accompanied by graphical preview, the notification distance required by pilots was equivalent to that for voice.
DataComm in Flight Deck Surface Trajectory-Based Operations. Chapter 20
NASA Technical Reports Server (NTRS)
Bakowski, Deborah L.; Foyle, David C.; Hooey, Becky L.; Meyer, Glenn R.; Wolter, Cynthia A.
2012-01-01
The purpose of this pilot-in-the-loop aircraft taxi simulation was to evaluate a NextGen concept for surface trajectory-based operations (STBO) in which air traffic control (ATC) issued taxi clearances with a required time of arrival (RTA) by Data Communications (DataComm). Flight deck avionics, driven by an error-nulling algorithm, displayed the speed needed to meet the RTA. To ensure robustness of the algorithm, the ability of 10 two-pilot crews to meet the RTA was tested in nine experimental trials representing a range of realistic conditions including a taxi route change, an RTA change, a departure clearance change, and a crossing traffic hold scenario. In some trials, these DataComm taxi clearances or clearance modifications were accompanied by preview information, in which the airport map display showed a preview of the proposed route changes, including the necessary speed to meet the RTA. Overall, the results of this study show that with the aid of the RTA speed algorithm, pilots were able to meet their RTAs with very little time error in all of the robustness-testing scenarios. Results indicated that when taxi clearance changes were issued by DataComm only, pilots required longer notification distances than with voice communication. However, when the DataComm was accompanied by graphical preview, the notification distance required by pilots was equivalent to that for voice.
Guidi, Andrea; Salvi, Sergio; Ottaviano, Manuel; Gentili, Claudio; Bertschy, Gilles; de Rossi, Danilo; Scilingo, Enzo Pasquale; Vanello, Nicola
2015-11-06
Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.
Guidi, Andrea; Salvi, Sergio; Ottaviano, Manuel; Gentili, Claudio; Bertschy, Gilles; de Rossi, Danilo; Scilingo, Enzo Pasquale; Vanello, Nicola
2015-01-01
Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented. PMID:26561811
2002-09-01
Protocol LAN Local Area Network LDAP Lightweight Directory Access Protocol LLQ Low Latency Queuing MAC Media Access Control MarCorSysCom Marine...Description Protocol SIP Session Initiation Protocol SMTP Simple Mail Transfer Protocol SPAWAR Space and Naval Warfare Systems Center SS7 ...PSTN infrastructure previously required to carry the conversation. The cost of accessing the PSTN is thereby eliminated. In cases where Internet
Life, Liberty, the Pursuit of Happiness : Cyberhate, Cybercrime, and Cyberterrorism in Burma
2014-10-30
risks, however, to doing nothing. As Internet adoption becomes widespread in Burma, the Burmese government may assume a laissez - faire attitude...see an opportunity. In the past year, Cisco, Microsoft, Google, Facebook, PayPal and other companies have sent their corporate leadership to...for further cooperation. Additionally, with a non-accusatory voice in the conversation, we have the potential to influence Burmese leadership in a
ERIC Educational Resources Information Center
Friedman Narr, Rachel; Kemmery, Megan
2015-01-01
This study used a qualitative design to explore parent mentors' summaries of conversations with more than 1,000 individual families of deaf and hard-of-hearing (DHH) children receiving parent-to-parent support as part of an existing family support project. Approximately 35% of the families were Spanish speaking. Five parent mentors who have…
"We Are Never Invited": School Children Using Collage to Envision Care and Support in Rural Schools
ERIC Educational Resources Information Center
Khanare, Fumane P.; de Lange, Naydene
2017-01-01
The voices of school children who are orphaned and vulnerable are more often than not missing from conversations about their care and support at school. In a rural ecology this is even more so the case. This article draws on a study with school children in rural KwaZulu-Natal and explores their constructions of care and support in the age of HIV…
Speech-recognition interfaces for music information retrieval
NASA Astrophysics Data System (ADS)
Goto, Masataka
2005-09-01
This paper describes two hands-free music information retrieval (MIR) systems that enable a user to retrieve and play back a musical piece by saying its title or the artist's name. Although various interfaces for MIR have been proposed, speech-recognition interfaces suitable for retrieving musical pieces have not been studied. Our MIR-based jukebox systems employ two different speech-recognition interfaces for MIR, speech completion and speech spotter, which exploit intentionally controlled nonverbal speech information in original ways. The first is a music retrieval system with the speech-completion interface that is suitable for music stores and car-driving situations. When a user only remembers part of the name of a musical piece or an artist and utters only a remembered fragment, the system helps the user recall and enter the name by completing the fragment. The second is a background-music playback system with the speech-spotter interface that can enrich human-human conversation. When a user is talking to another person, the system allows the user to enter voice commands for music playback control by spotting a special voice-command utterance in face-to-face or telephone conversations. Experimental results from use of these systems have demonstrated the effectiveness of the speech-completion and speech-spotter interfaces. (Video clips: http://staff.aist.go.jp/m.goto/MIR/speech-if.html)
Use of speech-to-text technology for documentation by healthcare providers.
Ajami, Sima
2016-01-01
Medical records are a critical component of a patient's treatment. However, documentation of patient-related information is considered a secondary activity in the provision of healthcare services, often leading to incomplete medical records and patient data of low quality. Advances in information technology (IT) in the health system and registration of information in electronic health records (EHR) using speechto- text conversion software have facilitated service delivery. This narrative review is a literature search with the help of libraries, books, conference proceedings, databases of Science Direct, PubMed, Proquest, Springer, SID (Scientific Information Database), and search engines such as Yahoo, and Google. I used the following keywords and their combinations: speech recognition, automatic report documentation, voice to text software, healthcare, information, and voice recognition. Due to lack of knowledge of other languages, I searched all texts in English or Persian with no time limits. Of a total of 70, only 42 articles were selected. Speech-to-text conversion technology offers opportunities to improve the documentation process of medical records, reduce cost and time of recording information, enhance the quality of documentation, improve the quality of services provided to patients, and support healthcare providers in legal matters. Healthcare providers should recognize the impact of this technology on service delivery.
A Research Program in Computer Technology. Volume 1
1981-08-01
rigidity, sensor networks 10. command and control, digital voice communication, graphic input device for terminal, multimedia communications, portable...satellite channel in the internetwork environment; Distributed Sensor Networks - formulation of algorithms and communication protocols to support the...operation of geographically distributed sensors ; Personal Communicator - work intended to result in a demonstration-level portable terminal to test and
Google Voice: Connecting Your Telephone to the 21st Century
ERIC Educational Resources Information Center
Johnson, Benjamin E.
2010-01-01
The foundation of the mighty Google Empire rests upon an algorithm that connects people to information--things such as websites, maps, and restaurant reviews. Lately it seems that people are less interested in connecting with information than they are with connecting to one another, which begs the question, "Is Facebook the new Google?" Given this…
Uskul, Ayse K; Paulmann, Silke; Weick, Mario
2016-02-01
Listeners have to pay close attention to a speaker's tone of voice (prosody) during daily conversations. This is particularly important when trying to infer the emotional state of the speaker. Although a growing body of research has explored how emotions are processed from speech in general, little is known about how psychosocial factors such as social power can shape the perception of vocal emotional attributes. Thus, the present studies explored how social power affects emotional prosody recognition. In a correlational study (Study 1) and an experimental study (Study 2), we show that high power is associated with lower accuracy in emotional prosody recognition than low power. These results, for the first time, suggest that individuals experiencing high or low power perceive emotional tone of voice differently. (c) 2016 APA, all rights reserved).
Experiences on developing digital down conversion algorithms using Xilinx system generator
NASA Astrophysics Data System (ADS)
Xu, Chengfa; Yuan, Yuan; Zhao, Lizhi
2013-07-01
The Digital Down Conversion (DDC) algorithm is a classical signal processing method which is widely used in radar and communication systems. In this paper, the DDC function is implemented by Xilinx System Generator tool on FPGA. System Generator is an FPGA design tool provided by Xilinx Inc and MathWorks Inc. It is very convenient for programmers to manipulate the design and debug the function, especially for the complex algorithm. Through the developing process of DDC function based on System Generator, the results show that System Generator is a very fast and efficient tool for FPGA design.
2017-01-01
Cortex in and around the human posterior superior temporal sulcus (pSTS) is known to be critical for speech perception. The pSTS responds to both the visual modality (especially biological motion) and the auditory modality (especially human voices). Using fMRI in single subjects with no spatial smoothing, we show that visual and auditory selectivity are linked. Regions of the pSTS were identified that preferred visually presented moving mouths (presented in isolation or as part of a whole face) or moving eyes. Mouth-preferring regions responded strongly to voices and showed a significant preference for vocal compared with nonvocal sounds. In contrast, eye-preferring regions did not respond to either vocal or nonvocal sounds. The converse was also true: regions of the pSTS that showed a significant response to speech or preferred vocal to nonvocal sounds responded more strongly to visually presented mouths than eyes. These findings can be explained by environmental statistics. In natural environments, humans see visual mouth movements at the same time as they hear voices, while there is no auditory accompaniment to visual eye movements. The strength of a voxel's preference for visual mouth movements was strongly correlated with the magnitude of its auditory speech response and its preference for vocal sounds, suggesting that visual and auditory speech features are coded together in small populations of neurons within the pSTS. SIGNIFICANCE STATEMENT Humans interacting face to face make use of auditory cues from the talker's voice and visual cues from the talker's mouth to understand speech. The human posterior superior temporal sulcus (pSTS), a brain region known to be important for speech perception, is complex, with some regions responding to specific visual stimuli and others to specific auditory stimuli. Using BOLD fMRI, we show that the natural statistics of human speech, in which voices co-occur with mouth movements, are reflected in the neural architecture of the pSTS. Different pSTS regions prefer visually presented faces containing either a moving mouth or moving eyes, but only mouth-preferring regions respond strongly to voices. PMID:28179553
Zhu, Lin L; Beauchamp, Michael S
2017-03-08
Cortex in and around the human posterior superior temporal sulcus (pSTS) is known to be critical for speech perception. The pSTS responds to both the visual modality (especially biological motion) and the auditory modality (especially human voices). Using fMRI in single subjects with no spatial smoothing, we show that visual and auditory selectivity are linked. Regions of the pSTS were identified that preferred visually presented moving mouths (presented in isolation or as part of a whole face) or moving eyes. Mouth-preferring regions responded strongly to voices and showed a significant preference for vocal compared with nonvocal sounds. In contrast, eye-preferring regions did not respond to either vocal or nonvocal sounds. The converse was also true: regions of the pSTS that showed a significant response to speech or preferred vocal to nonvocal sounds responded more strongly to visually presented mouths than eyes. These findings can be explained by environmental statistics. In natural environments, humans see visual mouth movements at the same time as they hear voices, while there is no auditory accompaniment to visual eye movements. The strength of a voxel's preference for visual mouth movements was strongly correlated with the magnitude of its auditory speech response and its preference for vocal sounds, suggesting that visual and auditory speech features are coded together in small populations of neurons within the pSTS. SIGNIFICANCE STATEMENT Humans interacting face to face make use of auditory cues from the talker's voice and visual cues from the talker's mouth to understand speech. The human posterior superior temporal sulcus (pSTS), a brain region known to be important for speech perception, is complex, with some regions responding to specific visual stimuli and others to specific auditory stimuli. Using BOLD fMRI, we show that the natural statistics of human speech, in which voices co-occur with mouth movements, are reflected in the neural architecture of the pSTS. Different pSTS regions prefer visually presented faces containing either a moving mouth or moving eyes, but only mouth-preferring regions respond strongly to voices. Copyright © 2017 the authors 0270-6474/17/372697-12$15.00/0.
NASA Astrophysics Data System (ADS)
Miller, Avery
Consider a set of prisoners that want to gossip with one another, and suppose that these prisoners are located at fixed locations (e.g., in jail cells) along a corridor. Each prisoner has a way to broadcast messages (e.g. by voice or contraband radio) with transmission radius R and interference radius R' ≥ R. We study synchronous algorithms for this problem (that is, prisoners are allowed to speak at regulated intervals) including two restricted subclasses. We prove exact upper and lower bounds on the gossiping completion time for all three classes. We demonstrate that each restriction placed on the algorithm results in decreasing performance.
Sound source tracking device for telematic spatial sound field reproduction
NASA Astrophysics Data System (ADS)
Cardenas, Bruno
This research describes an algorithm that localizes sound sources for use in telematic applications. The localization algorithm is based on amplitude differences between various channels of a microphone array of directional shotgun microphones. The amplitude differences will be used to locate multiple performers and reproduce their voices, which were recorded at close distance with lavalier microphones, spatially corrected using a loudspeaker rendering system. In order to track multiple sound sources in parallel the information gained from the lavalier microphones will be utilized to estimate the signal-to-noise ratio between each performer and the concurrent performers.
NASA Astrophysics Data System (ADS)
O'Sullivan, James; Chen, Zhuo; Herrero, Jose; McKhann, Guy M.; Sheth, Sameer A.; Mehta, Ashesh D.; Mesgarani, Nima
2017-10-01
Objective. People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. Approach. We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener’s neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker’s voice to assist the listener. Main results. Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. Significance. Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.
Antiles, S; Couris, J; Schweitzer, A; Rosenthal, D; Da Silva, R Q
2000-01-01
Computerized voice recognition systems (VR) can reduce costs and enhance service. The capital outlay required for conversion to a VR system is significant; therefore, it is incumbent on radiology departments to provide cost and service justifications to administrators. Massachusetts General Hospital (MGH) in Boston implemented VR over a two-year period and achieved annual savings of $530,000 and a 50% decrease in report throughput. Those accomplishments required solid planning and implementation strategies, training and sustainment programs. This article walks through the process, step by step, in the hope of providing a tool set for future implementations. Because VR has dramatic implications for workflow, a solid operational plan is needed when assessing vendors and planning for implementation. The goals for implementation should be to minimize operational disruptions and capitalize on efficiencies of the technology. Senior leadership--the department chair or vice-chair--must select the goals to be accomplished and oversee, manage and direct the VR initiative. The importance of this point cannot be overstated, since implementation will require behavior changes from radiologists and others who may not perceive any personal benefits. Training is the pivotal factor affecting the success of voice recognition, and practice is the only way for radiologists to enhance their skills. Through practice, radiologists will discover shortcuts, and their speed and comfort will improve. Measurement and data analysis are critical to changing and improving the voice recognition application and are vital to decision-making. Some of the issues about which valuable date can be collected are technical and educational problems, VR penetration, report turnaround time and annual cost savings. Sustained effort is indispensable to the maintenance of voice recognition. Finally, all efforts made and gains achieved may prove to be futile without ongoing sustainment of the system through retraining, education and technical support.
Benefits of the fiber optic versus the electret microphone in voice amplification.
Kyriakou, Kyriaki; Fisher, Hélène R
2013-01-01
Voice disorders that result in reduced loudness may cause difficulty in communicating, socializing and participating in occupational activities. Amplification is often recommended in order to facilitate functional communication, reduce vocal load and avoid developing maladaptive compensatory behaviours. The most common microphone used with amplification systems is the electret microphone. One alternate form of microphone is the fiber optic microphone. To examine the benefits of the fiber optic (1190S) versus the electret (M04) microphone as measured by objective and subjective parameters in the amplification of a patient's voice with reduced loudness caused by neurological and/or respiratory-based problems. Eighteen patients with vocal fold paralysis, Parkinson's disease and/or chronic obstructive pulmonary disease (COPD) participated in the study. The study contained a measurement of intensity, amplitude perturbation and signal-to-noise ratio during a sustained vowel production and a measurement of intensity during conversation with the use of the two microphones simultaneously. It also included the completion of a questionnaire indicating the patient's satisfaction with each microphone. The fiber optic (1190S) microphone had better objective acoustic performance (i.e. lower amplitude perturbation, higher signal-to-noise ratio and higher intensity) than the electret (M04) microphone. It also had better patient subjective satisfaction (i.e. less conspicuousness, more voice clarity, less acoustic feedback, more loudness and more utilization) than the electret microphone. Patients with neurological and/or respiratory-based voice problems may more confidently and frequently use the fiber optic microphone to communicate, socialize and participate in occupational activities more easily. Speech-language pathologists may more confidently use or recommend the fiber optic microphone with amplification systems. © 2012 Royal College of Speech and Language Therapists.
Conversational Entrainment of Vocal Fry in Young Adult Female American English Speakers.
Borrie, Stephanie A; Delfino, Christine R
2017-07-01
Conversational entrainment, the natural tendency for people to modify their behaviors to more closely match their communication partner, is examined as one possible mechanism modulating the prevalence of vocal fry in the speech of young American women engaged in spoken dialogue. Twenty young adult female American English speakers engaged in two spoken dialogue tasks-one with a young adult female American English conversational partner who exhibited substantial vocal fry and one with a young adult female American English conversational partner who exhibited quantifiably less vocal fry. Dialogues were analyzed for proportion of vocal fry, by speaker, and two measures of communicative success (efficiency and enjoyment). Participants employed significantly more vocal fry when conversing with the partner who exhibited substantial vocal fry than when conversing with the partner who exhibited quantifiably less vocal fry. Further, greater similarity between communication partners in their use of vocal fry tracked with higher scores of communicative efficiency and communicative enjoyment. Conversational entrainment offers a mechanistic framework that may be used to explain, to some degree, the frequency with which vocal fry is employed by young American women engaged in spoken dialogue. Further, young American women who modulated their vocal patterns during dialogue to match those of their conversational partner gained more efficiency and enjoyment from their interactions, demonstrating the cognitive and social benefits of entrainment. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Muhammed, Louwai
2013-11-01
It has been suggested that undiagnosed epilepsy profoundly influenced the lives of several key figures in history. Historical sources recounting strange voices and visions may in fact have been describing manifestations of epileptic seizures rather than more supernatural phenomena. Well-documented accounts of such experiences exist for three individuals in particular: Socrates, St Paul and Joan of Arc. The great philosopher Socrates described a 'daimonion' that would visit him throughout his life. This daimonion may have represented recurrent simple partial seizures, while the peculiar periods of motionlessness for which Socrates was well known may have been the result of co-existing complex partial seizures. St Paul's religious conversion on the Road to Damascus may have followed a temporal lobe seizure which would account for the lights, voices, blindness and even the religious ecstasy he described. Finally, Joan of Arc gave a detailed narrative on the voices she heard from childhood during her Trial of Condemnation. Her auditory hallucinations appear to follow sudden acoustic stimuli in a way reminiscent of idiopathic partial epilepsy with auditory features. By analysing passages from historical texts, it is possible to argue that Socrates, St Paul and Joan of Arc each had epilepsy.
Voice based gender classification using machine learning
NASA Astrophysics Data System (ADS)
Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.
2017-11-01
Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.
ERIC Educational Resources Information Center
Ma, Liping
This paper introduces a Chinese teacher induction program. In China it is believed that formal teacher education constitutes only half of teacher preparation; the other half has to be accomplished on the job with the active support and involvement of the teaching community. The voice of teachers is introduced to the field of educational research…
NASA Technical Reports Server (NTRS)
Mckee, James W.
1988-01-01
This final report describes the accomplishments of the General Purpose Intelligent Sensor Interface task of the Applications of Artificial Intelligence to Space Station grant for the period from October 1, 1987 through September 30, 1988. Portions of the First Biannual Report not revised will not be included but only referenced. The goal is to develop an intelligent sensor system that will simplify the design and development of expert systems using sensors of the physical phenomena as a source of data. This research will concentrate on the integration of image processing sensors and voice processing sensors with a computer designed for expert system development. The result of this research will be the design and documentation of a system in which the user will not need to be an expert in such areas as image processing algorithms, local area networks, image processor hardware selection or interfacing, television camera selection, voice recognition hardware selection, or analog signal processing. The user will be able to access data from video or voice sensors through standard LISP statements without any need to know about the sensor hardware or software.
Unique voices in harmony: Call-and-response to address race and physics teaching
NASA Astrophysics Data System (ADS)
Cochran, Geraldine L.; White, Gary D.
2017-09-01
In the February 2016 issue of The Physics Teacher, we announced a call for papers on race and physics teaching. The response was muted at first, but has now grown to a respectable chorale-sized volume. As the manuscripts began to come in and the review process progressed, Geraldine Cochran graciously agreed to come on board as co-editor for this remarkable collection of papers, to be published throughout the fall of 2017 in TPT. Upon reviewing the original call and the responses from the physics community, the parallels between generating this collection and the grand call-and-response tradition became compelling. What follows is a conversation constructed by the co-editors that is intended to introduce the reader to the swell of voices that responded to the original call. The authors would like to thank Pam Aycock for providing many useful contributions to this editorial.
NASA Astrophysics Data System (ADS)
Kuroki, Hayato; Ino, Shuichi; Nakano, Satoko; Hori, Kotaro; Ifukube, Tohru
The authors of this paper have been studying a real-time speech-to-caption system using speech recognition technology with a “repeat-speaking” method. In this system, they used a “repeat-speaker” who listens to a lecturer's voice and then speaks back the lecturer's speech utterances into a speech recognition computer. The througoing system showed that the accuracy of the captions is about 97% in Japanese-Japanese conversion and the conversion time from voices to captions is about 4 seconds in English-English conversion in some international conferences. Of course it required a lot of costs to achieve these high performances. In human communications, speech understanding depends not only on verbal information but also on non-verbal information such as speaker's gestures, and face and mouth movements. So the authors found the idea to display information of captions and speaker's face movement images with a suitable way to achieve a higher comprehension after storing information once into a computer briefly. In this paper, we investigate the relationship of the display sequence and display timing between captions that have speech recognition errors and the speaker's face movement images. The results show that the sequence “to display the caption before the speaker's face image” improves the comprehension of the captions. The sequence “to display both simultaneously” shows an improvement only a few percent higher than the question sentence, and the sequence “to display the speaker's face image before the caption” shows almost no change. In addition, the sequence “to display the caption 1 second before the speaker's face shows the most significant improvement of all the conditions.
Crosswalk navigation for people with visual impairments on a wearable device
NASA Astrophysics Data System (ADS)
Cheng, Ruiqi; Wang, Kaiwei; Yang, Kailun; Long, Ningbo; Hu, Weijian; Chen, Hao; Bai, Jian; Liu, Dong
2017-09-01
Detecting and reminding of crosswalks at urban intersections is one of the most important demands for people with visual impairments. A real-time crosswalk detection algorithm, adaptive extraction and consistency analysis (AECA), is proposed. Compared with existing algorithms, which detect crosswalks in ideal scenarios, the AECA algorithm performs better in challenging scenarios, such as crosswalks at far distances, low-contrast crosswalks, pedestrian occlusion, various illuminances, and the limited resources of portable PCs. Bright stripes of crosswalks are extracted by adaptive thresholding, and are gathered to form crosswalks by consistency analysis. On the testing dataset, the proposed algorithm achieves a precision of 84.6% and a recall of 60.1%, which are higher than the bipolarity-based algorithm. The position and orientation of crosswalks are conveyed to users by voice prompts so as to align themselves with crosswalks and walk along crosswalks. The field tests carried out in various practical scenarios prove the effectiveness and reliability of the proposed navigation approach.
NASA Astrophysics Data System (ADS)
Sheikhan, Mansour; Abbasnezhad Arabi, Mahdi; Gharavian, Davood
2015-10-01
Artificial neural networks are efficient models in pattern recognition applications, but their performance is dependent on employing suitable structure and connection weights. This study used a hybrid method for obtaining the optimal weight set and architecture of a recurrent neural emotion classifier based on gravitational search algorithm (GSA) and its binary version (BGSA), respectively. By considering the features of speech signal that were related to prosody, voice quality, and spectrum, a rich feature set was constructed. To select more efficient features, a fast feature selection method was employed. The performance of the proposed hybrid GSA-BGSA method was compared with similar hybrid methods based on particle swarm optimisation (PSO) algorithm and its binary version, PSO and discrete firefly algorithm, and hybrid of error back-propagation and genetic algorithm that were used for optimisation. Experimental tests on Berlin emotional database demonstrated the superior performance of the proposed method using a lighter network structure.
Enhancement of A5/1 encryption algorithm
NASA Astrophysics Data System (ADS)
Thomas, Ria Elin; Chandhiny, G.; Sharma, Katyayani; Santhi, H.; Gayathri, P.
2017-11-01
Mobiles have become an integral part of today’s world. Various standards have been proposed for the mobile communication, one of them being GSM. With the rising increase of mobile-based crimes, it is necessary to improve the security of the information passed in the form of voice or data. GSM uses A5/1 for its encryption. It is known that various attacks have been implemented, exploiting the vulnerabilities present within the A5/1 algorithm. Thus, in this paper, we proceed to look at what these vulnerabilities are, and propose the enhanced A5/1 (E-A5/1) where, we try to improve the security provided by the A5/1 algorithm by XORing the key stream generated with a pseudo random number, without increasing the time complexity. We need to study what the vulnerabilities of the base algorithm (A5/1) is, and try to improve upon its security. This will help in the future releases of the A5 family of algorithms.
The relationship between cell phone use and management of driver fatigue: It's complicated.
Saxby, Dyani Juanita; Matthews, Gerald; Neubauer, Catherine
2017-06-01
Voice communication may enhance performance during monotonous, potentially fatiguing driving conditions (Atchley & Chan, 2011); however, it is unclear whether safety benefits of conversation are outweighed by costs. The present study tested whether personalized conversations intended to simulate hands-free cell phone conversation may counter objective and subjective fatigue effects elicited by vehicle automation. A passive fatigue state (Desmond & Hancock, 2001), characterized by disengagement from the task, was induced using full vehicle automation prior to drivers resuming full control over the driving simulator. A conversation was initiated shortly after reversion to manual control. During the conversation an emergency event occurred. The fatigue manipulation produced greater task disengagement and slower response to the emergency event, relative to a control condition. Conversation did not mitigate passive fatigue effects; rather, it added worry about matters unrelated to the driving task. Conversation moderately improved vehicle control, as measured by SDLP, but it failed to counter fatigue-induced slowing of braking in response to an emergency event. Finally, conversation appeared to have a hidden danger in that it reduced drivers' insights into performance impairments when in a state of passive fatigue. Automation induced passive fatigue, indicated by loss of task engagement; yet, simulated cell phone conversation did not counter the subjective automation-induced fatigue. Conversation also failed to counter objective loss of performance (slower braking speed) resulting from automation. Cell phone conversation in passive fatigue states may impair drivers' awareness of their performance deficits. Practical applications: Results suggest that conversation, even using a hands-free device, may not be a safe way to reduce fatigue and increase alertness during transitions from automated to manual vehicle control. Copyright © 2017 Elsevier Ltd and National Safety Council. All rights reserved.
Wilkes, James; Scott, Sophie K
2016-01-01
ABSTRACT Dialogues and collaborations between scientists and non-scientists are now widely understood as important elements of scientific research and public engagement with science. In recognition of this, the authors, a neuroscientist and a poet, use a dialogical approach to extend questions and ideas first shared during a lab-based poetry residency. They recorded a conversation and then expanded it into an essayistic form, allowing divergent disciplinary understandings and uses of experiment, noise, voice and emotion to be articulated, shared and questioned. PMID:27885317
Reframing Political Differences: One Conversation at a Time.
Gardner, Deborah B
2015-01-01
The profession of nursing, by it's very nature, is wrought with significantly complex moral and political disagreements. These issues cannot simply be avoided by being relegated to private discussions because their resolution is crucial to the common good. It is important for nurses to develop skills in public discourse if we are to bridge the political divide and influence local, state, and national policy. Failure to do so will leave us with ineffective and dismissed voices.
Psychogenic dysphonia: diversity of clinical and vocal manifestations in a case series.
Martins, Regina Helena Garcia; Tavares, Elaine Lara Mendes; Ranalli, Paula Ferreira; Branco, Anete; Pessin, Adriana Bueno Benito
2014-01-01
Psychogenic dysphonia is a functional disorder with variable clinical manifestations. To assess the clinical and vocal characteristics of patients with psychogenic dysphonia in a case series. The study included 28 adult patients with psychogenic dysphonia, evaluated at a University hospital in the last ten years. Assessed variables included gender, age, occupation, vocal symptoms, vocal characteristics, and videolaryngostroboscopic findings. 28 patients (26 women and 2 men) were assessed. Their occupations included: housekeeper (n=17), teacher (n=4), salesclerk (n=4), nurse (n=1), retired (n=1), and psychologist (n=1). Sudden symptom onset was reported by 16 patients and progressive symptom onset was reported by 12; intermittent evolution was reported by 15; symptom duration longer than three months was reported by 21 patients. Videolaryngostroboscopy showed only functional disorders; no patient had structural lesions or changes in vocal fold mobility. Conversion aphonia, skeletal muscle tension, and intermittent voicing were the most frequent vocal emission manifestation forms. In this case series of patients with psychogenic dysphonia, the most frequent form of clinical presentation was conversion aphonia, followed by musculoskeletal tension and intermittent voicing. The clinical and vocal aspects of 28 patients with psychogenic dysphonia, as well as the particularities of each case, are discussed. Copyright © 2014 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Algorithmic commonalities in the parallel environment
NASA Technical Reports Server (NTRS)
Mcanulty, Michael A.; Wainer, Michael S.
1987-01-01
The ultimate aim of this project was to analyze procedures from substantially different application areas to discover what is either common or peculiar in the process of conversion to the Massively Parallel Processor (MPP). Three areas were identified: molecular dynamic simulation, production systems (rule systems), and various graphics and vision algorithms. To date, only selected graphics procedures have been investigated. They are the most readily available, and produce the most visible results. These include simple polygon patch rendering, raycasting against a constructive solid geometric model, and stochastic or fractal based textured surface algorithms. Only the simplest of conversion strategies, mapping a major loop to the array, has been investigated so far. It is not entirely satisfactory.
Using PhotoVoice to Promote Land Conservation and Indigenous Well-Being in Oklahoma.
Carroll, Clint; Garroutte, Eva; Noonan, Carolyn; Buchwald, Dedra
2018-03-26
Indigenous ancestral teachings commonly present individual and community health as dependent upon relationships between human and nonhuman worlds. But how do persons conversant with ancestral teachings effectively convey such perspectives in contemporary contexts, and to what extent does the general tribal citizenry share them? Can media technology provide knowledge keepers with opportunities to communicate their perspectives to larger audiences? What are the implications for tribal citizens' knowledge and views about tribal land use policies? Using a PhotoVoice approach, we collaborated with a formally constituted body of Cherokee elders who supply cultural guidance to the Cherokee Nation government in Oklahoma. We compiled photographs taken by the elders and conducted interviews with them centered on the project themes of land and health. We then developed a still-image documentary highlighting these themes and surveyed 84 Cherokee citizens before and after they viewed it. Results from the pre-survey revealed areas where citizens' perspectives on tribal policy did not converge with the elders' perspectives; however, the post-survey showed statistically significant changes. We conclude that PhotoVoice is an effective method to communicate elders' perspectives, and that tribal citizens' values about tribal land use may change as they encounter these perspectives in such novel formats.
NASA Astrophysics Data System (ADS)
Chu, Xiaowen; Li, Bo; Chlamtac, Imrich
2002-07-01
Sparse wavelength conversion and appropriate routing and wavelength assignment (RWA) algorithms are the two key factors in improving the blocking performance in wavelength-routed all-optical networks. It has been shown that the optimal placement of a limited number of wavelength converters in an arbitrary mesh network is an NP complete problem. There have been various heuristic algorithms proposed in the literature, in which most of them assume that a static routing and random wavelength assignment RWA algorithm is employed. However, the existing work shows that fixed-alternate routing and dynamic routing RWA algorithms can achieve much better blocking performance. Our study in this paper further demonstrates that the wavelength converter placement and RWA algorithms are closely related in the sense that a well designed wavelength converter placement mechanism for a particular RWA algorithm might not work well with a different RWA algorithm. Therefore, the wavelength converter placement and the RWA have to be considered jointly. The objective of this paper is to investigate the wavelength converter placement problem under fixed-alternate routing algorithm and least-loaded routing algorithm. Under the fixed-alternate routing algorithm, we propose a heuristic algorithm called Minimum Blocking Probability First (MBPF) algorithm for wavelength converter placement. Under the least-loaded routing algorithm, we propose a heuristic converter placement algorithm called Weighted Maximum Segment Length (WMSL) algorithm. The objective of the converter placement algorithm is to minimize the overall blocking probability. Extensive simulation studies have been carried out over three typical mesh networks, including the 14-node NSFNET, 19-node EON and 38-node CTNET. We observe that the proposed algorithms not only outperform existing wavelength converter placement algorithms by a large margin, but they also can achieve almost the same performance comparing with full wavelength conversion under the same RWA algorithm.
A robotic voice simulator and the interactive training for hearing-impaired people.
Sawada, Hideyuki; Kitani, Mitsuki; Hayashi, Yasumori
2008-01-01
A talking and singing robot which adaptively learns the vocalization skill by means of an auditory feedback learning algorithm is being developed. The robot consists of motor-controlled vocal organs such as vocal cords, a vocal tract and a nasal cavity to generate a natural voice imitating a human vocalization. In this study, the robot is applied to the training system of speech articulation for the hearing-impaired, because the robot is able to reproduce their vocalization and to teach them how it is to be improved to generate clear speech. The paper briefly introduces the mechanical construction of the robot and how it autonomously acquires the vocalization skill in the auditory feedback learning by listening to human speech. Then the training system is described, together with the evaluation of the speech training by auditory impaired people.
Evaluation of synthesized voice approach callouts /SYNCALL/
NASA Technical Reports Server (NTRS)
Simpson, C. A.
1981-01-01
The two basic approaches to the generation of 'synthesized' speech include a utilization of analog recorded human speech and a construction of speech entirely from algorithms applied to constants describing speech sounds. Given the availability of synthesized speech displays for man-machine systems, research is needed to study suggested applications for speech and design principles for speech displays. The present investigation is concerned with a study for which new performance measures were developed. A number of air carrier approach and landing accidents during low or impaired visibility have been associated with the absence of approach callouts. The study had the purpose to compare a pilot-not-flying (PNF) approach callout system to a system composed of PNF callouts augmented by an automatic synthesized voice callout system (SYNCALL). Pilots were found to favor the use of a SYNCALL system containing certain modifications.
Voice, (inter-)subjectivity, and real time recurrent interaction
Cummins, Fred
2014-01-01
Received approaches to a unified phenomenon called “language” are firmly committed to a Cartesian view of distinct unobservable minds. Questioning this commitment leads us to recognize that the boundaries conventionally separating the linguistic from the non-linguistic can appear arbitrary, omitting much that is regularly present during vocal communication. The thesis is put forward that uttering, or voicing, is a much older phenomenon than the formal structures studied by the linguist, and that the voice has found elaborations and codifications in other domains too, such as in systems of ritual and rite. Voice, it is suggested, necessarily gives rise to a temporally bound subjectivity, whether it is in inner speech (Descartes' “cogito”), in conversation, or in the synchronized utterances of collective speech found in prayer, protest, and sports arenas world wide. The notion of a fleeting subjective pole tied to dynamically entwined participants who exert reciprocal influence upon each other in real time provides an insightful way to understand notions of common ground, or socially shared cognition. It suggests that the remarkable capacity to construct a shared world that is so characteristic of Homo sapiens may be grounded in this ability to become dynamically entangled as seen, e.g., in the centrality of joint attention in human interaction. Empirical evidence of dynamic entanglement in joint speaking is found in behavioral and neuroimaging studies. A convergent theoretical vocabulary is now available in the concept of participatory sense-making, leading to the development of a rich scientific agenda liberated from a stifling metaphysics that obscures, rather than illuminates, the means by which we come to inhabit a shared world. PMID:25101028
Mistaking minds and machines: How speech affects dehumanization and anthropomorphism.
Schroeder, Juliana; Epley, Nicholas
2016-11-01
Treating a human mind like a machine is an essential component of dehumanization, whereas attributing a humanlike mind to a machine is an essential component of anthropomorphism. Here we tested how a cue closely connected to a person's actual mental experience-a humanlike voice-affects the likelihood of mistaking a person for a machine, or a machine for a person. We predicted that paralinguistic cues in speech are particularly likely to convey the presence of a humanlike mind, such that removing voice from communication (leaving only text) would increase the likelihood of mistaking the text's creator for a machine. Conversely, adding voice to a computer-generated script (resulting in speech) would increase the likelihood of mistaking the text's creator for a human. Four experiments confirmed these hypotheses, demonstrating that people are more likely to infer a human (vs. computer) creator when they hear a voice expressing thoughts than when they read the same thoughts in text. Adding human visual cues to text (i.e., seeing a person perform a script in a subtitled video clip), did not increase the likelihood of inferring a human creator compared with only reading text, suggesting that defining features of personhood may be conveyed more clearly in speech (Experiments 1 and 2). Removing the naturalistic paralinguistic cues that convey humanlike capacity for thinking and feeling, such as varied pace and intonation, eliminates the humanizing effect of speech (Experiment 4). We discuss implications for dehumanizing others through text-based media, and for anthropomorphizing machines through speech-based media. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Conversion of wastelands into state ownership for the needs of high-rise construction
NASA Astrophysics Data System (ADS)
Ganebnykh, Elena
2018-03-01
High-rise construction in big cities faces the problem of land shortage in downtown areas. Audit of economic complexes showed a large volume of wastelands. The conversion of wastelands into state and municipal ownership helps in part to solve the problem of the lack of space for high-rise construction in the urban area in the format of infill construction. The article investigates the problem of the conversion of wastelands into state and municipal ownership. The research revealed no clear algorithm for converting wastelands into state and municipal ownership. To form a unified system for identifying such plots, a universal algorithm was developed to identify and convert ownerless immovable property into state or municipal ownership.
NASA Astrophysics Data System (ADS)
Qian, Feng; Li, Guoqiang
2001-12-01
In this paper a generalized look-ahead logic algorithm for number conversion from signed-digit to its complement representation is developed. By properly encoding the signed digits, all the operations are performed by binary logic, and unified logical expressions can be obtained for conversion from modified-signed-digit (MSD) to 2's complement, trinary signed-digit (TSD) to 3's complement, and quaternary signed-digit (QSD) to 4's complement. For optical implementation, a parallel logical array module using electron-trapping device is employed, which is suitable for realizing complex logic functions in the form of sum-of-product. The proposed algorithm and architecture are compatible with a general-purpose optoelectronic computing system.
Adams, Derk; Schreuder, Astrid B; Salottolo, Kristin; Settell, April; Goss, J Richard
2011-07-01
There are significant changes in the abbreviated injury scale (AIS) 2005 system, which make it impractical to compare patients coded in AIS version 98 with patients coded in AIS version 2005. Harborview Medical Center created a computer algorithm "Harborview AIS Mapping Program (HAMP)" to automatically convert AIS 2005 to AIS 98 injury codes. The mapping was validated using 6 months of double-coded patient injury records from a Level I Trauma Center. HAMP was used to determine how closely individual AIS and injury severity scores (ISS) were converted from AIS 2005 to AIS 98 versions. The kappa statistic was used to measure the agreement between manually determined codes and HAMP-derived codes. Seven hundred forty-nine patient records were used for validation. For the conversion of AIS codes, the measure of agreement between HAMP and manually determined codes was [kappa] = 0.84 (95% confidence interval, 0.82-0.86). The algorithm errors were smaller in magnitude than the manually determined coding errors. For the conversion of ISS, the agreement between HAMP versus manually determined ISS was [kappa] = 0.81 (95% confidence interval, 0.78-0.84). The HAMP algorithm successfully converted injuries coded in AIS 2005 to AIS 98. This algorithm will be useful when comparing trauma patient clinical data across populations coded in different versions, especially for longitudinal studies.
1988-09-01
other languages are better suited for more precise and narrow communications 50 .7 ’ 7 . HIGH VARIETY Art & Music (AMBIGUOUS) Body Language...change one’s understanding). Face-to-face conversation is the *richest" medium as it provides "immediate feedback* plus ’multiple cues’ such as body ...language and voice tone (Daft and Lengel, 1986:560). Some of the more ’hidden messages managers send* (like body language and office arrangement) can
Classification of vocal aging using parameters extracted from the glottal signal.
Forero Mendoza, Leonardo A; Cataldo, Edson; Vellasco, Marley M B R; Silva, Marco A; Apolinário, José A
2014-09-01
This article proposes and evaluates a method to classify vocal aging using artificial neural network (ANN) and support vector machine (SVM), using the parameters extracted from the speech signal as inputs. For each recorded speech, from a corpus of male and female speakers of different ages, the corresponding glottal signal is obtained using an inverse filtering algorithm. The Mel Frequency Cepstrum Coefficients (MFCC) also extracted from the voice signal and the features extracted from the glottal signal are supplied to an ANN and an SVM with a previous selection. The selection is performed by a wrapper approach of the most relevant parameters. Three groups are considered for the aging-voice classification: young (aged 15-30 years), adult (aged 31-60 years), and senior (aged 61-90 years). The results are compared using different possibilities: with only the parameters extracted from the glottal signal, with only the MFCC, and with a combination of both. The results demonstrate that the best classification rate is obtained using the glottal signal features, which is a novel result and the main contribution of this article. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Evaluation of voice codecs for the Australian mobile satellite system
NASA Technical Reports Server (NTRS)
Bundrock, Tony; Wilkinson, Mal
1990-01-01
The evaluation procedure to choose a low bit rate voice coding algorithm is described for the Australian land mobile satellite system. The procedure is designed to assess both the inherent quality of the codec under 'normal' conditions and its robustness under 'severe' conditions. For the assessment, normal conditions were chosen to be random bit error rate with added background acoustic noise and the severe condition is designed to represent burst error conditions when mobile satellite channel suffers from signal fading due to roadside vegetation. The assessment is divided into two phases. First, a reduced set of conditions is used to determine a short list of candidate codecs for more extensive testing in the second phase. The first phase conditions include quality and robustness and codecs are ranked with a 60:40 weighting on the two. Second, the short listed codecs are assessed over a range of input voice levels, BERs, background noise conditions, and burst error distributions. Assessment is by subjective rating on a five level opinion scale and all results are then used to derive a weighted Mean Opinion Score using appropriate weights for each of the test conditions.
Hearsay Ethnography: Conversational Journals as a Method for Studying Culture in Action.
Watkins, Susan Cotts; Swidler, Ann
2009-04-01
Social scientists have long struggled to develop methods adequate to their theoretical understanding of meaning as collective and dynamic. While culture is widely understood as an emergent property of collectivities, the methods we use keep pulling us back towards interview-situated accounts and an image of culture as located in individual experience. Scholars who seek to access supra-individual semiotic structures by studying public rituals and other collectively-produced texts then have difficulty capturing the dynamic processes through which such meanings are created and changed in situ. To try to capture more effectively the way meaning is produced and re-produced in everyday life, we focus here on conversational interactions-the voices and actions that constitute the relational space among actors. Conversational journals provide us with a method: the analysis of texts produced by cultural insiders who keep journals of who-said-what-to-whom in conversations they overhear or events they participate in during the course of their daily lives. We describe the method, distinguishing it from other approaches and noting its drawbacks. We then illustrate the methodological advantages of conversational journals with examples from our texts. We end with a discussion of the method's potential in our setting as well as in other places and times.
Zadeh, Sima; Pao, Maryland; Wiener, Lori
2015-06-01
Each year, more than 11,000 adolescents and young adults (AYAs), aged 15-34, die from cancer and other life-threatening conditions. In order to facilitate the transition from curative to end-of-life (EoL) care, it is recommended that EoL discussions be routine, begin close to the time of diagnosis, and continue throughout the illness trajectory. However, due largely to discomfort with the topic of EoL and how to approach the conversation, healthcare providers have largely avoided these discussions. We conducted a two-phase study through the National Cancer Institute with AYAs living with cancer or pediatric HIV to assess AYA interest in EoL planning and to determine in which aspects of EoL planning AYAs wanted to participate. These results provided insight regarding what EoL concepts were important to AYAs, as well as preferences in terms of content, design, format, and style. The findings from this research led to the development of an age-appropriate advance care planning guide, Voicing My CHOiCES™. Voicing My CHOiCES™: An Advanced Care Planning Guide for AYA became available in November 2012. This manuscript provides guidelines on how to introduce and utilize an advance care planning guide for AYAs and discusses potential barriers. Successful use of Voicing My CHOiCES™ will depend on the comfort and skills of the healthcare provider. The present paper is intended to introduce the guide to providers who may utilize it as a resource in their practice, including physicians, nurses, social workers, chaplains, psychiatrists, and psychologists. We suggest guidelines on how to: incorporate EoL planning into the practice setting, identify timepoints at which a patient's goals of care are discussed, and address how to empower the patient and incorporate the family in EoL planning. Recommendations for introducing Voicing My CHOiCES™ and on how to work through each section alongside the patient are provided.
Scalable 3D image conversion and ergonomic evaluation
NASA Astrophysics Data System (ADS)
Kishi, Shinsuke; Kim, Sang Hyun; Shibata, Takashi; Kawai, Takashi; Häkkinen, Jukka; Takatalo, Jari; Nyman, Göte
2008-02-01
Digital 3D cinema has recently become popular and a number of high-quality 3D films have been produced. However, in contrast with advances in 3D display technology, it has been pointed out that there is a lack of suitable 3D content and content creators. Since 3D display methods and viewing environments vary widely, there is expectation that high-quality content will be multi-purposed. On the other hand, there is increasing interest in the bio-medical effects of image content of various types and there are moves toward international standardization, so 3D content production needs to take into consideration safety and conformity with international guidelines. The aim of the authors' research is to contribute to the production and application of 3D content that is safe and comfortable to watch by developing a scalable 3D conversion technology. In this paper, the authors focus on the process of changing the screen size, examining a conversion algorithm and its effectiveness. The authors evaluated the visual load imposed during the viewing of various 3D content converted by the prototype algorithm as compared with ideal conditions and with content expanded without conversion. Sheffe's paired comparison method was used for evaluation. To examine the effects of screen size reduction on viewers, changes in user impression and experience were elucidated using the IBQ methodology. The results of the evaluation are presented along with a discussion of the effectiveness and potential of the developed scalable 3D conversion algorithm and future research tasks.
Adaptive Noise Suppression Using Digital Signal Processing
NASA Technical Reports Server (NTRS)
Kozel, David; Nelson, Richard
1996-01-01
A signal to noise ratio dependent adaptive spectral subtraction algorithm is developed to eliminate noise from noise corrupted speech signals. The algorithm determines the signal to noise ratio and adjusts the spectral subtraction proportion appropriately. After spectra subtraction low amplitude signals are squelched. A single microphone is used to obtain both eh noise corrupted speech and the average noise estimate. This is done by determining if the frame of data being sampled is a voiced or unvoiced frame. During unvoice frames an estimate of the noise is obtained. A running average of the noise is used to approximate the expected value of the noise. Applications include the emergency egress vehicle and the crawler transporter.
Computer simulator for a mobile telephone system
NASA Technical Reports Server (NTRS)
Schilling, D. L.
1981-01-01
A software simulator was developed to assist NASA in the design of the land mobile satellite service. Structured programming techniques were used by developing the algorithm using an ALCOL-like pseudo language and then encoding the algorithm into FORTRAN 4. The basic input data to the system is a sine wave signal although future plans call for actual sampled voice as the input signal. The simulator is capable of studying all the possible combinations of types and modes of calls through the use of five communication scenarios: single hop systems; double hop, signal gateway system; double hop, double gateway system; mobile to wireline system; and wireline to mobile system. The transmitter, fading channel, and interference source simulation are also discussed.
Speech recognition for embedded automatic positioner for laparoscope
NASA Astrophysics Data System (ADS)
Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin
2014-07-01
In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
Nonlinear convergence active vibration absorber for single and multiple frequency vibration control
NASA Astrophysics Data System (ADS)
Wang, Xi; Yang, Bintang; Guo, Shufeng; Zhao, Wenqiang
2017-12-01
This paper presents a nonlinear convergence algorithm for active dynamic undamped vibration absorber (ADUVA). The damping of absorber is ignored in this algorithm to strengthen the vibration suppressing effect and simplify the algorithm at the same time. The simulation and experimental results indicate that this nonlinear convergence ADUVA can help significantly suppress vibration caused by excitation of both single and multiple frequency. The proposed nonlinear algorithm is composed of equivalent dynamic modeling equations and frequency estimator. Both the single and multiple frequency ADUVA are mathematically imitated by the same mechanical structure with a mass body and a voice coil motor (VCM). The nonlinear convergence estimator is applied to simultaneously satisfy the requirements of fast convergence rate and small steady state frequency error, which are incompatible for linear convergence estimator. The convergence of the nonlinear algorithm is mathematically proofed, and its non-divergent characteristic is theoretically guaranteed. The vibration suppressing experiments demonstrate that the nonlinear ADUVA can accelerate the convergence rate of vibration suppressing and achieve more decrement of oscillation attenuation than the linear ADUVA.
Duration, Pitch, and Loudness in Kunqu Opera Stage Speech.
Han, Qichao; Sundberg, Johan
2017-03-01
Kunqu is a special type of opera within the Chinese tradition with 600 years of history. In it, stage speech is used for the spoken dialogue. It is performed in Ming Dynasty's mandarin language and is a much more dominant part of the play than singing. Stage speech deviates considerably from normal conversational speech with respect to duration, loudness and pitch. This paper compares these properties in stage speech conversational speech. A famous, highly experienced female singer's performed stage speech and reading of the same lyrics in a conversational speech mode. Clear differences are found. As compared with conversational speech, stage speech had longer word and sentence duration and word duration was less variable. Average sound level was 16 dB higher. Also mean fundamental frequency was considerably higher and more varied. Within sentences, both loudness and fundamental frequency tended to vary according to a low-high-low pattern. Some of the findings fail to support current opinions regarding the characteristics of stage speech, and in this sense the study demonstrates the relevance of objective measurements in descriptions of vocal styles. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
An Improved Perturb and Observe Algorithm for Photovoltaic Motion Carriers
NASA Astrophysics Data System (ADS)
Peng, Lele; Xu, Wei; Li, Liming; Zheng, Shubin
2018-03-01
An improved perturbation and observation algorithm for photovoltaic motion carriers is proposed in this paper. The model of the proposed algorithm is given by using Lambert W function and tangent error method. Moreover, by using matlab and experiment of photovoltaic system, the tracking performance of the proposed algorithm is tested. And the results demonstrate that the improved algorithm has fast tracking speed and high efficiency. Furthermore, the energy conversion efficiency by the improved method has increased by nearly 8.2%.
NASA Astrophysics Data System (ADS)
Li, Guoqiang; Qian, Feng
2001-11-01
We present, for the first time to our knowledge, a generalized lookahead logic algorithm for number conversion from signed-digit to complement representation. By properly encoding the signed-digits, all the operations are performed by binary logic, and unified logical expressions can be obtained for conversion from modified-signed- digit (MSD) to 2's complement, trinary signed-digit (TSD) to 3's complement, and quarternary signed-digit (QSD) to 4's complement. For optical implementation, a parallel logical array module using an electron-trapping device is employed and experimental results are shown. This optical module is suitable for implementing complex logic functions in the form of the sum of the product. The algorithm and architecture are compatible with a general-purpose optoelectronic computing system.
Implementation of trigonometric function using CORDIC algorithms
NASA Astrophysics Data System (ADS)
Mokhtar, A. S. N.; Ayub, M. I.; Ismail, N.; Daud, N. G. Nik
2018-02-01
In 1959, Jack E. Volder presents a brand new formula to the real-time solution of the equation raised in navigation system. This new algorithm was the most beneficial replacement of analog navigation system by the digital. The CORDIC (Coordinate Rotation Digital Computer) algorithm are used for the rapid calculation associated with elementary operates like trigonometric function, multiplication, division and logarithm function, and also various conversions such as conversion of rectangular to polar coordinate including the conversion between binary coded information. In this current time CORDIC formula have many applications in the field of communication, signal processing, 3-D graphics, and others. This paper would be presents the trigonometric function implementation by using CORDIC algorithm in rotation mode for circular coordinate system. The CORDIC technique is used in order to generating the output angle between range 0o to 90o and error analysis is concern. The result showed that the average percentage error is about 0.042% at angles between ranges 00 to 900. But the average percentage error rose up to 45% at angle 90o and above. So, this method is very accurate at the 1st quadrant. The mirror properties method is used to find out an angle at 2nd, 3rd and 4th quadrant.
Lambert-Kerzner, Anne; Havranek, Edward P; Plomondon, Mary E; Albright, Karen; Moore, Ashley; Gryniewicz, Kelsey; Magid, David; Ho, P Michael
2010-11-01
Few studies have investigated the effectiveness of multifaceted interventions from the study participants' perspective. We conducted qualitative interviews to understand patients' experiences with a multifaceted blood pressure (BP) control intervention involving interactive voice response technology, home BP monitoring, and pharmacist-led BP management. In the randomized study, the intervention resulted in clinically significant decreases in BP. We used insights generated from in-depth interviews from all study participants randomly assigned to the multifaceted intervention or usual care (n=146) to create a model explaining the observed improvements in health behavior and clinical outcomes. The data were analyzed using qualitative content analysis methods and consultative and reflexive team analysis. Six explanatory factors emerged from the patients' interviews: (1) improved relationships with medical personnel; (2) increased knowledge of hypertension; (3) increased participation in their health care and personal empowerment; (4) greater understanding of the impact of health behavior on BP; (5) high satisfaction with technology used in the intervention; and, for some patients, (6) increased health care utilization. Eighty-six percent of the intervention patients and 62% of the usual care patients stated that study participation had a positive effect on them. Of those expressing a positive effect, 68% (intervention) and 55% (usual care) reached their systolic BP goal. Establishing bidirectional conversations between patients and providers is a key element of successful hypertension management. Home BP monitoring coupled with interactive voice response technology reporting facilitates such conversations.
Transitioning from analog to digital audio recording in childhood speech sound disorders.
Shriberg, Lawrence D; McSweeny, Jane L; Anderson, Bruce E; Campbell, Thomas F; Chial, Michael R; Green, Jordan R; Hauner, Katherina K; Moore, Christopher A; Rusiewicz, Heather L; Wilson, David L
2005-06-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants' speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise.
Towards Broadening the Audience
NASA Astrophysics Data System (ADS)
Sakimoto, P. J.
2008-06-01
The strand Towards Broadening the Audience was intended to seed thoughtful conversations about building bridges for outreach programs across cultural barriers. Many participants spoke about progress in increasing the diversity of their outreach audiences, but it was new voices from time-honored sources that offered fundamentally new wisdom. From the religious traditions and tensions that mark the Holy Land came the simple concept of bringing unity through teaching the commonalities found in basic concepts of the observed sky. From Mayan traditions, both contemporary and ancient, came the reminder that the sky is intimately connected to all aspects of our lives. Astronomy outreach should therefore be a part of much larger family and community celebrations. Ideas such as these offer renewed hope for major advances in bringing space science outreach to much broader audiences. They tell us about the importance of learning from voices with perspectives different from our own, and of building partnerships based upon genuine cross-cultural understanding and mutual love of the sky.
Transitioning from analog to digital audio recording in childhood speech sound disorders
Shriberg, Lawrence D.; McSweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.
2014-01-01
Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing a reference database for research in childhood speech sound disorders. Two research transcribers with different levels of experience glossed, transcribed, and prosody-voice coded conversational speech samples from eight children with mild to severe speech disorders of unknown origin. The samples were recorded, stored, and played back using representative analog and digital audio systems. Effect sizes calculated for an array of analog versus digital comparisons ranged from negligible to medium, with a trend for participants’ speech competency scores to be slightly lower for samples obtained and transcribed using the digital system. We discuss the implications of these and other findings for research and clinical practise. PMID:16019779
USE OF POPULATION VIABILITY ANALYSIS AND RESERVE SELECTION ALGORITHMS IN REGIONAL CONSERVATION PLANS
Current reserve selection algorithms have difficulty evaluating connectivity and other factors
necessary to conserve wide-ranging species in developing landscapes. Conversely, population viability analyses may incorporate detailed demographic data but often lack sufficient spa...
Vascular lesions of the vocal fold.
Gökcan, Kürşat Mustafa; Dursun, Gürsel
2009-04-01
The aim of the study was to present symptoms, laryngological findings, clinical course, management modalities, and consequences of vascular lesions of vocal fold. This study examined 162 patients, the majority professional voice users, with vascular lesions regarding their presenting symptoms, laryngological findings, clinical courses and treatment results. The most common complaint was sudden hoarseness with hemorrhagic polyp. Microlaryngoscopic surgery was performed in 108 cases and the main indication of surgery was the presence of vocal fold mass or development of vocal polyp during clinical course. Cold microsurgery was utilized for removal of vocal fold masses and feeding vessels cauterized using low power, pulsed CO(2) laser. Acoustic analysis of patients revealed a significant improvement of jitter, shimmer and harmonics/noise ratio values after treatment. Depending on our clinical findings, we propose treatment algorithm where voice rest and behavioral therapy is the integral part and indications of surgery are individualized for each patient.
Emich, I F
1980-08-01
Novel verbal and nonverbal therapeutic techniques are described by means of examples. The aphasic clients range from preschool and school children to young people up to the age of 18. Identification and evaluation of main interest areas enabled individualised combinations of therapeutic measure as well as novel play and/or work situations to be developed, which also involved technical devices: animal voice imitation, play telephone, normal telephone, typewriter, electronic pocket calculator, magic screen, keyed instruments (toy piano, etc.). Rhythmically stressed and "sports" speech training (revolving disc, indoor bicycle, "jouk" sport), hydrotherapy, horseback riding, swimming. Age-adapted conversation, storytelling, motivation through joy and success ("circulus hortativus"), music therapy. Even in cases of extremely delayed treatment, advances may be achieved. Special hints: shorthand therapy, pseudo-phenomena, cotherapy, conversion of right-handed to left.
Wang, Yulin; Tian, Xuelong
2014-08-01
In order to improve the speech quality and auditory perceptiveness of electronic cochlear implant under strong noise background, a speech enhancement system used for electronic cochlear implant front-end was constructed. Taking digital signal processing (DSP) as the core, the system combines its multi-channel buffered serial port (McBSP) data transmission channel with extended audio interface chip TLV320AIC10, so speech signal acquisition and output with high speed are realized. Meanwhile, due to the traditional speech enhancement method which has the problems as bad adaptability, slow convergence speed and big steady-state error, versiera function and de-correlation principle were used to improve the existing adaptive filtering algorithm, which effectively enhanced the quality of voice communications. Test results verified the stability of the system and the de-noising performance of the algorithm, and it also proved that they could provide clearer speech signals for the deaf or tinnitus patients.
MOLA II Laser Transmitter Calibration and Performance. 1.2
NASA Technical Reports Server (NTRS)
Afzal, Robert S.; Smith, David E. (Technical Monitor)
1997-01-01
The goal of the document is to explain the algorithm for determining the laser output energy from the telemetry data within the return packets from MOLA II. A simple algorithm is developed to convert the raw start detector data into laser energy, measured in millijoules. This conversion is dependent on three variables, start detector counts, array heat sink temperature and start detector temperature. All these values are contained within the return packets. The conversion is applied to the GSFC Thermal Vacuum data as well as the in-space data to date and shows good correlation.
Apollo 11 voice transcript pertaining to the geology of the landing site
Bailey, N.G.; Ulrich, G.E.
1974-01-01
On July 20, 1969, America's Eagle touched down in southwestern Mare Tranquillitatis beginning man's firsthand exploration of the moon. This document is an edited record of the conversations between astronauts Neil Armstrong and Edwin "Buzz" Aldrin, Jr., at Tranquility Base, and Bruce McCandless at Mission Control in Houston during the approximately 22 hours spent on the lunar surface. It includes additional commentary during their return to Earth. It is a condensation hopefully of all the verbal data having geological significance. All discussions and observations documenting the lunar landscape, its geologic characteristics, the rocks and soils collected, and the photographic record are retained along with supplementary remarks essential to the continuity of events during the mission. We have deleted the words of mechanical housekeeping and engineering data, attempting not to lose the personal and philosophical aspects of this intensely human experience. The sources of this verbal transcript are the complete audio tapes recorded during the mission and the Technical Air-to-Ground Voice Transcription published by NASA. The voice record is listed chronologically given in days, hours, minutes, and seconds. These are the Ground Elapsed Times (GET) after launch from Kennedy Space Center which was 9:32 a.m. EDT on July 16, 1969. Figure 1 shows the vicinity of the landing site that was described, sampled, and photographed by the Apollo 11 crewmen.
Development of an algorithm for controlling a multilevel three-phase converter
NASA Astrophysics Data System (ADS)
Taissariyeva, Kyrmyzy; Ilipbaeva, Lyazzat
2017-08-01
This work is devoted to the development of an algorithm for controlling transistors in a three-phase multilevel conversion system. The developed algorithm allows to organize a correct operation and describes the state of transistors at each moment of time when constructing a computer model of a three-phase multilevel converter. The developed algorithm of operation of transistors provides in-phase of a three-phase converter and obtaining a sinusoidal voltage curve at the converter output.
Fazzino, Tera L; Rabinowitz, Terry; Althoff, Robert R; Helzer, John E
2013-12-01
Recently, there has been a gradual shift from inpatient-only electroconvulsive therapy (ECT) toward outpatient administration. Potential advantages include convenience and reduced cost. But providers do not have the same opportunity to monitor treatment response and adverse effects as they do with inpatients. This can obviate some of the potential advantages of outpatient ECT, such as tailoring treatment intervals to clinical response. Scheduling is typically algorithmic rather than empirically based. Daily monitoring through an automated telephone, interactive voice response (IVR), is a potential solution to this quandary. To test feasibility of clinical monitoring via IVR, we recruited 26 patients (69% female; mean age, 51 years) receiving outpatient ECT to make daily IVR reports of affective symptoms and subjective memory for 60 days. The IVR also administered a word recognition task daily to test objective memory. Every seventh day, a longer IVR weekly interview included questions about suicidal ideation. Overall daily call compliance was high (mean, 80%). Most participants (96%) did not consider the calls to be time-consuming. Longitudinal regression analysis using generalized estimating equations revealed that participant objective memory functioning significantly improved during the study (P < 0.05). Of 123 weekly IVR interviews, 41 reports (33%) in 14 patients endorsed suicidal ideation during the previous week. Interactive voice response monitoring of outpatient ECT can provide more detailed clinical information than standard outpatient ECT assessment. Interactive voice response data offer providers a comprehensive, longitudinal picture of patient treatment response and adverse effects as a basis for treatment scheduling and ongoing clinical management.
Buechner, Andreas; Dyballa, Karl-Heinz; Hehrmann, Phillipp; Fredelake, Stefan; Lenarz, Thomas
2014-01-01
Objective To investigate the performance of monaural and binaural beamforming technology with an additional noise reduction algorithm, in cochlear implant recipients. Method This experimental study was conducted as a single subject repeated measures design within a large German cochlear implant centre. Twelve experienced users of an Advanced Bionics HiRes90K or CII implant with a Harmony speech processor were enrolled. The cochlear implant processor of each subject was connected to one of two bilaterally placed state-of-the-art hearing aids (Phonak Ambra) providing three alternative directional processing options: an omnidirectional setting, an adaptive monaural beamformer, and a binaural beamformer. A further noise reduction algorithm (ClearVoice) was applied to the signal on the cochlear implant processor itself. The speech signal was presented from 0° and speech shaped noise presented from loudspeakers placed at ±70°, ±135° and 180°. The Oldenburg sentence test was used to determine the signal-to-noise ratio at which subjects scored 50% correct. Results Both the adaptive and binaural beamformer were significantly better than the omnidirectional condition (5.3 dB±1.2 dB and 7.1 dB±1.6 dB (p<0.001) respectively). The best score was achieved with the binaural beamformer in combination with the ClearVoice noise reduction algorithm, with a significant improvement in SRT of 7.9 dB±2.4 dB (p<0.001) over the omnidirectional alone condition. Conclusions The study showed that the binaural beamformer implemented in the Phonak Ambra hearing aid could be used in conjunction with a Harmony speech processor to produce substantial average improvements in SRT of 7.1 dB. The monaural, adaptive beamformer provided an averaged SRT improvement of 5.3 dB. PMID:24755864
Objective Quality Assessment for Color-to-Gray Image Conversion.
Ma, Kede; Zhao, Tiesong; Zeng, Kai; Wang, Zhou
2015-12-01
Color-to-gray (C2G) image conversion is the process of transforming a color image into a grayscale one. Despite its wide usage in real-world applications, little work has been dedicated to compare the performance of C2G conversion algorithms. Subjective evaluation is reliable but is also inconvenient and time consuming. Here, we make one of the first attempts to develop an objective quality model that automatically predicts the perceived quality of C2G converted images. Inspired by the philosophy of the structural similarity index, we propose a C2G structural similarity (C2G-SSIM) index, which evaluates the luminance, contrast, and structure similarities between the reference color image and the C2G converted image. The three components are then combined depending on image type to yield an overall quality measure. Experimental results show that the proposed C2G-SSIM index has close agreement with subjective rankings and significantly outperforms existing objective quality metrics for C2G conversion. To explore the potentials of C2G-SSIM, we further demonstrate its use in two applications: 1) automatic parameter tuning for C2G conversion algorithms and 2) adaptive fusion of C2G converted images.
Fundamental frequency perturbation indicates perceived health and age in male and female speakers
NASA Astrophysics Data System (ADS)
Feinberg, David R.
2004-05-01
There is strong support for the idea that healthy vocal chords are able to produce fundamental frequencies (F0) with minimal perturbation. Measures of F0 perturbation have been shown to discriminate pathological versus healthy populations. In addition to measuring vocal chord health, F0 perturbation is a correlate of real and perceived age. Here, the role of jitter (periodic variation in F0) and shimmer (periodic variation in amplitude of F0) in perceived health and age in a young adult (males aged 18-33, females aged 18-26), nondysphonic population was investigated. Voices were assessed for health and age by peer aged, opposite-sex raters. Jitter and shimmer were measured with Praat software (www.praat.org) using various algorithms (jitter: DDP, local, local absolute, PPQ5, and RAP; shimmer: DDA, local, local absolute, APQ3, APQ5, APQ11) to reduce measurement error, and to ascertain the robustness of the findings. Male and female voices were analyzed separately. In both sexes, ratings of health and age were significantly correlated. Measures of jitter and shimmer correlated negatively with perceived health, and positively with perceived age. Further analysis revealed that these effects were independent in male voices. Implications of this finding are that attributions of vocal health and age may reflect actual underlying condition.
A Flexible Analysis Tool for the Quantitative Acoustic Assessment of Infant Cry
Reggiannini, Brian; Sheinkopf, Stephen J.; Silverman, Harvey F.; Li, Xiaoxue; Lester, Barry M.
2015-01-01
Purpose In this article, the authors describe and validate the performance of a modern acoustic analyzer specifically designed for infant cry analysis. Method Utilizing known algorithms, the authors developed a method to extract acoustic parameters describing infant cries from standard digital audio files. They used a frame rate of 25 ms with a frame advance of 12.5 ms. Cepstral-based acoustic analysis proceeded in 2 phases, computing frame-level data and then organizing and summarizing this information within cry utterances. Using signal detection methods, the authors evaluated the accuracy of the automated system to determine voicing and to detect fundamental frequency (F0) as compared to voiced segments and pitch periods manually coded from spectrogram displays. Results The system detected F0 with 88% to 95% accuracy, depending on tolerances set at 10 to 20 Hz. Receiver operating characteristic analyses demonstrated very high accuracy at detecting voicing characteristics in the cry samples. Conclusions This article describes an automated infant cry analyzer with high accuracy to detect important acoustic features of cry. A unique and important aspect of this work is the rigorous testing of the system’s accuracy as compared to ground-truth manual coding. The resulting system has implications for basic and applied research on infant cry development. PMID:23785178
On a Chirplet Transform Based Method for Co-channel Voice Separation
NASA Astrophysics Data System (ADS)
Dugnol, B.; Fernández, C.; Galiano, G.; Velasco, J.
We use signal and image theory based algorithms to produce estimations of the number of wolves emitting howls or barks in a given field recording as an individuals counting alternative to the traditional trace collecting methodologies. We proceed in two steps. Firstly, we clean and enhance the signal by using PDE based image processing algorithms applied to the signal spectrogram. Secondly, assuming that the wolves chorus may be modelled as an addition of nonlinear chirps, we use the quadratic energy distribution corresponding to the Chirplet Transform of the signal to produce estimates of the corresponding instantaneous frequencies, chirp-rates and amplitudes at each instant of the recording. We finally establish suitable criteria to decide how such estimates are connected in time.
Defense switched network technology and experiments program
NASA Astrophysics Data System (ADS)
Weinstein, C. J.
1983-09-01
This report documents work performed during FY 1983 on the DCA-sponsored Defense Switched Network Technology and Experiments Program. The areas of work reported are: (1) development of routing algorithms for application in the Defense Switched Network (DSN); (2) instrumentation and integration of the Experimental Integrated Switched Network (EISN) test facility; (3) development and test of data communication techniques using DoD-standard data protocols in an integrated voice/data network; and (4) EISN system coordination and experiment planning.
A Fully Distributed Approach to the Design of a KBIT/SEC VHF Packet Radio Network,
1984-02-01
topological change and consequent out-modea routing data. Algorithm development has been aided by computer simulation using a finite state machine technique...development has been aided by computer simulation using a finite state machine technique to model a realistic network of up to fifty nodes. This is...use of computer based equipments in weapons systems and their associated sensors and command and control elements and the trend from voice to data
How Like Perceives Like: Gay People on "Gaydar".
Barton, Bernadette
2015-01-01
When lacking explicit knowledge of someone's sexual orientation, gay people commonly assess the likelihood that another is gay using their "gaydar." The term gaydar is a playful mix of the word gay with radar, suggesting that one can sense, intuit, or perceive some set of characteristics in another that signal a shared minority status. While commonly mentioned, the exact criteria a gay person uses when employing their gaydar are little discussed. Drawing methodologically on a series of five focus groups of self-identified lesbians and gay men, this study explores the physical, visual, energetic, and conversational cues gay people consider when they employ the trope of gaydar. Specifically, interview subjects most often described their gaydar as triggered by the following elements: physical presentation, including mannerisms, dress, and voice; interactions, especially eye contact; a presence or absence of certain conversational social norms; and, intangibly, as a kind of energetic exchange.
Kuo, Chung-Feng Jeffrey; Wang, Hsing-Won; Hsiao, Shang-Wun; Peng, Kai-Ching; Chou, Ying-Liang; Lai, Chun-Yu; Hsu, Chien-Tung Max
2014-01-01
Physicians clinically use laryngeal video stroboscope as an auxiliary instrument to test glottal diseases, and read vocal fold images and voice quality for diagnosis. As the position of vocal fold varies in each person, the proportion of the vocal fold size as presented in the vocal fold image is different, making it impossible to directly estimate relevant glottis physiological parameters, such as the length, area, perimeter, and opening angle of the glottis. Hence, this study designs an innovative laser projection marking module for the laryngeal video stroboscope to provide reference parameters for image scaling conversion. This innovative laser projection marking module to be installed on the laryngeal video stroboscope using laser beams to project onto the glottis plane, in order to provide reference parameters for scaling conversion of images of laryngeal video stroboscope. Copyright © 2013 Elsevier Ltd. All rights reserved.
Automatic voice recognition using traditional and artificial neural network approaches
NASA Technical Reports Server (NTRS)
Botros, Nazeih M.
1989-01-01
The main objective of this research is to develop an algorithm for isolated-word recognition. This research is focused on digital signal analysis rather than linguistic analysis of speech. Features extraction is carried out by applying a Linear Predictive Coding (LPC) algorithm with order of 10. Continuous-word and speaker independent recognition will be considered in future study after accomplishing this isolated word research. To examine the similarity between the reference and the training sets, two approaches are explored. The first is implementing traditional pattern recognition techniques where a dynamic time warping algorithm is applied to align the two sets and calculate the probability of matching by measuring the Euclidean distance between the two sets. The second is implementing a backpropagation artificial neural net model with three layers as the pattern classifier. The adaptation rule implemented in this network is the generalized least mean square (LMS) rule. The first approach has been accomplished. A vocabulary of 50 words was selected and tested. The accuracy of the algorithm was found to be around 85 percent. The second approach is in progress at the present time.
Multicast routing for wavelength-routed WDM networks with dynamic membership
NASA Astrophysics Data System (ADS)
Huang, Nen-Fu; Liu, Te-Lung; Wang, Yao-Tzung; Li, Bo
2000-09-01
Future broadband networks must support integrated services and offer flexible bandwidth usage. In our previous work, we explore the optical link control layer on the top of optical layer that enables the possibility of bandwidth on-demand service directly over wavelength division multiplexed (WDM) networks. Today, more and more applications and services such as video-conferencing software and Virtual LAN service require multicast support over the underlying networks. Currently, it is difficult to provide wavelength multicast over the optical switches without optical/electronic conversions although the conversion takes extra cost. In this paper, based on the proposed wavelength router architecture (equipped with ATM switches to offer O/E and E/O conversions when necessary), a dynamic multicast routing algorithm is proposed to furnish multicast services over WDM networks. The goal is to joint a new group member into the multicast tree so that the cost, including the link cost and the optical/electronic conversion cost, is kept as less as possible. The effectiveness of the proposed wavelength router architecture as well as the dynamic multicast algorithm is evaluated by simulation.
1976-11-11
exchange. The basis for this choice was derived from several factors . One was a timing analysis that was made for certain basic time-critical software...randidate 6jrstem designs were developed and _*xamined with respect to L their capability to demonstrate the workability of the basic concept and for factors ...algorithm recuires a bit time completion, while SOF production allows byte timing and the involved = SOF correlation procedure may be perfor-med during
Template Based Low Data Rate Speech Encoder
1993-09-30
Nasality Distinguishes In/ from d/ 95.6 96.9 1m/ from /b/, etc. Sustention Distinguishes /f/ from /p/, $7.5 88.3 ibi from N/, Al from /0 8. etc. Sibilation...processor performs mainly Processor Workstation input/output (I/O) operations. The dynamic random access memory (DRAM) has 16 million bytes of...storage capacity. To execute the 800-b/s voice algorithm, the following amount of memory is needed: 5 MB for tables, 1.5 MB for it "program, and 30 KB for
2018-04-01
Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions...2006. Since that time , SS-RICS has been the integration platform for many robotics algorithms using a variety of different disciplines from cognitive...voice recognition. Each noise level was run 10 times per gender, yielding 60 total runs. Two paths were chosen for testing (Paths A and B) of
1989-01-20
addressable memory can be loaded or off- loaded as the number crunching continues. Modem VLSI processors can often process data faster than today’s...Available DSP Chips Texas Instruments was one of the first serious manufacturers of DSP chips. With the Texas Instruments TMS310 DSP chip, modem , voice...Can handle double presicion data types. Texas Instruments TMS32010 T’s first-generation DSP design: a fixed-point DSP that has found its way into modem
NASA Astrophysics Data System (ADS)
He, A.; Quan, C.
2018-04-01
The principal component analysis (PCA) and region matching combined method is effective for fringe direction estimation. However, its mask construction algorithm for region matching fails in some circumstances, and the algorithm for conversion of orientation to direction in mask areas is computationally-heavy and non-optimized. We propose an improved PCA based region matching method for the fringe direction estimation, which includes an improved and robust mask construction scheme, and a fast and optimized orientation-direction conversion algorithm for the mask areas. Along with the estimated fringe direction map, filtered fringe pattern by automatic selective reconstruction modification and enhanced fast empirical mode decomposition (ASRm-EFEMD) is used for Hilbert spiral transform (HST) to demodulate the phase. Subsequently, windowed Fourier ridge (WFR) method is used for the refinement of the phase. The robustness and effectiveness of proposed method are demonstrated by both simulated and experimental fringe patterns.
The ContiNet of the International Continence Society.
Lim, P H; Fonda, D
1997-01-01
This is an account of the International Continence Society's ContiNet--the web server linking up continence organisations worldwide with provision to upload or download vast data stores of information on continence via e-mail, FTP, mailing lists, and special tools to seek information using "search engines." Special communication devices using internet voice/phone mail and real-time "text" or "voice" chats permit conversation globally over normal phone lines linked to the Net at local telephone rates. Special features of ContiNet include announcements of upcoming conventions, information for professionals and laypeople, and the capability to conduct research via the net and conduct consultations and discussions via newsgroups. In-built devices requiring special IDs and passwords permit privacy and security for users. Simple instructions are provided on how to get your PC up and running and get connected to fellow members of ICS, link up with national continence societies, or simply surf for professional enrichment and leisure. With the advent of advanced multimedia capabilities, the current poor quality videoconferencing on the Net will be replaced by excellent videophones by 1998.
Alternating motion rate as an index of speech motor disorder in traumatic brain injury.
Wang, Yu-Tsai; Kent, Ray D; Duffy, Joseph R; Thomas, Jack E; Weismer, Gary
2004-01-01
The task of syllable alternating motion rate (AMR) (also called diadochokinesis) is suitable for examining speech disorders of varying degrees of severity and in individuals with varying levels of linguistic and cognitive ability. However, very limited information on this task has been published for subjects with traumatic brain injury (TBI). This study is a quantitative and qualitative acoustic analysis of AMR in seven subjects with TBI. The primary goal was to use acoustic analyses to assess speech motor control disturbances for the group as a whole and for individual patients. Quantitative analyses included measures of syllable rate, syllable and intersyllable gap durations, energy maxima, and voice onset time (VOT). Qualitative analyses included classification of features evident in spectrograms and waveforms to provide a more detailed description. The TBI group had (1) a slowed syllable rate due mostly to lengthened syllables and, to a lesser degree, lengthened intersyllable gaps, (2) highly correlated syllable rates between AMR and conversation, (3) temporal and energy maxima irregularities within repetition sequences, (4) normal median VOT values but with large variation, and (5) a number of speech production abnormalities revealed by qualitative analysis, including explosive speech quality, breathy voice quality, phonatory instability, multiple or missing stop bursts, continuous voicing, and spirantization. The relationships between these findings and TBI speakers' neurological status and dysarthria types are also discussed. It was concluded that acoustic analyses of the AMR task provides specific information on motor speech limitations in individuals with TBI.
Quantum digital-to-analog conversion algorithm using decoherence
NASA Astrophysics Data System (ADS)
SaiToh, Akira
2015-08-01
We consider the problem of mapping digital data encoded on a quantum register to analog amplitudes in parallel. It is shown to be unlikely that a fully unitary polynomial-time quantum algorithm exists for this problem; NP becomes a subset of BQP if it exists. In the practical point of view, we propose a nonunitary linear-time algorithm using quantum decoherence. It tacitly uses an exponentially large physical resource, which is typically a huge number of identical molecules. Quantumness of correlation appearing in the process of the algorithm is also discussed.
A maximum power point tracking algorithm for buoy-rope-drum wave energy converters
NASA Astrophysics Data System (ADS)
Wang, J. Q.; Zhang, X. C.; Zhou, Y.; Cui, Z. C.; Zhu, L. S.
2016-08-01
The maximum power point tracking control is the key link to improve the energy conversion efficiency of wave energy converters (WEC). This paper presents a novel variable step size Perturb and Observe maximum power point tracking algorithm with a power classification standard for control of a buoy-rope-drum WEC. The algorithm and simulation model of the buoy-rope-drum WEC are presented in details, as well as simulation experiment results. The results show that the algorithm tracks the maximum power point of the WEC fast and accurately.
Hotplate precipitation gauge calibrations and field measurements
NASA Astrophysics Data System (ADS)
Zelasko, Nicholas; Wettlaufer, Adam; Borkhuu, Bujidmaa; Burkhart, Matthew; Campbell, Leah S.; Steenburgh, W. James; Snider, Jefferson R.
2018-01-01
First introduced in 2003, approximately 70 Yankee Environmental Systems (YES) hotplate precipitation gauges have been purchased by researchers and operational meteorologists. A version of the YES hotplate is described in Rasmussen et al. (2011; R11). Presented here is testing of a newer version of the hotplate; this device is equipped with longwave and shortwave radiation sensors. Hotplate surface temperature, coefficients describing natural and forced convective sensible energy transfer, and radiative properties (longwave emissivity and shortwave reflectance) are reported for two of the new-version YES hotplates. These parameters are applied in a new algorithm and are used to derive liquid-equivalent accumulations (snowfall and rainfall), and these accumulations are compared to values derived by the internal algorithm used in the YES hotplates (hotplate-derived accumulations). In contrast with R11, the new algorithm accounts for radiative terms in a hotplate's energy budget, applies an energy conversion factor which does not differ from a theoretical energy conversion factor, and applies a surface area that is correct for the YES hotplate. Radiative effects are shown to be relatively unimportant for the precipitation events analyzed. In addition, this work documents a 10 % difference between the hotplate-derived and new-algorithm-derived accumulations. This difference seems consistent with R11's application of a hotplate surface area that deviates from the actual surface area of the YES hotplate and with R11's recommendation for an energy conversion factor that differs from that calculated using thermodynamic theory.
Quality of service routing in wireless ad hoc networks
NASA Astrophysics Data System (ADS)
Sane, Sachin J.; Patcha, Animesh; Mishra, Amitabh
2003-08-01
An efficient routing protocol is essential to guarantee application level quality of service running on wireless ad hoc networks. In this paper we propose a novel routing algorithm that computes a path between a source and a destination by considering several important constraints such as path-life, availability of sufficient energy as well as buffer space in each of the nodes on the path between the source and destination. The algorithm chooses the best path from among the multiples paths that it computes between two endpoints. We consider the use of control packets that run at a priority higher than the data packets in determining the multiple paths. The paper also examines the impact of different schedulers such as weighted fair queuing, and weighted random early detection among others in preserving the QoS level guarantees. Our extensive simulation results indicate that the algorithm improves the overall lifetime of a network, reduces the number of dropped packets, and decreases the end-to-end delay for real-time voice application.
Automatic speech recognition research at NASA-Ames Research Center
NASA Technical Reports Server (NTRS)
Coler, Clayton R.; Plummer, Robert P.; Huff, Edward M.; Hitchcock, Myron H.
1977-01-01
A trainable acoustic pattern recognizer manufactured by Scope Electronics is presented. The voice command system VCS encodes speech by sampling 16 bandpass filters with center frequencies in the range from 200 to 5000 Hz. Variations in speaking rate are compensated for by a compression algorithm that subdivides each utterance into eight subintervals in such a way that the amount of spectral change within each subinterval is the same. The recorded filter values within each subinterval are then reduced to a 15-bit representation, giving a 120-bit encoding for each utterance. The VCS incorporates a simple recognition algorithm that utilizes five training samples of each word in a vocabulary of up to 24 words. The recognition rate of approximately 85 percent correct for untrained speakers and 94 percent correct for trained speakers was not considered adequate for flight systems use. Therefore, the built-in recognition algorithm was disabled, and the VCS was modified to transmit 120-bit encodings to an external computer for recognition.
Hodges, Nathan
2015-01-01
You write this narrative autoethnography to open up a conversation about our chemical lives. You go through your day with chemical mindfulness, questioning taken-for-granted ideas about natural and artificial, healthy and unhealthy, dependency and addiction, trying to understand the chemical messages we consume through the experiences of everyday life. You reflect on how messages about chemicals influence and structure our lives and why some chemicals are celebrated and some are condemned. Using a second-person narrative voice, you show how the personal is relational and the chemical is cultural. You write because you seek a connection, a chemical bond.
Assessment of vocal cord nodules: a case study in speech processing by using Hilbert-Huang Transform
NASA Astrophysics Data System (ADS)
Civera, M.; Filosi, C. M.; Pugno, N. M.; Silvestrini, M.; Surace, C.; Worden, K.
2017-05-01
Vocal cord nodules represent a pathological condition for which the growth of unnatural masses on vocal folds affects the patients. Among other effects, changes in the vocal cords’ overall mass and stiffness alter their vibratory behaviour, thus changing the vocal emission generated by them. This causes dysphonia, i.e. abnormalities in the patients’ voice, which can be analysed and inspected via audio signals. However, the evaluation of voice condition through speech processing is not a trivial task, as standard methods based on the Fourier Transform, fail to fit the non-stationary nature of vocal signals. In this study, four audio tracks, provided by a volunteer patient, whose vocal fold nodules have been surgically removed, were analysed using a relatively new technique: the Hilbert-Huang Transform (HHT) via Empirical Mode Decomposition (EMD); specifically, by using the CEEMDAN (Complete Ensemble EMD with Adaptive Noise) algorithm. This method has been applied here to speech signals, which were recorded before removal surgery and during convalescence, to investigate specific trends. Possibilities offered by the HHT are exposed, but also some limitations of decomposing the signals into so-called intrinsic mode functions (IMFs) are highlighted. The results of these preliminary studies are intended to be a basis for the development of new viable alternatives to the softwares currently used for the analysis and evaluation of pathological voice.
What does it means to be a critical scholar? A metalogue between science education doctoral students
NASA Astrophysics Data System (ADS)
Cian, Heidi; Dsouza, Nikeetha; Lyons, Renee; Alston, Daniel
2017-06-01
This manuscript is written in response to Lydia Burke and Jesse Bazzul's article Locating a space of criticality as new scholars in science education. As doctoral students finding our place in the culture of science education, we respond by discussing our journeys towards the development of a scholarly identity, with particular focus on whether or how we see ourselves as critical scholars. Since each of us authoring this paper has a different perspective, a metalogue format is utilized to ensure all of our voices and journeys are represented. We use the Burke and Bazzul article as a platform for conversations about challenges faced for emerging scholars in the field of science education and explore how we see our role in responding to these challenges. Specifically, we discuss the barriers to publication, dissemination of research to practitioners, and how to approach these problems from a grounding in critical theory. As a result of our conversations, we conclude that there is a need to reshape the field of science education to invite more unorthodox research perspectives, methodologies, and publication formats. To do so, the issues we explore require a continued conversation between emerging scholars, practicing researchers, and practicing educators.
Enabling IP Header Compression in COTS Routers via Frame Relay on a Simplex Link
NASA Technical Reports Server (NTRS)
Nguyen, Sam P.; Pang, Jackson; Clare, Loren P.; Cheng, Michael K.
2010-01-01
NASA is moving toward a networkcentric communications architecture and, in particular, is building toward use of Internet Protocol (IP) in space. The use of IP is motivated by its ubiquitous application in many communications networks and in available commercial off-the-shelf (COTS) technology. The Constellation Program intends to fit two or more voice (over IP) channels on both the forward link to, and the return link from, the Orion Crew Exploration Vehicle (CEV) during all mission phases. Efficient bandwidth utilization of the links is key for voice applications. In Voice over IP (VoIP), the IP packets are limited to small sizes to keep voice latency at a minimum. The common voice codec used in VoIP is G.729. This new algorithm produces voice audio at 8 kbps and in packets of 10-milliseconds duration. Constellation has designed the VoIP communications stack to use the combination of IP/UDP/RTP protocols where IP carries a 20-byte header, UDP (User Datagram Protocol) carries an 8-byte header, and RTP (Real Time Transport Protocol) carries a 12-byte header. The protocol headers total 40 bytes and are equal in length to a 40-byte G.729 payload, doubling the VoIP latency. Since much of the IP/UDP/RTP header information does not change from IP packet to IP packet, IP/UDP/RTP header compression can avoid transmission of much redundant data as well as reduce VoIP latency. The benefits of IP header compression are more pronounced at low data rate links such as the forward and return links during CEV launch. IP/UDP/RTP header compression codecs are well supported by many COTS routers. A common interface to the COTS routers is through frame relay. However, enabling IP header compression over frame relay, according to industry standard (Frame Relay IP Header Compression Agreement FRF.20), requires a duplex link and negotiations between the compressor router and the decompressor router. In Constellation, each forward to and return link from the CEV in space is treated independently as a simplex link. Without negotiation, the COTS routers are prevented from entering into the IP header compression mode, and no IP header compression would be performed. An algorithm is proposed to enable IP header compression in COTS routers on a simplex link with no negotiation or with a one-way messaging. In doing so, COTS routers can enter IP header compression mode without the need to handshake through a bidirectional link as required by FRF.20. This technique would spoof the routers locally and thereby allow the routers to enter into IP header compression mode without having the negotiations between routers actually occur. The spoofing function is conducted by a frame relay adapter (also COTS) with the capability to generate control messages according to the FRF.20 descriptions. Therefore, negotiation is actually performed between the FRF.20 adapter and the connecting COTS router locally and never occurs over the space link. Through understanding of the handshaking protocol described by FRF.20, the necessary FRF.20 negotiations messages can be generated to control the connecting router, not only to turn on IP header compression but also to adjust the compression parameters. The FRF.20 negotiation (or control) message is composed in the FRF.20 adapter by interpreting the incoming router request message. Many of the fields are simply transcribed from request to response while the control field indicating response and type are modified.
NASA Astrophysics Data System (ADS)
Chakraborty, Tamal; Saha Misra, Iti
2016-03-01
Secondary Users (SUs) in a Cognitive Radio Network (CRN) face unpredictable interruptions in transmission due to the random arrival of Primary Users (PUs), leading to spectrum handoff or dropping instances. An efficient spectrum handoff algorithm, thus, becomes one of the indispensable components in CRN, especially for real-time communication like Voice over IP (VoIP). In this regard, this paper investigates the effects of spectrum handoff on the Quality of Service (QoS) for VoIP traffic in CRN, and proposes a real-time spectrum handoff algorithm in two phases. The first phase (VAST-VoIP based Adaptive Sensing and Transmission) adaptively varies the channel sensing and transmission durations to perform intelligent dropping decisions. The second phase (ProReact-Proactive and Reactive Handoff) deploys efficient channel selection mechanisms during spectrum handoff for resuming communication. Extensive performance analysis in analytical and simulation models confirms a decrease in spectrum handoff delay for VoIP SUs by more than 40% and 60%, compared to existing proactive and reactive algorithms, respectively and ensures a minimum 10% reduction in call-dropping probability with respect to the previous works in this domain. The effective SU transmission duration is also maximized under the proposed algorithm, thereby making it suitable for successful VoIP communication.
Changes in brain activity following intensive voice treatment in children with cerebral palsy.
Bakhtiari, Reyhaneh; Cummine, Jacqueline; Reed, Alesha; Fox, Cynthia M; Chouinard, Brea; Cribben, Ivor; Boliek, Carol A
2017-09-01
Eight children (3 females; 8-16 years) with motor speech disorders secondary to cerebral palsy underwent 4 weeks of an intensive neuroplasticity-principled voice treatment protocol, LSVT LOUD ® , followed by a structured 12-week maintenance program. Children were asked to overtly produce phonation (ah) at conversational loudness, cued-phonation at perceived twice-conversational loudness, a series of single words, and a prosodic imitation task while being scanned using fMRI, immediately pre- and post-treatment and 12 weeks following a maintenance program. Eight age- and sex-matched controls were scanned at each of the same three time points. Based on the speech and language literature, 16 bilateral regions of interest were selected a priori to detect potential neural changes following treatment. Reduced neural activity in the motor areas (decreased motor system effort) before and immediately after treatment, and increased activity in the anterior cingulate gyrus after treatment (increased contribution of decision making processes) were observed in the group with cerebral palsy compared to the control group. Using graphical models, post-treatment changes in connectivity were observed between the left supramarginal gyrus and the right supramarginal gyrus and the left precentral gyrus for the children with cerebral palsy, suggesting LSVT LOUD enhanced contributions of the feedback system in the speech production network instead of high reliance on feedforward control system and the somatosensory target map for regulating vocal effort. Network pruning indicates greater processing efficiency and the recruitment of the auditory and somatosensory feedback control systems following intensive treatment. Hum Brain Mapp 38:4413-4429, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Intelligent Medical Systems for Aerospace Emergency Medical Services
NASA Technical Reports Server (NTRS)
Epler, John; Zimmer, Gary
2004-01-01
The purpose of this project is to develop a portable, hands free device for emergency medical decision support to be used in remote or confined settings by non-physician providers. Phase I of the project will entail the development of a voice-activated device that will utilize an intelligent algorithm to provide guidance in establishing an airway in an emergency situation. The interactive, hands free software will process requests for assistance based on verbal prompts and algorithmic decision-making. The device will allow the CMO to attend to the patient while receiving verbal instruction. The software will also feature graphic representations where it is felt helpful in aiding in procedures. We will also develop a training program to orient users to the algorithmic approach, the use of the hardware and specific procedural considerations. We will validate the efficacy of this mode of technology application by testing in the Johns Hopkins Department of Emergency Medicine. Phase I of the project will focus on the validation of the proposed algorithm, testing and validation of the decision making tool and modifications of medical equipment. In Phase 11, we will produce the first generation software for hands-free, interactive medical decision making for use in acute care environments.
Hodgetts, William E; Scollie, Susan D
2017-07-01
To develop an algorithm that prescribes targets for bone conduction frequency response shape, compression, and output limiting, along with a clinical method that ensures accurate transforms between assessment and verification stages of the clinical workflow. Technical report of target generation and validation. We recruited 39 adult users of unilateral percutaneous bone conduction hearing aids with a range of unilateral, bilateral, mixed and conductive hearing losses across the sample. The initial algorithm over-prescribed output compared to the user's own settings in the low frequencies, but provided a good match to user settings in the high frequencies. Corrections to the targets were derived and implemented as a low-frequency cut aimed at improving acceptance of the wearer's own voice during device use. The DSL-BCD prescriptive algorithm is compatible with verification of devices and fine-tuning to target for percutaneous bone conduction hearing devices that can be coupled to a skull simulator. Further study is needed to investigate the appropriateness of this prescriptive algorithm for other input levels, and for other clinical populations including those with single-sided deafness, bilateral devices, children and users of transcutaneous bone conduction hearing aids.
Novel grid-based optical Braille conversion: from scanning to wording
NASA Astrophysics Data System (ADS)
Yoosefi Babadi, Majid; Jafari, Shahram
2011-12-01
Grid-based optical Braille conversion (GOBCO) is explained in this article. The grid-fitting technique involves processing scanned images taken from old hard-copy Braille manuscripts, recognising and converting them into English ASCII text documents inside a computer. The resulted words are verified using the relevant dictionary to provide the final output. The algorithms employed in this article can be easily modified to be implemented on other visual pattern recognition systems and text extraction applications. This technique has several advantages including: simplicity of the algorithm, high speed of execution, ability to help visually impaired persons and blind people to work with fax machines and the like, and the ability to help sighted people with no prior knowledge of Braille to understand hard-copy Braille manuscripts.
NASA Astrophysics Data System (ADS)
Holtzman, B. K.; Paté, A.; Paisley, J.; Waldhauser, F.; Repetto, D.; Boschi, L.
2017-12-01
The earthquake process reflects complex interactions of stress, fracture and frictional properties. New machine learning methods reveal patterns in time-dependent spectral properties of seismic signals and enable identification of changes in faulting processes. Our methods are based closely on those developed for music information retrieval and voice recognition, using the spectrogram instead of the waveform directly. Unsupervised learning involves identification of patterns based on differences among signals without any additional information provided to the algorithm. Clustering of 46,000 earthquakes of $0.3
An Innovative Thinking-Based Intelligent Information Fusion Algorithm
Hu, Liang; Liu, Gang; Zhou, Jin
2013-01-01
This study proposes an intelligent algorithm that can realize information fusion in reference to the relative research achievements in brain cognitive theory and innovative computation. This algorithm treats knowledge as core and information fusion as a knowledge-based innovative thinking process. Furthermore, the five key parts of this algorithm including information sense and perception, memory storage, divergent thinking, convergent thinking, and evaluation system are simulated and modeled. This algorithm fully develops innovative thinking skills of knowledge in information fusion and is a try to converse the abstract conception of brain cognitive science to specific and operable research routes and strategies. Furthermore, the influences of each parameter of this algorithm on algorithm performance are analyzed and compared with those of classical intelligent algorithms trough test. Test results suggest that the algorithm proposed in this study can obtain the optimum problem solution by less target evaluation times, improve optimization effectiveness, and achieve the effective fusion of information. PMID:23956699
An innovative thinking-based intelligent information fusion algorithm.
Lu, Huimin; Hu, Liang; Liu, Gang; Zhou, Jin
2013-01-01
This study proposes an intelligent algorithm that can realize information fusion in reference to the relative research achievements in brain cognitive theory and innovative computation. This algorithm treats knowledge as core and information fusion as a knowledge-based innovative thinking process. Furthermore, the five key parts of this algorithm including information sense and perception, memory storage, divergent thinking, convergent thinking, and evaluation system are simulated and modeled. This algorithm fully develops innovative thinking skills of knowledge in information fusion and is a try to converse the abstract conception of brain cognitive science to specific and operable research routes and strategies. Furthermore, the influences of each parameter of this algorithm on algorithm performance are analyzed and compared with those of classical intelligent algorithms trough test. Test results suggest that the algorithm proposed in this study can obtain the optimum problem solution by less target evaluation times, improve optimization effectiveness, and achieve the effective fusion of information.
NASA Astrophysics Data System (ADS)
Shirazi, Abolfazl
2016-10-01
This article introduces a new method to optimize finite-burn orbital manoeuvres based on a modified evolutionary algorithm. Optimization is carried out based on conversion of the orbital manoeuvre into a parameter optimization problem by assigning inverse tangential functions to the changes in direction angles of the thrust vector. The problem is analysed using boundary delimitation in a common optimization algorithm. A method is introduced to achieve acceptable values for optimization variables using nonlinear simulation, which results in an enlarged convergence domain. The presented algorithm benefits from high optimality and fast convergence time. A numerical example of a three-dimensional optimal orbital transfer is presented and the accuracy of the proposed algorithm is shown.
NASA Astrophysics Data System (ADS)
Yang, Tao; Peng, Jing-xiao; Ho, Ho-pui; Song, Chun-yuan; Huang, Xiao-li; Zhu, Yong-yuan; Li, Xing-ao; Huang, Wei
2018-01-01
By using a preaggregated silver nanoparticle monolayer film and an infrared sensor card, we demonstrate a miniature spectrometer design that covers a broad wavelength range from visible to infrared with high spectral resolution. The spectral contents of an incident probe beam are reconstructed by solving a matrix equation with a smoothing simulated annealing algorithm. The proposed spectrometer offers significant advantages over current instruments that are based on Fourier transform and grating dispersion, in terms of size, resolution, spectral range, cost and reliability. The spectrometer contains three components, which are used for dispersion, frequency conversion and detection. Disordered silver nanoparticles in dispersion component reduce the fabrication complexity. An infrared sensor card in the conversion component broaden the operational spectral range of the system into visible and infrared bands. Since the CCD used in the detection component provides very large number of intensity measurements, one can reconstruct the final spectrum with high resolution. An additional feature of our algorithm for solving the matrix equation, which is suitable for reconstructing both broadband and narrowband signals, we have adopted a smoothing step based on a simulated annealing algorithm. This algorithm improve the accuracy of the spectral reconstruction.
A Depth Map Generation Algorithm Based on Saliency Detection for 2D to 3D Conversion
NASA Astrophysics Data System (ADS)
Yang, Yizhong; Hu, Xionglou; Wu, Nengju; Wang, Pengfei; Xu, Dong; Rong, Shen
2017-09-01
In recent years, 3D movies attract people's attention more and more because of their immersive stereoscopic experience. However, 3D movies is still insufficient, so estimating depth information for 2D to 3D conversion from a video is more and more important. In this paper, we present a novel algorithm to estimate depth information from a video via scene classification algorithm. In order to obtain perceptually reliable depth information for viewers, the algorithm classifies them into three categories: landscape type, close-up type, linear perspective type firstly. Then we employ a specific algorithm to divide the landscape type image into many blocks, and assign depth value by similar relative height cue with the image. As to the close-up type image, a saliency-based method is adopted to enhance the foreground in the image and the method combine it with the global depth gradient to generate final depth map. By vanishing line detection, the calculated vanishing point which is regarded as the farthest point to the viewer is assigned with deepest depth value. According to the distance between the other points and the vanishing point, the entire image is assigned with corresponding depth value. Finally, depth image-based rendering is employed to generate stereoscopic virtual views after bilateral filter. Experiments show that the proposed algorithm can achieve realistic 3D effects and yield satisfactory results, while the perception scores of anaglyph images lie between 6.8 and 7.8.
Hunter, Eric J.
2009-01-01
Objectives Building on the concept that task type may influence fundamental frequency (F0) values, the purpose of this case study was to investigate the difference in a child’s F0 during structured, elicited tasks and long-term, unstructured activities. It also explores the possibility that the distribution in children’s F0 may make the standard statistical measures of mean and standard deviation less than ideal metrics. Methods A healthy male child (5 years, 7 months) was evaluated. The child completed four voice tasks used in a previous study of the influence of task type on F0 values: (1) sustaining the vowel /a/; (2) sustaining the vowel, /a/, embedded in a word at the end of a phrase; (3) repeating a sentence; and (4) counting from 1 to 10. The child also wore a National Center for Voice and Speech voice dosimeter, a device that collects voice data over the course of an entire day, during all activities for 34 hours over 4 days. Results Throughout the structured vocal tasks within the clinical environment, the child’s F0, as measured by both the dosimeter and acoustic analysis of microphone data, was similar for all four tasks, with the counting task the most dissimilar. The mean F0 (~257 Hz) matched very closely to the average task results in the literature given for the child’s age group. However, the child’s mean fundamental frequency during the unstructured activities was significantly higher (~376 Hz). Finally, the mode and median of the structured vocal tasks were respectively 260 Hz and 259 Hz (both near the mean), while the unstructured mode and median were respectively 290 Hz and 355 Hz. Conclusions The results of this study suggest that children may produce a notably different voice pattern during clinical observations compared to routine daily activities. In addition, the child’s long-term F0 distribution is not normal. If this distribution is consistent in long-term, unstructured natural vocalization patterns of children, statistical mean would not be a valid measure. Mode and median are suggested as two parameters which convey more accurate information about typical F0 usage. Finally, future research avenues, including further exploration of how children may adapt their F0 to various environments, conversation partners, and activity, are suggested. PMID:19185926
Pastoral care in a time of global market capitalism.
Poling, James
2004-01-01
The author defines pastoral theology as "the study of the micro-world of intrapsychic and interpersonal interactions with the tools of theology and the social sciences for the purpose of support and healing. In a typical class or supervisory session, we analyze the words, voice inflection, pace, and gestures of an intimate conversation between two people, looking for clues to the deep structure of personality and intimate relationships. The hope of such study is that we will see the revelation of God is love and power in action to validate and challenge the theological traditions that give us eyes to see and invite us to see more clearly."
Power Control and Optimization of Photovoltaic and Wind Energy Conversion Systems
NASA Astrophysics Data System (ADS)
Ghaffari, Azad
Power map and Maximum Power Point (MPP) of Photovoltaic (PV) and Wind Energy Conversion Systems (WECS) highly depend on system dynamics and environmental parameters, e.g., solar irradiance, temperature, and wind speed. Power optimization algorithms for PV systems and WECS are collectively known as Maximum Power Point Tracking (MPPT) algorithm. Gradient-based Extremum Seeking (ES), as a non-model-based MPPT algorithm, governs the system to its peak point on the steepest descent curve regardless of changes of the system dynamics and variations of the environmental parameters. Since the power map shape defines the gradient vector, then a close estimate of the power map shape is needed to create user assignable transients in the MPPT algorithm. The Hessian gives a precise estimate of the power map in a neighborhood around the MPP. The estimate of the inverse of the Hessian in combination with the estimate of the gradient vector are the key parts to implement the Newton-based ES algorithm. Hence, we generate an estimate of the Hessian using our proposed perturbation matrix. Also, we introduce a dynamic estimator to calculate the inverse of the Hessian which is an essential part of our algorithm. We present various simulations and experiments on the micro-converter PV systems to verify the validity of our proposed algorithm. The ES scheme can also be used in combination with other control algorithms to achieve desired closed-loop performance. The WECS dynamics is slow which causes even slower response time for the MPPT based on the ES. Hence, we present a control scheme, extended from Field-Oriented Control (FOC), in combination with feedback linearization to reduce the convergence time of the closed-loop system. Furthermore, the nonlinear control prevents magnetic saturation of the stator of the Induction Generator (IG). The proposed control algorithm in combination with the ES guarantees the closed-loop system robustness with respect to high level parameter uncertainty in the IG dynamics. The simulation results verify the effectiveness of the proposed algorithm.
Kuo, Chung-Feng Jeffrey; Chu, Yueng-Hsiang; Wang, Po-Chun; Lai, Chun-Yu; Chu, Wen-Lin; Leu, Yi-Shing; Wang, Hsing-Won
2013-12-01
The human larynx is an important organ for voice production and respiratory mechanisms. The vocal cord is approximated for voice production and open for breathing. The videolaryngoscope is widely used for vocal cord examination. At present, physicians usually diagnose vocal cord diseases by manually selecting the image of the vocal cord opening to the largest extent (abduction), thus maximally exposing the vocal cord lesion. On the other hand, the severity of diseases such as vocal palsy, atrophic vocal cord is largely dependent on the vocal cord closing to the smallest extent (adduction). Therefore, diseases can be assessed by the image of the vocal cord opening to the largest extent, and the seriousness of breathy voice is closely correlated to the gap between vocal cords when closing to the smallest extent. The aim of the study was to design an automatic vocal cord image selection system to improve the conventional selection process by physicians and enhance diagnosis efficiency. Also, due to the unwanted fuzzy images resulting from examination process caused by human factors as well as the non-vocal cord images, texture analysis is added in this study to measure image entropy to establish a screening and elimination system to effectively enhance the accuracy of selecting the image of the vocal cord closing to the smallest extent. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Non-invasive In vivo measurement of the shear modulus of human vocal fold tissue
Kazemirad, Siavash; Bakhshaee, Hani; Mongeau, Luc; Kost, Karen
2014-01-01
Voice is the essential part of singing and speech communication. Voice disorders significantly affect the quality of life. The viscoelastic mechanical properties of the vocal fold mucosa determine the characteristics of the vocal folds oscillations, and thereby voice quality. In the present study, a non-invasive method was developed to determine the shear modulus of human vocal fold tissue in vivo via measurements of the mucosal wave propagation speed during phonation. Images of four human subjects’ vocal folds were captured using high speed digital imaging (HSDI) and magnetic resonance imaging (MRI) for different phonation pitches, specifically fundamental frequencies between 110 to 440 Hz. The MRI images were used to obtain the morphometric dimensions of each subject's vocal folds in order to determine the pixel size in the high-speed images. The mucosal wave propagation speed was determined for each subject and at each pitch value using an automated image processing algorithm. The transverse shear modulus of the vocal fold mucosa was then calculated from a surface (Rayleigh) wave propagation dispersion equation using the measured wave speeds. It was found that the mucosal wave propagation speed and therefore the shear modulus of the vocal fold tissue were generally greater at higher pitches. The results were in good agreement with those from other studies obtained via in vitro measurements, thereby supporting the validity of the proposed measurement method. This method offers the potential for in vivo clinical assessments of vocal folds viscoelasticity from HSDI. PMID:24433668
Does insecure attachment mediate the relationship between trauma and voice-hearing in psychosis?
Pilton, Marie; Bucci, Sandra; McManus, James; Hayward, Mark; Emsley, Richard; Berry, Katherine
2016-12-30
This study extends existing research and theoretical developments by exploring the potential mediating role of insecure attachment within the relationship between trauma and voice-hearing. Fifty-five voice hearers with a psychosis-related diagnosis completed comprehensive assessments of childhood trauma, adult attachment, voice-related severity and distress, beliefs about voices and relationships with voices. Anxious attachment was significantly associated with the voice-hearing dimensions examined. More sophisticated analysis showed that anxious attachment mediated the relationship between childhood sexual and emotional abuse and voice-related severity and distress, voice-malevolence, voice-omnipotence, voice-resistance and hearer-dependence. Anxious attachment also mediated the relationship between childhood physical neglect and voice-related severity and distress and hearer-dependence. Furthermore, consistent with previous research, the relationship between anxious attachment and voice-related distress was mediated by voice-malevolence, voice-omnipotence and voice-resistance. We propose a model whereby anxious attachment mediates the well-established relationship between trauma and voice-hearing. In turn, negative beliefs about voices may mediate the association between anxious attachment and voice-related distress. Findings presented here highlight the need to assess and formulate the impact of attachment patterns upon the voice-hearing experience in psychosis and the potential to alleviate voice-related distress by fostering secure attachments to therapists or significant others. Crown Copyright © 2016. Published by Elsevier Ireland Ltd. All rights reserved.
Lyberg Åhlander, Viveka; Rydell, Roland; Löfqvist, Anders
2012-07-01
This randomized case-control study compares teachers with self-reported voice problems to age-, gender-, and school-matched colleagues with self-reported voice health. The self-assessed voice function is related to factors known to influence the voice: laryngeal findings, voice quality, personality, psychosocial and coping aspects, searching for causative factors of voice problems in teachers. Subjects and controls, recruited from a teacher group in an earlier questionnaire study, underwent examinations of the larynx by high-speed imaging and kymograms; voice recordings; voice range profile; audiometry; self-assessment of voice handicap and voice function; teaching and environmental aspects; personality; coping; burnout, and work-related issues. The laryngeal and voice recordings were assessed by experienced phoniatricians and speech pathologists. The subjects with self-assessed voice problems differed from their peers with self-assessed voice health by significantly longer recovery time from voice problems and scored higher on all subscales of the Voice Handicap Index-Throat. The results show that the cause of voice dysfunction in this group of teachers with self-reported voice problems is not found in the vocal apparatus or within the individual. The individual's perception of a voice problem seems to be based on a combination of the number of symptoms and of how often the symptoms occur, along with the recovery time. The results also underline the importance of using self-assessed reports of voice dysfunction. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Self-masking: Listening during vocalization. Normal hearing.
Borg, Erik; Bergkvist, Christina; Gustafsson, Dan
2009-06-01
What underlying mechanisms are involved in the ability to talk and listen simultaneously and what role does self-masking play under conditions of hearing impairment? The purpose of the present series of studies is to describe a technique for assessment of masked thresholds during vocalization, to describe normative data for males and females, and to focus on hearing impairment. The masking effect of vocalized [a:] on narrow-band noise pulses (250-8000 Hz) was studied using the maximum vocalization method. An amplitude-modulated series of sound pulses, which sounded like a steam engine, was masked until the criterion of halving the perceived pulse rate was reached. For masking of continuous reading, a just-follow-conversation criterion was applied. Intra-session test-retest reproducibility and inter-session variability were calculated. The results showed that female voices were more efficient in masking high frequency noise bursts than male voices and more efficient in masking both a male and a female test reading. The male had to vocalize 4 dBA louder than the female to produce the same masking effect on the test reading. It is concluded that the method is relatively simple to apply and has small intra-session and fair inter-session variability. Interesting gender differences were observed.
Facial Expression Presentation for Real-Time Internet Communication
NASA Astrophysics Data System (ADS)
Dugarry, Alexandre; Berrada, Aida; Fu, Shan
2003-01-01
Text, voice and video images are the most common forms of media content for instant communication on the Internet. Studies have shown that facial expressions convey much richer information than text and voice during a face-to-face conversation. The currently available real time means of communication (instant text messages, chat programs and videoconferencing), however, have major drawbacks in terms of exchanging facial expression. The first two means do not involve the image transmission, whilst video conferencing requires a large bandwidth that is not always available, and the transmitted image sequence is neither smooth nor without delay. The objective of the work presented here is to develop a technique that overcomes these limitations, by extracting the facial expression of speakers and to realise real-time communication. In order to get the facial expressions, the main characteristics of the image are emphasized. Interpolation is performed on edge points previously detected to create geometric shapes such as arcs, lines, etc. The regional dominant colours of the pictures are also extracted and the combined results are subsequently converted into Scalable Vector Graphics (SVG) format. The application based on the proposed technique aims at being used simultaneously with chat programs and being able to run on any platform.
Internal versus External Auditory Hallucinations in Schizophrenia: Symptom and Course Correlates
Docherty, Nancy M.; Dinzeo, Thomas J.; McCleery, Amanda; Bell, Emily K.; Shakeel, Mohammed K.; Moe, Aubrey
2015-01-01
Introduction The auditory hallucinations associated with schizophrenia are phenomenologically diverse. “External” hallucinations classically have been considered to reflect more severe psychopathology than “internal” hallucinations, but empirical support has been equivocal. Methods We examined associations of “internal” v. “external” hallucinations with (a) other characteristics of the hallucinations, (b) severity of other symptoms, and (c) course of illness variables, in a sample of 97 stable outpatients with schizophrenia or schizoaffective disorder who experienced auditory hallucinations. Results Patients with internal hallucinations did not differ from those with external hallucinations on severity of other symptoms. However, they reported their hallucinations to be more emotionally negative, distressing, and long-lasting, less controllable, and less likely to remit over time. They also were more likely to experience voices commenting, conversing, or commanding. However, they also were more likely to have insight into the self-generated nature of their voices. Patients with internal hallucinations were not older, but had a later age of illness onset. Conclusions Differences in characteristics of auditory hallucinations are associated with differences in other characteristics of the disorder, and hence may be relevant to identifying subgroups of patients that are more homogeneous with respect to their underlying disease processes. PMID:25530157
NASA Astrophysics Data System (ADS)
Rushton, Gregory T.; Criswell, Brett A.
2013-03-01
In Penetrating a Wall of Introspection: A Critical Attrition Analysis, Johannsen, Rump, and Linder strive to give a voice to those whose thoughts might otherwise be unheard: students representing the casualties in the conflict surrounding the practices of STEM education, specifically those in the field of physics. Beyond giving those students a voice, they try to filter out and amplify a message that the seven individuals themselves may not have recognized: that the cause of their struggles in their physics programs might not be something innate ( causa materialis in the authors' framework), but might be found outside the individuals ( causa efficiens in the authors' framework). In our response, we attempt to extend the conversation regarding the issues these authors have raised by (1) considering the conditions within the physics community that might exacerbate this situation and (2) exploring from a different perspective the nature of the discourses that will either perpetuate or ameliorate such circumstances. In so doing we seek to provide a more holistic description of the features of the educational system that help to construct the wall of introspection and that might, in turn, be redressed in order to help tear it down—and positively impact attrition rates.
A survey of the state-of-the-art and focused research in range systems
NASA Technical Reports Server (NTRS)
Kung, Yao; Balakrishnan, A. V.
1988-01-01
In this one-year renewal of NASA Contract No. 2-304, basic research, development, and implementation in the areas of modern estimation algorithms and digital communication systems have been performed. In the first area, basic study on the conversion of general classes of practical signal processing algorithms into systolic array algorithms is considered, producing four publications. Also studied were the finite word length effects and convergence rates of lattice algorithms, producing two publications. In the second area of study, the use of efficient importance sampling simulation technique for the evaluation of digital communication system performances were studied, producing two publications.
Computation-aware algorithm selection approach for interlaced-to-progressive conversion
NASA Astrophysics Data System (ADS)
Park, Sang-Jun; Jeon, Gwanggil; Jeong, Jechang
2010-05-01
We discuss deinterlacing results in a computationally constrained and varied environment. The proposed computation-aware algorithm selection approach (CASA) for fast interlaced to progressive conversion algorithm consists of three methods: the line-averaging (LA) method for plain regions, the modified edge-based line-averaging (MELA) method for medium regions, and the proposed covariance-based adaptive deinterlacing (CAD) method for complex regions. The proposed CASA uses two criteria, mean-squared error (MSE) and CPU time, for assigning the method. We proposed a CAD method. The principle idea of CAD is based on the correspondence between the high and low-resolution covariances. We estimated the local covariance coefficients from an interlaced image using Wiener filtering theory and then used these optimal minimum MSE interpolation coefficients to obtain a deinterlaced image. The CAD method, though more robust than most known methods, was not found to be very fast compared to the others. To alleviate this issue, we proposed an adaptive selection approach using a fast deinterlacing algorithm rather than using only one CAD algorithm. The proposed hybrid approach of switching between the conventional schemes (LA and MELA) and our CAD was proposed to reduce the overall computational load. A reliable condition to be used for switching the schemes was presented after a wide set of initial training processes. The results of computer simulations showed that the proposed methods outperformed a number of methods presented in the literature.
Lee, Hun Joo; Han, Eunyoung; Lee, Jaesin; Chung, Heesun; Min, Sung-Gi
2016-11-01
The aim of this study is to improve resolution of impurity peaks using a newly devised normalization algorithm for multi-internal standards (ISs) and to describe a visual peak selection system (VPSS) for efficient support of impurity profiling. Drug trafficking routes, location of manufacture, or synthetic route can be identified from impurities in seized drugs. In the analysis of impurities, different chromatogram profiles are obtained from gas chromatography and used to examine similarities between drug samples. The data processing method using relative retention time (RRT) calculated by a single internal standard is not preferred when many internal standards are used and many chromatographic peaks present because of the risk of overlapping between peaks and difficulty in classifying impurities. In this study, impurities in methamphetamine (MA) were extracted by liquid-liquid extraction (LLE) method using ethylacetate containing 4 internal standards and analyzed by gas chromatography-flame ionization detection (GC-FID). The newly developed VPSS consists of an input module, a conversion module, and a detection module. The input module imports chromatograms collected from GC and performs preprocessing, which is converted with a normalization algorithm in the conversion module, and finally the detection module detects the impurities in MA samples using a visualized zoning user interface. The normalization algorithm in the conversion module was used to convert the raw data from GC-FID. The VPSS with the built-in normalization algorithm can effectively detect different impurities in samples even in complex matrices and has high resolution keeping the time sequence of chromatographic peaks the same as that of the RRT method. The system can widen a full range of chromatograms so that the peaks of impurities were better aligned for easy separation and classification. The resolution, accuracy, and speed of impurity profiling showed remarkable improvement. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Somanath, Keerthan; Mau, Ted
2016-11-01
(1) To develop an automated algorithm to analyze electroglottographic (EGG) signal in continuous dysphonic speech, and (2) to identify EGG waveform parameters that correlate with the auditory-perceptual quality of strain in the speech of patients with adductor spasmodic dysphonia (ADSD). Software development with application in a prospective controlled study. EGG was recorded from 12 normal speakers and 12 subjects with ADSD reading excerpts from the Rainbow Passage. Data were processed by a new algorithm developed with the specific goal of analyzing continuous dysphonic speech. The contact quotient, pulse width, a new parameter peak skew, and various contact closing slope quotient and contact opening slope quotient measures were extracted. EGG parameters were compared between normal and ADSD speech. Within the ADSD group, intra-subject comparison was also made between perceptually strained syllables and unstrained syllables. The opening slope quotient SO7525 distinguished strained syllables from unstrained syllables in continuous speech within individual subjects with ADSD. The standard deviations, but not the means, of contact quotient, EGGW50, peak skew, and SO7525 were different between normal and ADSD speakers. The strain-stress pattern in continuous speech can be visualized as color gradients based on the variation of EGG parameter values. EGG parameters may provide a within-subject measure of vocal strain and serve as a marker for treatment response. The addition of EGG to multidimensional assessment may lead to improved characterization of the voice disturbance in ADSD. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Design of the ultraprecision stage for lithography using VCM
NASA Astrophysics Data System (ADS)
Kim, Jung-Han; Kim, Mun-Su; Oh, Min-Taek
2007-12-01
This paper presents a new design of precision stage for the reticle in lithography process and a low hunting control method for the stage. The stage has three axes for X,Y, θ Z, those actuated by three voice coil motors individually. The proposed precision stage system has three gap sensors and voice coil motors, and supported by four air bearings, so it do not have any mechanical contact and nonlinear effect such as hysterisis which usually degrade performance in nano level movement. The reticle stage has cross coupled dynamics between X,Y,θ Z, axes, so the forward and inverse kinematics were solved to get an accurate reference position. When the stage is in regulating control mode, there always exist small fluctuations (stage hunting) in the stage movement. Because the low stage hunting characteristic is very important in recent lithography and nano-level applications, the proposed stage has a special regulating controller composed of digital filter, adjustor and switching algorithm. Another importance factor that generates hunting noise is the system noise inside the lithography machine such as EMI from another motor and solenoids. For reducing such system noises, the proposed controller has a two-port transmission system that transfers torque command signal from the DSP board to the amplifier. The low hunting control algorithm and two-port transmission system reduced hunting noise as 35nm(rms) when a conventional PID generates 77nm(rms) in the same mechanical system. The experimental results showed that the reticle system has 100nm linear accuracy and 1μ rad rotation accuracy at the control frequency of 8 kHz.
Interpersonal Processes and Attachment in Voice-Hearers.
Robson, George; Mason, Oliver
2015-11-01
Studies of both clinical and non-clinical voice hearers suggest that distress is rather inconsistently associated with the perceived relationship between voice and hearer. It is also not clear if their beliefs about voices are relevant. This study investigated the links between attachment anxiety/avoidance, interpersonal aspects of the voice relationship, and distress whilst considering the impact of beliefs about voices and paranoia. Forty-four voice-hearing participants completed a number of self-report measures tapping attachment, interpersonal processes in the voice relationship, beliefs about voices, paranoia, distress and depression. Attachment avoidance was related to voice intrusiveness, hearer distance and distress. Attachment anxiety was related to voice intrusiveness, hearer dependence and distress. A series of simple mediation analyses were conducted that suggest that the relationship between attachment and voice related distress may be mediated by interpersonal dynamics in the voice-hearer relationship, beliefs about voices and paranoia. Beliefs about voices, the hearer's relationship with their voices, and the distress voices sometimes engender appear to be meaningfully related to their attachment style. This may be important to consider in therapeutic work.
Martinelli, Eugenio; Mencattini, Arianna; Daprati, Elena; Di Natale, Corrado
2016-01-01
Humans can communicate their emotions by modulating facial expressions or the tone of their voice. Albeit numerous applications exist that enable machines to read facial emotions and recognize the content of verbal messages, methods for speech emotion recognition are still in their infancy. Yet, fast and reliable applications for emotion recognition are the obvious advancement of present 'intelligent personal assistants', and may have countless applications in diagnostics, rehabilitation and research. Taking inspiration from the dynamics of human group decision-making, we devised a novel speech emotion recognition system that applies, for the first time, a semi-supervised prediction model based on consensus. Three tests were carried out to compare this algorithm with traditional approaches. Labeling performances relative to a public database of spontaneous speeches are reported. The novel system appears to be fast, robust and less computationally demanding than traditional methods, allowing for easier implementation in portable voice-analyzers (as used in rehabilitation, research, industry, etc.) and for applications in the research domain (such as real-time pairing of stimuli to participants' emotional state, selective/differential data collection based on emotional content, etc.).
Rantala, Leena M; Hakala, Suvi J; Holmqvist, Sofia; Sala, Eeva
2012-11-01
The aim of the study was to investigate the connections between voice ergonomic risk factors found in classrooms and voice-related problems in teachers. Voice ergonomic assessment was performed in 39 classrooms in 14 elementary schools by means of a Voice Ergonomic Assessment in Work Environment--Handbook and Checklist. The voice ergonomic risk factors assessed included working culture, noise, indoor air quality, working posture, stress, and access to a sound amplifier. Teachers from the above-mentioned classrooms reported their voice symptoms, respiratory tract diseases, and completed a Voice Handicap Index (VHI). The more voice ergonomic risk factors found in the classroom the higher were the teachers' total scores on voice symptoms and VHI. Stress was the factor that correlated most strongly with voice symptoms. Poor indoor air quality increased the occurrence of laryngitis. Voice ergonomics were poor in the classrooms studied and voice ergonomic risk factors affected the voice. It is important to convey information on voice ergonomics to education administrators and those responsible for school planning and taking care of school buildings. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
A text-mining analysis of the public's reactions to the opioid crisis.
Glowacki, Elizabeth M; Glowacki, Joseph B; Wilcox, Gary B
2017-07-19
Opioid abuse has become an epidemic in the United States. On August 25, 2016, the former Surgeon General of the United States sent an open letter to care providers asking for their help with combatting this growing health crisis. Social media forums such as Twitter allow for open discussions among the public and up-to-date exchanges of information about timely topics such as opioids. Therefore, the goal of the current study is to identify the public's reactions to the opioid epidemic by identifying the most popular topics tweeted by users. A text miner, algorithmic-driven statistical program was used to capture 73,235 original tweets and retweets posted within a 2-month time span 15 (August 15, 2016, through October 15, 2016). All tweets contained references to "opioids," "turnthetide," or similar keywords. The sets of tweets were then analyzed to identify the most prevalent topics. The most discussed topics had to do with public figures addressing opioid abuse, creating better treatment options for teen addicts, using marijuana as an alternative for managing pain, holding foreign and domestic drug makers accountable for the epidemic, promoting the "Rx for Change" campaign, addressing double standards in the perceptions and treatment of black and white opioid users, and advertising opioid recovery programs. Twitter allows users to find current information, voice their concerns, and share calls for action in response to the opioid epidemic. Monitoring the conversations about opioids that are taking place on social media forums such as Twitter can help public health officials and care providers better understand how the public is responding to this health crisis.
Native voice, self-concept and the moral case for personalized voice technology.
Nathanson, Esther
2017-01-01
Purpose (1) To explore the role of native voice and effects of voice loss on self-concept and identity, and survey the state of assistive voice technology; (2) to establish the moral case for developing personalized voice technology. Methods This narrative review examines published literature on the human significance of voice, the impact of voice loss on self-concept and identity, and the strengths and limitations of current voice technology. Based on the impact of voice loss on self and identity, and voice technology limitations, the moral case for personalized voice technology is developed. Results Given the richness of information conveyed by voice, loss of voice constrains expression of the self, but the full impact is poorly understood. Augmentative and alternative communication (AAC) devices facilitate communication but, despite advances in this field, voice output cannot yet express the unique nuances of individual voice. The ethical principles of autonomy, beneficence and equality of opportunity establish the moral responsibility to invest in accessible, cost-effective, personalized voice technology. Conclusions Although further research is needed to elucidate the full effects of voice loss on self-concept, identity and social functioning, current understanding of the profoundly negative impact of voice loss establishes the moral case for developing personalized voice technology. Implications for Rehabilitation Rehabilitation of voice-disordered patients should facilitate self-expression, interpersonal connectedness and social/occupational participation. Proactive questioning about the psychological and social experiences of patients with voice loss is a valuable entry point for rehabilitation planning. Personalized voice technology would enhance sense of self, communicative participation and autonomy and promote shared healthcare decision-making. Further research is needed to identify the best strategies to preserve and strengthen identity and sense of self.
biobambam: tools for read pair collation based algorithms on BAM files
2014-01-01
Background Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. Results In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package. Conclusions In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.
Using Natural Language to Enable Mission Managers to Control Multiple Heterogeneous UAVs
NASA Technical Reports Server (NTRS)
Trujillo, Anna C.; Puig-Navarro, Javier; Mehdi, S. Bilal; Mcquarry, A. Kyle
2016-01-01
The availability of highly capable, yet relatively cheap, unmanned aerial vehicles (UAVs) is opening up new areas of use for hobbyists and for commercial activities. This research is developing methods beyond classical control-stick pilot inputs, to allow operators to manage complex missions without in-depth vehicle expertise. These missions may entail several heterogeneous UAVs flying coordinated patterns or flying multiple trajectories deconflicted in time or space to predefined locations. This paper describes the functionality and preliminary usability measures of an interface that allows an operator to define a mission using speech inputs. With a defined and simple vocabulary, operators can input the vast majority of mission parameters using simple, intuitive voice commands. Although the operator interface is simple, it is based upon autonomous algorithms that allow the mission to proceed with minimal input from the operator. This paper also describes these underlying algorithms that allow an operator to manage several UAVs.
Hartfield, Matthew; Wright, Stephen I.; Agrawal, Aneil F.
2016-01-01
Many diploid organisms undergo facultative sexual reproduction. However, little is currently known concerning the distribution of neutral genetic variation among facultative sexual organisms except in very simple cases. Understanding this distribution is important when making inferences about rates of sexual reproduction, effective population size, and demographic history. Here we extend coalescent theory in diploids with facultative sex to consider gene conversion, selfing, population subdivision, and temporal and spatial heterogeneity in rates of sex. In addition to analytical results for two-sample coalescent times, we outline a coalescent algorithm that accommodates the complexities arising from partial sex; this algorithm can be used to generate multisample coalescent distributions. A key result is that when sex is rare, gene conversion becomes a significant force in reducing diversity within individuals. This can reduce genomic signatures of infrequent sex (i.e., elevated within-individual allelic sequence divergence) or entirely reverse the predicted patterns. These models offer improved methods for assessing null patterns of molecular variation in facultative sexual organisms. PMID:26584902
Hartfield, Matthew; Wright, Stephen I; Agrawal, Aneil F
2016-01-01
Many diploid organisms undergo facultative sexual reproduction. However, little is currently known concerning the distribution of neutral genetic variation among facultative sexual organisms except in very simple cases. Understanding this distribution is important when making inferences about rates of sexual reproduction, effective population size, and demographic history. Here we extend coalescent theory in diploids with facultative sex to consider gene conversion, selfing, population subdivision, and temporal and spatial heterogeneity in rates of sex. In addition to analytical results for two-sample coalescent times, we outline a coalescent algorithm that accommodates the complexities arising from partial sex; this algorithm can be used to generate multisample coalescent distributions. A key result is that when sex is rare, gene conversion becomes a significant force in reducing diversity within individuals. This can reduce genomic signatures of infrequent sex (i.e., elevated within-individual allelic sequence divergence) or entirely reverse the predicted patterns. These models offer improved methods for assessing null patterns of molecular variation in facultative sexual organisms. Copyright © 2016 by the Genetics Society of America.
Dudley, James; Eames, Catrin; Mulligan, John; Fisher, Naomi
2018-03-01
Developing compassion towards oneself has been linked to improvement in many areas of psychological well-being, including psychosis. Furthermore, developing a non-judgemental, accepting way of relating to voices is associated with lower levels of distress for people who hear voices. These factors have also been associated with secure attachment. This study explores associations between the constructs of mindfulness of voices, self-compassion, and distress from hearing voices and how secure attachment style related to each of these variables. Cross-sectional online. One hundred and twenty-eight people (73% female; M age = 37.5; 87.5% Caucasian) who currently hear voices completed the Self-Compassion Scale, Southampton Mindfulness of Voices Questionnaire, Relationships Questionnaire, and Hamilton Programme for Schizophrenia Voices Questionnaire. Results showed that mindfulness of voices mediated the relationship between self-compassion and severity of voices, and self-compassion mediated the relationship between mindfulness of voices and severity of voices. Self-compassion and mindfulness of voices were significantly positively correlated with each other and negatively correlated with distress and severity of voices. Mindful relation to voices and self-compassion are associated with reduced distress and severity of voices, which supports the proposed potential benefits of mindful relating to voices and self-compassion as therapeutic skills for people experiencing distress by voice hearing. Greater self-compassion and mindfulness of voices were significantly associated with less distress from voices. These findings support theory underlining compassionate mind training. Mindfulness of voices mediated the relationship between self-compassion and distress from voices, indicating a synergistic relationship between the constructs. Although the current findings do not give a direction of causation, consideration is given to the potential impact of mindful and compassionate approaches to voices. © 2017 The Authors. British Journal of Clinical Psychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.
Auditory traits of "own voice".
Kimura, Marino; Yotsumoto, Yuko
2018-01-01
People perceive their recorded voice differently from their actively spoken voice. The uncanny valley theory proposes that as an object approaches humanlike characteristics, there is an increase in the sense of familiarity; however, eventually a point is reached where the object becomes strangely similar and makes us feel uneasy. The feeling of discomfort experienced when people hear their recorded voice may correspond to the floor of the proposed uncanny valley. To overcome the feeling of eeriness of own-voice recordings, previous studies have suggested equalization of the recorded voice with various types of filters, such as step, bandpass, and low-pass, yet the effectiveness of these filters has not been evaluated. To address this, the aim of experiment 1 was to identify what type of voice recording was the most representative of one's own voice. The voice recordings were presented in five different conditions: unadjusted recorded voice, step filtered voice, bandpass filtered voice, low-pass filtered voice, and a voice for which the participants freely adjusted the parameters. We found large individual differences in the most representative own-voice filter. In order to consider roles of sense of agency, experiment 2 investigated if lip-synching would influence the rating of own voice. The result suggested lip-synching did not affect own voice ratings. In experiment 3, based on the assumption that the voices used in previous experiments corresponded to continuous representations of non-own voice to own voice, the existence of an uncanny valley was examined. Familiarity, eeriness, and the sense of own voice were rated. The result did not support the existence of an uncanny valley. Taken together, the experiments led us to the following conclusions: there is no general filter that can represent own voice for everyone, sense of agency has no effect on own voice rating, and the uncanny valley does not exist for own voice, specifically.
Szijarto, Barbara; Milley, Peter; Svensson, Kate; Cousins, J Bradley
2018-02-01
Social innovation (SI) is billed as a new way to address complex social problems. Interest in SI has intensified rapidly in the last decade, making it an important area of practice for evaluators, but a difficult one to navigate. Learning from developments in SI and evaluation approaches applied in SI contexts is challenging because of 'fuzzy' concepts and silos of activity and knowledge within SI communities. This study presents findings from a systematic review and integration of 41 empirical studies on evaluation in SI contexts. We identify two isolated conversations: one about 'social enterprises' (SEs) and the other about non-SE 'social innovations'. These conversations diverge in key areas, including engagement with evaluation scholarship, and in the reported purposes, approaches and use of evaluation. We identified striking differences with respect to degree of interest in collaborative approaches and facilitation of evaluation use. The findings speak to trends and debates in our field, for example how evaluation might reconcile divergent information needs in multilevel, cross-sectoral collaborations and respond to fluidity and change in innovative settings. Implications for practitioners and commissioners of evaluation include how evaluation is used in different contexts and the voice of evaluators (and the evaluation profession) in these conversations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Peters, E R; Williams, S L; Cooke, M A; Kuipers, E
2012-07-01
Previous studies have suggested that beliefs about voices mediate the relationship between actual voice experience and behavioural and affective response. We investigated beliefs about voice power (omnipotence), voice intent (malevolence/benevolence) and emotional and behavioural response (resistance/engagement) using the Beliefs About Voices Questionnaire - Revised (BAVQ-R) in 46 voice hearers. Distress was assessed using a wide range of measures: voice-related distress, depression, anxiety, self-esteem and suicidal ideation. Voice topography was assessed using measures of voice severity, frequency and intensity. We predicted that beliefs about voices would show a stronger association with distress than voice topography. Omnipotence had the strongest associations with all measures of distress included in the study whereas malevolence was related to resistance, and benevolence to engagement. As predicted, voice severity, frequency and intensity were not related to distress once beliefs were accounted for. These results concur with previous findings that beliefs about voice power are key determinants of distress in voice hearers, and should be targeted specifically in psychological interventions.
Updating signal typing in voice: addition of type 4 signals.
Sprecher, Alicia; Olszewski, Aleksandra; Jiang, Jack J; Zhang, Yu
2010-06-01
The addition of a fourth type of voice to Titze's voice classification scheme is proposed. This fourth voice type is characterized by primarily stochastic noise behavior and is therefore unsuitable for both perturbation and correlation dimension analysis. Forty voice samples were classified into the proposed four types using narrowband spectrograms. Acoustic, perceptual, and correlation dimension analyses were completed for all voice samples. Perturbation measures tended to increase with voice type. Based on reliability cutoffs, the type 1 and type 2 voices were considered suitable for perturbation analysis. Measures of unreliability were higher for type 3 and 4 voices. Correlation dimension analyses increased significantly with signal type as indicated by a one-way analysis of variance. Notably, correlation dimension analysis could not quantify the type 4 voices. The proposed fourth voice type represents a subset of voices dominated by noise behavior. Current measures capable of evaluating type 4 voices provide only qualitative data (spectrograms, perceptual analysis, and an infinite correlation dimension). Type 4 voices are highly complex and the development of objective measures capable of analyzing these voices remains a topic of future investigation.
Mechanics of human voice production and control
Zhang, Zhaoyan
2016-01-01
As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed. PMID:27794319
Mechanics of human voice production and control.
Zhang, Zhaoyan
2016-10-01
As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.
Voice care knowledge among clinicians and people with healthy voices or dysphonia.
Fletcher, Helen M; Drinnan, Michael J; Carding, Paul N
2007-01-01
An important clinical component in the prevention and treatment of voice disorders is voice care and hygiene. Research in voice care knowledge has mainly focussed on specific groups of professional voice users with limited reporting on the tool and evidence base used. In this study, a questionnaire to measure voice care knowledge was developed based on "best evidence." The questionnaire was validated by measuring specialist voice clinicians' agreement. Preliminary data are then presented using the voice care knowledge questionnaire with 17 subjects with nonorganic dysphonia and 17 with healthy voices. There was high (89%) agreement among the clinicians. There was a highly significant difference between the dysphonic and the healthy group scores (P = 0.00005). Furthermore, the dysphonic subjects (63% agreement) presented with less voice care knowledge than the subjects with healthy voices (72% agreement). The questionnaire provides a useful and valid tool to investigate voice care knowledge. The findings have implications for clinical intervention, voice therapy, and health prevention.
Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav
2018-04-01
A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).
Bai, Mingsian R; Li, Yi; Chiang, Yi-Hao
2017-10-01
A unified framework is proposed for analysis and synthesis of two-dimensional spatial sound field in reverberant environments. In the sound field analysis (SFA) phase, an unbaffled 24-element circular microphone array is utilized to encode the sound field based on the plane-wave decomposition. Depending on the sparsity of the sound sources, the SFA stage can be implemented in two manners. For sparse-source scenarios, a one-stage algorithm based on compressive sensing algorithm is utilized. Alternatively, a two-stage algorithm can be used, where the minimum power distortionless response beamformer is used to localize the sources and Tikhonov regularization algorithm is used to extract the source amplitudes. In the sound field synthesis (SFS), a 32-element rectangular loudspeaker array is employed to decode the target sound field using pressure matching technique. To establish the room response model, as required in the pressure matching step of the SFS phase, an SFA technique for nonsparse-source scenarios is utilized. Choice of regularization parameters is vital to the reproduced sound field. In the SFS phase, three SFS approaches are compared in terms of localization performance and voice reproduction quality. Experimental results obtained in a reverberant room are presented and reveal that an accurate room response model is vital to immersive rendering of the reproduced sound field.
Quantitative analysis of professionally trained versus untrained voices.
Siupsinskiene, Nora
2003-01-01
The aim of this study was to compare healthy trained and untrained voices as well as healthy and dysphonic trained voices in adults using combined voice range profile and aerodynamic tests, to define the normal range limiting values of quantitative voice parameters and to select the most informative quantitative voice parameters for separation between healthy and dysphonic trained voices. Three groups of persons were evaluated. One hundred eighty six healthy volunteers were divided into two groups according to voice training: non-professional speakers group consisted of 106 untrained voices persons (36 males and 70 females) and professional speakers group--of 80 trained voices persons (21 males and 59 females). Clinical group consisted of 103 dysphonic professional speakers (23 males and 80 females) with various voice disorders. Eighteen quantitative voice parameters from combined voice range profile (VRP) test were analyzed: 8 of voice range profile, 8 of speaking voice, overall vocal dysfunction degree and coefficient of sound, and aerodynamic maximum phonation time. Analysis showed that healthy professional speakers demonstrated expanded vocal abilities in comparison to healthy non-professional speakers. Quantitative voice range profile parameters- pitch range, high frequency limit, area of high frequencies and coefficient of sound differed significantly between healthy professional and non-professional voices, and were more informative than speaking voice or aerodynamic parameters in showing the voice training. Logistic stepwise regression revealed that VRP area in high frequencies was sufficient to discriminate between healthy and dysphonic professional speakers for male subjects (overall discrimination accuracy--81.8%) and combination of three quantitative parameters (VRP high frequency limit, maximum voice intensity and slope of speaking curve) for female subjects (overall model discrimination accuracy--75.4%). We concluded that quantitative voice assessment with selected parameters might be useful for evaluation of voice education for healthy professional speakers as well as for detection of vocal dysfunction and evaluation of rehabilitation effect in dysphonic professionals.
The Voice as Computer Interface: A Look at Tomorrow's Technologies.
ERIC Educational Resources Information Center
Lange, Holley R.
1991-01-01
Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…
[The voice of the singer in the phonetogram].
Klingholz, F
1989-01-01
Phonetograms were subdivided into areas approximating voice registers. By means of an analytical description of the areas, parameters could be established for a differentiation of voice categories and efficiency. The evaluation of 21 untrained and 34 trained voices showed a significant difference between the two groups. Male singers demonstrated more efficiency in the head and chest registers than male non-singers; female singers showed a stronger efficiency only in the head voice in comparison with their non-singer counterparts. Proceeding from voice sound alone, voices are often misclassified regarding the voice categories, and voice problems arise. Moreover, enhanced training of only chest or head voice function results in functional disorders in the singing voice. Such cases can be demonstrated by means of phonetograms.
Abnormal motor cortex excitability during linguistic tasks in adductor-type spasmodic dysphonia.
Suppa, A; Marsili, L; Giovannelli, F; Di Stasio, F; Rocchi, L; Upadhyay, N; Ruoppolo, G; Cincotta, M; Berardelli, A
2015-08-01
In healthy subjects (HS), transcranial magnetic stimulation (TMS) applied during 'linguistic' tasks discloses excitability changes in the dominant hemisphere primary motor cortex (M1). We investigated 'linguistic' task-related cortical excitability modulation in patients with adductor-type spasmodic dysphonia (ASD), a speech-related focal dystonia. We studied 10 ASD patients and 10 HS. Speech examination included voice cepstral analysis. We investigated the dominant/non-dominant M1 excitability at baseline, during 'linguistic' (reading aloud/silent reading/producing simple phonation) and 'non-linguistic' tasks (looking at non-letter strings/producing oral movements). Motor evoked potentials (MEPs) were recorded from the contralateral hand muscles. We measured the cortical silent period (CSP) length and tested MEPs in HS and patients performing the 'linguistic' tasks with different voice intensities. We also examined MEPs in HS and ASD during hand-related 'action-verb' observation. Patients were studied under and not-under botulinum neurotoxin-type A (BoNT-A). In HS, TMS over the dominant M1 elicited larger MEPs during 'reading aloud' than during the other 'linguistic'/'non-linguistic' tasks. Conversely, in ASD, TMS over the dominant M1 elicited increased-amplitude MEPs during 'reading aloud' and 'syllabic phonation' tasks. CSP length was shorter in ASD than in HS and remained unchanged in both groups performing 'linguistic'/'non-linguistic' tasks. In HS and ASD, 'linguistic' task-related excitability changes were present regardless of the different voice intensities. During hand-related 'action-verb' observation, MEPs decreased in HS, whereas in ASD they increased. In ASD, BoNT-A improved speech, as demonstrated by cepstral analysis and restored the TMS abnormalities. ASD reflects dominant hemisphere excitability changes related to 'linguistic' tasks; BoNT-A returns these excitability changes to normal. © 2015 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Hunter, Eric J.; Titze, Ingo R.
2012-01-01
Purpose This study creates a more concise picture of the vocal demands placed on teachers by comparing occupational voice use with non-occupational voice use. Methods The National Center for Voice and Speech voice dosimetry databank was used to calculate voicing percentage per hour, as well as average dB SPL and F0. Occupational voice use (9am-3 PM, weekdays) and non-occupational voice use (4 PM-10 PM, weekends) were compared (57 teachers, two weeks each). Results Five key findings were uncovered: [1] similar to previous studies, occupational voicing percentage per hour is more than twice that of non-occupational; [2] teachers experienced a wide range of occupational voicing percentages per hour (30±11%/hr); [3] average occupational voice was about 1 dB SPL louder than the non-occupational voice and remained constant throughout the day; [4] occupational voice exhibited an increased pitch and trended upward throughout the day; [5] some apparent gender differences were shown. Conclusions Data regarding voicing percentages, F0 and dB SPL provide critical insight into teachers’ vocal health. Further, because non-occupational voice use is added to an already overloaded voice, it may add key insights into recovery patterns, and should be the focus of future studies. PMID:20689046
Borowiak, Kamila; von Kriegstein, Katharina
2016-01-01
The ability to recognise the identity of others is a key requirement for successful communication. Brain regions that respond selectively to voices exist in humans from early infancy on. Currently, it is unclear whether dysfunction of these voice-sensitive regions can explain voice identity recognition impairments. Here, we used two independent functional magnetic resonance imaging studies to investigate voice processing in a population that has been reported to have no voice-sensitive regions: autism spectrum disorder (ASD). Our results refute the earlier report that individuals with ASD have no responses in voice-sensitive regions: Passive listening to vocal, compared to non-vocal, sounds elicited typical responses in voice-sensitive regions in the high-functioning ASD group and controls. In contrast, the ASD group had a dysfunction in voice-sensitive regions during voice identity but not speech recognition in the right posterior superior temporal sulcus/gyrus (STS/STG)—a region implicated in processing complex spectrotemporal voice features and unfamiliar voices. The right anterior STS/STG correlated with voice identity recognition performance in controls but not in the ASD group. The findings suggest that right STS/STG dysfunction is critical for explaining voice recognition impairments in high-functioning ASD and show that ASD is not characterised by a general lack of voice-sensitive responses. PMID:27369067
Clustering of color map pixels: an interactive approach
NASA Astrophysics Data System (ADS)
Moon, Yiu Sang; Luk, Franklin T.; Yuen, K. N.; Yeung, Hoi Wo
2003-12-01
The demand for digital maps continues to arise as mobile electronic devices become more popular nowadays. Instead of creating the entire map from void, we may convert a scanned paper map into a digital one. Color clustering is the very first step of the conversion process. Currently, most of the existing clustering algorithms are fully automatic. They are fast and efficient but may not work well in map conversion because of the numerous ambiguous issues associated with printed maps. Here we introduce two interactive approaches for color clustering on the map: color clustering with pre-calculated index colors (PCIC) and color clustering with pre-calculated color ranges (PCCR). We also introduce a memory model that could enhance and integrate different image processing techniques for fine-tuning the clustering results. Problems and examples of the algorithms are discussed in the paper.
The prevalence of voice disorders in 911 emergency telecommunicators.
Johns-Fiedler, Heidi; van Mersbergen, Miriam
2015-05-01
Emergency 911 dispatchers or telecommunicators have been cited as occupational voice users who could be at risk for voice disorders. To test the theoretical assumption that the 911 emergency telecommunicators (911ETCs) are exposed to risk for voice disorders because of their heavy vocal load, this study assessed the prevalence of voice complaints in 911ETCs. A cross-sectional survey was sent to two large national organizations for 911ETCs with 71 complete responses providing information about voice health, voice complaints, and work load. Although 911ETCs have a higher rate of reported voice symptoms and score higher on the Voice Handicap Index-10 than the general public, they have a voice disorder diagnosis prevalence that mirrors the prevalence of the general population. The 911ETCs may be underserved in the voice community and would benefit from education on vocal health and treatments for voice complaints. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Integrating cues of social interest and voice pitch in men's preferences for women's voices.
Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana
2008-04-23
Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.
I like my voice better: self-enhancement bias in perceptions of voice attractiveness.
Hughes, Susan M; Harrison, Marissa A
2013-01-01
Previous research shows that the human voice can communicate a wealth of nonsemantic information; preferences for voices can predict health, fertility, and genetic quality of the speaker, and people often use voice attractiveness, in particular, to make these assessments of others. But it is not known what we think of the attractiveness of our own voices as others hear them. In this study eighty men and women rated the attractiveness of an array of voice recordings of different individuals and were not told that their own recorded voices were included in the presentation. Results showed that participants rated their own voices as sounding more attractive than others had rated their voices, and participants also rated their own voices as sounding more attractive than they had rated the voices of others. These findings suggest that people may engage in vocal implicit egotism, a form of self-enhancement.
The Combination of RSA And Block Chiper Algorithms To Maintain Message Authentication
NASA Astrophysics Data System (ADS)
Yanti Tarigan, Sepri; Sartika Ginting, Dewi; Lumban Gaol, Melva; Lorensi Sitompul, Kristin
2017-12-01
RSA algorithm is public key algorithm using prime number and even still used today. The strength of this algorithm lies in the exponential process, and the factorial number into 2 prime numbers which until now difficult to do factoring. The RSA scheme itself adopts the block cipher scheme, where prior to encryption, the existing plaintext is divide in several block of the same length, where the plaintext and ciphertext are integers between 1 to n, where n is typically 1024 bit, and the block length itself is smaller or equal to log(n)+1 with base 2. With the combination of RSA algorithm and block chiper it is expected that the authentication of plaintext is secure. The secured message will be encrypted with RSA algorithm first and will be encrypted again using block chiper. And conversely, the chipertext will be decrypted with the block chiper first and decrypted again with the RSA algorithm. This paper suggests a combination of RSA algorithms and block chiper to secure data.
Curriculum in crisis, pedagogy in disrepair: a provocation.
Walker, Kim
2009-01-01
In this admittedly partisan text, I examine the broad contours of the nursing curriculum and pedagogy in Australia today (and while this unashamed parochialism might limit for an international audience the ideas herein, the issues being given voice might also resonate unexpectedly with those who harbour similar concerns in their part of the world). Much of what follows is drawn from personal observations over the last thirty years and frequent and often fraught conversations with peers and colleagues who have long worked in the tertiary and health service sectors. While there is a growing body of literature that supports many of the claims here there is also relative published silence around others that are perhaps more contentious.
Energy Education: The Quantitative Voice
NASA Astrophysics Data System (ADS)
Wolfson, Richard
2010-02-01
A serious study of energy use and its consequences has to be quantitative. It makes little sense to push your favorite renewable energy source if it can't provide enough energy to make a dent in humankind's prodigious energy consumption. Conversely, it makes no sense to dismiss alternatives---solar in particular---that supply Earth with energy at some 10,000 times our human energy consumption rate. But being quantitative---especially with nonscience students or the general public---is a delicate business. This talk draws on the speaker's experience presenting energy issues to diverse audiences through single lectures, entire courses, and a textbook. The emphasis is on developing a quick, ``back-of-the-envelope'' approach to quantitative understanding of energy issues. )
2009-06-01
Blackberry handheld) device. After each voice command activation, the medic provided voice comments to be recorded in Observer Notepad over Voice...vial (up-right corner of picture) upon voice activation from the medic’s Blackberry handheld. The NPS UAS which was controlled by voice commands...Voice Portal using a standard Blackberry handheld with a head set. The results demonstrated sufficient accuracy for controlling the tactical sensor
Choi, Seong Hee; Zhang, Yu; Jiang, Jack J.; Bless, Diane M.; Welham, Nathan V.
2011-01-01
Objective The primary goal of this study was to evaluate a nonlinear dynamic approach to the acoustic analysis of dysphonia associated with vocal fold scar and sulcus vocalis. Study Design Case-control study. Methods Acoustic voice samples from scar/sulcus patients and age/sex-matched controls were analyzed using correlation dimension (D2) and phase plots, time-domain based perturbation indices (jitter, shimmer, signal-to-noise ratio [SNR]), and an auditory-perceptual rating scheme. Signal typing was performed to identify samples with bifurcations and aperiodicity. Results Type 2 and 3 acoustic signals were highly represented in the scar/sulcus patient group. When data were analyzed irrespective of signal type, all perceptual and acoustic indices successfully distinguished scar/sulcus patients from controls. Removal of type 2 and 3 signals eliminated the previously identified differences between experimental groups for all acoustic indices except D2. The strongest perceptual-acoustic correlation in our dataset was observed for SNR; the weakest correlation was observed for D2. Conclusions These findings suggest that D2 is inferior to time-domain based perturbation measures for the analysis of dysphonia associated with scar/sulcus; however, time-domain based algorithms are inherently susceptible to inflation under highly aperiodic (i.e., type 2 and 3) signal conditions. Auditory-perceptual analysis, unhindered by signal aperiodicity, is therefore a robust strategy for distinguishing scar/sulcus patient voices from normal voices. Future acoustic analysis research in this area should consider alternative (e.g., frequency- and quefrency-domain based) measures alongside additional nonlinear approaches. PMID:22516315
Cognitive Attachment Model of Voices: Evidence Base and Future Implications
Berry, Katherine; Varese, Filippo; Bucci, Sandra
2017-01-01
There is a robust association between hearing voices and exposure to traumatic events. Identifying mediating mechanisms for this relationship is key to theories of voice hearing and the development of therapies for distressing voices. This paper outlines the Cognitive Attachment model of Voices (CAV), a theoretical model to understand the relationship between earlier interpersonal trauma and distressing voice hearing. The model builds on attachment theory and well-established cognitive models of voices and argues that attachment and dissociative processes are key psychological mechanisms that explain how trauma influences voice hearing. Following the presentation of the model, the paper will review the current state of evidence regarding the proposed mechanisms of vulnerability to voice hearing and maintenance of voice-related distress. This review will include evidence from studies supporting associations between dissociation and voices, followed by details of our own research supporting the role of dissociation in mediating the relationship between trauma and voices and evidence supporting the role of adult attachment in influencing beliefs and relationships that voice hearers can develop with voices. The paper concludes by outlining the key questions that future research needs to address to fully test the model and the clinical implications that arise from the work. PMID:28713292
Lu, Dan; Wen, Bei; Yang, Hui; Chen, Fei; Liu, Jun; Xu, Yanan; Zheng, Yitao; Zhao, Yu; Zou, Jian; Wang, Haiyang
2017-07-01
To investigate the differences and correlation between the Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) in teachers in China with and without voice disorders. This is a cross-sectional descriptive analytical study. The participants were 864 teachers (569 women, 295 men) whose vocal cords were examined using a flexible nasofibrolaryngoscope. Questionnaire results were obtained for both the VHI-10 and the V-RQOL. Of the 864 participants, 409 teachers had no voice disorders and 455 teachers had voice disorders. The most common voice complaint was hoarseness (n = 298) and the most common throat complaint was globus pharyngis (n = 79) in teachers with voice disorders. Chronic laryngitis (n = 218) and polyps and nodules (n = 182) were the most frequent diagnoses in teachers with voice disorders. Significant differences were seen on the VHI-10 between teachers with and those without voice disorders (P < 0.05) and in function between female and male teachers with voice disorders (P < 0.05) and between those with different voice disorders (P < 0.05). Moderate to strong correlations were observed between VHI-10 total score and those for the three domains of the VHI-10 and the V-RQOL (P < 0.0001). There is a high prevalence of voice disorders in teachers. Teachers with voice disorders have poor voice-related quality of life, with more impairment seen among female than male teachers. Different groups of voice disorders have different effects on voice-related quality of life. A moderate correlation was found between the results of the VHI-10 and the V-RQOL. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Mawson, Amy; Berry, Katherine; Murray, Craig; Hayward, Mark
2011-09-01
Research has found relational qualities of power and intimacy to exist within hearer-voice interactions. The present study aimed to provide a deeper understanding of the interpersonal context of voice hearing by exploring participants' relationships with their voices and other people in their lives. This research was designed in consultation with service users and employed a qualitative, phenomenological, and idiographic design using semi-structured interviews. Ten participants, recruited via mental health services, and who reported hearing voices in the previous week, completed the interviews. These were transcribed verbatim and analysed using interpretative phenomenological analysis. Five themes resulted from the analysis. Theme 1: 'person and voice' demonstrated that participants' voices often reflected the identity, but not always the quality of social acquaintances. Theme 2: 'voices changing and confirming relationship with the self' explored the impact of voice hearing in producing an inferior sense-of-self in comparison to others. Theme 3: 'a battle for control' centred on issues of control and a dilemma of independence within voice relationships. Theme 4: 'friendships facilitating the ability to cope' and theme 5: 'voices creating distance in social relationships' explored experiences of social relationships within the context of voice hearing, and highlighted the impact of social isolation for voice hearers. The study demonstrated the potential role of qualitative research in developing theories of voice hearing. It extended previous research by highlighting the interface between voices and the social world of the hearer, including reciprocal influences of social relationships on voices and coping. Improving voice hearers' sense-of-self may be a key factor in reducing the distress caused by voices. ©2010 The British Psychological Society.
Voice Disorders in Teacher Students-A Prospective Study and a Randomized Controlled Trial.
Ohlsson, Ann-Christine; Andersson, Eva M; Södersten, Maria; Simberg, Susanna; Claesson, Silwa; Barregård, Lars
2016-11-01
Teachers are at risk of developing voice disorders, but longitudinal studies on voice problems among teachers are lacking. The aim of this randomized trial was to investigate long-term effects of voice education for teacher students with mild voice problems. In addition, vocal health was examined prospectively in a group of students without voice problems. First-semester students answered three questionnaires: one about background factors, one about voice symptoms (Screen6), and the Voice Handicap Index. Students with voice problems according to the questionnaire results were randomized to a voice training group or a control group. At follow-up in the sixth semester, all students answered Screen6 again together with four questions about factors that could have affected vocal health during their teacher education. The training group and the control group also answered the Voice Handicap Index a second time. At follow-up, 400 students remained in the study: 27 in the training group, 54 in the control group, and 319 without voice problems at baseline. Voice problems had decreased somewhat more in the training group than in the control group, but the difference was not statistically significant (P = 0.1). However, subgroup analyses showed significantly larger improvement among the students in the group with complete participation in the training program compared with the group with incomplete participation. Of the 319 students without voice problems at baseline, 14% had developed voice problems. Voice problems often develop in teacher students. Despite extensive dropout, our results support the hypothesis that voice education for teacher students has a preventive effect. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Quantitative evaluation of the voice range profile in patients with voice disorder.
Ikeda, Y; Masuda, T; Manako, H; Yamashita, H; Yamamoto, T; Komiyama, S
1999-01-01
In 1953, Calvet first displayed the fundamental frequency (pitch) and sound pressure level (intensity) of a voice on a two-dimensional plane and created a voice range profile. This profile has been used to evaluate clinically various vocal disorders, although such evaluations to date have been subjective without quantitative assessment. In the present study, a quantitative system was developed to evaluate the voice range profile utilizing a personal computer. The area of the voice range profile was defined as the voice volume. This volume was analyzed in 137 males and 175 females who were treated for various dysphonias at Kyushu University between 1984 and 1990. Ten normal subjects served as controls. The voice volume in cases with voice disorders significantly decreased irrespective of the disease and sex. Furthermore, cases having better improvement after treatment showed a tendency for the voice volume to increase. These findings illustrated the voice volume as a useful clinical test for evaluating voice control in cases with vocal disorders.
Keus van de Poll, Marijke; Carlsson, Johannes; Marsh, John E; Ljung, Robert; Odelius, Johan; Schlittmeier, Sabine J; Sundin, Gunilla; Sörqvist, Patrik
2015-08-01
Broadband noise is often used as a masking sound to combat the negative consequences of background speech on performance in open-plan offices. As office workers generally dislike broadband noise, it is important to find alternatives that are more appreciated while being at least not less effective. The purpose of experiment 1 was to compare broadband noise with two alternatives-multiple voices and water waves-in the context of a serial short-term memory task. A single voice impaired memory in comparison with silence, but when the single voice was masked with multiple voices, performance was on level with silence. Experiment 2 explored the benefits of multiple-voice masking in more detail (by comparing one voice, three voices, five voices, and seven voices) in the context of word processed writing (arguably a more office-relevant task). Performance (i.e., writing fluency) increased linearly from worst performance in the one-voice condition to best performance in the seven-voice condition. Psychological mechanisms underpinning these effects are discussed.
A parallel-pipelined architecture for a multi carrier demodulator
NASA Astrophysics Data System (ADS)
Kwatra, S. C.; Jamali, M. M.; Eugene, Linus P.
1991-03-01
Analog devices have been used for processing the information on board the satellites. Presently, digital devices are being used because they are economical and flexible as compared to their analog counterparts. Several schemes of digital transmission can be used depending on the data rate requirement of the user. An economical scheme of transmission for small earth stations uses single channel per carrier/frequency division multiple access (SCPC/FDMA) on the uplink and time division multiplexing (TDM) on the downlink. This is a typical communication service offered to low data rate users in commercial mass market. These channels usually pertain to either voice or data transmission. An efficient digital demodulator architecture is provided for a large number of law data rate users. A demodulator primarily consists of carrier, clock, and data recovery modules. This design uses principles of parallel processing, pipelining, and time sharing schemes to process large numbers of voice or data channels. It maintains the optimum throughput which is derived from the designed architecture and from the use of high speed components. The design is optimized for reduced power and area requirements. This is essential for satellite applications. The design is also flexible in processing a group of a varying number of channels. The algorithms that are used are verified by the use of a computer aided software engineering (CASE) tool called the Block Oriented System Simulator. The data flow, control circuitry, and interface of the hardware design is simulated in C language. Also, a multiprocessor approach is provided to map, model, and simulate the demodulation algorithms mainly from a speed view point. A hypercude based architecture implementation is provided for such a scheme of operation. The hypercube structure and the demodulation models on hypercubes are simulated in Ada.
NASA Technical Reports Server (NTRS)
Kwatra, S. C.; Jamali, M. M.; Eugene, Linus P.
1991-01-01
Analog devices have been used for processing the information on board the satellites. Presently, digital devices are being used because they are economical and flexible as compared to their analog counterparts. Several schemes of digital transmission can be used depending on the data rate requirement of the user. An economical scheme of transmission for small earth stations uses single channel per carrier/frequency division multiple access (SCPC/FDMA) on the uplink and time division multiplexing (TDM) on the downlink. This is a typical communication service offered to low data rate users in commercial mass market. These channels usually pertain to either voice or data transmission. An efficient digital demodulator architecture is provided for a large number of law data rate users. A demodulator primarily consists of carrier, clock, and data recovery modules. This design uses principles of parallel processing, pipelining, and time sharing schemes to process large numbers of voice or data channels. It maintains the optimum throughput which is derived from the designed architecture and from the use of high speed components. The design is optimized for reduced power and area requirements. This is essential for satellite applications. The design is also flexible in processing a group of a varying number of channels. The algorithms that are used are verified by the use of a computer aided software engineering (CASE) tool called the Block Oriented System Simulator. The data flow, control circuitry, and interface of the hardware design is simulated in C language. Also, a multiprocessor approach is provided to map, model, and simulate the demodulation algorithms mainly from a speed view point. A hypercude based architecture implementation is provided for such a scheme of operation. The hypercube structure and the demodulation models on hypercubes are simulated in Ada.
Voice Tremor in Parkinson's Disease: An Acoustic Study.
Gillivan-Murphy, Patricia; Miller, Nick; Carding, Paul
2018-01-30
Voice tremor associated with Parkinson disease (PD) has not been characterized. Its relationship with voice disability and disease variables is unknown. This study aimed to evaluate voice tremor in people with PD (pwPD) and a matched control group using acoustic analysis, and to examine correlations with voice disability and disease variables. Acoustic voice tremor analysis was completed on 30 pwPD and 28 age-gender matched controls. Voice disability (Voice Handicap Index), and disease variables of disease duration, Activities of Daily Living (Unified Parkinson's Disease Rating Scale [UPDRS II]), and motor symptoms related to PD (UPDRS III) were examined for relationship with voice tremor measures. Voice tremor was detected acoustically in pwPD and controls with similar frequency. PwPD had a statistically significantly higher rate of amplitude tremor (Hz) than controls (P = 0.001). Rate of amplitude tremor was negatively and significantly correlated with UPDRS III total score (rho -0.509). For pwPD, the magnitude and periodicity of acoustic tremor was higher than for controls without statistical significance. The magnitude of frequency tremor (Mftr%) was positively and significantly correlated with disease duration (rho 0.463). PwPD had higher Voice Handicap Index total, functional, emotional, and physical subscale scores than matched controls (P < 0.001). Voice disability did not correlate significantly with acoustic voice tremor measures. Acoustic analysis enhances understanding of PD voice tremor characteristics, its pathophysiology, and its relationship with voice disability and disease symptomatology. Copyright © 2018 The Voice Foundation. All rights reserved.
Epidemiology of Voice Disorders in Latvian School Teachers.
Trinite, Baiba
2017-07-01
The prevalence of voice disorders in the teacher population in Latvia has not been studied so far and this is the first epidemiological study whose goal is to investigate the prevalence of voice disorders and their risk factors in this professional group. A wide cross-sectional study using stratified sampling methodology was implemented in the general education schools of Latvia. The self-administered voice risk factor questionnaire and the Voice Handicap Index were completed by 522 teachers. Two teachers groups were formed: the voice disorders group which included 235 teachers with actual voice problems or problems during the last 9 months; and the control group which included 174 teachers without voice disorders. Sixty-six percent of teachers gave a positive answer to the following question: Have you ever had problems with your voice? Voice problems are more often found in female than male teachers (68.2% vs 48.8%). Music teachers suffer from voice disorders more often than teachers of other subjects. Eighty-two percent of teachers first faced voice problems in their professional carrier. The odds of voice disorders increase if the following risk factors exist: extra vocal load, shouting, throat clearing, neglecting of personal health, background noise, chronic illnesses of the upper respiratory tract, allergy, job dissatisfaction, and regular stress in the working place. The study findings indicated a high risk of voice disorders among Latvian teachers. The study confirmed data concerning the multifactorial etiology of voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Speaker's comfort in teaching environments: voice problems in Swedish teaching staff.
Åhlander, Viveka Lyberg; Rydell, Roland; Löfqvist, Anders
2011-07-01
The primary objective of this study was to examine how a group of Swedish teachers rate aspects of their working environment that can be presumed to have an impact on vocal behavior and voice problems. The secondary objective was to explore the prevalence of voice problems in Swedish teachers. Questionnaires were distributed to the teachers of 23 randomized schools. Teaching staff at all levels were included, except preschool teachers and teachers at specialized, vocational high schools. The response rate was 73%. The results showed that 13% of the whole group reported voice problems occurring sometimes, often, or always. The teachers reporting voice problems were compared with those without problems. There were significant differences among the groups for several items. The teachers with voice problems rated items on room acoustics and work environment as more noticeable. This group also reported voice symptoms, such as hoarseness, throat clearing, and voice change, to a significantly higher degree, even though teachers in both groups reported some voice symptoms. Absence from work because of voice problems was also significantly more common in the group with voice problems--35% versus 9% in the group without problems. We may conclude that teachers suffering from voice problems react stronger to loading factors in the teaching environment, report more frequent symptoms of voice discomfort, and are more often absent from work because of voice problems than their voice-healthy colleagues. Copyright © 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Selective attention in perceptual adjustments to voice.
Mullennix, J W; Howe, J N
1999-10-01
The effects of perceptual adjustments to voice information on the perception of isolated spoken words were examined. In two experiments, spoken target words were preceded or followed within a trial by a neutral word spoken in the same voice or in a different voice as the target. Over-all, words were reproduced more accurately on trials on which the voice of the neutral word matched the voice of the spoken target word, suggesting that perceptual adjustments to voice interfere with word processing. This result, however, was mediated by selective attention to voice. The results provide further evidence of a close processing relationship between perceptual adjustments to voice and spoken word recognition.
Understanding the 'Anorexic Voice' in Anorexia Nervosa.
Pugh, Matthew; Waller, Glenn
2017-05-01
In common with individuals experiencing a number of disorders, people with anorexia nervosa report experiencing an internal 'voice'. The anorexic voice comments on the individual's eating, weight and shape and instructs the individual to restrict or compensate. However, the core characteristics of the anorexic voice are not known. This study aimed to develop a parsimonious model of the voice characteristics that are related to key features of eating disorder pathology and to determine whether patients with anorexia nervosa fall into groups with different voice experiences. The participants were 49 women with full diagnoses of anorexia nervosa. Each completed validated measures of the power and nature of their voice experience and of their responses to the voice. Different voice characteristics were associated with current body mass index, duration of disorder and eating cognitions. Two subgroups emerged, with 'weaker' and 'stronger' voice experiences. Those with stronger voices were characterized by having more negative eating attitudes, more severe compensatory behaviours, a longer duration of illness and a greater likelihood of having the binge-purge subtype of anorexia nervosa. The findings indicate that the anorexic voice is an important element of the psychopathology of anorexia nervosa. Addressing the anorexic voice might be helpful in enhancing outcomes of treatments for anorexia nervosa, but that conclusion might apply only to patients with more severe eating psychopathology. Copyright © 2016 John Wiley & Sons, Ltd. Experiences of an internal 'anorexic voice' are common in anorexia nervosa. Clinicians should consider the role of the voice when formulating eating pathology in anorexia nervosa, including how individuals perceive and relate to that voice. Addressing the voice may be beneficial, particularly in more severe and enduring forms of anorexia nervosa. When working with the voice, clinicians should aim to address both the content of the voice and how individuals relate and respond to it. Copyright © 2016 John Wiley & Sons, Ltd.
14 CFR 23.1457 - Cockpit voice recorders.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 14 Aeronautics and Space 1 2011-01-01 2011-01-01 false Cockpit voice recorders. 23.1457 Section 23... Equipment § 23.1457 Cockpit voice recorders. (a) Each cockpit voice recorder required by the operating rules...) Voice communications transmitted from or received in the airplane by radio. (2) Voice communications of...
ERIC Educational Resources Information Center
Morrow, Sharon L.
2009-01-01
Teachers represent the largest group of occupational voice users and have voice-related problems at a rate of over twice that found in the general population. Among teachers, music teachers are roughly four times more likely than classroom teachers to develop voice-related problems. Although it has been established that music teachers use their…
Cognitive Behavioural Relating Therapy (CBRT) for voice hearers: a case study.
Paulik, Georgie; Hayward, Mark; Birchwood, Max
2013-10-01
There has been a recent focus on the interpersonal nature of the voice hearing experience, with studies showing that similar patterns of relating exist between voice hearer and voice as between voice hearer and social others. Two recent therapeutic approaches to voices, Cognitive Therapy for Command Hallucinations and Relating Therapy, have been developed to address patterns of relating and power imbalances between voice hearer and voice. This paper presents a novel intervention that combines elements of these two therapies, named Cognitive Behavioural Relating Therapy (CBRT). The application of CBRT is illustrated through a clinical case study. The clinical case study showed changes in patterns of relating, improved self-esteem and reductions in voice-related distress. The outcomes provide preliminary support for the utility of CBRT when working with voice hearers.
Pribuisiene, Ruta; Uloza, Virgilijus; Kardisiene, Vilija
2011-12-01
To determine impact of age, gender, and vocal training on voice characteristics of children aged 6-13 years. Voice acoustic and phonetogram parameters were determined for the group of 44 singing and 31 non-singing children. No impact of gender and/or age on phonetogram, acoustic voice parameters, and maximum phonation time was detected. Voice ranges of all children represented a pre-pubertal soprano type with a voice range of 22 semitones for non-singing and of 26 semitones for singing individuals. The mean maximum voice intensity was 81 dB. Vocal training had a positive impact on voice intensity parameters in girls. The presented data on average voice characteristics may be applicable in the clinical practice and provide relevant support for voice assessment.
Understanding the mechanisms of familiar voice-identity recognition in the human brain.
Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina
2018-03-31
Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.
Voices to reckon with: perceptions of voice identity in clinical and non-clinical voice hearers
Badcock, Johanna C.; Chhabra, Saruchi
2013-01-01
The current review focuses on the perception of voice identity in clinical and non-clinical voice hearers. Identity perception in auditory verbal hallucinations (AVH) is grounded in the mechanisms of human (i.e., real, external) voice perception, and shapes the emotional (distress) and behavioral (help-seeking) response to the experience. Yet, the phenomenological assessment of voice identity is often limited, for example to the gender of the voice, and has failed to take advantage of recent models and evidence on human voice perception. In this paper we aim to synthesize the literature on identity in real and hallucinated voices and begin by providing a comprehensive overview of the features used to judge voice identity in healthy individuals and in people with schizophrenia. The findings suggest some subtle, but possibly systematic biases across different levels of voice identity in clinical hallucinators that are associated with higher levels of distress. Next we provide a critical evaluation of voice processing abilities in clinical and non-clinical voice hearers, including recent data collected in our laboratory. Our studies used diverse methods, assessing recognition and binding of words and voices in memory as well as multidimensional scaling of voice dissimilarity judgments. The findings overall point to significant difficulties recognizing familiar speakers and discriminating between unfamiliar speakers in people with schizophrenia, both with and without AVH. In contrast, these voice processing abilities appear to be generally intact in non-clinical hallucinators. The review highlights some important avenues for future research and treatment of AVH associated with a need for care, and suggests some novel insights into other symptoms of psychosis. PMID:23565088
Uloza, Virgilijus; Padervinskis, Evaldas; Vegiene, Aurelija; Pribuisiene, Ruta; Saferis, Viktoras; Vaiciukynas, Evaldas; Gelzinis, Adas; Verikas, Antanas
2015-11-01
The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73-1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7% and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5%. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60% and RFC revealed the EER of 7.9%, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.
Internal connections and conversations: the internalized other interview in bereavement work.
Moules, Nancy J
Much of the work of grief lies in the ways the bereaved learn to maintain connection to the deceased in their lives, while living alongside the physical absence of them. The theory of an Internalized Other Interview is that we carry within ourselves impressions, memories, beliefs, assessments, doctrines, and codes of those who have shaped our lives through relationship. This internalized community of commentators is active in our lives on a day-to-day basis, but when someone dies, their active voice in the dialogue is shifted to a perceived inactivity. However, I argue that, despite the physical absence of the other, the voice continues to resonate and interact in our formation of our worlds. How our loved ones live on inside us influences who we are in the world and in our bereavement. As a result of our research and clinical work, I have come to believe that the active interviewing of the deceased person as internalized in the bereaved can have powerful and healing effects. In this article, I share the results of the research related to this intervention, describe the history located in Internalized Other Interviewing, and offer a transcription of an Internalized Other Interview with a young man and his family who recently lost both his brother and father.
Change in singing voice production, objectively measured.
Schutte, Harm K; Stark, James A; Miller, Donald G
2003-12-01
Although subglottal pressures in conversational speech are relatively easily measured and thus known, the higher values that sometimes occur in singing (especially in tenors) have received little attention in the literature. Still more unusual is the opportunity to measure a large-scale change over decades in the application of pressure in singing production. This study compares measurements of subglottal pressure in a tenor/singing teacher (JS) at two points in his career: in his early thirties, when he was a subject in HS's dissertation study on the efficiency of voice production; and recently, in his fifties, in connection with JS's forthcoming book on the history of the pedagogy of Bel Canto. Although a single case study, its points of special interest include the high values initially measured (up to 100 cm H2O) and the reduction of this figure by more than 50% in the maximal values of the recent measurements. The study compares these values with those of other singers in the same laboratory (both with esophageal balloon and directly, with a catheter passed through the glottis) and in the literature, as well as discusses in detail the problems pertaining to the measurement (repeatability, correcting for lung volume, etc.). As a sophisticated subject, JS makes some pertinent observations about the changes in his use of subglottal pressure.
Communication devices in the operating room.
Ruskin, Keith J
2006-12-01
Effective communication is essential to patient safety. Although radio pagers have been the cornerstone of medical communication, new devices such as cellular telephones, personal digital assistants (PDAs), and laptop or tablet computers can help anesthesiologists to get information quickly and reliably. Anesthesiologists can use these devices to speak with colleagues, access the medical record, or help a colleague in another location without having to leave a patient's side. Recent advances in communication technology offer anesthesiologists new ways to improve patient care. Anesthesiologists rely on a wide variety of information to make decisions, including vital signs, laboratory values, and entries in the medical record. Devices such as PDAs and computers with wireless networking can be used to access this information. Mobile telephones can be used to get help or ask for advice, and are more efficient than radio pagers. Voice over Internet protocol is a new technology that allows voice conversations to be routed over computer networks. It is widely believed that wireless devices can cause life-threatening interference with medical devices. The actual risk is very low, and is offset by a significant reduction in medical errors that results from more efficient communication. Using common technology like cellular telephones and wireless networks is a simple, cost-effective way to improve patient care.
Learned face-voice pairings facilitate visual search.
Zweig, L Jacob; Suzuki, Satoru; Grabowecky, Marcia
2015-04-01
Voices provide a rich source of information that is important for identifying individuals and for social interaction. During search for a face in a crowd, voices often accompany visual information, and they facilitate localization of the sought-after individual. However, it is unclear whether this facilitation occurs primarily because the voice cues the location of the face or because it also increases the salience of the associated face. Here we demonstrate that a voice that provides no location information nonetheless facilitates visual search for an associated face. We trained novel face-voice associations and verified learning using a two-alternative forced choice task in which participants had to correctly match a presented voice to the associated face. Following training, participants searched for a previously learned target face among other faces while hearing one of the following sounds (localized at the center of the display): a congruent learned voice, an incongruent but familiar voice, an unlearned and unfamiliar voice, or a time-reversed voice. Only the congruent learned voice speeded visual search for the associated face. This result suggests that voices facilitate the visual detection of associated faces, potentially by increasing their visual salience, and that the underlying crossmodal associations can be established through brief training.
[Voice assessment and demographic data of applicants for a school of speech therapists].
Reiter, R; Brosch, S
2008-05-01
Demographic data, subjective und objective voice analysis as well as self-assessment of voice quality from applicants for a school of speech therapists were investigated. Demographic data from 116 applicants were collected and their voice quality assessed by three independent judges. An objective evaluation was done by maximum phonation time, average fundamental frequency, dynamic range and percent of jitter and shimmer by means of Goettinger Hoarseness diagram. Self-assessment of voice quality was done by "voice handicap index questionnaire". The twenty successful applicants had a physiological voice in 95 %, they were all musical and had university entrance qualifications. Subjective voice assessment showed in 16 % of the applicants a hoarse voice. In this subgroup an unphysiological vocal use was observed in 72 % and a reduced articulation in 45 %. The objective voice parameters did not show a significant difference between the 3 groups. Self-assessment of the voice was inconspicuous in all applicants. Applicants with general qualification for university entrance, musicality and a physiological voice were more likely to be successful. There were main differences between self assessment of voice and quantitative analysis or subjective assessment by three independent judges.
... an ENT Doctor Near You Keeping Your Voice Healthy Keeping Your Voice Healthy Patient Health Information News ... voice-related. Key Steps for Keeping Your Voice Healthy Drink plenty of water. Moisture is good for ...
Overgeneral autobiographical memory bias in clinical and non-clinical voice hearers.
Jacobsen, Pamela; Peters, Emmanuelle; Ward, Thomas; Garety, Philippa A; Jackson, Mike; Chadwick, Paul
2018-03-14
Hearing voices can be a distressing and disabling experience for some, whilst it is a valued experience for others, so-called 'healthy voice-hearers'. Cognitive models of psychosis highlight the role of memory, appraisal and cognitive biases in determining emotional and behavioural responses to voices. A memory bias potentially associated with distressing voices is the overgeneral memory bias (OGM), namely the tendency to recall a summary of events rather than specific occasions. It may limit access to autobiographical information that could be helpful in re-appraising distressing experiences, including voices. We investigated the possible links between OGM and distressing voices in psychosis by comparing three groups: (1) clinical voice-hearers (N = 39), (2) non-clinical voice-hearers (N = 35) and (3) controls without voices (N = 77) on a standard version of the autobiographical memory test (AMT). Clinical and non-clinical voice-hearers also completed a newly adapted version of the task, designed to assess voices-related memories (vAMT). As hypothesised, the clinical group displayed an OGM bias by retrieving fewer specific autobiographical memories on the AMT compared with both the non-clinical and control groups, who did not differ from each other. The clinical group also showed an OGM bias in recall of voice-related memories on the vAMT, compared with the non-clinical group. Clinical voice-hearers display an OGM bias when compared with non-clinical voice-hearers on both general and voices-specific recall tasks. These findings have implications for the refinement and targeting of psychological interventions for psychosis.
Rousseau, Bernard; Gutmann, Michelle L; Mau, Theodore; Francis, David O; Johnson, Jeffrey P; Novaleski, Carolyn K; Vinson, Kimberly N; Garrett, C Gaelyn
2015-03-01
This randomized trial investigated voice rest and supplemental text-to-speech communication versus voice rest alone on visual analog scale measures of communication effectiveness and magnitude of voice use. Randomized clinical trial. Multicenter outpatient voice clinics. Thirty-seven patients undergoing phonomicrosurgery. Patients undergoing phonomicrosurgery were randomized to voice rest and supplemental text-to-speech communication or voice rest alone. The primary outcome measure was the impact of voice rest on ability to communicate effectively over a 7-day period. Pre- and postoperative magnitude of voice use was also measured as an observational outcome. Patients randomized to voice rest and supplemental text-to-speech communication reported higher median communication effectiveness on each postoperative day compared to those randomized to voice rest alone, with significantly higher median communication effectiveness on postoperative days 3 (P=.03) and 5 (P=.01). Magnitude of voice use did not differ on any preoperative (P>.05) or postoperative day (P>.05), nor did patients significantly decrease voice use as the surgery date approached (P>.05). However, there was a significant reduction in median voice use pre- to postoperatively across patients (P<.001) with median voice use ranging from 0 to 3 throughout the postoperative week. Supplemental text-to-speech communication increased patient-perceived communication effectiveness on postoperative days 3 and 5 over voice rest alone. With the prevalence of smartphones and the widespread use of text messaging, supplemental text-to-speech communication may provide an accessible and cost-effective communication option for patients on vocal restrictions. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2015.
Poliva, Oren
2017-01-01
In the brain of primates, the auditory cortex connects with the frontal lobe via the temporal pole (auditory ventral stream; AVS) and via the inferior parietal lobe (auditory dorsal stream; ADS). The AVS is responsible for sound recognition, and the ADS for sound-localization, voice detection and integration of calls with faces. I propose that the primary role of the ADS in non-human primates is the detection and response to contact calls. These calls are exchanged between tribe members (e.g., mother-offspring) and are used for monitoring location. Detection of contact calls occurs by the ADS identifying a voice, localizing it, and verifying that the corresponding face is out of sight. Once a contact call is detected, the primate produces a contact call in return via descending connections from the frontal lobe to a network of limbic and brainstem regions. Because the ADS of present day humans also performs speech production, I further propose an evolutionary course for the transition from contact call exchange to an early form of speech. In accordance with this model, structural changes to the ADS endowed early members of the genus Homo with partial vocal control. This development was beneficial as it enabled offspring to modify their contact calls with intonations for signaling high or low levels of distress to their mother. Eventually, individuals were capable of participating in yes-no question-answer conversations. In these conversations the offspring emitted a low-level distress call for inquiring about the safety of objects (e.g., food), and his/her mother responded with a high- or low-level distress call to signal approval or disapproval of the interaction. Gradually, the ADS and its connections with brainstem motor regions became more robust and vocal control became more volitional. Speech emerged once vocal control was sufficient for inventing novel calls. PMID:28928931
Children's Voice or Children's Voices? How Educational Research Can Be at the Heart of Schooling
ERIC Educational Resources Information Center
Stern, Julian
2015-01-01
There are problems with considering children and young people in schools as quite separate individuals, and with considering them as members of a single collectivity. The tension is represented in the use of "voice" and "voices" in educational debates. Voices in dialogue, in contrast to "children's voice", are…
ERIC Educational Resources Information Center
Liming, Drew
2009-01-01
This article talks about voice actors and features Tony Oliver, a professional voice actor. Voice actors help to bring one's favorite cartoon and video game characters to life. They also do voice-overs for radio and television commercials and movie trailers. These actors use the sound of their voice to sell a character's emotions--or an advertised…
Applied Swarm-based medicine: collecting decision trees for patterns of algorithms analysis.
Panje, Cédric M; Glatzer, Markus; von Rappard, Joscha; Rothermundt, Christian; Hundsberger, Thomas; Zumstein, Valentin; Plasswilm, Ludwig; Putora, Paul Martin
2017-08-16
The objective consensus methodology has recently been applied in consensus finding in several studies on medical decision-making among clinical experts or guidelines. The main advantages of this method are an automated analysis and comparison of treatment algorithms of the participating centers which can be performed anonymously. Based on the experience from completed consensus analyses, the main steps for the successful implementation of the objective consensus methodology were identified and discussed among the main investigators. The following steps for the successful collection and conversion of decision trees were identified and defined in detail: problem definition, population selection, draft input collection, tree conversion, criteria adaptation, problem re-evaluation, results distribution and refinement, tree finalisation, and analysis. This manuscript provides information on the main steps for successful collection of decision trees and summarizes important aspects at each point of the analysis.
Hierarchical planning for a surface mounting machine placement.
Zeng, You-jiao; Ma, Deng-ze; Jin, Ye; Yan, Jun-qi
2004-11-01
For a surface mounting machine (SMM) in printed circuit board (PCB) assembly line, there are four problems, e.g. CAD data conversion, nozzle selection, feeder assignment and placement sequence determination. A hierarchical planning for them to maximize the throughput rate of an SMM is presented here. To minimize set-up time, a CAD data conversion system was first applied that could automatically generate the data for machine placement from CAD design data files. Then an effective nozzle selection approach implemented to minimize the time of nozzle changing. And then, to minimize picking time, an algorithm for feeder assignment was used to make picking multiple components simultaneously as much as possible. Finally, in order to shorten pick-and-place time, a heuristic algorithm was used to determine optimal component placement sequence according to the decided feeder positions. Experiments were conducted on a four head SMM. The experimental results were used to analyse the assembly line performance.
A Quasiphysics Intelligent Model for a Long Range Fast Tool Servo
Liu, Qiang; Zhou, Xiaoqin; Lin, Jieqiong; Xu, Pengzi; Zhu, Zhiwei
2013-01-01
Accurately modeling the dynamic behaviors of fast tool servo (FTS) is one of the key issues in the ultraprecision positioning of the cutting tool. Herein, a quasiphysics intelligent model (QPIM) integrating a linear physics model (LPM) and a radial basis function (RBF) based neural model (NM) is developed to accurately describe the dynamic behaviors of a voice coil motor (VCM) actuated long range fast tool servo (LFTS). To identify the parameters of the LPM, a novel Opposition-based Self-adaptive Replacement Differential Evolution (OSaRDE) algorithm is proposed which has been proved to have a faster convergence mechanism without compromising with the quality of solution and outperform than similar evolution algorithms taken for consideration. The modeling errors of the LPM and the QPIM are investigated by experiments. The modeling error of the LPM presents an obvious trend component which is about ±1.15% of the full span range verifying the efficiency of the proposed OSaRDE algorithm for system identification. As for the QPIM, the trend component in the residual error of LPM can be well suppressed, and the error of the QPIM maintains noise level. All the results verify the efficiency and superiority of the proposed modeling and identification approaches. PMID:24163627
Healthcare workplace conversations on race and the perspectives of physicians of African descent.
Nunez-Smith, Marcella; Curry, Leslie A; Berg, David; Krumholz, Harlan M; Bradley, Elizabeth H
2008-09-01
Although experts recommend that healthcare organizations create forums for honest dialogue about race, there is little insight into the physician perspectives that may influence these conversations across the healthcare workforce. To identify the range of perspectives that might contribute to workplace silence on race and affect participation in race-related conversations within healthcare settings. In-person, in-depth, racially concordant qualitative interviews. Twenty-five physicians of African descent practicing in the 6 New England states. Line-by-line independent coding and group negotiated consensus to develop codes structure using constant comparative method. Five themes characterize perspectives of participating physicians of African descent that potentially influence race-related conversations at work: 1) Perceived race-related healthcare experiences shape how participating physicians view healthcare organizations and their professional identities prior to any formal medical training; 2) Protecting racial/ethnic minority patients from healthcare discrimination is a top priority for participating physicians; 3) Participating physicians often rely on external support systems for race-related issues, rather than support systems inside the organization; 4) Participating physicians perceive differences between their interpretations of potentially offensive race-related work experiences and their non-minority colleagues' interpretations of the same experiences; and 5) Participating physicians are uncomfortable voicing race-related concerns at work. Creating a healthcare work environment that successfully supports diversity is as important as recruiting diversity across the workforce. Developing constructive ways to discuss race and race relations among colleagues in the workplace is a key step towards creating a supportive environment for employees and patients from all backgrounds.
Healthcare Workplace Conversations on Race and the Perspectives of Physicians of African Descent
Curry, Leslie A.; Berg, David; Krumholz, Harlan M.; Bradley, Elizabeth H.
2008-01-01
Background Although experts recommend that healthcare organizations create forums for honest dialogue about race, there is little insight into the physician perspectives that may influence these conversations across the healthcare workforce. Objective To identify the range of perspectives that might contribute to workplace silence on race and affect participation in race-related conversations within healthcare settings. Design In-person, in-depth, racially concordant qualitative interviews. Participants Twenty-five physicians of African descent practicing in the 6 New England states. Approach Line-by-line independent coding and group negotiated consensus to develop codes structure using constant comparative method. Main Results Five themes characterize perspectives of participating physicians of African descent that potentially influence race-related conversations at work: 1) Perceived race-related healthcare experiences shape how participating physicians view healthcare organizations and their professional identities prior to any formal medical training; 2) Protecting racial/ethnic minority patients from healthcare discrimination is a top priority for participating physicians; 3) Participating physicians often rely on external support systems for race-related issues, rather than support systems inside the organization; 4) Participating physicians perceive differences between their interpretations of potentially offensive race-related work experiences and their non-minority colleagues’ interpretations of the same experiences; and 5) Participating physicians are uncomfortable voicing race-related concerns at work. Conclusions Creating a healthcare work environment that successfully supports diversity is as important as recruiting diversity across the workforce. Developing constructive ways to discuss race and race relations among colleagues in the workplace is a key step towards creating a supportive environment for employees and patients from all backgrounds. PMID:18618190
Normal voice processing after posterior superior temporal sulcus lesion.
Jiahui, Guo; Garrido, Lúcia; Liu, Ran R; Susilo, Tirta; Barton, Jason J S; Duchaine, Bradley
2017-10-01
The right posterior superior temporal sulcus (pSTS) shows a strong response to voices, but the cognitive processes generating this response are unclear. One possibility is that this activity reflects basic voice processing. However, several fMRI and magnetoencephalography findings suggest instead that pSTS serves as an integrative hub that combines voice and face information. Here we investigate whether right pSTS contributes to basic voice processing by testing Faith, a patient whose right pSTS was resected, with eight behavioral tasks assessing voice identity perception and recognition, voice sex perception, and voice expression perception. Faith performed normally on all the tasks. Her normal performance indicates right pSTS is not necessary for intact voice recognition and suggests that pSTS activations to voices reflect higher-level processes. Copyright © 2017 Elsevier Ltd. All rights reserved.
A pneumatic Bionic Voice prosthesis-Pre-clinical trials of controlling the voice onset and offset.
Ahmadi, Farzaneh; Noorian, Farzad; Novakovic, Daniel; van Schaik, André
2018-01-01
Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech.
A pneumatic Bionic Voice prosthesis—Pre-clinical trials of controlling the voice onset and offset
Noorian, Farzad; Novakovic, Daniel; van Schaik, André
2018-01-01
Despite emergent progress in many fields of bionics, a functional Bionic Voice prosthesis for laryngectomy patients (larynx amputees) has not yet been achieved, leading to a lifetime of vocal disability for these patients. This study introduces a novel framework of Pneumatic Bionic Voice Prostheses as an electronic adaptation of the Pneumatic Artificial Larynx (PAL) device. The PAL is a non-invasive mechanical voice source, driven exclusively by respiration with an exceptionally high voice quality, comparable to the existing gold standard of Tracheoesophageal (TE) voice prosthesis. Following PAL design closely as the reference, Pneumatic Bionic Voice Prostheses seem to have a strong potential to substitute the existing gold standard by generating a similar voice quality while remaining non-invasive and non-surgical. This paper designs the first Pneumatic Bionic Voice prosthesis and evaluates its onset and offset control against the PAL device through pre-clinical trials on one laryngectomy patient. The evaluation on a database of more than five hours of continuous/isolated speech recordings shows a close match between the onset/offset control of the Pneumatic Bionic Voice and the PAL with an accuracy of 98.45 ±0.54%. When implemented in real-time, the Pneumatic Bionic Voice prosthesis controller has an average onset/offset delay of 10 milliseconds compared to the PAL. Hence it addresses a major disadvantage of previous electronic voice prostheses, including myoelectric Bionic Voice, in meeting the short time-frames of controlling the onset/offset of the voice in continuous speech. PMID:29466455
Martinelli, Eugenio; Mencattini, Arianna; Di Natale, Corrado
2016-01-01
Humans can communicate their emotions by modulating facial expressions or the tone of their voice. Albeit numerous applications exist that enable machines to read facial emotions and recognize the content of verbal messages, methods for speech emotion recognition are still in their infancy. Yet, fast and reliable applications for emotion recognition are the obvious advancement of present ‘intelligent personal assistants’, and may have countless applications in diagnostics, rehabilitation and research. Taking inspiration from the dynamics of human group decision-making, we devised a novel speech emotion recognition system that applies, for the first time, a semi-supervised prediction model based on consensus. Three tests were carried out to compare this algorithm with traditional approaches. Labeling performances relative to a public database of spontaneous speeches are reported. The novel system appears to be fast, robust and less computationally demanding than traditional methods, allowing for easier implementation in portable voice-analyzers (as used in rehabilitation, research, industry, etc.) and for applications in the research domain (such as real-time pairing of stimuli to participants’ emotional state, selective/differential data collection based on emotional content, etc.). PMID:27563724
NASA Astrophysics Data System (ADS)
Mohammed Anzar, Sharafudeen Thaha; Sathidevi, Puthumangalathu Savithri
2014-12-01
In this paper, we have considered the utility of multi-normalization and ancillary measures, for the optimal score level fusion of fingerprint and voice biometrics. An efficient matching score preprocessing technique based on multi-normalization is employed for improving the performance of the multimodal system, under various noise conditions. Ancillary measures derived from the feature space and the score space are used in addition to the matching score vectors, for weighing the modalities, based on their relative degradation. Reliability (dispersion) and the separability (inter-/intra-class distance and d-prime statistics) measures under various noise conditions are estimated from the individual modalities, during the training/validation stage. The `best integration weights' are then computed by algebraically combining these measures using the weighted sum rule. The computed integration weights are then optimized against the recognition accuracy using techniques such as grid search, genetic algorithm and particle swarm optimization. The experimental results show that, the proposed biometric solution leads to considerable improvement in the recognition performance even under low signal-to-noise ratio (SNR) conditions and reduces the false acceptance rate (FAR) and false rejection rate (FRR), making the system useful for security as well as forensic applications.
Detecting Parkinson's disease from sustained phonation and speech signals.
Vaiciukynas, Evaldas; Verikas, Antanas; Gelzinis, Adas; Bacauskiene, Marija
2017-01-01
This study investigates signals from sustained phonation and text-dependent speech modalities for Parkinson's disease screening. Phonation corresponds to the vowel /a/ voicing task and speech to the pronunciation of a short sentence in Lithuanian language. Signals were recorded through two channels simultaneously, namely, acoustic cardioid (AC) and smart phone (SP) microphones. Additional modalities were obtained by splitting speech recording into voiced and unvoiced parts. Information in each modality is summarized by 18 well-known audio feature sets. Random forest (RF) is used as a machine learning algorithm, both for individual feature sets and for decision-level fusion. Detection performance is measured by the out-of-bag equal error rate (EER) and the cost of log-likelihood-ratio. Essentia audio feature set was the best using the AC speech modality and YAAFE audio feature set was the best using the SP unvoiced modality, achieving EER of 20.30% and 25.57%, respectively. Fusion of all feature sets and modalities resulted in EER of 19.27% for the AC and 23.00% for the SP channel. Non-linear projection of a RF-based proximity matrix into the 2D space enriched medical decision support by visualization.
Multimodal Speech Capture System for Speech Rehabilitation and Learning.
Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam
2017-11-01
Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.
Fundamental frequency estimation of singing voice
NASA Astrophysics Data System (ADS)
de Cheveigné, Alain; Henrich, Nathalie
2002-05-01
A method of fundamental frequency (F0) estimation recently developped for speech [de Cheveigné and Kawahara, J. Acoust. Soc. Am. (to be published)] was applied to singing voice. An electroglottograph signal recorded together with the microphone provided a reference by which estimates could be validated. Using standard parameter settings as for speech, error rates were low despite the wide range of F0s (about 100 to 1600 Hz). Most ``errors'' were due to irregular vibration of the vocal folds, a sharp formant resonance that reduced the waveform to a single harmonic, or fast F0 changes such as in high-amplitude vibrato. Our database (18 singers from baritone to soprano) included examples of diphonic singing for which melody is carried by variations of the frequency of a narrow formant rather than F0. Varying a parameter (ratio of inharmonic to total power) the algorithm could be tuned to follow either frequency. Although the method has not been formally tested on a wide range of instruments, it seems appropriate for musical applications because it is accurate, accepts a wide range of F0s, and can be implemented with low latency for interactive applications. [Work supported by the Cognitique programme of the French Ministry of Research and Technology.
Hacki, T
1996-01-01
The Voice Range Profile (VRP) measurement offers a method for the investigation of voice modalities i.e. speaking voice, shouting voice and singing voice in their mutual pitch and intensity relations. The parameters FO and SPL are evaluated by means of automatic pitch and SPL measurements from (1) sustained phonation /a:/ in the speaker's natural pitch and intensity range, (2) the continuous speaking voice beginning with Pianissimo up to Fortissimo, (3) the shouting voice. Vocal intensity is plotted vertically, vocal pitch horizontally. The displays of the vocal intensity versus fundamental frequency are defined as singing voice range profile (VRP), speaking VRP and shouting VRP. The VRPs are superimposed on the same plot. Their form, their shape and their position to each other are analysed. The physiological relationships between the VRPs of the different voice modalities to each other are defined. The pathological relationships between the VRPs (i.e. reduction, shifting) give information about etiology and pathomechanism of voice disorders.
Ma, E P; Yiu, E M
2001-06-01
Traditional clinical voice evaluation focuses primarily on the severity of voice impairment, with little emphasis on the impact of voice disorders on the individual's quality of life. This study reports the development of a 28-item assessment tool that evaluates the perception of voice problem, activity limitation, and participation restriction using the International Classification of Impairments, Disabilities and Handicaps-2 Beta-1 concept (World Health Organization, 1997). The questionnaire was administered to 40 subjects with dysphonia and 40 control subjects with normal voices. Results showed that the dysphonic group reported significantly more severe voice problems, limitation in daily voice activities, and restricted participation in these activities than the control group. The study also showed that the perception of a voice problem by the dysphonic subjects correlated positively with the perception of limitation in voice activities and restricted participation. However, the self-perceived voice problem had little correlation with the degree of voice-quality impairment measured acoustically and perceptually by speech pathologists. The data also showed that the aggregate scores of activity limitation and participation restriction were positively correlated, and the extent of activity limitation and participation restriction was similar in all except the job area. These findings highlight the importance of identifying and quantifying the impact of dysphonia on the individual's quality of life in the clinical management of voice disorders.
Benefits for Voice Learning Caused by Concurrent Faces Develop over Time.
Zäske, Romi; Mühl, Constanze; Schweinberger, Stefan R
2015-01-01
Recognition of personally familiar voices benefits from the concurrent presentation of the corresponding speakers' faces. This effect of audiovisual integration is most pronounced for voices combined with dynamic articulating faces. However, it is unclear if learning unfamiliar voices also benefits from audiovisual face-voice integration or, alternatively, is hampered by attentional capture of faces, i.e., "face-overshadowing". In six study-test cycles we compared the recognition of newly-learned voices following unimodal voice learning vs. bimodal face-voice learning with either static (Exp. 1) or dynamic articulating faces (Exp. 2). Voice recognition accuracies significantly increased for bimodal learning across study-test cycles while remaining stable for unimodal learning, as reflected in numerical costs of bimodal relative to unimodal voice learning in the first two study-test cycles and benefits in the last two cycles. This was independent of whether faces were static images (Exp. 1) or dynamic videos (Exp. 2). In both experiments, slower reaction times to voices previously studied with faces compared to voices only may result from visual search for faces during memory retrieval. A general decrease of reaction times across study-test cycles suggests facilitated recognition with more speaker repetitions. Overall, our data suggest two simultaneous and opposing mechanisms during bimodal face-voice learning: while attentional capture of faces may initially impede voice learning, audiovisual integration may facilitate it thereafter.
Leino, Timo
2009-11-01
Voice quality has mainly been studied in trained speakers, singers, and dysphonic patients. Few studies have concerned ordinary untrained university students' voices. In light of earlier studies of professional voice users, it was hypothesized that good, poor, and intermediate voices would be distinguishable on the basis of long-term average spectrum characteristics. In the present study, voice quality of 50 Finnish vocally untrained male university students was studied perceptually and using long-term average spectrum analysis of text reading samples of one minute duration. Equivalent sound level (Leq) of text reading was also measured. According to the results, the good and ordinary voices differed from the poor ones in their relatively higher sound level in the frequency range of 1-3 kHz and a prominent peak at 3-4 kHz. Good voices, however, did not differ from the ordinary voices in terms of the characteristics of the long-term average spectrum (LTAS). The strength of the peak at 3-4 kHz and the voice-quality scores correlated weakly but significantly. Voice quality and alpha ratio (level difference above and below 1 kHz) correlated likewise. Leq was significantly higher in the students with good and ordinary voices than in those with poor voices. The connections between Leq, voice quality, and the formation of the peak at 3-4 kHz warrant further studies.
Interventions for preventing voice disorders in adults.
Ruotsalainen, J H; Sellman, J; Lehto, L; Jauhiainen, M; Verbeek, J H
2007-10-17
Poor voice quality due to a voice disorder can lead to a reduced quality of life. In occupations where voice use is substantial it can lead to periods of absence from work. To evaluate the effectiveness of interventions to prevent voice disorders in adults. We searched MEDLINE (PubMed, 1950 to 2006), EMBASE (1974 to 2006), CENTRAL (The Cochrane Library, Issue 2 2006), CINAHL (1983 to 2006), PsychINFO (1967 to 2006), Science Citation Index (1986 to 2006) and the Occupational Health databases OSH-ROM (to 2006). The date of the last search was 05/04/06. Randomised controlled clinical trials (RCTs) of interventions evaluating the effectiveness of treatments to prevent voice disorders in adults. For work-directed interventions interrupted time series and prospective cohort studies were also eligible. Two authors independently extracted data and assessed trial quality. Meta-analysis was performed where appropriate. We identified two randomised controlled trials including a total of 53 participants in intervention groups and 43 controls. One study was conducted with teachers and the other with student teachers. Both trials were poor quality. Interventions were grouped into 1) direct voice training, 2) indirect voice training and 3) direct and indirect voice training combined.1) Direct voice training: One study did not find a significant decrease of the Voice Handicap Index for direct voice training compared to no intervention.2) Indirect voice training: One study did not find a significant decrease of the Voice Handicap Index for indirect voice training when compared to no intervention.3) Direct and indirect voice training combined: One study did not find a decrease of the Voice Handicap Index for direct and indirect voice training combined when compared to no intervention. The same study did however find an improvement in maximum phonation time (Mean Difference -3.18 sec; 95 % CI -4.43 to -1.93) for direct and indirect voice training combined when compared to no intervention. No work-directed studies were found. None of the studies found evaluated the effectiveness of prevention in terms of sick leave or number of diagnosed voice disorders. We found no evidence that either direct or indirect voice training or the two combined are effective in improving self-reported vocal functioning when compared to no intervention. The current practice of giving training to at-risk populations for preventing the development of voice disorders is therefore not supported by definitive evidence of effectiveness. Larger and methodologically better trials are needed with outcome measures that better reflect the aims of interventions.
Rousseau, Bernard; Gutmann, Michelle L.; Mau, I-fan Theodore; Francis, David O.; Johnson, Jeffrey P.; Novaleski, Carolyn K.; Vinson, Kimberly N.; Garrett, C. Gaelyn
2015-01-01
Objective This randomized trial investigated voice rest and supplemental text-to-speech communication versus voice rest alone on visual analog scale measures of communication effectiveness and magnitude of voice use. Study Design Randomized clinical trial. Setting Multicenter outpatient voice clinics. Subjects Thirty-seven patients undergoing phonomicrosurgery. Methods Patients undergoing phonomicrosurgery were randomized to voice rest and supplemental text-to-speech communication or voice rest alone. The primary outcome measure was the impact of voice rest on ability to communicate effectively over a seven-day period. Pre- and post-operative magnitude of voice use was also measured as an observational outcome. Results Patients randomized to voice rest and supplemental text-to-speech communication reported higher median communication effectiveness on each post-operative day compared to those randomized to voice rest alone, with significantly higher median communication effectiveness on post-operative day 3 (p = 0.03) and 5 (p = 0.01). Magnitude of voice use did not differ on any pre-operative (p > 0.05) or post-operative day (p > 0.05), nor did patients significantly decrease voice use as the surgery date approached (p > 0.05). However, there was a significant reduction in median voice use pre- to post-operatively across patients (p < 0.001) with median voice use ranging from 0–3 throughout the post-operative week. Conclusion Supplemental text-to-speech communication increased patient perceived communication effectiveness on post-operative days 3 and 5 over voice rest alone. With the prevalence of smartphones and the widespread use of text messaging, supplemental text-to-speech communication may provide an accessible and cost-effective communication option for patients on vocal restrictions. PMID:25605690
Cannito, Michael P; Chorna, Lesya B; Kahane, Joel C; Dworkin, James P
2014-05-01
This study evaluated the hypotheses that sentence production by speakers with adductor (AD) and abductor (AB) spasmodic dysphonia (SD) may be differentially influenced by consonant voicing and manner features, in comparison with healthy, matched, nondysphonic controls. This was a prospective, single blind study, using a between-groups, repeated measures design for the independent variables of perceived voice quality and sentence duration. Sixteen subjects with ADSD and 10 subjects with ABSD, as well as 26 matched healthy controls produced four short, simple sentences that were systematically loaded with voiced or voiceless consonants of either obstruant or continuant manner categories. Experienced voice clinicians, who were "blind" as to speakers' group affixations, used visual analog scaling to judge the overall voice quality of each sentence. Acoustic sentence durations were also measured. Speakers with ABSD or ADSD demonstrated significantly poorer than normal voice quality on all sentences. Speakers with ABSD exhibited longer than normal duration for voiceless consonant sentences. Speakers with ADSD had poorer voice quality for voiced than for voiceless consonant sentences. Speakers with ABSD had longer durations for voiceless than for voiced consonant sentences. The two subtypes of SD exhibit differential performance on the basis of consonant voicing in short, simple sentences; however, each subgroup manifested voicing-related differences on a different variable (voice quality vs sentence duration). Findings suggest different underlying pathophysiological mechanisms for ABSD and ADSD. Findings also support inclusion of short, simple sentences containing voiced or voiceless consonants as part of the diagnostic protocol for SD, with measurement of sentence duration in addition to judments of voice quality severity. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The effectiveness of a voice treatment approach for teachers with self-reported voice problems.
Gillivan-Murphy, Patricia; Drinnan, Michael J; O'Dwyer, Tadhg P; Ridha, Hayder; Carding, Paul
2006-09-01
Teachers are considered the professional group most at risk of developing voice-problems, but limited treatment effectiveness evidence exists. We studied prospectively the effectiveness of a 6-week combined treatment approach using vocal function exercises (VFEs) and vocal hygiene (VH) education with 20 teachers with self-reported voice problems. Twenty subjects were randomly assigned to a no-treatment control (n = 11) and a treatment group (n = 9). Fibreoptic endoscopic evaluation was carried out on all subjects before randomization. Two self-report voice outcome measures were used: the Voice-Related Quality of Life (VRQOL) and the Voice Symptom Severity Scale (VoiSS). A Voice Care Knowledge Visual Analogue Scale (VAS), developed specifically for the study, was also used to evaluate change in selected voice knowledge areas. A Student unpaired t test revealed a statistically significant (P < 0.05) improvement in the treatment group as measured by the VoiSS. There was not a significant improvement in the treatment group as measured by the V-RQOL. The difference in voice care knowledge areas was also significant for the treatment group (P < 0.05). This study suggests that a voice treatment approach of VFEs and VH education improved self-reported voice symptoms and voice care knowledge in a group of teachers.
Bauer, Jay J; Mittal, Jay; Larson, Charles R; Hain, Timothy C
2006-04-01
The present study tested whether subjects respond to unanticipated short perturbations in voice loudness feedback with compensatory responses in voice amplitude. The role of stimulus magnitude (+/- 1,3 vs 6 dB SPL), stimulus direction (up vs down), and the ongoing voice amplitude level (normal vs soft) were compared across compensations. Subjects responded to perturbations in voice loudness feedback with a compensatory change in voice amplitude 76% of the time. Mean latency of amplitude compensation was 157 ms. Mean response magnitudes were smallest for 1-dB stimulus perturbations (0.75 dB) and greatest for 6-dB conditions (0.98 dB). However, expressed as gain, responses for 1-dB perturbations were largest and almost approached 1.0. Response magnitudes were larger for the soft voice amplitude condition compared to the normal voice amplitude condition. A mathematical model of the audio-vocal system captured the main features of the compensations. Previous research has demonstrated that subjects can respond to an unanticipated perturbation in voice pitch feedback with an automatic compensatory response in voice fundamental frequency. Data from the present study suggest that voice loudness feedback can be used in a similar manner to monitor and stabilize voice amplitude around a desired loudness level.
Relationship between Activity Noise, Voice Parameters, and Voice Symptoms among Female Teachers.
Pirilä, Sirpa; Pirilä, Paula; Ansamaa, Terhi; Yliherva, Anneli; Sonning, Samuel; Rantala, Leena
2017-01-01
Our interest was in how teachers' voices behave during the delivery of lessons in core subjects (e.g., mathematics, science, etc.). We sought to evaluate the relationship between voice sound pressure level (SPL), vocal fundamental frequency (F0), voice symptoms, activity noise, and differences therein during the first and the last lessons in core subjects of the day. The participants were 24 female elementary school teachers. Voice symptoms were evaluated by questionnaire. The data were recorded on 2 portable voice accumulators (VoxLog) from the first and last lessons of the day. The versions of accumulators differed by frequency weighting; therefore, the analysis and the results of noise and voice SPL were treated separately: unweighted (group 1) and A-weighted (group 2). Difference in voice SPL followed difference in activity noise. F0 increased between the first and last lessons. Correlations were found between differences in the noise and the voice symptoms of tiredness and dryness. Irritating mucus was associated with high F0 during the first lesson. An apparent increase in voice loading due to the activity noise was observed during lessons in core subjects. Collaboration between specialists in voice and acoustics and teachers and pupils is needed to reduce this voice loading. © 2017 S. Karger AG, Basel.
The effect of singing training on voice quality for people with quadriplegia.
Tamplin, Jeanette; Baker, Felicity A; Buttifant, Mary; Berlowitz, David J
2014-01-01
Despite anecdotal reports of voice impairment in quadriplegia, the exact nature of these impairments is not well described in the literature. This article details objective and subjective voice assessments for people with quadriplegia at baseline and after a respiratory-targeted singing intervention. Randomized controlled trial. Twenty-four participants with quadriplegia were randomly assigned to a 12-week program of either a singing intervention or active music therapy control. Recordings of singing and speech were made at baseline, 6 weeks, 12 weeks, and 6 months postintervention. These deidentified recordings were used to measure sound pressure levels and assess voice quality using the Multidimensional Voice Profile and the Perceptual Voice Profile. Baseline voice quality data indicated deviation from normality in the areas of breathiness, strain, and roughness. A greater percentage of intervention participants moved toward more normal voice quality in terms of jitter, shimmer, and noise-to-harmonic ratio; however, the improvements failed to achieve statistical significance. Subjective and objective assessments of voice quality indicate that quadriplegia may have a detrimental effect on voice quality; in particular, causing a perception of roughness and breathiness in the voice. The results of this study suggest that singing training may have a role in ameliorating these voice impairments. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The phonatory deviation diagram: a novel objective measurement of vocal function.
Madazio, Glaucya; Leão, Sylvia; Behlau, Mara
2011-01-01
To identify the discriminative characteristics of the phonatory deviation diagram (PDD) in rough, breathy and tense voices. One hundred and ninety-six samples of normal and dysphonic voices from adults were submitted to perceptual auditory evaluation, focusing on the predominant vocal quality and the degree of deviation. Acoustic analysis was performed with the VoxMetria (CTS Informatica). Significant differences were observed between the dysphonic and normal groups (p < 0.001), and also between the breathy and rough samples (p = 0.044) and the breathy and tense samples (p < 0.001). All normal voices were positioned in the inferior left quadrant, 45% of the rough voices in the inferior right quadrant, 52.6% of the breathy voices in the superior right quadrant and 54.3% of the tense voices in the inferior left quadrant of the PDD. In the inferior left quadrant, 93.8% of voices with no deviation were located and 72.7% of voices with mild deviation; voices with moderate deviation were distributed in the inferior and superior right quadrants, the latter ones containing the most deviant voices and 80% of voices with severe deviation. The PDD was able to discriminate normal from dysphonic voices, and the distribution was related to the type and degree of voice alteration. Copyright © 2011 S. Karger AG, Basel.
Voice and choice in health care in England: understanding citizen responses to dissatisfaction.
Dowding, Keith; John, Peter
2011-01-01
Using data from a five-year online survey the paper examines the effects of relative satisfaction with health services on individuals' voice-and-choice activity in the English public health care system. Voice is considered in three parts – individual voice (complaints), collective voice voting and participation (collective action). Exercising choice is seen in terms of complete exit (not using health care), internal exit (choosing another public service provider) and private exit (using private health care). The interaction of satisfaction and forms of voice and choice are analysed over time. Both voice and choice are correlated with dissatisfaction with those who are unhappy with the NHS more likely to privately voice and to plan to take up private health care. Those unable to choose private provision are likely to use private voice. These factors are not affected by items associated with social capital – indeed, being more trusting leads to lower voice activity.
Occurrence Frequencies of Acoustic Patterns of Vocal Fry in American English Speakers.
Abdelli-Beruh, Nassima B; Drugman, Thomas; Red Owl, R H
2016-11-01
The goal of this study was to analyze the occurrence frequencies of three individual acoustic patterns (A, B, C) and of vocal fry overall (A + B + C) as a function of gender, word position in the sentence (Not Last Word vs. Last Word), and sentence length (number of words in a sentence). This is an experimental design. Twenty-five male and 29 female American English (AE) speakers read the Grandfather Passage. The recordings were processed by a Matlab toolbox designed for the analysis and detection of creaky segments, automatically identified using the Kane-Drugman algorithm. The experiment produced subsamples of outcomes, three that reflect a single, discrete acoustic pattern (A, B, or C) and the fourth that reflects the occurrence frequency counts of Vocal Fry Overall without regard to any specific pattern. Zero-truncated Poisson regression analyses were conducted with Gender and Word Position as predictors and Sentence Length as a covariate. The results of the present study showed that the occurrence frequencies of the three acoustic patterns and vocal fry overall (A + B + C) are greatest at the end of sentences but are unaffected by sentence length. The findings also reveal that AE female speakers exhibit Pattern C significantly more frequently than Pattern B, and the converse holds for AE male speakers. Future studies are needed to confirm such outcomes, assess the perceptual salience of these acoustic patterns, and determine the physiological correlates of these acoustic patterns. The findings have implications for the design of new excitation models of vocal fry. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Arteaga-Sierra, F R; Milián, C; Torres-Gómez, I; Torres-Cisneros, M; Moltó, G; Ferrando, A
2014-09-22
We present a numerical strategy to design fiber based dual pulse light sources exhibiting two predefined spectral peaks in the anomalous group velocity dispersion regime. The frequency conversion is based on the soliton fission and soliton self-frequency shift occurring during supercontinuum generation. The optimization process is carried out by a genetic algorithm that provides the optimum input pulse parameters: wavelength, temporal width and peak power. This algorithm is implemented in a Grid platform in order to take advantage of distributed computing. These results are useful for optical coherence tomography applications where bell-shaped pulses located in the second near-infrared window are needed.
Voice- and swallow-related quality of life in idiopathic Parkinson's disease.
van Hooren, Michel R A; Baijens, Laura W J; Vos, Rein; Pilz, Walmari; Kuijpers, Laura M F; Kremer, Bernd; Michou, Emilia
2016-02-01
This study explores whether changes in voice- and swallow-related QoL are associated with progression of idiopathic Parkinson's disease (IPD). Furthermore, it examines the relationship between patients' perception of both voice and swallowing disorders in IPD. Prospective clinical study, quality of life (QoL). One-hundred mentally competent IPD patients with voice and swallowing complaints were asked to answer four QoL questionnaires (Voice Handicap Index, MD Anderson Dysphagia Inventory, Visual Analog Scale [VAS] voice, and Dysphagia Severity Scale [DSS]). Differences in means for the QoL questionnaires and their subscales within Hoehn and Yahr stage groups were calculated using one-way analysis of variance. The relationship between voice- and swallow-related QoL questionnaires was determined with the Spearman correlation coefficient. Scores on both voice and swallow questionnaires suggest an overall decrease in QoL with progression of IPD. A plateau in QoL for VAS voice and the DSS was seen in the early Hoehn and Yahr stages. Finally, scores on voice-related QoL questionnaires were significantly correlated with swallow-related QoL outcomes. Voice- and swallow-related QoL decreases with progression of IPD. A significant association was found between voice- and swallow-related QoL questionnaires. Healthcare professionals can benefit from voice- and swallow-related QoL questionnaires in a multidimensional voice- or swallow-assessment protocol. The patient's perception of his/her voice and swallowing disorders and its impact on QoL in IPD should not be disregarded. 2b. © 2015 The American Laryngological, Rhinological and Otological Society, Inc.
Abrams, Daniel A.; Chen, Tianwen; Odriozola, Paola; Cheng, Katherine M.; Baker, Amanda E.; Padmanabhan, Aarthi; Ryali, Srikanth; Kochalka, John; Feinstein, Carl; Menon, Vinod
2016-01-01
The human voice is a critical social cue, and listeners are extremely sensitive to the voices in their environment. One of the most salient voices in a child’s life is mother's voice: Infants discriminate their mother’s voice from the first days of life, and this stimulus is associated with guiding emotional and social function during development. Little is known regarding the functional circuits that are selectively engaged in children by biologically salient voices such as mother’s voice or whether this brain activity is related to children’s social communication abilities. We used functional MRI to measure brain activity in 24 healthy children (mean age, 10.2 y) while they attended to brief (<1 s) nonsense words produced by their biological mother and two female control voices and explored relationships between speech-evoked neural activity and social function. Compared to female control voices, mother’s voice elicited greater activity in primary auditory regions in the midbrain and cortex; voice-selective superior temporal sulcus (STS); the amygdala, which is crucial for processing of affect; nucleus accumbens and orbitofrontal cortex of the reward circuit; anterior insula and cingulate of the salience network; and a subregion of fusiform gyrus associated with face perception. The strength of brain connectivity between voice-selective STS and reward, affective, salience, memory, and face-processing regions during mother’s voice perception predicted social communication skills. Our findings provide a novel neurobiological template for investigation of typical social development as well as clinical disorders, such as autism, in which perception of biologically and socially salient voices may be impaired. PMID:27185915
Abrams, Daniel A; Chen, Tianwen; Odriozola, Paola; Cheng, Katherine M; Baker, Amanda E; Padmanabhan, Aarthi; Ryali, Srikanth; Kochalka, John; Feinstein, Carl; Menon, Vinod
2016-05-31
The human voice is a critical social cue, and listeners are extremely sensitive to the voices in their environment. One of the most salient voices in a child's life is mother's voice: Infants discriminate their mother's voice from the first days of life, and this stimulus is associated with guiding emotional and social function during development. Little is known regarding the functional circuits that are selectively engaged in children by biologically salient voices such as mother's voice or whether this brain activity is related to children's social communication abilities. We used functional MRI to measure brain activity in 24 healthy children (mean age, 10.2 y) while they attended to brief (<1 s) nonsense words produced by their biological mother and two female control voices and explored relationships between speech-evoked neural activity and social function. Compared to female control voices, mother's voice elicited greater activity in primary auditory regions in the midbrain and cortex; voice-selective superior temporal sulcus (STS); the amygdala, which is crucial for processing of affect; nucleus accumbens and orbitofrontal cortex of the reward circuit; anterior insula and cingulate of the salience network; and a subregion of fusiform gyrus associated with face perception. The strength of brain connectivity between voice-selective STS and reward, affective, salience, memory, and face-processing regions during mother's voice perception predicted social communication skills. Our findings provide a novel neurobiological template for investigation of typical social development as well as clinical disorders, such as autism, in which perception of biologically and socially salient voices may be impaired.
Perceptions of Voice Teachers Regarding Students' Vocal Behaviors During Singing and Speaking.
Beeman, Shellie A
2017-01-01
This study examined voice teachers' perceptions of their instruction of healthy singing and speaking voice techniques. An online, researcher-generated questionnaire based on the McClosky technique was administered to college/university voice teachers listed as members in the 2012-2013 College Music Society directory. A majority of participants believed there to be a relationship between the health of the singing voice and the health of the speaking voice. Participants' perception scores were the most positive for variable MBSi, the monitoring of students' vocal behaviors during singing. Perception scores for variable TVB, the teaching of healthy vocal behaviors, and variable MBSp, the monitoring of students' vocal behaviors while speaking, ranked second and third, respectively. Perception scores for variable TVB were primarily associated with participants' familiarity with voice rehabilitation techniques, gender, and familiarity with the McClosky technique. Perception scores for variable MBSi were primarily associated with participants' familiarity with voice rehabilitation techniques, gender, type of student taught, and instruction of a student with a voice disorder. Perception scores for variable MBSp were correlated with the greatest number of characteristics, including participants' familiarity with voice rehabilitation techniques, familiarity with the McClosky technique, type of student taught, years of teaching experience, and instruction of a student with a voice disorder. Voice teachers are purportedly working with injured voices and attempting to include vocal health in their instruction. Although a voice teacher is not obligated to pursue further rehabilitative training, the current study revealed a positive relationship between familiarity with specific rehabilitation techniques and vocal health. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Performer's attitudes toward seeking health care for voice issues: understanding the barriers.
Gilman, Marina; Merati, Albert L; Klein, Adam M; Hapner, Edie R; Johns, Michael M
2009-03-01
Contemporary commercial music (CCM) performers rely heavily on their voice, yet may not be aware of the importance of proactive voice care. This investigation intends to identify perceptions and barriers to seeking voice care among CCM artists. This cross-sectional observational study used a 10-item Likert-based response questionnaire to assess current perceptions regarding voice care in a population of randomly selected participants of professional CCM conference. Subjects (n=78) were queried regarding their likelihood to seek medical care for minor medical problems and specifically problems with their voice. Additional questions investigated anxiety about seeking voice care from a physician specialist, speech language pathologist, or voice coach; apprehension regarding findings of laryngeal examination, laryngeal imaging procedures; and the effect of medical insurance on the likelihood of seeking medical care. Eighty-two percent of subjects reported that their voice was a critical part of their profession; 41% stated that they were not likely to seek medical care for problems with their voice; and only 19% were reluctant to seek care for general medical problems (P<0.001). Anxiety about seeking a clinician regarding their voice was not a deterrent. Most importantly, 39% of subjects do not seek medical attention for their voice problems due to medical insurance coverage. The CCM artists are less likely to seek medical care for voice problems compared with general medical problems. Availability of medical insurance may be a factor. Availability of affordable voice care and education about the importance of voice care is needed in this population of vocal performers.
Voice similarity in identical twins.
Van Gysel, W D; Vercammen, J; Debruyne, F
2001-01-01
If people are asked to discriminate visually the two individuals of a monozygotic twin (MT), they mostly get into trouble. Does this problem also exist when listening to twin voices? Twenty female and 10 male MT voices were randomly assembled with one "strange" voice to get voice trios. The listeners (10 female students in Speech and Language Pathology) were asked to label the twins (voices 1-2, 1-3 or 2-3) in two conditions: two standard sentences read aloud and a 2.5-second midsection of a sustained /a/. The proportion correctly labelled twins was for female voices 82% and 63% and for male voices 74% and 52% for the sentences and the sustained /a/ respectively, both being significantly greater than chance (33%). The acoustic analysis revealed a high intra-twin correlation for the speaking fundamental frequency (SFF) of the sentences and the fundamental frequency (F0) of the sustained /a/. So the voice pitch could have been a useful characteristic in the perceptual identification of the twins. We conclude that there is a greater perceptual resemblance between the voices of identical twins than between voices without genetic relationship. The identification however is not perfect. The voice pitch possibly contributes to the correct twin identifications.
Niebudek-Bogusz, Ewa; Sliwińska-Kowalska, Mariola
2006-01-01
An assessment of the vocal system, as a part of the medical certification of occupational diseases, should be objective and reliable. Therefore, interest in the method of acoustic voice analysis enabling objective assessment of voice parameters is still growing. The aim of the present study was to evaluate the applicability of acoustic analysis with vocal loading test to the diagnostics of occupational voice disorders. The results of acoustic voice analysis were compared using IRIS software for phoniatrics, before and after a 30-min vocal loading test in 35 female teachers with diagnosed occupational voice disorders (group I) and in 31 female teachers with functional dysphonia (group II). In group I, vocal effort produced significant abnormalities in voice acoustic parameters, compared to group II. These included significantly increased mean fundamental frequency (Fo) value (by 11 Hz) and worsened jitter, shimmer and NHR parameters. Also, the percentage of subjects showing abnormalities in voice acoustic analysis was higher in this group. Conducting voice acoustic analysis before and after the vocal loading test makes it possible to objectively confirm irreversible voice impairments in persons with work-related pathologies of the larynx, which is essential for medical certification of occupational voice diseases.
von Lochow, Heike; Lyberg-Åhlander, Viveka; Sahlén, Birgitta; Kastberg, Tobias; Brännström, K Jonas
2018-04-01
This study explores the effect of voice quality and competing speaker/-s on children's performance in a passage comprehension task. Furthermore, it explores the interaction between passage comprehension and cognitive functioning. Forty-nine children (27 girls and 22 boys) with normal hearing (aged 7-12 years) participated. Passage comprehension was tested in six different listening conditions; a typical voice (non-dysphonic voice) in quiet, a typical voice with one competing speaker, a typical voice with four competing speakers, a dysphonic voice in quiet, a dysphonic voice with one competing speaker, and a dysphonic voice with four competing speakers. The children's working memory capacity and executive functioning were also assessed. The findings indicate no direct effect of voice quality on the children's performance, but a significant effect of background listening condition. Interaction effects were seen between voice quality, background listening condition, and executive functioning. The children's susceptibility to the effect of the dysphonic voice and the background listening conditions are related to the individual's executive functions. The findings have several implications for design of interventions in language learning environments such as classrooms.
Speaker's voice as a memory cue.
Campeanu, Sandra; Craik, Fergus I M; Alain, Claude
2015-02-01
Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect reflective of voice congruency is currently lacking. Copyright © 2014 Elsevier B.V. All rights reserved.
Matching novel face and voice identity using static and dynamic facial images.
Smith, Harriet M J; Dunn, Andrew K; Baguley, Thom; Stacey, Paula C
2016-04-01
Research investigating whether faces and voices share common source identity information has offered contradictory results. Accurate face-voice matching is consistently above chance when the facial stimuli are dynamic, but not when the facial stimuli are static. We tested whether procedural differences might help to account for the previous inconsistencies. In Experiment 1, participants completed a sequential two-alternative forced choice matching task. They either heard a voice and then saw two faces or saw a face and then heard two voices. Face-voice matching was above chance when the facial stimuli were dynamic and articulating, but not when they were static. In Experiment 2, we tested whether matching was more accurate when faces and voices were presented simultaneously. The participants saw two face-voice combinations, presented one after the other. They had to decide which combination was the same identity. As in Experiment 1, only dynamic face-voice matching was above chance. In Experiment 3, participants heard a voice and then saw two static faces presented simultaneously. With this procedure, static face-voice matching was above chance. The overall results, analyzed using multilevel modeling, showed that voices and dynamic articulating faces, as well as voices and static faces, share concordant source identity information. It seems, therefore, that above-chance static face-voice matching is sensitive to the experimental procedure employed. In addition, the inconsistencies in previous research might depend on the specific stimulus sets used; our multilevel modeling analyses show that some people look and sound more similar than others.
[An across-scales analysis of the voice self-concept questionnaire (FESS)].
Nusseck, Manfred; Richter, Bernhard; Echternach, Matthias; Spahn, Claudia
2018-04-01
The questionnaire for the assessment of the voice selfconcept (FESS) contains three sub-scales indicating the personal relation with the own voice. The scales address the relationship with one's own voice, the awareness of the use of one's own voice, and the perception of the connection between voice and emotional changes. A comprehensive approach across the three scales supporting a simplified interpretation of the results was still missing. The FESS questionnaire was used in a sample of 536 German teachers. With a discrimination analysis, commonalities in the scale characteristics were investigated. For a comparative validation with voice health and psychological and physiological wellbeing, the Voice Handicap Index (VHI), the questionnaire for Work-related Behavior and Experience Patterns (AVEM), and the questionnaire for Health-related Quality of Life (SF-12) were additionally collected. The analysis provided four different groups of voice self-concept: group 1 with healthy values in the voice self-concept and wellbeing scales, group 2 with a low voice self-concept and mean wellbeing values, group 3 with a high awareness of the voice use and mean wellbeing values and group 4 with low values in all scales. The results show that a combined approach across all scales of the questionnaire for the assessment of the voice self-concept enables a more detailed interpretation of the characteristics in the voice self-concept. The presented groups provide an applicable use supporting medical diagnoses. © Georg Thieme Verlag KG Stuttgart · New York.
AdaBoost-based algorithm for network intrusion detection.
Hu, Weiming; Hu, Wei; Maybank, Steve
2008-04-01
Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.
Voice symptoms and voice-related quality of life in college students.
Merrill, Ray M; Tanner, Kristine; Merrill, Joseph G; McCord, Matthew D; Beardsley, Melissa M; Steele, Brittanie A
2013-08-01
The purpose of this study was to examine the prevalence of voice disorders in college students and their effect on the students as shown by quality-of-life indicators. A cross-sectional survey was completed by 545 college students in 2012. The survey included 10 questions from the Voice-Related Quality of Life (V-RQOL), selected voice symptoms, and quality-of-life indicators of functional health and well-being based on the Short Form 36-item Health Survey (SF-36). Twenty-nine percent of the college students (mean age, 22.7 years) reported a history of a voice disorder. Hoarseness was the most prevalent voice symptom, but was not correlated with V-RQOL scores. A wobbly or shaky voice, throat dryness, vocal fatigue, and vocal effort explained a significant amount of variance on the social-emotional and physical domains of the V-RQOL index (p < 0.05). Voice symptoms limited emotional and physical functioning as indicated by SF-36 scores. Voice disorders significantly influence psychosocial and physical functioning in college students. These findings have important implications for voice-care services in this population.
Connections between voice ergonomic risk factors in classrooms and teachers' voice production.
Rantala, Leena M; Hakala, Suvi; Holmqvist, Sofia; Sala, Eeva
2012-01-01
The aim of the study was to investigate if voice ergonomic risk factors in classrooms correlated with acoustic parameters of teachers' voice production. The voice ergonomic risk factors in the fields of working culture, working postures and indoor air quality were assessed in 40 classrooms using the Voice Ergonomic Assessment in Work Environment - Handbook and Checklist. Teachers (32 females, 8 males) from the above-mentioned classrooms recorded text readings before and after a working day. Fundamental frequency, sound pressure level (SPL) and the slope of the spectrum (alpha ratio) were analyzed. The higher the number of the risk factors in the classrooms, the higher SPL the teachers used and the more strained the males' voices (increased alpha ratio) were. The SPL was already higher before the working day in the teachers with higher risk than in those with lower risk. In the working environment with many voice ergonomic risk factors, speakers increase voice loudness and use more strained voice quality (males). A practical implication of the results is that voice ergonomic assessments are needed in schools. Copyright © 2013 S. Karger AG, Basel.
NASA Astrophysics Data System (ADS)
Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.
2018-03-01
A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.
Illumination-invariant hand gesture recognition
NASA Astrophysics Data System (ADS)
Mendoza-Morales, América I.; Miramontes-Jaramillo, Daniel; Kober, Vitaly
2015-09-01
In recent years, human-computer interaction (HCI) has received a lot of interest in industry and science because it provides new ways to interact with modern devices through voice, body, and facial/hand gestures. The application range of the HCI is from easy control of home appliances to entertainment. Hand gesture recognition is a particularly interesting problem because the shape and movement of hands usually are complex and flexible to be able to codify many different signs. In this work we propose a three step algorithm: first, detection of hands in the current frame is carried out; second, hand tracking across the video sequence is performed; finally, robust recognition of gestures across subsequent frames is made. Recognition rate highly depends on non-uniform illumination of the scene and occlusion of hands. In order to overcome these issues we use two Microsoft Kinect devices utilizing combined information from RGB and infrared sensors. The algorithm performance is tested in terms of recognition rate and processing time.
In Search of Voice: Theory and Methods in K-12 Student Voice Research in the Us, 1990-2010
ERIC Educational Resources Information Center
Gonzalez, Taucia E.; Hernandez-Saca, David I.; Artiles, Alfredo J.
2017-01-01
Student voice research is a promising field of study that disrupts traditional student roles by reorganizing learning spaces that center youth voices. This review synthesizes student voice research by answering the following questions: (a) To what extent has student voice been studied at the K-12 levels in the US? (b) What are the conceptual…
Acoustic and Perceived Measurements Certifying Tango as Voice Treatment Method.
Tafiadis, Dionysios; Kosma, Evangelia I; Chronopoulos, Spyridon K; Papadopoulos, Aggelos; Toki, Eugenia I; Vassiliki, Siafaka; Ziavra, Nausica
2018-03-01
Voice disorders are affecting everyday life in many levels, and their prevalence has been studied extensively in certain and general populations. Notably, several factors have a cohesive influence on voice disorders and voice characteristics. Several studies report that health and environmental and psychological etiologies can serve as risk factors for voice disorders. Many diagnostic protocols, in the literature, evaluate voice and its parameters leading to direct or indirect treatment intervention. This study was designed to examine the effect of tango on adult acoustic voice parameters. Fifty-two adults (26 male and 26 female) were recruited and divided into four subgroups (male dancers, female dancers, male nondancers, and female nondancers). The participants were asked to answer two questionnaires (Voice Handicap Index and Voice Evaluation Form), and their voices were recorded before and after the tango dance session. Moreover, water consumption was investigated. The study's results indicated that the voices' acoustic characteristics were different between tango dancers and the control group. The beneficial results are far from prominent as they prove that tango dance can serve stand-alone as voice therapy without the need for hydration. Also, more research is imperative to be conducted on a longitudinal basis to obtain a more accurate result on the required time for the proposed therapy. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Acoustic echo cancellation for full-duplex voice transmission on fading channels
NASA Technical Reports Server (NTRS)
Park, Sangil; Messer, Dion D.
1990-01-01
This paper discusses the implementation of an adaptive acoustic echo canceler for a hands-free cellular phone operating on a fading channel. The adaptive lattice structure, which is particularly known for faster convergence relative to the conventional tapped-delay-line (TDL) structure, is used in the initialization stage. After convergence, the lattice coefficients are converted into the coefficients for the TDL structure which can accommodate a larger number of taps in real-time operation due to its computational simplicity. The conversion method of the TDL coefficients from the lattice coefficients is derived and the DSP56001 assembly code for the lattice and TDL structure is included, as well as simulation results and the schematic diagram for the hardware implementation.
Varieties of Voice-Hearing: Psychics and the Psychosis Continuum
Powers, Albert R.; Kelley, Megan S.; Corlett, Philip R.
2017-01-01
Hearing voices that are not present is a prominent symptom of serious mental illness. However, these experiences may be common in the non-help-seeking population, leading some to propose the existence of a continuum of psychosis from health to disease. Thus far, research on this continuum has focused on what is impaired in help-seeking groups. Here we focus on protective factors in non-help-seeking voice-hearers. We introduce a new study population: clairaudient psychics who receive daily auditory messages. We conducted phenomenological interviews with these subjects, as well as with patients diagnosed with a psychotic disorder who hear voices, people with a diagnosis of a psychotic disorder who do not hear voices, and matched control subjects (without voices or a diagnosis). We found the hallucinatory experiences of psychic voice-hearers to be very similar to those of patients who were diagnosed. We employed techniques from forensic psychiatry to conclude that the psychics were not malingering. Critically, we found that this sample of non-help-seeking voice hearers were able to control the onset and offset of their voices, that they were less distressed by their voice-hearing experiences and that, the first time they admitted to voice-hearing, the reception by others was much more likely to be positive. Patients had much more negative voice-hearing experiences, were more likely to receive a negative reaction when sharing their voices with others for the first time, and this was subsequently more disruptive to their social relationships. We predict that this sub-population of healthy voice-hearers may have much to teach us about the neurobiology, cognitive psychology and ultimately the treatment of voices that are distressing. PMID:28053132
Postlingual adult performance in noise with HiRes 120 and ClearVoice Low, Medium, and High.
Holden, Laura K; Brenner, Christine; Reeder, Ruth M; Firszt, Jill B
2013-11-01
The study's objectives were to evaluate speech recognition in multiple listening conditions using several noise types with HiRes 120 and ClearVoice (Low, Medium, High) and to determine which ClearVoice program was most beneficial for everyday use. Fifteen postlingual adults attended four sessions; speech recognition was assessed at sessions 1 and 3 with HiRes 120 and at sessions 2 and 4 with all ClearVoice programs. Test measures included sentences presented in restaurant noise (R-SPACE), in speech-spectrum noise, in four- and eight-talker babble, and connected discourse presented in 12-talker babble. Participants completed a questionnaire comparing ClearVoice programs. Significant group differences in performance between HiRes 120 and ClearVoice were present only in the R-SPACE; performance was better with ClearVoice High than HiRes 120. Among ClearVoice programs, no significant group differences were present for any measure. Individual results revealed most participants performed better in the R-SPACE with ClearVoice than HiRes 120. For other measures, significant individual differences between HiRes 120 and ClearVoice were not prevalent. Individual results among ClearVoice programs differed and overall preferences varied. Questionnaire data indicated increased understanding with High and Medium in certain environments. R-SPACE and questionnaire results indicated an advantage for ClearVoice High and Medium. Individual test and preference data showed mixed results between ClearVoice programs making global recommendations difficult; however, results suggest providing ClearVoice High and Medium and HiRes 120 as processor options for adults willing to change settings. For adults unwilling or unable to change settings, ClearVoice Medium is a practical choice for daily listening.
Ebersole, Barbara; Soni, Resha S; Moran, Kathleen; Lango, Miriam; Devarajan, Karthik; Jamal, Nausheen
2018-05-01
Examine the relationship among the severity of patient-perceived voice impairment, perceptual dysphonia severity, occupational voice demand, and voice therapy adherence. Identify clinical predictors of increased risk for therapy nonadherence. A retrospective cohort study of patients presenting with a chief complaint of persistent dysphonia at an interdisciplinary voice center was done. The Voice Handicap Index-10 (VHI-10) and the Voice-Related Quality of Life (V-RQOL) survey scores, clinician rating of dysphonia severity using the Grade score from the Grade, Roughness Breathiness, Asthenia, and Strain scale, occupational voice demand, and patient demographics were tested for associations with therapy adherence, defined as completion of the treatment plan. Classification and Regression Tree (CART) analysis was performed to establish thresholds for nonadherence risk. Of 166 patients evaluated, 111 were recommended for voice therapy. The therapy nonadherence rate was 56%. Occupational voice demand category, VHI-10, and V-RQOL scores were the only factors significantly correlated with therapy adherence (P < 0.0001, P = 0.018, and P = 0.008, respectively). CART analysis found that patients with low or no occupational voice demand are significantly more likely to be nonadherent with therapy than those with high occupational voice demand (P < 0.001). Furthermore, a VHI-10 score of ≤29 or a V-RQOL score of >40 is a significant cutoff point for predicting therapy nonadherence (P < 0.011 and P < 0.004, respectively). Occupational voice demand and patient perception of impairment are significantly and independently correlated with therapy adherence. A VHI-10 score of ≤9 or a V-RQOL score of >40 is a significant cutoff point for predicting nonadherence risk. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Faham, Maryam; Jalilevand, Nahid; Torabinezhad, Farhad; Silverman, Erin Pearson; Ahmadi, Akram; Anaraki, Zahra Ghayoumi; Jafari, Narges
2017-07-01
Teachers are at high risk of developing voice problems because of the excessive vocal demands necessitated by their profession. Teachers' self-assessment of vocal complaints, combined with subjective and objective measures of voice, may enable better therapeutic decision-making. This investigation compared audio-perceptual assessment and acoustic variables in teachers with and without voice complaints. Ninety-nine teachers completed this cross-sectional study and were assigned to one of two groups: those "with voice complaint (VC)" and those "without voice complaint (W-VC)." Voice samples were collected during reading, counting, and vowel prolongation tasks. Teachers were also asked to document any voice symptoms they experienced. Voice samples were analyzed using Dr. Speech program (4th version; Tiger Ltd., USA), and labeled "normal" or "abnormal" according to the "grade" dimension "G" from GRBAS scale. Twenty-one teachers were assigned to the VC group based on self-assessment data. There were statistically significant differences between the two groups with regard to self-reported voice symptoms of hoarseness, breathiness, pitch breaks, and vocal fatigue (P < 0.05). Fourteen participants in the VC group and 40 from the W-VC group were determined to demonstrate "abnormal" vocal quality on perceptual assessment. Only harmonic-to-noise ratio was significantly higher for the W-VC group (ES = 0.55). Teachers with and without voice complaints differed in the incidence, but not type of voice symptoms. Teachers' voice complaints did not correspond to perceptual and acoustic measures. This suggests a potential unmet need for teachers to receive further education on voice disorders. Copyright © 2017 The Voice Foundation. All rights reserved.
The Influence of Sleep Disorders on Voice Quality.
Rocha, Bruna Rainho; Behlau, Mara
2017-09-19
To verify the influence of sleep quality on the voice. Descriptive and analytical cross-sectional study. Data were collected by an online or printed survey divided in three parts: (1) demographic data and vocal health aspects; (2) self-assessment of sleep and vocal quality, and the influence that sleep has on voice; and (3) sleep and voice self-assessment inventories-the Epworth Sleepiness Scale (ESS), the Pittsburgh Sleep Quality Index (PSQI), and the Voice Handicap Index reduced version (VHI-10). A total of 862 people were included (493 women, 369 men), with a mean age of 32 years old (maximum age of 79 and minimum age of 18 years old). The perception of the influence that sleep has on voice showed a difference (P < 0.050) between measures of sleep quality and vocal self-assessment. There were higher scores on the ESS, PSQI, and VHI-10 protocols if sleep and vocal self-assessment were poor. The results indicate that the greater the effect that sleep has on voice, the greater the perceived voice handicap. The aspects that influence a voice handicap are vocal self-assessment, ESS total score, and self-assessment of the influence that sleep has on voice. The absence of daytime sleepiness is a protective factor (odds ratio [OR] > 1) against perceived voice handicap; the presence of daytime sleepiness is a damaging factor (OR < 1). Sleep quality influences voice. Perceived poor sleep quality is related to perceived poor vocal quality. Individuals with a voice handicap observe a greater influence of sleep on voice than those without. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Maynes, Timothy D; Podsakoff, Philip M
2014-01-01
Scholarly interest in employee voice behavior has increased dramatically over the past 15 years. Although this research has produced valuable knowledge, it has focused almost exclusively on voice as a positively intended challenge to the status quo, even though some scholars have argued that it need not challenge the status quo or be well intentioned. Thus, in this paper, we create an expanded view of voice; one that extends beyond voice as a positively intended challenge to the status quo to include voice that supports how things are being done in organizations as well as voice that may not be well intentioned. We construct a framework based on this expanded view that identifies 4 different types of voice behavior (supportive, constructive, defensive, and destructive). We then develop and validate survey measures for each of these. Evidence from 5 studies across 4 samples provides strong support for our new measures in that (a) a 4-factor confirmatory factor analysis model fit the data significantly better than 1-, 2-, or 3-factor models; (b) the voice measures converged with and yet remained distinct from conceptually related comparison constructs; (c) personality predictors exhibited unique patterns of relationships with the different types of voice; (d) variations in actual voice behaviors had a direct causal impact on responses to the survey items; and (e) each type of voice significantly impacted important outcomes for voicing employees (e.g., likelihood of relying on a voicing employee's opinions and evaluations of a voicing employee's overall performance). Implications of our findings are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved
Uloza, Virgilijus; Padervinskis, Evaldas; Uloziene, Ingrida; Saferis, Viktoras; Verikas, Antanas
2015-09-01
The aim of the present study was to evaluate the reliability of the measurements of acoustic voice parameters obtained simultaneously using oral and contact (throat) microphones and to investigate utility of combined use of these microphones for voice categorization. Voice samples of sustained vowel /a/ obtained from 157 subjects (105 healthy and 52 pathological voices) were recorded in a soundproof booth simultaneously through two microphones: oral AKG Perception 220 microphone (AKG Acoustics, Vienna, Austria) and contact (throat) Triumph PC microphone (Clearer Communications, Inc, Burnaby, Canada) placed on the lamina of thyroid cartilage. Acoustic voice signal data were measured for fundamental frequency, percent of jitter and shimmer, normalized noise energy, signal-to-noise ratio, and harmonic-to-noise ratio using Dr. Speech software (Tiger Electronics, Seattle, WA). The correlations of acoustic voice parameters in vocal performance were statistically significant and strong (r = 0.71-1.0) for the entire functional measurements obtained for the two microphones. When classifying into healthy-pathological voice classes, the oral-shimmer revealed the correct classification rate (CCR) of 75.2% and the throat-jitter revealed CCR of 70.7%. However, combination of both throat and oral microphones allowed identifying a set of three voice parameters: throat-signal-to-noise ratio, oral-shimmer, and oral-normalized noise energy, which provided the CCR of 80.3%. The measurements of acoustic voice parameters using a combination of oral and throat microphones showed to be reliable in clinical settings and demonstrated high CCRs when distinguishing the healthy and pathological voice patient groups. Our study validates the suitability of the throat microphone signal for the task of automatic voice analysis for the purpose of voice screening. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Valadez, Victor; Ysunza, Antonio; Ocharan-Hernandez, Esther; Garrido-Bustamante, Norma; Sanchez-Valerio, Araceli; Pamplona, Ma C
2012-09-01
Vocal Nodules (VN) are a functional voice disorder associated with voice misuse and abuse in children. There are few reports addressing vocal parameters in children with VN, especially after a period of vocal rehabilitation. The purpose of this study is to describe measurements of vocal parameters including Fundamental Frequency (FF), Shimmer (S), and Jitter (J), videonasolaryngoscopy examination and clinical perceptual assessment, before and after voice therapy in children with VN. Voice therapy was provided using visual support through Speech-Viewer software. Twenty patients with VN were studied. An acoustical analysis of voice was performed and compared with data from subjects from a control group matched by age and gender. Also, clinical perceptual assessment of voice and videonasolaryngoscopy were performed to all patients with VN. After a period of voice therapy, provided with visual support using Speech Viewer-III (SV-III-IBM) software, new acoustical analyses, perceptual assessments and videonasolaryngoscopies were performed. Before the onset of voice therapy, there was a significant difference (p<0.05) in mean FF, S and J, between the patients with VN and subjects from the control group. After the voice therapy period, a significant improvement (p<0.05) was found in all acoustic voice parameters. Moreover, perceptual voice analysis demonstrated improvement in all cases. Finally, videonasolaryngoscopy demonstrated that vocal nodules were no longer discernible on the vocal folds in any of the cases. SV-III software seems to be a safe and reliable method for providing voice therapy in children with VN. Acoustic voice parameters, perceptual data and videonasolaryngoscopy were significantly improved after the speech therapy period was completed. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Caminero Cueva, Maria Jesús; Señaris González, Blanca; Llorente Pendás, José Luis; Gorriz Gil, Carmen; López Llames, Aurora; Alonso Pantiga, Ramón; Suárez Nieto, Carlos
2007-01-01
We analyzed the functional outcome and self-evaluation of the voice of patients with T1 glottic carcinoma treated with endoscopic laser surgery and radiotherapy. We performed an objective voice evaluation, as well as a physical, emotional and functional well being assessment of 19 patients treated with laser surgery and 18 patients treated with radiotherapy. Voice quality is affected both by surgery and radiotherapy. Voice parameters only show differences in the maximum phonation time between both treatments. Results in the Voice Handicap Index show that radiotherapy has less effect on patient voice quality perception. There is a reduced impact on the patient’s perception of voice quality after radiotherapy, despite there being no significant differences in vocal quality between radiotherapy and laser cordectomy. PMID:17999074
NASA Astrophysics Data System (ADS)
Kapalova, N.; Haumen, A.
2018-05-01
This paper addresses to structures and properties of the cryptographic information protection algorithm model based on NPNs and constructed on an SP-network. The main task of the research is to increase the cryptostrength of the algorithm. In the paper, the transformation resulting in the improvement of the cryptographic strength of the algorithm is described in detail. The proposed model is based on an SP-network. The reasons for using the SP-network in this model are the conversion properties used in these networks. In the encryption process, transformations based on S-boxes and P-boxes are used. It is known that these transformations can withstand cryptanalysis. In addition, in the proposed model, transformations that satisfy the requirements of the "avalanche effect" are used. As a result of this work, a computer program that implements an encryption algorithm model based on the SP-network has been developed.
Efficient Boundary Extraction of BSP Solids Based on Clipping Operations.
Wang, Charlie C L; Manocha, Dinesh
2013-01-01
We present an efficient algorithm to extract the manifold surface that approximates the boundary of a solid represented by a Binary Space Partition (BSP) tree. Our polygonization algorithm repeatedly performs clipping operations on volumetric cells that correspond to a spatial convex partition and computes the boundary by traversing the connected cells. We use point-based representations along with finite-precision arithmetic to improve the efficiency and generate the B-rep approximation of a BSP solid. The core of our polygonization method is a novel clipping algorithm that uses a set of logical operations to make it resistant to degeneracies resulting from limited precision of floating-point arithmetic. The overall BSP to B-rep conversion algorithm can accurately generate boundaries with sharp and small features, and is faster than prior methods. At the end of this paper, we use this algorithm for a few geometric processing applications including Boolean operations, model repair, and mesh reconstruction.
ERIC Educational Resources Information Center
Tauberer, Joshua Ian
2010-01-01
The [voice] distinction between homorganic stops and fricatives is made by a number of acoustic correlates including voicing, segment duration, and preceding vowel duration. The present work looks at [voice] from a number of multidimensional perspectives. This dissertation's focus is a corpus study of the phonetic realization of [voice] in two…
Johnsrude, Ingrid S; Mackey, Allison; Hakyemez, Hélène; Alexander, Elizabeth; Trang, Heather P; Carlyon, Robert P
2013-10-01
People often have to listen to someone speak in the presence of competing voices. Much is known about the acoustic cues used to overcome this challenge, but almost nothing is known about the utility of cues derived from experience with particular voices--cues that may be particularly important for older people and others with impaired hearing. Here, we use a version of the coordinate-response-measure procedure to show that people can exploit knowledge of a highly familiar voice (their spouse's) not only to track it better in the presence of an interfering stranger's voice, but also, crucially, to ignore it so as to comprehend a stranger's voice more effectively. Although performance declines with increasing age when the target voice is novel, there is no decline when the target voice belongs to the listener's spouse. This finding indicates that older listeners can exploit their familiarity with a speaker's voice to mitigate the effects of sensory and cognitive decline.
Rinta, Tiija Elisabet; Welch, Graham F
2009-11-01
Traditionally, children's speaking and singing behaviors have been regarded as two separate sets of behaviors. Nevertheless, according to the voice-scientific view, all vocal functioning is interconnected due to the fact that we exploit the same voice and the same physiological mechanisms in generating all vocalization. The intention of the study was to investigate whether prepubertal children's speaking and singing behaviors are connected perceptually. Voice recordings were conducted with 60 10-year-old children. Each child performed a set of speaking and singing tasks in the voice experiments. Each voice sample was analyzed perceptually with a specially designed perceptual voice assessment protocol. The main finding was that the children's vocal functioning and voice quality in their speaking behavior correlated statistically significantly with those in their singing behavior. The findings imply that children's speaking and singing behaviors are perceptually connected through their vocal functioning and voice quality. Thus, it can be argued that children possess one voice that is used for generating their speaking and singing behaviors.
Broadband Gerchberg-Saxton algorithm for freeform diffractive spectral filter design.
Vorndran, Shelby; Russo, Juan M; Wu, Yuechen; Pelaez, Silvana Ayala; Kostuk, Raymond K
2015-11-30
A multi-wavelength expansion of the Gerchberg-Saxton (GS) algorithm is developed to design and optimize a surface relief Diffractive Optical Element (DOE). The DOE simultaneously diffracts distinct wavelength bands into separate target regions. A description of the algorithm is provided, and parameters that affect filter performance are examined. Performance is based on the spectral power collected within specified regions on a receiver plane. The modified GS algorithm is used to design spectrum splitting optics for CdSe and Si photovoltaic (PV) cells. The DOE has average optical efficiency of 87.5% over the spectral bands of interest (400-710 nm and 710-1100 nm). Simulated PV conversion efficiency is 37.7%, which is 29.3% higher than the efficiency of the better performing PV cell without spectrum splitting optics.
Voice-enabled Knowledge Engine using Flood Ontology and Natural Language Processing
NASA Astrophysics Data System (ADS)
Sermet, M. Y.; Demir, I.; Krajewski, W. F.
2015-12-01
The Iowa Flood Information System (IFIS) is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to flood inundation maps, real-time flood conditions, flood forecasts, flood-related data, information and interactive visualizations for communities in Iowa. The IFIS is designed for use by general public, often people with no domain knowledge and limited general science background. To improve effective communication with such audience, we have introduced a voice-enabled knowledge engine on flood related issues in IFIS. Instead of navigating within many features and interfaces of the information system and web-based sources, the system provides dynamic computations based on a collection of built-in data, analysis, and methods. The IFIS Knowledge Engine connects to real-time stream gauges, in-house data sources, analysis and visualization tools to answer natural language questions. Our goal is the systematization of data and modeling results on flood related issues in Iowa, and to provide an interface for definitive answers to factual queries. The goal of the knowledge engine is to make all flood related knowledge in Iowa easily accessible to everyone, and support voice-enabled natural language input. We aim to integrate and curate all flood related data, implement analytical and visualization tools, and make it possible to compute answers from questions. The IFIS explicitly implements analytical methods and models, as algorithms, and curates all flood related data and resources so that all these resources are computable. The IFIS Knowledge Engine computes the answer by deriving it from its computational knowledge base. The knowledge engine processes the statement, access data warehouse, run complex database queries on the server-side and return outputs in various formats. This presentation provides an overview of IFIS Knowledge Engine, its unique information interface and functionality as an educational tool, and discusses the future plans for providing knowledge on flood related issues and resources. IFIS Knowledge Engine provides an alternative access method to these comprehensive set of tools and data resources available in IFIS. Current implementation of the system accepts free-form input and voice recognition capabilities within browser and mobile applications.
Cottam, S; Paul, S N; Doughty, O J; Carpenter, L; Al-Mousawi, A; Karvounis, S; Done, D J
2011-09-01
Introduction. Hearing voices occurs in people without psychosis. Why hearing voices is such a key pathological feature of psychosis whilst remaining a manageable experience in nonpsychotic people is yet to be understood. We hypothesised that religious voice hearers would interpret voices in accordance with their beliefs and therefore experience less distress. Methods. Three voice hearing groups, which comprised: 20 mentally healthy Christians, 15 Christian patients with psychosis, and 14 nonreligious patients with psychosis. All completed (1) questionnaires with rating scales measuring the perceptual and emotional aspects of hallucinated voices, and (2) a semistructured interview to explore whether religious belief is used to make sense of the voice hearing experience. Results. The three groups had perceptually similar experiences when hearing the voices. Mentally healthy Christians appeared to assimilate the experience with their religious beliefs (schematic processing) resulting in positive interpretations. Christian patients tended not to assimilate the experience with their religious beliefs, frequently reporting nonreligious interpretations that were predominantly negative. Nearly all participants experienced voices as powerful, but mentally healthy Christians reported the power of voices positively. Conclusion. Religious belief appeared to have a profound, beneficial influence on the mentally healthy Christians' interpretation of hearing voices, but had little or no influence in the case of Christian patients.
Roy, Nelson; Merrill, Ray M; Thibeault, Susan; Gray, Steven D; Smith, Elaine M
2004-06-01
To examine the frequency and adverse effects of voice disorders on job performance and attendance in teachers and the general population, 2,401 participants from Iowa and Utah (n1 = 1,243 teachers and n2 = 1,279 nonteachers) were randomly selected and were interviewed by telephone using a voice disorder questionnaire. Teachers were significantly more likely than nonteachers to have experienced multiple voice symptoms and signs including hoarseness, discomfort, and increased effort while using their voice, tiring or experiencing a change in voice quality after short use, difficulty projecting their voice, trouble speaking or singing softly, and a loss of their singing range (all odds ratios [ORs] p <.05). Furthermore, teachers consistently attributed these voice symptoms to their occupation and were significantly more likely to indicate that their voice limited their ability to perform certain tasks at work, and had reduced activities or interactions as a result. Teachers, as compared with nonteachers, had missed more workdays over the preceding year because of voice problems and were more likely to consider changing occupations because of their voice (all comparisons p <.05). These findings strongly suggest that occupationally related voice dysfunction in teachers can have significant adverse effects on job performance, attendance, and future career choices.
Transmasculine People's Voice Function: A Review of the Currently Available Evidence.
Azul, David; Nygren, Ulrika; Södersten, Maria; Neuschaefer-Rube, Christiane
2017-03-01
This study aims to evaluate the currently available discursive and empirical data relating to those aspects of transmasculine people's vocal situations that are not primarily gender-related, to identify restrictions to voice function that have been observed in this population, and to make suggestions for future voice research and clinical practice. We conducted a comprehensive review of the voice literature. Publications were identified by searching six electronic databases and bibliographies of relevant articles. Twenty-two publications met inclusion criteria. Discourses and empirical data were analyzed for factors and practices that impact on voice function and for indications of voice function-related problems in transmasculine people. The quality of the evidence was appraised. The extent and quality of studies investigating transmasculine people's voice function was found to be limited. There was mixed evidence to suggest that transmasculine people might experience restrictions to a range of domains of voice function, including vocal power, vocal control/stability, glottal function, pitch range/variability, vocal endurance, and voice quality. More research into the different factors and practices affecting transmasculine people's voice function that takes account of a range of parameters of voice function and considers participants' self-evaluations is needed to establish how functional voice production can be best supported in this population. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Huang, Zhaohui; Huang, Xiemin
2018-04-01
This paper, firstly, introduces the application trend of the integration of multi-channel interactions in automotive HMI ((Human Machine Interface) from complex information models faced by existing automotive HMI and describes various interaction modes. By comparing voice interaction and touch screen, gestures and other interaction modes, the potential and feasibility of voice interaction in automotive HMI experience design are concluded. Then, the related theories of voice interaction, identification technologies, human beings' cognitive models of voices and voice design methods are further explored. And the research priority of this paper is proposed, i.e. how to design voice interaction to create more humane task-oriented dialogue scenarios to enhance interactive experiences of automotive HMI. The specific scenarios in driving behaviors suitable for the use of voice interaction are studied and classified, and the usability principles and key elements for automotive HMI voice design are proposed according to the scenario features. Then, through the user participatory usability testing experiment, the dialogue processes of voice interaction in automotive HMI are defined. The logics and grammars in voice interaction are classified according to the experimental results, and the mental models in the interaction processes are analyzed. At last, the voice interaction design method to create the humane task-oriented dialogue scenarios in the driving environment is proposed.
Voice responses to changes in pitch of voice or tone auditory feedback
NASA Astrophysics Data System (ADS)
Sivasankar, Mahalakshmi; Bauer, Jay J.; Babu, Tara; Larson, Charles R.
2005-02-01
The present study was undertaken to examine if a subject's voice F0 responded not only to perturbations in pitch of voice feedback but also to changes in pitch of a side tone presented congruent with voice feedback. Small magnitude brief duration perturbations in pitch of voice or tone auditory feedback were randomly introduced during sustained vowel phonations. Results demonstrated a higher rate and larger magnitude of voice F0 responses to changes in pitch of the voice compared with a triangular-shaped tone (experiment 1) or a pure tone (experiment 2). However, response latencies did not differ across voice or tone conditions. Data suggest that subjects responded to the change in F0 rather than harmonic frequencies of auditory feedback because voice F0 response prevalence, magnitude, or latency did not statistically differ across triangular-shaped tone or pure-tone feedback. Results indicate the audio-vocal system is sensitive to the change in pitch of a variety of sounds, which may represent a flexible system capable of adapting to changes in the subject's voice. However, lower prevalence and smaller responses to tone pitch-shifted signals suggest that the audio-vocal system may resist changes to the pitch of other environmental sounds when voice feedback is present. .
Computer-automated dementia screening using a touch-tone telephone.
Mundt, J C; Ferber, K L; Rizzo, M; Greist, J H
2001-11-12
This study investigated the sensitivity and specificity of a computer-automated telephone system to evaluate cognitive impairment in elderly callers to identify signs of early dementia. The Clinical Dementia Rating Scale was used to assess 155 subjects aged 56 to 93 years (n = 74, 27, 42, and 12, with a Clinical Dementia Rating Scale score of 0, 0.5, 1, and 2, respectively). These subjects performed a battery of tests administered by an interactive voice response system using standard Touch-Tone telephones. Seventy-four collateral informants also completed an interactive voice response version of the Symptoms of Dementia Screener. Sixteen cognitively impaired subjects were unable to complete the telephone call. Performances on 6 of 8 tasks were significantly influenced by Clinical Dementia Rating Scale status. The mean (SD) call length was 12 minutes 27 seconds (2 minutes 32 seconds). A subsample (n = 116) was analyzed using machine-learning methods, producing a scoring algorithm that combined performances across 4 tasks. Results indicated a potential sensitivity of 82.0% and specificity of 85.5%. The scoring model generalized to a validation subsample (n = 39), producing 85.0% sensitivity and 78.9% specificity. The kappa agreement between predicted and actual group membership was 0.64 (P<.001). Of the 16 subjects unable to complete the call, 11 provided sufficient information to permit us to classify them as impaired. Standard scoring of the interactive voice response-administered Symptoms of Dementia Screener (completed by informants) produced a screening sensitivity of 63.5% and 100% specificity. A lower criterion found a 90.4% sensitivity, without lowering specificity. Computer-automated telephone screening for early dementia using either informant or direct assessment is feasible. Such systems could provide wide-scale, cost-effective screening, education, and referral services to patients and caregivers.
Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.
Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka
2008-07-01
In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.
Schloneger, Matthew J; Hunter, Eric J
2017-01-01
The multiple social and performance demands placed on college/university singers could put their still-developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hertz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and the estimated voice quality of college/university singing students (18-24 years old, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal, traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose were analyzed. Additional acoustic measures included perceived pitch, pitch strength, long-term average spectrum slope, alpha ratio, dB sound pressure level 1-3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between nonsinging, solo singing, and choral singing. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Obligatory and facultative brain regions for voice-identity recognition
Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina
2018-01-01
Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. PMID:29228111
Obligatory and facultative brain regions for voice-identity recognition.
Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina
2018-01-01
Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain.
Voice to Voice: Developing In-Service Teachers' Personal, Collaborative, and Public Voices.
ERIC Educational Resources Information Center
Thurber, Frances; Zimmerman, Enid
1997-01-01
Describes a model for inservice education that begins with an interchange of teachers' voices with those of the students in an interactive dialog. The exchange allows them to develop their private voices through self-reflection and validation of their own experiences. (JOW)
Ping, Lichuan; Wang, Ningyuan; Tang, Guofang; Lu, Thomas; Yin, Li; Tu, Wenhe; Fu, Qian-Jie
2017-09-01
Because of limited spectral resolution, Mandarin-speaking cochlear implant (CI) users have difficulty perceiving fundamental frequency (F0) cues that are important to lexical tone recognition. To improve Mandarin tone recognition in CI users, we implemented and evaluated a novel real-time algorithm (C-tone) to enhance the amplitude contour, which is strongly correlated with the F0 contour. The C-tone algorithm was implemented in clinical processors and evaluated in eight users of the Nurotron NSP-60 CI system. Subjects were given 2 weeks of experience with C-tone. Recognition of Chinese tones, monosyllables, and disyllables in quiet was measured with and without the C-tone algorithm. Subjective quality ratings were also obtained for C-tone. After 2 weeks of experience with C-tone, there were small but significant improvements in recognition of lexical tones, monosyllables, and disyllables (P < 0.05 in all cases). Among lexical tones, the largest improvements were observed for Tone 3 (falling-rising) and the smallest for Tone 4 (falling). Improvements with C-tone were greater for disyllables than for monosyllables. Subjective quality ratings showed no strong preference for or against C-tone, except for perception of own voice, where C-tone was preferred. The real-time C-tone algorithm provided small but significant improvements for speech performance in quiet with no change in sound quality. Pre-processing algorithms to reduce noise and better real-time F0 extraction would improve the benefits of C-tone in complex listening environments. Chinese CI users' speech recognition in quiet can be significantly improved by modifying the amplitude contour to better resemble the F0 contour.
Voices on Voice: Perspectives, Definitions, Inquiry.
ERIC Educational Resources Information Center
Yancey, Kathleen Blake, Ed.
This collection of essays approaches "voice" as a means of expression that lives in the interactions of writers, readers, and language, and examines the conceptualizations of voice within the oral rhetorical and expressionist traditions, and the notion of voice as both a singular and plural phenomenon. An explanatory introduction by the…
Voice Therapy Practices and Techniques: A Survey of Voice Clinicians.
ERIC Educational Resources Information Center
Mueller, Peter B.; Larson, George W.
1992-01-01
Eighty-three voice disorder therapists' ratings of statements regarding voice therapy practices indicated that vocal nodules are the most frequent disorder treated; vocal abuse and hard glottal attack elimination, counseling, and relaxation were preferred treatment approaches; and voice therapy is more effective with adults than with children.…
Theran, Sally A
2009-09-01
The current study empirically examined predictors of level of voice (ethnicity, attachment, and gender role socialization) in a diverse sample of 108 14-year-old girls. Structural equation modeling results indicated that parental attachment predicted level of voice with authority figures, and gender role socialization predicted level of voice with authority figures and peers. Both masculinity and femininity were salient for higher levels of voice with authority figures whereas higher scores on masculinity contributed to higher levels of voice with peers. These findings suggest that, contrary to previous theoretical work, femininity itself is not a risk factor for low levels of voice. In addition, African-American girls had higher levels of voice with teachers and classmates than did Caucasian girls, and girls who were in a school with a greater concentration of ethnic minorities had higher levels of voice with peers than did girls at a school with fewer minority students.
Speech technology and cinema: can they learn from each other?
Pauletto, Sandra
2013-10-01
The voice is the most important sound of a film soundtrack. It represents a character and it carries language. There are different types of cinematic voices: dialogue, internal monologues, and voice-overs. Conventionally, two main characteristics differentiate these voices: lip synchronization and the voice's attributes that make it appropriate for the character (for example, a voice that sounds very close to the audience can be appropriate for a narrator, but not for an onscreen character). What happens, then, if a film character can only speak through an asynchronous machine that produces a 'robot-like' voice? This article discusses the sound-related work and experimentation done by the author for the short film Voice by Choice. It also attempts to discover whether speech technology design can learn from its cinematic representation, and if such uncommon film protagonists can contribute creatively to transform the conventions of cinematic voices.
Dimensionality in voice quality.
Bele, Irene Velsvik
2007-05-01
This study concerns speaking voice quality in a group of male teachers (n = 35) and male actors (n = 36), as the purpose was to investigate normal and supranormal voices. The goal was the development of a method of valid perceptual evaluation for normal to supranormal and resonant voices. The voices (text reading at two loudness levels) had been evaluated by 10 listeners, for 15 vocal characteristics using VA scales. In this investigation, the results of an exploratory factor analysis of the vocal characteristics used in this method are presented, reflecting four dimensions of major importance for normal and supranormal voices. Special emphasis is placed on the effects on voice quality of a change in the loudness variable, as two loudness levels are studied. Furthermore, the vocal characteristics Sonority and Ringing voice quality are paid special attention, as the essence of the term "resonant voice" was a basic issue throughout a doctoral dissertation where this study was included.
When the face fits: recognition of celebrities from matching and mismatching faces and voices.
Stevenage, Sarah V; Neil, Greg J; Hamlin, Iain
2014-01-01
The results of two experiments are presented in which participants engaged in a face-recognition or a voice-recognition task. The stimuli were face-voice pairs in which the face and voice were co-presented and were either "matched" (same person), "related" (two highly associated people), or "mismatched" (two unrelated people). Analysis in both experiments confirmed that accuracy and confidence in face recognition was consistently high regardless of the identity of the accompanying voice. However accuracy of voice recognition was increasingly affected as the relationship between voice and accompanying face declined. Moreover, when considering self-reported confidence in voice recognition, confidence remained high for correct responses despite the proportion of these responses declining across conditions. These results converged with existing evidence indicating the vulnerability of voice recognition as a relatively weak signaller of identity, and results are discussed in the context of a person-recognition framework.
Voice quality change in future professional voice users after 9 months of voice training.
Timmermans, Bernadette; De Bodt, Marc; Wuyts, Floris; Van de Heyning, Paul
2004-01-01
Sixty-eight students of a school for audiovisual communication participated in this study. A part of them, 49 students, received voice training for 9 months (the trained group); 19 subjects received no specific voice training (the untrained group). A multidimensional test battery containing the GRBAS scale, videolaryngostroboscopy, Maximum Phonation Time (MPT), jitter, lowest intensity (IL), highest frequency (FoH), Dysphonia Severity Index (DSI) and Voice Handicap Index (VHI) was applied before and after training to evaluate training outcome. The voice training is made up of technical workshops in small groups (five to eight subjects) and vocal coaching in the ateliers. In the technical workshops, basic skills are trained (posture, breathing technique, articulation and diction), and in the ateliers, the speech and language pathologist assists the subjects in the practice of their voice work. This study revealed a significant amelioration over time for the objective measurements [Dysphonia Severity Index: from 2.3 to 4.5 ( P<0.001)] and the self-evaluation [Voice Handicap Index, from 23 to 18.4 ( P=0.016)] for the trained group only. This outcome favors the systematic introduction of voice training during the schooling of professional voice users.
Lin, Szu-Han Joanna; Johnson, Russell E
2015-09-01
One way that employees contribute to organizational effectiveness is by expressing voice. They may offer suggestions for how to improve the organization (promotive voice behavior), or express concerns to prevent harmful events from occurring (prohibitive voice behavior). Although promotive and prohibitive voices are thought to be distinct types of behavior, very little is known about their unique antecedents and consequences. In this study we draw on regulatory focus and ego depletion theories to derive a theoretical model that outlines a dynamic process of the antecedents and consequences of voice behavior. Results from 2 multiwave field studies revealed that promotion and prevention foci have unique ties to promotive and prohibitive voice, respectively. Promotive and prohibitive voice, in turn, were associated with decreases and increases, respectively, in depletion. Consistent with the dynamic nature of self-control, depletion was associated with reductions in employees' subsequent voice behavior, regardless of the type of voice (promotive or prohibitive). Results were consistent across 2 studies and remained even after controlling for other established antecedents of voice and alternative mediating mechanisms beside depletion. (c) 2015 APA, all rights reserved).
Petrovic-Lazic, Mirjana; Jovanovic, Nadica; Kulic, Milan; Babac, Snezana; Jurisic, Vladimir
2015-03-01
The aim of the study was to assess the effect of endolaryngeal phonomicrosurgery (EPM) and voice therapy in patients with vocal fold polyps using perceptual and acoustic analysis before and after both therapies. The acoustic tests and perceptual evaluation of voice were carried out on 41 female patients with vocal fold polyp before and after EPM and voice therapy. Both therapy strategies were performed. Used acoustic parameters were Jitter percent (Jitt), pitch perturbation quotient (PPQ), shimmer percent (Shim), amplitude perturbation quotient (APQ), fundamental frequency variation (vF0), noise-to-harmonic ratio (NHR), Voice Turbulence Index (VTI). For perceptual evaluation, GRB scale was used. Results indicated higher values of investigated parameters in patients' group than in the control group (P < 0.01). Good correlation between the perceptual hoarseness factors of GRB scale and objective acoustic voice parameters were observed. All analyzed acoustic parameters improved after the phonomicrosurgery and voice therapy and tend to approach to values of the control group. For Jitt percent, Shim percent, vF0, VTI, and NHR, there were statistically significant differences. Perceptual voice evaluation revealed statistically significantly (P < 0.01) decreased rating of G (grade), R (rough) and B (breathy) after surgery and voice therapy. Our data indicated that both acoustic and perceptual characteristic of voice in patients with vocal polyps significantly improved after phonomicrosurgical and voice treatment. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Tafiadis, Dionysios; Chronopoulos, Spyridon K; Siafaka, Vassiliki; Drosos, Konstantinos; Kosma, Evangelia I; Toki, Eugenia I; Ziavra, Nausica
2017-09-01
Students' groups (eg, teachers, speech language pathologists) are presumably at risk of developing a voice disorder due to misuse of their voice, which will affect their way of living. Multidisciplinary voice assessment of student populations is currently spread widely along with the use of self-reported questionnaires. This study compared the Voice Handicap Index domains and item scores between female students of speech and language therapy and of other health professions in Greece. We also examined the probability of speech language therapy students developing any vocal symptom. Two hundred female non-dysphonic students (aged 18-31) were recruited. Participants answered the Voice Evaluation Form and the Greek adaptation of the Voice Handicap Index. Significant differences were observed between the two groups (students of speech therapy and other health professions) through Voice Handicap Index (total score, functional and physical domains), excluding the emotional domain. Furthermore, significant differences for specific Voice Handicap Index items, between subgroups, were observed. In conclusion, speech language therapy students had higher Voice Handicap Index scores, which probably could be an indicator for avoiding profession-related dysphonia at a later stage. Also, Voice Handicap Index could be at a first glance an assessment tool for the recognition of potential voice disorder development in students. In turn, the results could be used for indirect therapy approaches, such as providing methods for maintaining vocal health in different student populations. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Perceived control and voice handicap in patients with voice disorders.
Frazier, Patricia; Merians, Addie; Misono, Stephanie
2017-11-01
The purpose of the study was to replicate and extend previous research on the relation between perceived present control and voice handicap and to further examine the psychometric properties of a present control scale adapted for patients with voice disorders (Misono, Meredith, Peterson, & Frazier, 2016). Sample 1 consisted of 1,129 patients recruited from a voice disorder clinic who completed measures of perceived present control, distress, and voice handicap in the clinic. Sample 2 consisted of 62 patients from the same clinic who completed measures of present control, distress, voice handicap, and general control beliefs online at baseline and measures of present control and voice handicap again 3 weeks later (n = 59). With regard to the psychometric properties of the voice-adapted present control scale, alpha coefficients were above .80 and the 3-week test-reliability coefficient was .69. There was mixed support for the hypothesized 1-factor structure of the scale. In Sample 1, present control was more strongly associated with lower voice handicap than was distress and accounted for significant variance in voice handicap controlling for distress. In Sample 2, present control at baseline predicted later voice handicap, controlling for general control beliefs and distress. Present control appears to be a promising target for adjunctive interventions for patients with voice disorders. An evidence-based online present control intervention (Hintz, Frazier, & Meredith, 2015) is being adapted for this patient population. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Voice changes after thyroidectomy without recurrent laryngeal nerve injury.
Sinagra, Diego L; Montesinos, Manuel R; Tacchi, Verónica A; Moreno, Julio C; Falco, Jorge E; Mezzadri, Norberto A; Debonis, Daniel L; Curutchet, H Pablo
2004-10-01
Injury of the inferior laryngeal nerve is not the only cause of voice alteration after thyroidectomy; many patients notice minimal changes immediately after operation, without evidence of inferior laryngeal nerve damage. We hypothesized that there may be other causes for voice modification, such as injuries of the superior laryngeal nerve, prethyroid strap muscles, and cricothyroid muscles. We describe voice changes after total thyroidectomy, without inferior laryngeal nerve injury, using a computer program to objectively compare different patterns of voice. Forty-six consecutive patients who underwent total thyroidectomy were studied between March 1997 and December 1999. Acoustic voice analysis was performed preoperatively and at the second, fourth, and sixth postoperative months using a microphone adapted to a personal computer. Parameters measured were intensity of the voice (Shimmer) and fundamental frequency (Fo). No complications occurred during operation or in the postoperative period. Voice fatigue during phonation was the most common symptom after thyroidectomy. Forty patients (87%) stated that their voices had changed since the operation, and common complaints were voice alteration while speaking loudly, changes in voice pitch, and voice disorder while singing. Changes in the Fo and Shimmer values in smokers versus nonsmokers were similar (Fo overall, p = 0.56; Shimmer overall, p = 0.66), as were the same parameters in benign and malignant pathologies (Fo overall, p = 0.66; Shimmer overall, p = 0.67). Voice changes after uncomplicated thyroidectomy occur and can be objectively measured. This is important in the preoperative counseling of patients before thyroidectomy, for ethical and legal purposes.
Schloneger, Matthew; Hunter, Eric
2016-01-01
The multiple social and performance demands placed on college/university singers could put their still developing voices at risk. Previous ambulatory monitoring studies have analyzed the duration, intensity, and frequency (in Hz) of voice use among such students. Nevertheless, no studies to date have incorporated the simultaneous acoustic voice quality measures into the acquisition of these measures to allow for direct comparison during the same voicing period. Such data could provide greater insight into how young singers use their voices, as well as identify potential correlations between vocal dose and acoustic changes in voice quality. The purpose of this study was to assess the voice use and estimated voice quality of college/university singing students (18–24 y/o, N = 19). Ambulatory monitoring was conducted over three full, consecutive weekdays measuring voice from an unprocessed accelerometer signal measured at the neck. From this signal were analyzed traditional vocal dose metrics such as phonation percentage, dose time, cycle dose, and distance dose. Additional acoustic measures included perceived pitch, pitch strength, LTAS slope, alpha ratio, dB SPL 1–3 kHz, and harmonic-to-noise ratio. Major findings from more than 800 hours of recording indicated that among these students (a) higher vocal doses correlated significantly with greater voice intensity, more vocal clarity and less perturbation; and (b) there were significant differences in some acoustic voice quality metrics between non-singing, solo singing and choral singing. PMID:26897545
Kaneko, Mami; Hitomi, Takefumi; Takekawa, Takashi; Tsuji, Takuya; Kishimoto, Yo; Hirano, Shigeru
2017-09-26
Injury to the superior laryngeal nerve can result in dysphonia, and in particular, loss of vocal range. It can be an especially difficult problem to address with either voice therapy or surgical intervention. Some clinicians and scientists suggest that combining vocal exercises with adjunctive neuromuscular electrical stimulation may enhance the positive effects of voice therapy for superior laryngeal nerve paresis (SLNP). However, the effects of voice therapy without neuromuscular electrical stimulation are unknown. The purpose of this retrospective study was to demonstrate the clinical effectiveness of voice therapy for rehabilitating chronic SLNP dysphonia in two subjects, using interspike interval (ISI) variability of laryngeal motor units by laryngeal electromyography (LEMG). Both patients underwent LEMG and were diagnosed with having 70% recruitment of the cricothyroid muscle, and 70% recruitment of the cricothyroid and thyroarytenoid muscles, respectively. Both patients received voice therapy for 3 months. Grade, roughness, breathiness, asthenia, and strain (GRBAS) scale, stroboscopic examination, aerodynamic assessment, acoustic analysis, and Voice Handicap Index-10 were performed before and after voice therapy. Mean ISI variability during steady phonation was also assessed. After voice therapy, both patients showed improvement in vocal assessments by acoustic, aerodynamic, GRBAS, and Voice Handicap Index-10 analysis. LEMG indicated shortened ISIs in both cases. This study suggests that voice therapy for chronic SLNP dysphonia can be useful for improving SLNP and voice quality. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice Problems in New Zealand Teachers: A National Survey.
Leão, Sylvia H de S; Oates, Jennifer M; Purdy, Suzanne C; Scott, David; Morton, Randall P
2015-09-01
This study determined the prevalence and nature of voice problems in New Zealand (NZ) teachers using a national self-report questionnaire. Epidemiological cross-sectional survey. Participants were 1879 primary and secondary teachers (72.5% females). Three prevalence timeframes were estimated. Severity of voice problems, recovery time, days away from work, symptoms, health assistance, and voice education were also investigated. Prevalence of self-reported vocal problems was 33.2% during their teaching career, 24.7% over the teaching year, and 13.2% on the day of the survey. Primary teachers (P<0.001; odds ratio [OR]=1.74; confidence interval [CI]=1.33-2.40), females (P=0.008; OR=1.63; CI=1.13-2.37), and those aged 51-60 years (P=0.010; OR=1.45; CI=1.11-3.00) were more likely to report problems. Among teachers reporting voice problems during the year, 47% were moderate or severe; for 30%, voice recovery took more than 1 week. Approximately 28% stayed away from work 1-3 days owing to a vocal problem and 9% for more than 3 days. Women reported longer recovery times and more days away. Symptoms associated with voice problems (P<0.001) were voice quality alteration (OR=4.35; CI=3.40-5.57), vocal effort (OR=1.15; CI=0.96-1.37), voice breaks (OR=1.55; CI=1.30-1.84), voice projection difficulty (OR=1.25; CI=1.04-1.50), and throat discomfort (OR=1.22; CI=1.02-1.47). Of the teachers reporting voice problems, only 22.5% consulted a health practitioner. Only 38% of the teachers with chronic voice problems visited an otolaryngologist. Higher hours of voice training/education were associated with fewer self-reported voice problems. Voice problems are of concern for NZ teachers, as has been reported for teachers in other countries. There is still limited awareness among teachers about vocal health, potential risks, and specialized health services for voice problems. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Doarn, Charles R; Zacharias, Stephanie; Keck, Casey Stewart; Tabangin, Meredith; DeAlarcon, Alessandro; Kelchner, Lisa
2018-06-05
This article describes the design and implementation of a web-based portal developed to provide supported home practice between weekly voice therapy sessions delivered through telehealth to children with voice disorders. This in-between care consisted of supported home practice that was remotely monitored by speech-language pathologists (SLPs). A web-based voice therapy portal (VTP) was developed as a platform so participants could complete voice therapy home practice by an interdisciplinary team of SLPs (specialized in pediatric voice therapy), telehealth specialists, biomedical informaticians, and interface designers. The VTP was subsequently field tested in a group of children with voice disorders, participating in a larger telehealth study. Building the VTP for supported home practice for pediatric voice therapy was challenging, but successful. Key interactive features of the final site included 11 vocal hygiene questions, traditional voice therapy exercises grouped into levels, audio/visual voice therapy demonstrations, a store-and-retrieval system for voice samples, message/chat function, written guidelines for weekly therapy exercises, and questionnaires for parents to complete after each therapy session. Ten participants (9-14 years of age) diagnosed with a voice disorder were enrolled for eight weekly telehealth voice therapy sessions with follow-up in-between care provided using the VTP. The development and implementation of the VTP as a novel platform for the delivery of voice therapy home practice sessions were effective. We found that a versatile individual, who can work with all project staff (speak the language of both SLPs and information technologists), is essential to the development process. Once the website was established, participants and SLPs effectively utilized the web-based VTP. They found it feasible and useful for needed in-between care and reinforcement of therapeutic exercises.
Körner Gustafsson, Joakim; Södersten, Maria; Ternström, Sten; Schalling, Ellika
2018-02-15
This study examines the effects of an intensive voice treatment focusing on increasing voice intensity, LSVT LOUD ® Lee Silverman Voice Treatment, on voice use in daily life in a participant with Parkinson's disease, using a portable voice accumulator, the VoxLog. A secondary aim was to compare voice use between the participant and a matched healthy control. Participants were an individual with Parkinson's disease and his healthy monozygotic twin. Voice use was registered with the VoxLog during 9 weeks for the individual with Parkinson's disease and 2 weeks for the control. This included baseline registrations for both participants, 4 weeks during LSVT LOUD for the individual with Parkinson's disease and 1 week after treatment for both participants. For the participant with Parkinson's disease, follow-up registrations at 3, 6, and 12 months post-treatment were made. The individual with Parkinson's disease increased voice intensity during registrations in daily life with 4.1 dB post-treatment and 1.4 dB at 1-year follow-up compared to before treatment. When monitored during laboratory recordings an increase of 5.6 dB was seen post-treatment and 3.8 dB at 1-year follow-up. Changes in voice intensity were interpreted as a treatment effect as no significant correlations between changes in voice intensity and background noise were found for the individual with Parkinson's disease. The increase in voice intensity in a laboratory setting was comparable to findings previously reported following LSVT LOUD. The increase registered using ambulatory monitoring in daily life was lower but still reflecting a clinically relevant change.
Multiscale Mathematics for Biomass Conversion to Renewable Hydrogen
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plechac, Petr
2016-03-01
The overall objective of this project was to develop multiscale models for understanding and eventually designing complex processes for renewables. To the best of our knowledge, our work is the first attempt at modeling complex reacting systems, whose performance relies on underlying multiscale mathematics and developing rigorous mathematical techniques and computational algorithms to study such models. Our specific application lies at the heart of biofuels initiatives of DOE and entails modeling of catalytic systems, to enable economic, environmentally benign, and efficient conversion of biomass into either hydrogen or valuable chemicals.
Lebacq, Jean; Schoentgen, Jean; Cantarella, Giovanna; Bruss, Franz Thomas; Manfredi, Claudia; DeJonckere, Philippe
2017-09-01
Smartphone technology provides new opportunities for recording standardized voice samples of patients and transmitting the audio files to the voice laboratory. This drastically improves the achievement of baseline designs, used in research on efficiency of voice treatments. However, the basic requirement is the suitability of smartphones for recording and digitizing pathologic voices (mainly characterized by period perturbations and noise) without significant distortion. In a previous article, this was tested using realistic synthesized deviant voice samples (/a:/) with three precisely known levels of jitter and of noise in all combinations. High correlations were found between jitter and noise to harmonics ratio measured in (1) recordings via smartphones, (2) direct microphone recordings, and (3) sound files generated by the synthesizer. In the present work, similar experiments were performed (1) in the presence of increasing levels of ambient noise and (2) using synthetic deviant voice samples (/a:/) as well as synthetic voice material simulating a deviant short voiced utterance (/aiuaiuaiu/). Ambient noise levels up to 50 dB A are acceptable. However, signal processing occurs in some smartphones, and this significantly affects estimates of jitter and noise to harmonics ratio when formant changes are introduced in analogy with running speech. The conclusion is that voice material must provisionally be limited to a sustained /a/. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Griffiths, Sarah; Barnes, Rebecca; Britten, Nicky; Wilkinson, Ray
2011-01-01
Around 70% of people who develop Parkinson's disease (PD) experience speech and voice changes. Clinicians often find that when asked about their primary communication concerns, PD clients will talk about the difficulties they have 'getting into' conversations. This is an important area for clients and it has implications for quality of life and clinical management. To review the extant literature on PD and communication impairments in order to reveal key topic areas, the range of methodologies applied, and any gaps in knowledge relating to PD and social interaction and how these might be usefully addressed. A systematic search of a number of key databases and available grey literatures regarding PD and communication impairment was conducted (including motor speech changes, intelligibility, cognitive/language changes) to obtain a sense of key areas and methodologies applied. Research applying conversation analysis in the field of communication disability was also reviewed to illustrate the value of this methodology in uncovering common interactional difficulties, and in revealing the use of strategic collaborative competencies in naturally occurring conversation. In addition, available speech and language therapy assessment and intervention approaches to PD were examined with a view to their effectiveness in promoting individualized intervention planning and advice-giving for everyday interaction. A great deal has been written about the deficits underpinning communication changes in PD and the impact of communication disability on the self and others as measured in a clinical setting. Less is known about what happens for this client group in everyday conversations outside of the clinic. Current speech and language therapy assessments and interventions focus on the individual and are largely impairment based or focused on compensatory speaker-oriented techniques. A conversation analysis approach would complement basic research on what actually happens in everyday conversation for people with PD and their co-participants. The potential benefits of a conversation analysis approach to communication disability in PD include enabling a shift in clinical focus from individual impairment onto strategic collaborative competencies. This would have implications for client-centred intervention planning and the development of new and complementary clinical resources addressing participation. The impact would be new and improved support for those living with the condition as well as their families and carers. © 2011 Royal College of Speech & Language Therapists.
Crossing Cultures with Multi-Voiced Journals
ERIC Educational Resources Information Center
Styslinger, Mary E.; Whisenant, Alison
2004-01-01
In this article, the authors discuss the benefits of using multi-voiced journals as a teaching strategy in reading instruction. Multi-voiced journals, an adaptation of dual-voiced journals, encourage responses to reading in varied, cultured voices of characters. It is similar to reading journals in that they prod students to connect to the lives…
Parent Trigger Laws and the Promise of Parental Voice
ERIC Educational Resources Information Center
Smith, William C.; Rowland, Julie
2014-01-01
Parent trigger laws have gained momentum nationally under the premise that they will increase local authority by amplifying parental voice in the decision to turn around "failing" schools. Using Hirschman's exit, voice, and loyalty framework we create two conceptual models of voice and evaluate the promise of voice in California, home of…
14 CFR 23.1457 - Cockpit voice recorders.
Code of Federal Regulations, 2013 CFR
2013-01-01
... intelligibility. (c) Each cockpit voice recorder must be installed so that the part of the communication or audio... 14 Aeronautics and Space 1 2013-01-01 2013-01-01 false Cockpit voice recorders. 23.1457 Section 23... Equipment § 23.1457 Cockpit voice recorders. (a) Each cockpit voice recorder required by the operating rules...
14 CFR 23.1457 - Cockpit voice recorders.
Code of Federal Regulations, 2014 CFR
2014-01-01
... intelligibility. (c) Each cockpit voice recorder must be installed so that the part of the communication or audio... 14 Aeronautics and Space 1 2014-01-01 2014-01-01 false Cockpit voice recorders. 23.1457 Section 23... Equipment § 23.1457 Cockpit voice recorders. (a) Each cockpit voice recorder required by the operating rules...
14 CFR 23.1457 - Cockpit voice recorders.
Code of Federal Regulations, 2012 CFR
2012-01-01
... intelligibility. (c) Each cockpit voice recorder must be installed so that the part of the communication or audio... 14 Aeronautics and Space 1 2012-01-01 2012-01-01 false Cockpit voice recorders. 23.1457 Section 23... Equipment § 23.1457 Cockpit voice recorders. (a) Each cockpit voice recorder required by the operating rules...
Reported Voice Difficulties in Student Teachers: A Questionnaire Survey
ERIC Educational Resources Information Center
Fairfield, Carol; Richards, Brian
2007-01-01
As professional voice users, teachers are particularly at risk of abusing their voices and developing voice disorders during their career. In spite of this, attention paid to voice care in the initial training and further professional development of teachers is unevenly spread and insufficient. This article describes a questionnaire survey of 171…