voice interactive systems: Topics by Science.gov

Sample records for voice interactive systems

78 FR 71676 - Submission for Review: 3206-0201, Federal Employees Health Benefits (FEHB) Open Season Express...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-11-29

... (FEHB) Open Season Express Interactive Voice Response (IVR) System and Open Season Web site AGENCY: U.S... Benefits (FEHB) Open Season Express Interactive Voice Response (IVR) System and the Open Season Web site... Season Express Interactive Voice Response (IVR) System, and the Open Season Web site, Open Season Online...
The Voice as Computer Interface: A Look at Tomorrow's Technologies.

ERIC Educational Resources Information Center

Lange, Holley R.

1991-01-01

Discussion of voice as the communications device for computer-human interaction focuses on voice recognition systems for use within a library environment. Voice technologies are described, including voice response and voice recognition; examples of voice systems in use in libraries are examined; and further possibilities, including use with…
Design and realization of intelligent tourism service system based on voice interaction

NASA Astrophysics Data System (ADS)

Hu, Lei-di; Long, Yi; Qian, Cheng-yang; Zhang, Ling; Lv, Guo-nian

2008-10-01

Voice technology is one of the important contents to improve the intelligence and humanization of tourism service system. Combining voice technology, the paper concentrates on application needs and the composition of system to present an overall intelligent tourism service system's framework consisting of presentation layer, Web services layer, and tourism application service layer. On the basis, the paper further elaborated the implementation of the system and its key technologies, including intelligent voice interactive technology, seamless integration technology of multiple data sources, location-perception-based guides' services technology, and tourism safety control technology. Finally, according to the situation of Nanjing tourism, a prototype of Tourism Services System is realized.
Practical applications of interactive voice technologies: Some accomplishments and prospects

NASA Technical Reports Server (NTRS)

Grady, Michael W.; Hicklin, M. B.; Porter, J. E.

1977-01-01

A technology assessment of the application of computers and electronics to complex systems is presented. Three existing systems which utilize voice technology (speech recognition and speech generation) are described. Future directions in voice technology are also described.
Voice Interactive Analysis System Study. Final Report, August 28, 1978 through March 23, 1979.

ERIC Educational Resources Information Center

Harry, D. P.; And Others

The Voice Interactive Analysis System study continued research and development of the LISTEN real-time, minicomputer based connected speech recognition system, within NAVTRAEQUIPCEN'S program of developing automatic speech technology in support of training. An attempt was made to identify the most effective features detected by the TTI-500 model…
Interface Anywhere: Development of a Voice and Gesture System for Spaceflight Operations

NASA Technical Reports Server (NTRS)

Thompson, Shelby; Haddock, Maxwell; Overland, David

2013-01-01

The Interface Anywhere Project was funded through Innovation Charge Account (ICA) at NASA JSC in the Fall of 2012. The project was collaboration between human factors and engineering to explore the possibility of designing an interface to control basic habitat operations through gesture and voice control; (a) Current interfaces require the users to be physically near an input device in order to interact with the system; and (b) By using voice and gesture commands, the user is able to interact with the system anywhere they want within the work environment.
76 FR 72306 - Federal Housing Administration (FHA) Appraiser Roster: Appraiser Qualifications for Placement on...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-11-23

... Appraiser Roster regulations by replacing the obsolete references to the Credit Alert Interactive Voice Response System (CAIVRS) with references to its successor, the online-based Credit Alert Verification... propose the elimination references to the Credit Alert Interactive Voice Response System (CAIVRS). On July...
76 FR 41441 - Federal Housing Administration (FHA) Appraiser Roster: Appraiser Qualifications for Placement on...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-07-14

... the FHA Appraiser Roster by replacing the obsolete references to the Credit Alert Interactive Voice Response System with references to its successor, the online-based Credit Alert Verification Reporting...'s Limited Denial of Participation list, or in HUD's Credit Alert Interactive Voice Response System...
Scientific bases of human-machine communication by voice.

PubMed Central

Schafer, R W

1995-01-01

The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines. PMID:7479802
Voice interactive electronic warning systems (VIEWS) - An applied approach to voice technology in the helicopter cockpit

NASA Technical Reports Server (NTRS)

Voorhees, J. W.; Bucher, N. M.

1983-01-01

The cockpit has been one of the most rapidly changing areas of new aircraft design over the past thirty years. In connection with these developments, a pilot can now be considered a decision maker/system manager as well as a vehicle controller. There is, however, a trend towards an information overload in the cockpit, and information processing problems begin to occur for the rotorcraft pilot. One approach to overcome the arising difficulties is based on the utilization of voice technology to improve the information transfer rate in the cockpit with respect to both input and output. Attention is given to the background of speech technology, the application of speech technology within the cockpit, voice interactive electronic warning system (VIEWS) simulation, and methodology. Information subsystems are considered along with a dynamic simulation study, and data collection.
Discourse-voice regulatory strategies in the psychotherapeutic interaction: a state-space dynamics analysis.

PubMed

Tomicic, Alemka; Martínez, Claudio; Pérez, J Carola; Hollenstein, Tom; Angulo, Salvador; Gerstmann, Adam; Barroux, Isabelle; Krause, Mariane

2015-01-01

This study seeks to provide evidence of the dynamics associated with the configurations of discourse-voice regulatory strategies in patient-therapist interactions in relevant episodes within psychotherapeutic sessions. Its central assumption is that discourses manifest themselves differently in terms of their prosodic characteristics according to their regulatory functions in a system of interactions. The association between discourse and vocal quality in patients and therapists was analyzed in a sample of 153 relevant episodes taken from 164 sessions of five psychotherapies using the state space grid (SSG) method, a graphical tool based on the dynamic systems theory (DST). The results showed eight recurrent and stable discourse-voice regulatory strategies of the patients and three of the therapists. Also, four specific groups of these discourse-voice strategies were identified. The latter were interpreted as regulatory configurations, that is to say, as emergent self-organized groups of discourse-voice regulatory strategies constituting specific interactional systems. Both regulatory strategies and their configurations differed between two types of relevant episodes: Change Episodes and Rupture Episodes. As a whole, these results support the assumption that speaking and listening, as dimensions of the interaction that takes place during therapeutic conversation, occur at different levels. The study not only shows that these dimensions are dependent on each other, but also that they function as a complex and dynamic whole in therapeutic dialog, generating relational offers which allow the patient and the therapist to regulate each other and shape the psychotherapeutic process that characterizes each type of relevant episode.
The interaction of tone with voicing and foot structure: evidence from Kera phonetics and phonology

NASA Astrophysics Data System (ADS)

Pearce, Mary Dorothy

This thesis uses acoustic measurements as a basis for the phonological analysis of the interaction of tone with voicing and foot structure in Kera (a Chadic language). In both tone spreading and vowel harmony, the iambic foot acts as a domain for spreading. Further evidence for the foot comes from measurements of duration, intensity and vowel quality. Kera is unusual in combining a tone system with a partially independent metrical system based on iambs. In words containing more than one foot, the foot is the tone bearing unit (TBU), but in shorter words, the TBU is the syllable. In perception and production experiments, results show that Kera speakers, unlike English and French, use the fundamental frequency as the principle cue to 'Voicing" contrast. Voice onset time (VOT) has only a minor role. Historically, tones probably developed from voicing through a process of tonogenesis, but synchronically, the feature voice is no longer contrastive and VOT is used in an enhancing role. Some linguists have claimed that Kera is a key example for their controversial theory of long-distance voicing spread. But as voice is not part of Kera phonology, this thesis gives counter-evidence to the voice spreading claim. An important finding from the experiments is that the phonological grammars are different between village women, men moving to town and town men. These differences are attributed to French contact. The interaction between Kera tone and voicing and contact with French have produced changes from a 2-way voicing contrast, through a 3-way tonal contrast, to a 2-way voicing contrast plus another contrast with short VOT. These diachronic and synchronic tone/voicing facts are analysed using laryngeal features and Optimality Theory. This thesis provides a body of new data, detailed acoustic measurements, and an analysis incorporating current theoretical issues in phonology, which make it of interest to Africanists and theoreticians alike.
A study on the application of voice interaction in automotive human machine interface experience design

NASA Astrophysics Data System (ADS)

Huang, Zhaohui; Huang, Xiemin

2018-04-01

This paper, firstly, introduces the application trend of the integration of multi-channel interactions in automotive HMI ((Human Machine Interface) from complex information models faced by existing automotive HMI and describes various interaction modes. By comparing voice interaction and touch screen, gestures and other interaction modes, the potential and feasibility of voice interaction in automotive HMI experience design are concluded. Then, the related theories of voice interaction, identification technologies, human beings' cognitive models of voices and voice design methods are further explored. And the research priority of this paper is proposed, i.e. how to design voice interaction to create more humane task-oriented dialogue scenarios to enhance interactive experiences of automotive HMI. The specific scenarios in driving behaviors suitable for the use of voice interaction are studied and classified, and the usability principles and key elements for automotive HMI voice design are proposed according to the scenario features. Then, through the user participatory usability testing experiment, the dialogue processes of voice interaction in automotive HMI are defined. The logics and grammars in voice interaction are classified according to the experimental results, and the mental models in the interaction processes are analyzed. At last, the voice interaction design method to create the humane task-oriented dialogue scenarios in the driving environment is proposed.
Study on intelligent processing system of man-machine interactive garment frame model

NASA Astrophysics Data System (ADS)

Chen, Shuwang; Yin, Xiaowei; Chang, Ruijiang; Pan, Peiyun; Wang, Xuedi; Shi, Shuze; Wei, Zhongqian

2018-05-01

A man-machine interactive garment frame model intelligent processing system is studied in this paper. The system consists of several sensor device, voice processing module, mechanical parts and data centralized acquisition devices. The sensor device is used to collect information on the environment changes brought by the body near the clothes frame model, the data collection device is used to collect the information of the environment change induced by the sensor device, voice processing module is used for speech recognition of nonspecific person to achieve human-machine interaction, mechanical moving parts are used to make corresponding mechanical responses to the information processed by data collection device.it is connected with data acquisition device by a means of one-way connection. There is a one-way connection between sensor device and data collection device, two-way connection between data acquisition device and voice processing module. The data collection device is one-way connection with mechanical movement parts. The intelligent processing system can judge whether it needs to interact with the customer, realize the man-machine interaction instead of the current rigid frame model.
Micro-Based Speech Recognition: Instructional Innovation for Handicapped Learners.

ERIC Educational Resources Information Center

Horn, Carin E.; Scott, Brian L.

A new voice based learning system (VBLS), which allows the handicapped user to interact with a microcomputer by voice commands, is described. Speech or voice recognition is the computerized process of identifying a spoken word or phrase, including those resulting from speech impediments. This new technology is helpful to the severely physically…
Eye-movements and Voice as Interface Modalities to Computer Systems

NASA Astrophysics Data System (ADS)

Farid, Mohsen M.; Murtagh, Fionn D.

2003-03-01

We investigate the visual and vocal modalities of interaction with computer systems. We focus our attention on the integration of visual and vocal interface as possible replacement and/or additional modalities to enhance human-computer interaction. We present a new framework for employing eye gaze as a modality of interface. While voice commands, as means of interaction with computers, have been around for a number of years, integration of both the vocal interface and the visual interface, in terms of detecting user's eye movements through an eye-tracking device, is novel and promises to open the horizons for new applications where a hand-mouse interface provides little or no apparent support to the task to be accomplished. We present an array of applications to illustrate the new framework and eye-voice integration.
Discourse-voice regulatory strategies in the psychotherapeutic interaction: a state-space dynamics analysis

PubMed Central

Tomicic, Alemka; Martínez, Claudio; Pérez, J. Carola; Hollenstein, Tom; Angulo, Salvador; Gerstmann, Adam; Barroux, Isabelle; Krause, Mariane

2015-01-01

This study seeks to provide evidence of the dynamics associated with the configurations of discourse-voice regulatory strategies in patient–therapist interactions in relevant episodes within psychotherapeutic sessions. Its central assumption is that discourses manifest themselves differently in terms of their prosodic characteristics according to their regulatory functions in a system of interactions. The association between discourse and vocal quality in patients and therapists was analyzed in a sample of 153 relevant episodes taken from 164 sessions of five psychotherapies using the state space grid (SSG) method, a graphical tool based on the dynamic systems theory (DST). The results showed eight recurrent and stable discourse-voice regulatory strategies of the patients and three of the therapists. Also, four specific groups of these discourse-voice strategies were identified. The latter were interpreted as regulatory configurations, that is to say, as emergent self-organized groups of discourse-voice regulatory strategies constituting specific interactional systems. Both regulatory strategies and their configurations differed between two types of relevant episodes: Change Episodes and Rupture Episodes. As a whole, these results support the assumption that speaking and listening, as dimensions of the interaction that takes place during therapeutic conversation, occur at different levels. The study not only shows that these dimensions are dependent on each other, but also that they function as a complex and dynamic whole in therapeutic dialog, generating relational offers which allow the patient and the therapist to regulate each other and shape the psychotherapeutic process that characterizes each type of relevant episode. PMID:25932014
Interactive Voice/Web Response System in clinical research

PubMed Central

Ruikar, Vrishabhsagar

2016-01-01

Emerging technologies in computer and telecommunication industry has eased the access to computer through telephone. An Interactive Voice/Web Response System (IxRS) is one of the user friendly systems for end users, with complex and tailored programs at its backend. The backend programs are specially tailored for easy understanding of users. Clinical research industry has experienced revolution in methodologies of data capture with time. Different systems have evolved toward emerging modern technologies and tools in couple of decades from past, for example, Electronic Data Capture, IxRS, electronic patient reported outcomes, etc. PMID:26952178
Interactive Voice/Web Response System in clinical research.

PubMed

Ruikar, Vrishabhsagar

2016-01-01

Emerging technologies in computer and telecommunication industry has eased the access to computer through telephone. An Interactive Voice/Web Response System (IxRS) is one of the user friendly systems for end users, with complex and tailored programs at its backend. The backend programs are specially tailored for easy understanding of users. Clinical research industry has experienced revolution in methodologies of data capture with time. Different systems have evolved toward emerging modern technologies and tools in couple of decades from past, for example, Electronic Data Capture, IxRS, electronic patient reported outcomes, etc.
Teaching and Learning Foreign Languages via System of "Voice over Internet Protocol" and Language Interactions Case Study: Skype

ERIC Educational Resources Information Center

Wahid, Wazira Ali Abdul; Ahmed, Eqbal Sulaiman; Wahid, Muntaha Ali Abdul

2015-01-01

This issue expresses a research study based on the online interactions of English teaching specially conversation through utilizing VOIP (Voice over Internet Protocol) and cosmopolitan online theme. Data has been achieved by interviews. Simplifiers indicate how oral tasks require to be planned upon to facilitate engagement models propitious to…

Sequoyah Foreign Language Translation System - Business Case Analysis

DTIC Science & Technology

2007-12-01

Interactive Natural Dialogue System (S-MINDS)..................................................................20 j. Voice Response Translator ( VRT ...20 Figure 8. U.S. Marine Military Policeman Demonstrating VRT (From: Ref. U.S...www.languagerealm.com/Files/usmc_mt_test_2004.pdf. 21 j. Voice Response Translator ( VRT ) The VRT is a S2S human language translation device that uses
Increasing the Interaction with Distant Learners on an Interactive Telecommunications System.

ERIC Educational Resources Information Center

Schlenker, Jon

1994-01-01

Suggests a variety of ways to increase interaction with distance learners on an interactive telecommunications system, based on experiences at the University of Maine at Augusta. Highlights include establishing the proper environment; telephone systems; voice mail; fax; electronic mail; computer conferencing; postal mail; printed materials; and…
The interaction of criminal procedure and outcome.

PubMed

Laxminarayan, Malini; Pemberton, Antony

2014-01-01

Procedural quality is an important aspect of crime victims' experiences in criminal proceedings and consists of different dimensions. Two of these dimensions are procedural justice (voice) and interpersonal justice (respectful treatment). Social psychological research has suggested that both voice and respectful treatment are moderated by the impact of outcomes of justice procedures on individuals' reactions. To add to this research, we extend this assertion to the criminal justice context, examining the interaction between the assessment of procedural quality and outcome favorability with victim's trust in the legal system and self-esteem. Hierarchical regression analyses reveal that voice, respectful treatment and outcome favorability are predictive of trust in the legal system and self-esteem. Further investigation reveals that being treated with respect is only related to trust in the legal system when outcome favorability is high. Copyright © 2014 Elsevier Ltd. All rights reserved.
Neurobiological correlates of emotional intelligence in voice and face perception networks

PubMed Central

Karle, Kathrin N; Ethofer, Thomas; Jacob, Heike; Brück, Carolin; Erb, Michael; Lotze, Martin; Nizielski, Sophia; Schütz, Astrid; Wildgruber, Dirk; Kreifelts, Benjamin

2018-01-01

Abstract Facial expressions and voice modulations are among the most important communicational signals to convey emotional information. The ability to correctly interpret this information is highly relevant for successful social interaction and represents an integral component of emotional competencies that have been conceptualized under the term emotional intelligence. Here, we investigated the relationship of emotional intelligence as measured with the Salovey-Caruso-Emotional-Intelligence-Test (MSCEIT) with cerebral voice and face processing using functional and structural magnetic resonance imaging. MSCEIT scores were positively correlated with increased voice-sensitivity and gray matter volume of the insula accompanied by voice-sensitivity enhanced connectivity between the insula and the temporal voice area, indicating generally increased salience of voices. Conversely, in the face processing system, higher MSCEIT scores were associated with decreased face-sensitivity and gray matter volume of the fusiform face area. Taken together, these findings point to an alteration in the balance of cerebral voice and face processing systems in the form of an attenuated face-vs-voice bias as one potential factor underpinning emotional intelligence. PMID:29365199
Neurobiological correlates of emotional intelligence in voice and face perception networks.

PubMed

Karle, Kathrin N; Ethofer, Thomas; Jacob, Heike; Brück, Carolin; Erb, Michael; Lotze, Martin; Nizielski, Sophia; Schütz, Astrid; Wildgruber, Dirk; Kreifelts, Benjamin

2018-02-01

Facial expressions and voice modulations are among the most important communicational signals to convey emotional information. The ability to correctly interpret this information is highly relevant for successful social interaction and represents an integral component of emotional competencies that have been conceptualized under the term emotional intelligence. Here, we investigated the relationship of emotional intelligence as measured with the Salovey-Caruso-Emotional-Intelligence-Test (MSCEIT) with cerebral voice and face processing using functional and structural magnetic resonance imaging. MSCEIT scores were positively correlated with increased voice-sensitivity and gray matter volume of the insula accompanied by voice-sensitivity enhanced connectivity between the insula and the temporal voice area, indicating generally increased salience of voices. Conversely, in the face processing system, higher MSCEIT scores were associated with decreased face-sensitivity and gray matter volume of the fusiform face area. Taken together, these findings point to an alteration in the balance of cerebral voice and face processing systems in the form of an attenuated face-vs-voice bias as one potential factor underpinning emotional intelligence.
Voice and choice in health care in England: understanding citizen responses to dissatisfaction.

PubMed

Dowding, Keith; John, Peter

2011-01-01

Using data from a five-year online survey the paper examines the effects of relative satisfaction with health services on individuals' voice-and-choice activity in the English public health care system. Voice is considered in three parts – individual voice (complaints), collective voice voting and participation (collective action). Exercising choice is seen in terms of complete exit (not using health care), internal exit (choosing another public service provider) and private exit (using private health care). The interaction of satisfaction and forms of voice and choice are analysed over time. Both voice and choice are correlated with dissatisfaction with those who are unhappy with the NHS more likely to privately voice and to plan to take up private health care. Those unable to choose private provision are likely to use private voice. These factors are not affected by items associated with social capital – indeed, being more trusting leads to lower voice activity.
Design and Implementation of an Interactive Website for Pediatric Voice Therapy-The Concept of In-Between Care: A Telehealth Model.

PubMed

Doarn, Charles R; Zacharias, Stephanie; Keck, Casey Stewart; Tabangin, Meredith; DeAlarcon, Alessandro; Kelchner, Lisa

2018-06-05

This article describes the design and implementation of a web-based portal developed to provide supported home practice between weekly voice therapy sessions delivered through telehealth to children with voice disorders. This in-between care consisted of supported home practice that was remotely monitored by speech-language pathologists (SLPs). A web-based voice therapy portal (VTP) was developed as a platform so participants could complete voice therapy home practice by an interdisciplinary team of SLPs (specialized in pediatric voice therapy), telehealth specialists, biomedical informaticians, and interface designers. The VTP was subsequently field tested in a group of children with voice disorders, participating in a larger telehealth study. Building the VTP for supported home practice for pediatric voice therapy was challenging, but successful. Key interactive features of the final site included 11 vocal hygiene questions, traditional voice therapy exercises grouped into levels, audio/visual voice therapy demonstrations, a store-and-retrieval system for voice samples, message/chat function, written guidelines for weekly therapy exercises, and questionnaires for parents to complete after each therapy session. Ten participants (9-14 years of age) diagnosed with a voice disorder were enrolled for eight weekly telehealth voice therapy sessions with follow-up in-between care provided using the VTP. The development and implementation of the VTP as a novel platform for the delivery of voice therapy home practice sessions were effective. We found that a versatile individual, who can work with all project staff (speak the language of both SLPs and information technologists), is essential to the development process. Once the website was established, participants and SLPs effectively utilized the web-based VTP. They found it feasible and useful for needed in-between care and reinforcement of therapeutic exercises.
78 FR 30896 - Submission for OMB Review; Comment Request

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-23

..., Associated Form and OMB Number: Interactive Customer Evaluation (ICE)/Enterprise Voice of the Customer (EVoC...)/ Enterprise Voice of the Customer (EVoC) System automates and minimizes the use of the current manual paper... service provider on the quality of their experience and their satisfaction level. This is a management...
Interactive Augmentation of Voice Quality and Reduction of Breath Airflow in the Soprano Voice.

PubMed

Rothenberg, Martin; Schutte, Harm K

2016-11-01

In 1985, at a conference sponsored by the National Institutes of Health, Martin Rothenberg first described a form of nonlinear source-tract acoustic interaction mechanism by which some sopranos, singing in their high range, can use to reduce the total airflow, to allow holding the note longer, and simultaneously enrich the quality of the voice, without straining the voice. (M. Rothenberg, "Source-Tract Acoustic Interaction in the Soprano Voice and Implications for Vocal Efficiency," Fourth International Conference on Vocal Fold Physiology, New Haven, Connecticut, June 3-6, 1985.) In this paper, we describe additional evidence for this type of nonlinear source-tract interaction in some soprano singing and describe an analogous interaction phenomenon in communication engineering. We also present some implications for voice research and pedagogy. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Multi-modal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems.

PubMed

Mehler, Bruce; Kidd, David; Reimer, Bryan; Reagan, Ian; Dobres, Jonathan; McCartt, Anne

2016-03-01

One purpose of integrating voice interfaces into embedded vehicle systems is to reduce drivers' visual and manual distractions with 'infotainment' technologies. However, there is scant research on actual benefits in production vehicles or how different interface designs affect attentional demands. Driving performance, visual engagement, and indices of workload (heart rate, skin conductance, subjective ratings) were assessed in 80 drivers randomly assigned to drive a 2013 Chevrolet Equinox or Volvo XC60. The Chevrolet MyLink system allowed completing tasks with one voice command, while the Volvo Sensus required multiple commands to navigate the menu structure. When calling a phone contact, both voice systems reduced visual demand relative to the visual-manual interfaces, with reductions for drivers in the Equinox being greater. The Equinox 'one-shot' voice command showed advantages during contact calling but had significantly higher error rates than Sensus during destination address entry. For both secondary tasks, neither voice interface entirely eliminated visual demand. Practitioner Summary: The findings reinforce the observation that most, if not all, automotive auditory-vocal interfaces are multi-modal interfaces in which the full range of potential demands (auditory, vocal, visual, manipulative, cognitive, tactile, etc.) need to be considered in developing optimal implementations and evaluating drivers' interaction with the systems. Social Media: In-vehicle voice-interfaces can reduce visual demand but do not eliminate it and all types of demand need to be taken into account in a comprehensive evaluation.
Multi-modal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems

PubMed Central

Mehler, Bruce; Kidd, David; Reimer, Bryan; Reagan, Ian; Dobres, Jonathan; McCartt, Anne

2016-01-01

Abstract One purpose of integrating voice interfaces into embedded vehicle systems is to reduce drivers’ visual and manual distractions with ‘infotainment’ technologies. However, there is scant research on actual benefits in production vehicles or how different interface designs affect attentional demands. Driving performance, visual engagement, and indices of workload (heart rate, skin conductance, subjective ratings) were assessed in 80 drivers randomly assigned to drive a 2013 Chevrolet Equinox or Volvo XC60. The Chevrolet MyLink system allowed completing tasks with one voice command, while the Volvo Sensus required multiple commands to navigate the menu structure. When calling a phone contact, both voice systems reduced visual demand relative to the visual–manual interfaces, with reductions for drivers in the Equinox being greater. The Equinox ‘one-shot’ voice command showed advantages during contact calling but had significantly higher error rates than Sensus during destination address entry. For both secondary tasks, neither voice interface entirely eliminated visual demand. Practitioner Summary: The findings reinforce the observation that most, if not all, automotive auditory–vocal interfaces are multi-modal interfaces in which the full range of potential demands (auditory, vocal, visual, manipulative, cognitive, tactile, etc.) need to be considered in developing optimal implementations and evaluating drivers’ interaction with the systems. Social Media: In-vehicle voice-interfaces can reduce visual demand but do not eliminate it and all types of demand need to be taken into account in a comprehensive evaluation. PMID:26269281
The role of voice input for human-machine communication.

PubMed Central

Cohen, P R; Oviatt, S L

1995-01-01

Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computer-based spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design of future spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology. PMID:7479803
Using Continuous Voice Recognition Technology as an Input Medium to the Naval Warfare Interactive Simulation System (NWISS).

DTIC Science & Technology

1984-06-01

Co ,u’arataor, Gr 7- / ’ . c ; / , caae.ic >ar. ’ ’# d:.i II ’ ..... .. . . .. .. . ... . , rV ABSTRACT A great d-al of research has been conducted an...9 2. Continuous Voice -%ecoait.ior, ....... 11 B. VERBEX 3000 SPEECH APPLiCATION DEVELOP !ENT SYSTEM! ( SPADS ...13 C . NAVAL IAR FARE INT7EACTI7E S:AIULATIC"N SYSTEM (NWISS) ....... .................. 14 D. PURPOSE .................... 16 1. A Past
Designing of Intelligent Multilingual Patient Reported Outcome System (IMPROS)

PubMed Central

Pourasghar, Faramarz; Partovi, Yeganeh

2015-01-01

Background: By self-reporting outcome procedure the patients themselves record disease symptoms outside medical centers and then report them to medical staff in specific periods of time. One of the self-reporting methods is the application of interactive voice response (IVR), in which some pre-designed questions in the form of voice tracks would be played and then the caller responses the questions by pressing phone’s keypad bottoms. Aim: The present research explains the main framework of such system designing according to IVR technology that is for the first time designed and administered in Iran. Methods: Interactive Voice Response system was composed by two main parts of hardware and software. Hardware section includes one or several digital phone lines, a modem card with voice playing capability and a PC. IVR software on the other hand, acts as an intelligent control center, records call information and controls incoming data. Results: One of the main features of the system is its capability to be administered in common PCs, utilizing simple and cheap modems, high speed to take responses and it’s appropriateness to low literate patients. The system is applicable for monitoring chronic diseases, cancer and also in psychological diseases and can be suitable for taking care of elders and Children who require long term cares. Other features include user-friendly, decrease in direct and indirect costs of disease treatment and enjoying from high level of security to access patients’ profiles. Conclusions: Intelligent multilingual patient reported outcome system (IMPROS) by controlling diseases gives the opportunity to patients to have more participation during treatment and it improves mutual interaction between patient and medical staff. Moreover it increases the quality of medical services, Additional to empowering patients and their followers. PMID:26635441
Forms of Mediation: The Case of Interpreter-Mediated Interactions in Medical Systems

ERIC Educational Resources Information Center

Baraldi, Claudio

2009-01-01

This paper analyses the forms of mediation in interlinguistic interactions performed in Italian healthcare services and in contexts of migration. The literature encourages dialogic transformative mediation, empowering participants' voices and changing cultural presuppositions in social systems. It may be doubtful, however, whether mediation can…
The Army word recognition system

NASA Technical Reports Server (NTRS)

Hadden, David R.; Haratz, David

1977-01-01

The application of speech recognition technology in the Army command and control area is presented. The problems associated with this program are described as well as as its relevance in terms of the man/machine interactions, voice inflexions, and the amount of training needed to interact with and utilize the automated system.
Administration of Neuropsychological Tests Using Interactive Voice Response Technology in the Elderly: Validation and Limitations

PubMed Central

Miller, Delyana Ivanova; Talbot, Vincent; Gagnon, Michèle; Messier, Claude

2013-01-01

Interactive voice response (IVR) systems are computer programs, which interact with people to provide a number of services from business to health care. We examined the ability of an IVR system to administer and score a verbal fluency task (fruits) and the digit span forward and backward in 158 community dwelling people aged between 65 and 92 years of age (full scale IQ of 68–134). Only six participants could not complete all tasks mostly due to early technical problems in the study. Participants were also administered the Wechsler Intelligence Scale fourth edition (WAIS-IV) and Wechsler Memory Scale fourth edition subtests. The IVR system correctly recognized 90% of the fruits in the verbal fluency task and 93–95% of the number sequences in the digit span. The IVR system typically underestimated the performance of participants because of voice recognition errors. In the digit span, these errors led to the erroneous discontinuation of the test: however the correlation between IVR scoring and clinical scoring was still high (93–95%). The correlation between the IVR verbal fluency and the WAIS-IV Similarities subtest was 0.31. The correlation between the IVR digit span forward and backward and the in-person administration was 0.46. We discuss how valid and useful IVR systems are for neuropsychological testing in the elderly. PMID:23950755
Age Differences in Voice Evaluation: From Auditory-Perceptual Evaluation to Social Interactions

ERIC Educational Resources Information Center

Lortie, Catherine L.; Deschamps, Isabelle; Guitton, Matthieu J.; Tremblay, Pascale

2018-01-01

Purpose: The factors that influence the evaluation of voice in adulthood, as well as the consequences of such evaluation on social interactions, are not well understood. Here, we examined the effect of listeners' age and the effect of talker age, sex, and smoking status on the auditory-perceptual evaluation of voice, voice-related psychosocial…
The use of an automated interactive voice response system to manage medication identification calls to a poison center.

PubMed

Krenzelok, Edward P; Mrvos, Rita

2009-05-01

In 2007, medication identification requests (MIRs) accounted for 26.2% of all calls to U.S. poison centers. MIRs are documented with minimal information, but they still require an inordinate amount of work by specialists in poison information (SPI). An analysis was undertaken to identify options to reduce the impact of MIRs on both human and financial resources. All MIRs (2003-2007) to a certified regional poison information center were analyzed to determine call patterns and staffing. The data were used to justify an efficient and cost-effective solution. MIRs represented 42.3% of the 2007 call volume. Optimal staffing would require hiring an additional four full-time equivalent SPI. An interactive voice response (IVR) system was developed to respond to the MIRs. The IVR was used to develop the Medication Identification System that allowed the diversion of up to 50% of the MIRs, enhancing surge capacity and allowing specialists to address the more emergent poison exposure calls. This technology is an entirely voice-activated response call management system that collects zip code, age, gender and drug data and stores all responses as .csv files for reporting purposes. The query bank includes the 200 most common MIRs, and the system features text-to-voice synthesis that allows easy modification of the drug identification menu. Callers always have the option of engaging a SPI at any time during the IVR call flow. The IVR is an efficient and effective alternative that creates better staff utilization.
Central nervous system control of the laryngeal muscles in humans

PubMed Central

Ludlow, Christy L.

2005-01-01

Laryngeal muscle control may vary for different functions such as: voice for speech communication, emotional expression during laughter and cry, breathing, swallowing, and cough. This review discusses the control of the human laryngeal muscles for some of these different functions. Sensori-motor aspects of laryngeal control have been studied by eliciting various laryngeal reflexes. The role of audition in learning and monitoring ongoing voice production for speech is well known; while the role of somatosensory feedback is less well understood. Reflexive control systems involving central pattern generators may contribute to swallowing, breathing and cough with greater cortical control during volitional tasks such as voice production for speech. Volitional control is much less well understood for each of these functions and likely involves the integration of cortical and subcortical circuits. The new frontier is the study of the central control of the laryngeal musculature for voice, swallowing and breathing and how volitional and reflexive control systems may interact in humans. PMID:15927543

Research on realization scheme of interactive voice response (IVR) system

NASA Astrophysics Data System (ADS)

Jin, Xin; Zhu, Guangxi

2003-12-01

In this paper, a novel interactive voice response (IVR) system is proposed, which is apparently different from the traditional. Using software operation and network control, the IVR system is presented which only depends on software in the server in which the system lies and the hardware in network terminals on user side, such as gateway (GW), personal gateway (PG), PC and so on. The system transmits the audio using real time protocol (RTP) protocol via internet to the network terminals and controls flow using finite state machine (FSM) stimulated by H.245 massages sent from user side and the system control factors. Being compared with other existing schemes, this IVR system results in several advantages, such as greatly saving the system cost, fully utilizing the existing network resources and enhancing the flexibility. The system is capable to be put in any service server anywhere in the Internet and even fits for the wireless applications based on packet switched communication. The IVR system has been put into reality and passed the system test.
Translational Systems Biology and Voice Pathophysiology

PubMed Central

Li, Nicole Y. K.; Abbott, Katherine Verdolini; Rosen, Clark; An, Gary; Hebda, Patricia A.; Vodovotz, Yoram

2011-01-01

Objectives/Hypothesis Personalized medicine has been called upon to tailor healthcare to an individual's needs. Evidence-based medicine (EBM) has advocated using randomized clinical trials with large populations to evaluate treatment effects. However, due to large variations across patients, the results are likely not to apply to an individual patient. We suggest that a complementary, systems biology approach using computational modeling may help tackle biological complexity in order to improve ultimate patient care. The purpose of the article is: 1) to review the pros and cons of EBM, and 2) to discuss the alternative systems biology method and present its utility in clinical voice research. Study Design Tutorial Methods Literature review and discussion. Results We propose that translational systems biology can address many of the limitations of EBM pertinent to voice and other health care domains, and thus complement current health research models. In particular, recent work using mathematical modeling suggests that systems biology has the ability to quantify the highly complex biologic processes underlying voice pathophysiology. Recent data support the premise that this approach can be applied specifically in the case of phonotrauma and surgically induced vocal fold trauma, and may have particular power to address personalized medicine. Conclusions We propose that evidence around vocal health and disease be expanded beyond a population-based method to consider more fully issues of complexity and systems interactions, especially in implementing personalized medicine in voice care and beyond. PMID:20025041
A theoretical study of F0-F1 interaction with application to resonant speaking and singing voice.

PubMed

Titze, Ingo R

2004-09-01

An interactive source-filter system, consisting of a three-mass body-cover model of the vocal folds and a wave reflection model of the vocal tract, was used to test the dependence of vocal fold vibration on the vocal tract. The degree of interaction is governed by the epilarynx tube, which raises the vocal tract impedance to match the impedance of the glottis. The key component of the impedance is inertive reactance. Whenever there is inertive reactance, the vocal tract assists the vocal folds in vibration. The amplitude of vibration and the glottal flow can more than double, and the oral radiated power can increase up to 10 dB. As F0 approaches F1, the first formant frequency, the interactive source-filter system loses its advantage (because inertive reactance changes to compliant reactance) and the noninteractive system produces greater vocal output. Thus, from a voice training and control standpoint, there may be reasons to operate the system in either interactive and noninteractive modes. The harmonics 2F0 and 3F0 can also benefit from being positioned slightly below F1.
The GuideView System for Interactive, Structured, Multi-modal Delivery of Clinical Guidelines

NASA Technical Reports Server (NTRS)

Iyengar, Sriram; Florez-Arango, Jose; Garcia, Carlos Andres

2009-01-01

GuideView is a computerized clinical guideline system which delivers clinical guidelines in an easy-to-understand and easy-to-use package. It may potentially enhance the quality of medical care or allow non-medical personnel to provide acceptable levels of care in situations where physicians or nurses may not be available. Such a system can be very valuable during space flight missions when a physician is not readily available, or perhaps the designated medical personnel is unable to provide care. Complex clinical guidelines are broken into simple steps. At each step clinical information is presented in multiple modes, including voice,audio, text, pictures, and video. Users can respond via mouse clicks or via voice navigation. GuideView can also interact with medical sensors using wireless or wired connections. The system's interface is illustrated and the results of a usability study are presented.
English Voicing in Dimensional Theory*

PubMed Central

Iverson, Gregory K.; Ahn, Sang-Cheol

2007-01-01

Assuming a framework of privative features, this paper interprets two apparently disparate phenomena in English phonology as structurally related: the lexically specific voicing of fricatives in plural nouns like wives or thieves and the prosodically governed “flapping” of medial /t/ (and /d/) in North American varieties, which we claim is itself not a rule per se, but rather a consequence of the laryngeal weakening of fortis /t/ in interaction with speech-rate determined segmental abbreviation. Taking as our point of departure the Dimensional Theory of laryngeal representation developed by Avery & Idsardi (2001), along with their assumption that English marks voiceless obstruents but not voiced ones (Iverson & Salmons 1995), we find that an unexpected connection between fricative voicing and coronal flapping emerges from the interplay of familiar phonemic and phonetic factors in the phonological system. PMID:18496590
Voice to Voice: Developing In-Service Teachers' Personal, Collaborative, and Public Voices.

ERIC Educational Resources Information Center

Thurber, Frances; Zimmerman, Enid

1997-01-01

Describes a model for inservice education that begins with an interchange of teachers' voices with those of the students in an interactive dialog. The exchange allows them to develop their private voices through self-reflection and validation of their own experiences. (JOW)
A Multimodal Emotion Detection System during Human-Robot Interaction

PubMed Central

Alonso-Martín, Fernando; Malfaz, María; Sequeira, João; Gorostiza, Javier F.; Salichs, Miguel A.

2013-01-01

In this paper, a multimodal user-emotion detection system for social robots is presented. This system is intended to be used during human–robot interaction, and it is integrated as part of the overall interaction system of the robot: the Robotics Dialog System (RDS). Two modes are used to detect emotions: the voice and face expression analysis. In order to analyze the voice of the user, a new component has been developed: Gender and Emotion Voice Analysis (GEVA), which is written using the Chuck language. For emotion detection in facial expressions, the system, Gender and Emotion Facial Analysis (GEFA), has been also developed. This last system integrates two third-party solutions: Sophisticated High-speed Object Recognition Engine (SHORE) and Computer Expression Recognition Toolbox (CERT). Once these new components (GEVA and GEFA) give their results, a decision rule is applied in order to combine the information given by both of them. The result of this rule, the detected emotion, is integrated into the dialog system through communicative acts. Hence, each communicative act gives, among other things, the detected emotion of the user to the RDS so it can adapt its strategy in order to get a greater satisfaction degree during the human–robot dialog. Each of the new components, GEVA and GEFA, can also be used individually. Moreover, they are integrated with the robotic control platform ROS (Robot Operating System). Several experiments with real users were performed to determine the accuracy of each component and to set the final decision rule. The results obtained from applying this decision rule in these experiments show a high success rate in automatic user emotion recognition, improving the results given by the two information channels (audio and visual) separately. PMID:24240598
The "VoiceForum" Platform for Spoken Interaction

ERIC Educational Resources Information Center

Fynn, Fohn; Wigham, Chiara R.

2011-01-01

Showcased in the courseware exhibition, "VoiceForum" is a web-based software platform for asynchronous learner interaction in threaded discussions using voice and text. A dedicated space is provided for the tutor who can give feedback on a posted message and dialogue with the participants at a separate level from the main interactional…
Voices on Voice: Perspectives, Definitions, Inquiry.

ERIC Educational Resources Information Center

Yancey, Kathleen Blake, Ed.

This collection of essays approaches "voice" as a means of expression that lives in the interactions of writers, readers, and language, and examines the conceptualizations of voice within the oral rhetorical and expressionist traditions, and the notion of voice as both a singular and plural phenomenon. An explanatory introduction by the…
The effect of voice quality and competing speakers in a passage comprehension task: performance in relation to cognitive functioning in children with normal hearing.

PubMed

von Lochow, Heike; Lyberg-Åhlander, Viveka; Sahlén, Birgitta; Kastberg, Tobias; Brännström, K Jonas

2018-04-01

This study explores the effect of voice quality and competing speaker/-s on children's performance in a passage comprehension task. Furthermore, it explores the interaction between passage comprehension and cognitive functioning. Forty-nine children (27 girls and 22 boys) with normal hearing (aged 7-12 years) participated. Passage comprehension was tested in six different listening conditions; a typical voice (non-dysphonic voice) in quiet, a typical voice with one competing speaker, a typical voice with four competing speakers, a dysphonic voice in quiet, a dysphonic voice with one competing speaker, and a dysphonic voice with four competing speakers. The children's working memory capacity and executive functioning were also assessed. The findings indicate no direct effect of voice quality on the children's performance, but a significant effect of background listening condition. Interaction effects were seen between voice quality, background listening condition, and executive functioning. The children's susceptibility to the effect of the dysphonic voice and the background listening conditions are related to the individual's executive functions. The findings have several implications for design of interventions in language learning environments such as classrooms.
Virtual interface environment

NASA Technical Reports Server (NTRS)

Fisher, Scott S.

1986-01-01

A head-mounted, wide-angle, stereoscopic display system controlled by operator position, voice and gesture has been developed for use as a multipurpose interface environment. The system provides a multisensory, interactive display environment in which a user can virtually explore a 360-degree synthesized or remotely sensed environment and can viscerally interact with its components. Primary applications of the system are in telerobotics, management of large-scale integrated information systems, and human factors research. System configuration, application scenarios, and research directions are described.
Crossmodal interactions during non-linguistic auditory processing in cochlear-implanted deaf patients.

PubMed

Barone, Pascal; Chambaudie, Laure; Strelnikov, Kuzma; Fraysse, Bernard; Marx, Mathieu; Belin, Pascal; Deguine, Olivier

2016-10-01

Due to signal distortion, speech comprehension in cochlear-implanted (CI) patients relies strongly on visual information, a compensatory strategy supported by important cortical crossmodal reorganisations. Though crossmodal interactions are evident for speech processing, it is unclear whether a visual influence is observed in CI patients during non-linguistic visual-auditory processing, such as face-voice interactions, which are important in social communication. We analyse and compare visual-auditory interactions in CI patients and normal-hearing subjects (NHS) at equivalent auditory performance levels. Proficient CI patients and NHS performed a voice-gender categorisation in the visual-auditory modality from a morphing-generated voice continuum between male and female speakers, while ignoring the presentation of a male or female visual face. Our data show that during the face-voice interaction, CI deaf patients are strongly influenced by visual information when performing an auditory gender categorisation task, in spite of maximum recovery of auditory speech. No such effect is observed in NHS, even in situations of CI simulation. Our hypothesis is that the functional crossmodal reorganisation that occurs in deafness could influence nonverbal processing, such as face-voice interaction; this is important for patient internal supramodal representation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Interactions between voice clinics and singing teachers: a report on the British Voice Association questionnaire to voice clinics in the UK.

PubMed

Davies, J; Anderson, S; Huchison, L; Stewart, G

2007-01-01

Singers with vocal problems are among patients who present at multidisciplinary voice clinics led by Ear Nose and Throat consultants and laryngologists or speech and language therapists. However, the development and care of the singing voice are also important responsibilities of singing teachers. We report here on the current extent and nature of interactions between voice clinics and singing teachers, based on data from a recent survey undertaken on behalf of the British Voice Association. A questionnaire was sent to all 103 voice clinics at National Health Service (NHS) hospitals in the UK. Responses were received and analysed from 42 currently active clinics. Eight (19%) clinics reported having a singing teacher as an active member of the team. They were all satisfied with the singing teacher's knowledge and expertise, which had been acquired by several different means. Of 32 clinics without a singing teacher regularly associated with the team, funding and difficulty of finding an appropriate singing voice expert (81% and 50%, respectively) were among the main reasons for their absence. There was an expressed requirement for more interaction between voice clinics and singing teachers, and 86% replied that they would find it useful to have a list of singing teachers in their area. On the matter of gaining expertise and training, 74% of the clinics replying would enable singing teachers to observe clinic sessions for experience and 21% were willing to assist in training them for clinic-associated work.
Effects of the Interaction of Caffeine and Water on Voice Performance: A Pilot Study

ERIC Educational Resources Information Center

Franca, Maria Claudia; Simpson, Kenneth O.

2013-01-01

The objective of this "pilot" investigation was to study the effects of the interaction of caffeine and water intake on voice as evidenced by acoustic and aerodynamic measures, to determine whether ingestion of 200 mg of caffeine and various levels of water intake have an impact on voice. The participants were 48 females ranging in age…
The Lincoln Training System: A Summary Report.

ERIC Educational Resources Information Center

Butman, Robert C.; Frick, Frederick C.

The current status of the Lincoln Training System (LTS) is reported. This document describes LTS as a computer supported microfiche system which: 1) provides random access to voice quality audio and to graphics; 2) supports student-controlled interactive processes; and 3) functions in a variety of environments. The report offers a detailed…
Virtual workstation - A multimodal, stereoscopic display environment

NASA Astrophysics Data System (ADS)

Fisher, S. S.; McGreevy, M.; Humphries, J.; Robinett, W.

1987-01-01

A head-mounted, wide-angle, stereoscopic display system controlled by operator position, voice and gesture has been developed for use in a multipurpose interface environment. The system provides a multisensory, interactive display environment in which a user can virtually explore a 360-degree synthesized or remotely sensed environment and can viscerally interact with its components. Primary applications of the system are in telerobotics, management of large-scale integrated information systems, and human factors research. System configuration, application scenarios, and research directions are described.
Dialogism and Carnival in Virginia Woolf's "To the Lighthouse": A Bakhtinian Reading

ERIC Educational Resources Information Center

Faizi, Hamed; Taghizadeh, Ali

2015-01-01

Mikhail Bakhtin's dialogism in a novel promises the creation of a domain of interactive context for different voices which results in a polyphonic discourse. Instead of trying to suppress each other, the voices of the novel interact upon the other voices in a way that none of them tries to silent the other ones, and each one has the opportunity to…
Interference effects of vocalization on dual task performance

NASA Astrophysics Data System (ADS)

Owens, J. M.; Goodman, L. S.; Pianka, M. J.

1984-09-01

Voice command and control systems have been proposed as a potential means of off-loading the typically overburdened visual information processing system. However, prior to introducing novel human-machine interfacing technologies in high workload environments, consideration must be given to the integration of the new technologists within existing task structures to ensure that no new sources of workload or interference are systematically introduced. This study examined the use of voice interactive systems technology in the joint performance of two cognitive information processing tasks requiring continuous memory and choice reaction wherein a basis for intertask interference might be expected. Stimuli for the continuous memory task were presented aurally and either voice or keyboard responding was required in the choice reaction task. Performance was significantly degraded in each task when voice responding was required in the choice reaction time task. Performance degradation was evident in higher error scores for both the choice reaction and continuous memory tasks. Performance decrements observed under conditions of high intertask stimulus similarity were not statistically significant. The results signal the need to consider further the task requirements for verbal short-term memory when applying speech technology in multitask environments.
Three input concepts for flight crew interaction with information presented on a large-screen electronic cockpit display

NASA Technical Reports Server (NTRS)

Jones, Denise R.

1990-01-01

A piloted simulation study was conducted comparing three different input methods for interfacing to a large-screen, multiwindow, whole-flight-deck display for management of transport aircraft systems. The thumball concept utilized a miniature trackball embedded in a conventional side-arm controller. The touch screen concept provided data entry through a capacitive touch screen. The voice concept utilized a speech recognition system with input through a head-worn microphone. No single input concept emerged as the most desirable method of interacting with the display. Subjective results, however, indicate that the voice concept was the most preferred method of data entry and had the most potential for future applications. The objective results indicate that, overall, the touch screen concept was the most effective input method. There was also significant differences between the time required to perform specific tasks and the input concept employed, with each concept providing better performance relative to a specific task. These results suggest that a system combining all three input concepts might provide the most effective method of interaction.
A Dynamic Dialog System Using Semantic Web Technologies

ERIC Educational Resources Information Center

Ababneh, Mohammad

2014-01-01

A dialog system or a conversational agent provides a means for a human to interact with a computer system. Dialog systems use text, voice and other means to carry out conversations with humans in order to achieve some objective. Most dialog systems are created with specific objectives in mind and consist of preprogrammed conversations. The primary…

Individual versus Interactive Task-Based Performance through Voice-Based Computer-Mediated Communication

ERIC Educational Resources Information Center

Granena, Gisela

2016-01-01

Interaction is a necessary condition for second language (L2) learning (Long, 1980, 1996). Research in computer-mediated communication has shown that interaction opportunities make learners pay attention to form in a variety of ways that promote L2 learning. This research has mostly investigated text-based rather than voice-based interaction. The…
Digital Systems Validation Handbook. Volume 2. Chapter 19. Pilot - Vehicle Interface

DTIC Science & Technology

1993-11-01

checklists, and other status messages. Voice interactive systems are defi-ed as "the interface between a cooperative human and a machine, which involv -he...Pilot-Vehicle Interface 19-85 5.6.1 Crew Interaction and the Cockpit 19-85 5.6.2 Crew Resource Management and Safety 19-87 5.6.3 Pilot and Crew Training...systems was a "stand-alone" component performing its intended function. Systems and their cockpit interfaces were added as technological advances were
A self-teaching image processing and voice-recognition-based, intelligent and interactive system to educate visually impaired children

NASA Astrophysics Data System (ADS)

Iqbal, Asim; Farooq, Umar; Mahmood, Hassan; Asad, Muhammad Usman; Khan, Akrama; Atiq, Hafiz Muhammad

2010-02-01

A self teaching image processing and voice recognition based system is developed to educate visually impaired children, chiefly in their primary education. System comprises of a computer, a vision camera, an ear speaker and a microphone. Camera, attached with the computer system is mounted on the ceiling opposite (on the required angle) to the desk on which the book is placed. Sample images and voices in the form of instructions and commands of English, Urdu alphabets, Numeric Digits, Operators and Shapes are already stored in the database. A blind child first reads the embossed character (object) with the help of fingers than he speaks the answer, name of the character, shape etc into the microphone. With the voice command of a blind child received by the microphone, image is taken by the camera which is processed by MATLAB® program developed with the help of Image Acquisition and Image processing toolbox and generates a response or required set of instructions to child via ear speaker, resulting in self education of a visually impaired child. Speech recognition program is also developed in MATLAB® with the help of Data Acquisition and Signal Processing toolbox which records and process the command of the blind child.
Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study

PubMed Central

Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L

2017-01-01

Background Computer-delivered interventions have been shown to be effective in reducing alcohol consumption in heavy drinking college students. However, these computer-delivered interventions rely on mouse, keyboard, or touchscreen responses for interactions between the users and the computer-delivered intervention. The principles of motivational interviewing suggest that in-person interventions may be effective, in part, because they encourage individuals to think through and speak aloud their motivations for changing a health behavior, which current computer-delivered interventions do not allow. Objective The objective of this study was to take the initial steps toward development of a voice-based computer-delivered intervention that can ask open-ended questions and respond appropriately to users’ verbal responses, more closely mirroring a human-delivered motivational intervention. Methods We developed (1) a voice-based computer-delivered intervention that was run by a human controller and that allowed participants to speak their responses to scripted prompts delivered by speech generation software and (2) a text-based computer-delivered intervention that relied on the mouse, keyboard, and computer screen for all interactions. We randomized 60 heavy drinking college students to interact with the voice-based computer-delivered intervention and 30 to interact with the text-based computer-delivered intervention and compared their ratings of the systems as well as their motivation to change drinking and their drinking behavior at 1-month follow-up. Results Participants reported that the voice-based computer-delivered intervention engaged positively with them in the session and delivered content in a manner consistent with motivational interviewing principles. At 1-month follow-up, participants in the voice-based computer-delivered intervention condition reported significant decreases in quantity, frequency, and problems associated with drinking, and increased perceived importance of changing drinking behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. Conclusions Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form. PMID:28659259
31 CFR 901.4 - Reporting debts.

Code of Federal Regulations, 2011 CFR

2011-07-01

... and Urban Development's Credit Alert Interactive Voice Response System (CAIVRS). For information about the CAIVRS program, agencies should contact the Director of Information Resources Management Policy and Management Division, Office of Information Technology, Department of Housing and Urban Development...
31 CFR 901.4 - Reporting debts.

Code of Federal Regulations, 2010 CFR

2010-07-01

... and Urban Development's Credit Alert Interactive Voice Response System (CAIVRS). For information about the CAIVRS program, agencies should contact the Director of Information Resources Management Policy and Management Division, Office of Information Technology, Department of Housing and Urban Development...
Virtual interface environment workstations

NASA Technical Reports Server (NTRS)

Fisher, S. S.; Wenzel, E. M.; Coler, C.; Mcgreevy, M. W.

1988-01-01

A head-mounted, wide-angle, stereoscopic display system controlled by operator position, voice and gesture has been developed at NASA's Ames Research Center for use as a multipurpose interface environment. This Virtual Interface Environment Workstation (VIEW) system provides a multisensory, interactive display environment in which a user can virtually explore a 360-degree synthesized or remotely sensed environment and can viscerally interact with its components. Primary applications of the system are in telerobotics, management of large-scale integrated information systems, and human factors research. System configuration, research scenarios, and research directions are described.
AdaRTE: adaptable dialogue architecture and runtime engine. A new architecture for health-care dialogue systems.

PubMed

Rojas-Barahona, L M; Giorgino, T

2007-01-01

Spoken dialogue systems have been increasingly employed to provide ubiquitous automated access via telephone to information and services for the non-Internet-connected public. In the health care context, dialogue systems have been successfully applied. Nevertheless, speech-based technology is not easy to implement because it requires a considerable development investment. The advent of VoiceXML for voice applications contributed to reduce the proliferation of incompatible dialogue interpreters, but introduced new complexity. As a response to these issues, we designed an architecture for dialogue representation and interpretation, AdaRTE, which allows developers to layout dialogue interactions through a high level formalism that offers both declarative and procedural features. AdaRTE aim is to provide a ground for deploying complex and adaptable dialogues whilst allows the experimentation and incremental adoption of innovative speech technologies. It provides the dynamic behavior of Augmented Transition Networks and enables the generation of different backends formats such as VoiceXML. It is especially targeted to the health care context, where a framework for easy dialogue deployment could reduce the barrier for a more widespread adoption of dialogue systems.
Sounds of Education: Teacher Role and Use of Voice in Interactions with Young Children

ERIC Educational Resources Information Center

Koch, Anette Boye

2017-01-01

Voice is a basic tool in communication between adults. However, in early educational settings, adult professionals use their voices in different paralinguistic ways when they communicate with children. A teacher's use of voice is important because it serves to communicate attitudes and emotions in ways that are often ignored in early childhood…
17 Ways to Say Yes: Toward Nuanced Tone of Voice in AAC and Speech Technology

PubMed Central

Pullin, Graham; Hennig, Shannon

2015-01-01

Abstract People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice; the speculative concept Speech Hedge illustrates challenges and opportunities in designing more expressive user interfaces; the pilot project Tonetable could enable participatory research and seed a research network around tone of voice. We speculate that more radical interactions might expand frontiers of AAC and disrupt speech technology as a whole. PMID:25965913
Contributions of speech science to the technology of man-machine voice interactions

NASA Technical Reports Server (NTRS)

Lea, Wayne A.

1977-01-01

Research in speech understanding was reviewed. Plans which include prosodics research, phonological rules for speech understanding systems, and continued interdisciplinary phonetics research are discussed. Improved acoustic phonetic analysis capabilities in speech recognizers are suggested.
Double Voicing and Personhood in Collaborative Life Writing about Autism: the Transformative Narrative of Carly's Voice.

PubMed

Orlando, Monica

2018-06-01

Collaborative memoirs by co-writers with and without autism can enable the productive interaction of the voices of the writers in ways that can empower rather than exploit the disabled subject. Carly's Voice, co-written by Arthur Fleischmann and his autistic daughter Carly, demonstrates the capacity for such life narratives to facilitate the relational interaction between writers in the negotiation of understandings of disability. Though the text begins by focusing on the limitations of life with autism, it develops into a collaboration which helps both writers move toward new ways of understanding disability and their own and one another's life stories.
To Hybrid or Not to Hybrid, that Is the Question! Incorporating VoiceThread Technology into a Traditional Communication Course

ERIC Educational Resources Information Center

Pecot-Hebert, Lisa

2012-01-01

A hybrid course, which combines the face-to-face interactions of a traditional course with the flexibility of an online course, provides an alternative option for educating students in a new media environment. While educators often interact with their students through various electronic learning management systems that are set up within the…
Virtual Integrated Planning and Execution Resource System (VIPERS): The High Ground of 2025

DTIC Science & Technology

1996-04-01

earth to one meter will allow modeling of 21 enemy actions, to a degree only dreamed of before.8 For example, before starting an air campaign, an...that is facilitated by the system. Interaction may take the form of the written word, voice, video conferencing, or mental telepathy . Control speaks
PERCEPTUAL SYSTEMS IN READING--THE PREDICTION OF A TEMPORAL EYE-VOICE SPAN CONSTANT. PAPER.

ERIC Educational Resources Information Center

GEYER, JOHN JACOB

A STUDY WAS CONDUCTED TO DELINEATE HOW PERCEPTION OCCURS DURING ORAL READING. FROM AN ANALYSIS OF CLASSICAL AND MODERN RESEARCH, A HEURISTIC MODEL WAS CONSTRUCTED WHICH DELINEATED THE DIRECTLY INTERACTING SYSTEMS POSTULATED AS FUNCTIONING DURING ORAL READING. THE MODEL AS OUTLINED WAS DIFFERENTIATED LOGICALLY INTO THREE MAJOR PROCESSING…
Nurses using futuristic technology in today's healthcare setting.

PubMed

Wolf, Debra M; Kapadia, Amar; Kintzel, Jessie; Anton, Bonnie B

2009-01-01

Human computer interaction (HCI) equates nurses using voice assisted technology within a clinical setting to document patient care real time, retrieve patient information from care plans, and complete routine tasks. This is a reality currently utilized by clinicians today in acute and long term care settings. Voice assisted documentation provides hands & eyes free accurate documentation while enabling effective communication and task management. The speech technology increases the accuracy of documentation, while interfacing directly into the electronic health record (EHR). Using technology consisting of a light weight headset and small fist size wireless computer, verbal responses to easy to follow cues are converted into a database systems allowing staff to obtain individualized care status reports on demand. To further assist staff in their daily process, this innovative technology allows staff to send and receive pages as needed. This paper will discuss how leading edge and award winning technology is being integrated within the United States. Collaborative efforts between clinicians and analyst will be discussed reflecting the interactive design and build functionality. Features such as the system's voice responses and directed cues will be shared and how easily data can be documented, viewed and retrieved. Outcome data will be presented on how the technology impacted organization's quality outcomes, financial reimbursement, and employee's level of satisfaction.
Mobile phone-based interactive voice response as a tool for improving access to healthcare in remote areas in Ghana - an evaluation of user experiences.

PubMed

Brinkel, J; May, J; Krumkamp, R; Lamshöft, M; Kreuels, B; Owusu-Dabo, E; Mohammed, A; Bonacic Marinovic, A; Dako-Gyeke, P; Krämer, A; Fobil, J N

2017-05-01

To investigate and determine the factors that enhanced or constituted barriers to the acceptance of an mHealth system which was piloted in Asante-Akim North District of Ghana to support healthcare of children. Four semi-structured focus group discussions were conducted with a total of 37 mothers. Participants were selected from a study population of mothers who subscribed to a pilot mHealth system which used an interactive voice response (IVR) for its operations. Data were evaluated using qualitative content analysis methods. In addition, a short quantitative questionnaire assessed system's usability (SUS). Results revealed 10 categories of factors that facilitated user acceptance of the IVR system including quality-of-care experience, health education and empowerment of women. The eight categories of factors identified as barriers to user acceptance included the lack of human interaction, lack of update and training on the electronic advices provided and lack of social integration of the system into the community. The usability (SUS median: 79.3; range: 65-97.5) of the system was rated acceptable. The principles of the tested mHealth system could be of interest during infectious disease outbreaks, such as Ebola or Lassa fever, when there might be a special need for disease-specific health information within populations. © 2017 John Wiley & Sons Ltd.
Mechanics of human voice production and control

PubMed Central

Zhang, Zhaoyan

2016-01-01

As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed. PMID:27794319
Mechanics of human voice production and control.

PubMed

Zhang, Zhaoyan

2016-10-01

As the primary means of communication, voice plays an important role in daily life. Voice also conveys personal information such as social status, personal traits, and the emotional state of the speaker. Mechanically, voice production involves complex fluid-structure interaction within the glottis and its control by laryngeal muscle activation. An important goal of voice research is to establish a causal theory linking voice physiology and biomechanics to how speakers use and control voice to communicate meaning and personal information. Establishing such a causal theory has important implications for clinical voice management, voice training, and many speech technology applications. This paper provides a review of voice physiology and biomechanics, the physics of vocal fold vibration and sound production, and laryngeal muscular control of the fundamental frequency of voice, vocal intensity, and voice quality. Current efforts to develop mechanical and computational models of voice production are also critically reviewed. Finally, issues and future challenges in developing a causal theory of voice production and perception are discussed.
Prototype Software for Future Spaceflight Tested at Mars Desert Research Station

NASA Technical Reports Server (NTRS)

Clancey, William J.; Sierhuis, Maaretn; Alena, Rick; Dowding, John; Garry, Brent; Scott, Mike; Tompkins, Paul; vanHoof, Ron; Verma, Vandi

2006-01-01

NASA scientists in MDRS Crew 49 (April 23-May 7, 2006) field tested and significantly extended a prototype monitoring and advising system that integrates power system telemetry with a voice commanding interface. A distributed, wireless network of functionally specialized agents interacted with the crew to provide alerts (e.g., impending shut-down of inverter due to low battery voltage), access md interpret historical data, and display troubleshooting procedures. In practical application during two weeks, the system generated speech over loudspeakers and headsets lo alert the crew about the need to investigate power system problems. The prototype system adapts the Brahms/Mobile Agents toolkit to receive data from the OneMeter (Brand Electronics) electric metering system deployed by Crew 47. A computer on the upper deck was connected to loudspeakers, four others were paired with wireless (Bluetooth) headsets that enabled crew members to interact with their personal agents from anywhere in the hab. Voice commands and inquiries included: 1. What is the {battery | generator} {volts | amps | volts and amps}? 2. What is the status of the {generator | inverter | battery | solar panel}? 3. What is the hab{itat} {power usage | volts | voltage | amps | volts and amps}? 4. What was the average hab{itat} {amps | volts | voltage} since <#> {AM | PM)? 5. When did the {generator | batteries} change status? 6. Tell {me I | everyone} when{ ever} the generator goes offline. 7. Tell {me | | everyone} when the hab{itat} {amps | volts | voltage} {exceeds | drops brelow} <#>. 8. {Send | Take | Record} {a} voice note {(for | to} } {at }. This research demonstrates the principles of design in the context of use, investigating requirements through experimental use of prototype systems in an analog setting, and use of MDRS as a research facility for designing and implementing new systems.

The professional voice.

PubMed

Benninger, M S

2011-02-01

The human voice is not only the key to human communication but also serves as the primary musical instrument. Many professions rely on the voice, but the most noticeable and visible are singers. Care of the performing voice requires a thorough understanding of the interaction between the anatomy and physiology of voice production, along with an awareness of the interrelationships between vocalisation, acoustic science and non-vocal components of performance. This review gives an overview of the care and prevention of professional voice disorders by describing the unique and integrated anatomy and physiology of singing, the roles of development and training, and the importance of the voice care team.
Initial Progress Toward Development of a Voice-Based Computer-Delivered Motivational Intervention for Heavy Drinking College Students: An Experimental Study.

PubMed

Kahler, Christopher W; Lechner, William J; MacGlashan, James; Wray, Tyler B; Littman, Michael L

2017-06-28

Computer-delivered interventions have been shown to be effective in reducing alcohol consumption in heavy drinking college students. However, these computer-delivered interventions rely on mouse, keyboard, or touchscreen responses for interactions between the users and the computer-delivered intervention. The principles of motivational interviewing suggest that in-person interventions may be effective, in part, because they encourage individuals to think through and speak aloud their motivations for changing a health behavior, which current computer-delivered interventions do not allow. The objective of this study was to take the initial steps toward development of a voice-based computer-delivered intervention that can ask open-ended questions and respond appropriately to users' verbal responses, more closely mirroring a human-delivered motivational intervention. We developed (1) a voice-based computer-delivered intervention that was run by a human controller and that allowed participants to speak their responses to scripted prompts delivered by speech generation software and (2) a text-based computer-delivered intervention that relied on the mouse, keyboard, and computer screen for all interactions. We randomized 60 heavy drinking college students to interact with the voice-based computer-delivered intervention and 30 to interact with the text-based computer-delivered intervention and compared their ratings of the systems as well as their motivation to change drinking and their drinking behavior at 1-month follow-up. Participants reported that the voice-based computer-delivered intervention engaged positively with them in the session and delivered content in a manner consistent with motivational interviewing principles. At 1-month follow-up, participants in the voice-based computer-delivered intervention condition reported significant decreases in quantity, frequency, and problems associated with drinking, and increased perceived importance of changing drinking behaviors. In comparison to the text-based computer-delivered intervention condition, those assigned to voice-based computer-delivered intervention reported significantly fewer alcohol-related problems at the 1-month follow-up (incident rate ratio 0.60, 95% CI 0.44-0.83, P=.002). The conditions did not differ significantly on perceived importance of changing drinking or on measures of drinking quantity and frequency of heavy drinking. Results indicate that it is feasible to construct a series of open-ended questions and a bank of responses and follow-up prompts that can be used in a future fully automated voice-based computer-delivered intervention that may mirror more closely human-delivered motivational interventions to reduce drinking. Such efforts will require using advanced speech recognition capabilities and machine-learning approaches to train a program to mirror the decisions made by human controllers in the voice-based computer-delivered intervention used in this study. In addition, future studies should examine enhancements that can increase the perceived warmth and empathy of voice-based computer-delivered intervention, possibly through greater personalization, improvements in the speech generation software, and embodying the computer-delivered intervention in a physical form. ©Christopher W Kahler, William J Lechner, James MacGlashan, Tyler B Wray, Michael L Littman. Originally published in JMIR Mental Health (http://mental.jmir.org), 28.06.2017.
A meta-analysis of in-vehicle and nomadic voice-recognition system interaction and driving performance.

PubMed

Simmons, Sarah M; Caird, Jeff K; Steel, Piers

2017-09-01

Driver distraction is a growing and pervasive issue that requires multiple solutions. Voice-recognition (V-R) systems may decrease the visual-manual (V-M) demands of a wide range of in-vehicle system and smartphone interactions. However, the degree that V-R systems integrated into vehicles or available in mobile phone applications affect driver distraction is incompletely understood. A comprehensive meta-analysis of experimental studies was conducted to address this knowledge gap. To meet study inclusion criteria, drivers had to interact with a V-R system while driving and doing everyday V-R tasks such as dialing, initiating a call, texting, emailing, destination entry or music selection. Coded dependent variables included detection, reaction time, lateral position, speed and headway. Comparisons of V-R systems with baseline driving and/or a V-M condition were also coded. Of 817 identified citations, 43 studies involving 2000 drivers and 183 effect sizes (r) were analyzed in the meta-analysis. Compared to baseline, driving while interacting with a V-R system is associated with increases in reaction time and lane positioning, and decreases in detection. When V-M systems were compared to V-R systems, drivers had slightly better performance with the latter system on reaction time, lane positioning and headway. Although V-R systems have some driving performance advantages over V-M systems, they have a distraction cost relative to driving without any system at all. The pattern of results indicates that V-R systems impose moderate distraction costs on driving. In addition, drivers minimally engage in compensatory performance adjustments such as reducing speed and increasing headway while using V-R systems. Implications of the results for theory, design guidelines and future research are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multipath for Agricultural and Rural Information Services in China

NASA Astrophysics Data System (ADS)

Ge, Ningning; Zang, Zhiyuan; Gao, Lingwang; Shi, Qiang; Li, Jie; Xing, Chunlin; Shen, Zuorui

Internet cannot provide perfect information services for farmers in rural regions in China, because farmers in rural regions can hardly access the internet by now. But the wide coverage of mobile signal, telephone line, and television network, etc. gave us a chance to solve the problem. The integrated pest management platform of Northern fruit trees were developed based on the integrated technology, which can integrate the internet, mobile and fixed-line telephone network, and television network, to provide integrated pest management(IPM) information services for farmers in rural regions in E-mail, telephone-voice, short message, voice mail, videoconference or other format, to users' telephone, cell phone, personal computer, personal digital assistant(PDA), television, etc. alternatively. The architecture and the functions of the system were introduced in the paper. The system can manage the field monitoring data of agricultural pests, deal with enquiries to provide the necessary information to farmers accessing the interactive voice response(IVR) in the system with the experts on-line or off-line, and issue the early warnings about the fruit tree pests when it is necessary according to analysis on the monitoring data about the pests of fruit trees in variety of ways including SMS, fax, voice and intersystem e-mail.The system provides a platform and a new pattern for agricultural technology extension with a high coverage rate of agricultural technology in rural regions, and it can solve the problem of agriculture information service 'last kilometer' in China. The effectiveness of the system was certified.
Integrating cues of social interest and voice pitch in men's preferences for women's voices.

PubMed

Jones, Benedict C; Feinberg, David R; Debruine, Lisa M; Little, Anthony C; Vukovic, Jovana

2008-04-23

Most previous studies of vocal attractiveness have focused on preferences for physical characteristics of voices such as pitch. Here we examine the content of vocalizations in interaction with such physical traits, finding that vocal cues of social interest modulate the strength of men's preferences for raised pitch in women's voices. Men showed stronger preferences for raised pitch when judging the voices of women who appeared interested in the listener than when judging the voices of women who appeared relatively disinterested in the listener. These findings show that voice preferences are not determined solely by physical properties of voices and that men integrate information about voice pitch and the degree of social interest expressed by women when forming voice preferences. Women's preferences for raised pitch in women's voices were not modulated by cues of social interest, suggesting that the integration of cues of social interest and voice pitch when men judge the attractiveness of women's voices may reflect adaptations that promote efficient allocation of men's mating effort.
Measuring the intuitive response of users when faced with different interactive paradigms to control a gastroenterology CAD system.

PubMed

Abrantes, D; Gomes, P; Pereira, D; Coimbra, M

2016-08-01

The gastroenterology specialty could benefit from the introduction of Computer Assisted Decision (CAD) systems, since gastric cancer is a serious concern in which an accurate and early diagnosis usually leads to a good prognosis. Still, the way doctors interact with these systems is very important because it will often determine its embracement or rejection, as any gains in productivity will frequently hinge on how comfortable they are with it. Using other types of interaction paradigms such as voice and motion control, is important in a way that typical inputs such as keyboard and mouse are sometimes not the best choice for certain clinical scenarios. In order to ascertain how a doctor could control a hypothetical CAD system during a gastroenterology exam, we measured the natural response of users when faced with three different task requests, using three types of interaction paradigms: voice, gesture and endoscope. Results fit in what was expected, with gesture control being the most intuitive to use, and the endoscope being on the other edge. All the technologies are mature enough to cope with the response concepts the participants gave us. However, when having into account the scenario context, better natural response scores may not always be the best choice for implementation. That way, simplification or reduction of tasks, along with a well tought-out interface, or even mixing more oriented paradigms for particular requests, could allow for better system control with fewer inconveniences for the user.
It doesn't matter what you say: FMRI correlates of voice learning and recognition independent of speech content.

PubMed

Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal

2017-09-01

Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity, independent of speech content. Copyright © 2017 Elsevier Ltd. All rights reserved.
The smartphone and the driver's cognitive workload: A comparison of Apple, Google, and Microsoft's intelligent personal assistants.

PubMed

Strayer, David L; Cooper, Joel M; Turrill, Jonna; Coleman, James R; Hopman, Rachel J

2017-06-01

The goal of this research was to examine the impact of voice-based interactions using 3 different intelligent personal assistants (Apple's Siri , Google's Google Now for Android phones, and Microsoft's Cortana ) on the cognitive workload of the driver. In 2 experiments using an instrumented vehicle on suburban roadways, we measured the cognitive workload of drivers when they used the voice-based features of each smartphone to place a call, select music, or send text messages. Cognitive workload was derived from primary task performance through video analysis, secondary-task performance using the Detection Response Task (DRT), and subjective mental workload. We found that workload was significantly higher than that measured in the single-task drive. There were also systematic differences between the smartphones: The Google system placed lower cognitive demands on the driver than the Apple and Microsoft systems, which did not differ. Video analysis revealed that the difference in mental workload between the smartphones was associated with the number of system errors, the time to complete an action, and the complexity and intuitiveness of the devices. Finally, surprisingly high levels of cognitive workload were observed when drivers were interacting with the devices: "on-task" workload measures did not systematically differ from that associated with a mentally demanding Operation Span (OSPAN) task. The analysis also found residual costs associated using each of the smartphones that took a significant time to dissipate. The data suggest that caution is warranted in the use of smartphone voice-based technology in the vehicle because of the high levels of cognitive workload associated with these interactions. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Computer-automated dementia screening using a touch-tone telephone.

PubMed

Mundt, J C; Ferber, K L; Rizzo, M; Greist, J H

2001-11-12

This study investigated the sensitivity and specificity of a computer-automated telephone system to evaluate cognitive impairment in elderly callers to identify signs of early dementia. The Clinical Dementia Rating Scale was used to assess 155 subjects aged 56 to 93 years (n = 74, 27, 42, and 12, with a Clinical Dementia Rating Scale score of 0, 0.5, 1, and 2, respectively). These subjects performed a battery of tests administered by an interactive voice response system using standard Touch-Tone telephones. Seventy-four collateral informants also completed an interactive voice response version of the Symptoms of Dementia Screener. Sixteen cognitively impaired subjects were unable to complete the telephone call. Performances on 6 of 8 tasks were significantly influenced by Clinical Dementia Rating Scale status. The mean (SD) call length was 12 minutes 27 seconds (2 minutes 32 seconds). A subsample (n = 116) was analyzed using machine-learning methods, producing a scoring algorithm that combined performances across 4 tasks. Results indicated a potential sensitivity of 82.0% and specificity of 85.5%. The scoring model generalized to a validation subsample (n = 39), producing 85.0% sensitivity and 78.9% specificity. The kappa agreement between predicted and actual group membership was 0.64 (P<.001). Of the 16 subjects unable to complete the call, 11 provided sufficient information to permit us to classify them as impaired. Standard scoring of the interactive voice response-administered Symptoms of Dementia Screener (completed by informants) produced a screening sensitivity of 63.5% and 100% specificity. A lower criterion found a 90.4% sensitivity, without lowering specificity. Computer-automated telephone screening for early dementia using either informant or direct assessment is feasible. Such systems could provide wide-scale, cost-effective screening, education, and referral services to patients and caregivers.
Toward a Trustworthy Voice: Increasing the Effectiveness of Automated Outreach Calls to Promote Colorectal Cancer Screening among African Americans

PubMed Central

Albright, Karen; Richardson, Terri; Kempe, Karin L; Wallace, Kristin

2014-01-01

Introduction: Colorectal cancer screening rates are lower among African-American members of Kaiser Permanente Colorado (KPCO) than among members of other races and ethnicities. This study evaluated use of a linguistically congruent voice in interactive voice response outreach calls about colorectal cancer screening as a strategy to increase call completion and response. Methods: After an initial discussion group to assess cultural acceptability of the project, 6 focus groups were conducted with 33 KPCO African-American members. Participants heard and discussed recordings of 5 female voices reading the same segment of the standard-practice colorectal cancer message using interactive voice response. The linguistic palette included the voices of a white woman, a lightly accented Latina, and 3 African-American women. Results: Participants strongly preferred the African-American voices, particularly two voices. Participants considered these voices the most trustworthy and reported that they would be the most effective at increasing motivation to complete an automated call. Participants supported the use of African-American voices when designing outgoing automated calls for African Americans because the sense of familiarity engendered trust among listeners. Participants also indicated that effective automated messages should provide immediate clarity of purpose; explain why the issue is relevant to African Americans; avoid sounding scripted; emphasize that the call is for the listener’s benefit only; sound personable, warm, and positive; and not create fear among listeners. Discussion: Establishing linguistic congruence between African Americans and the voices used in automated calls designed to reach them may increase the effectiveness of outreach efforts. PMID:24867548
Auditory and visual modulation of temporal lobe neurons in voice-sensitive and association cortices.

PubMed

Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K; Petkov, Christopher I

2014-02-12

Effective interactions between conspecific individuals can depend upon the receiver forming a coherent multisensory representation of communication signals, such as merging voice and face content. Neuroimaging studies have identified face- or voice-sensitive areas (Belin et al., 2000; Petkov et al., 2008; Tsao et al., 2008), some of which have been proposed as candidate regions for face and voice integration (von Kriegstein et al., 2005). However, it was unclear how multisensory influences occur at the neuronal level within voice- or face-sensitive regions, especially compared with classically defined multisensory regions in temporal association cortex (Stein and Stanford, 2008). Here, we characterize auditory (voice) and visual (face) influences on neuronal responses in a right-hemisphere voice-sensitive region in the anterior supratemporal plane (STP) of Rhesus macaques. These results were compared with those in the neighboring superior temporal sulcus (STS). Within the STP, our results show auditory sensitivity to several vocal features, which was not evident in STS units. We also newly identify a functionally distinct neuronal subpopulation in the STP that appears to carry the area's sensitivity to voice identity related features. Audiovisual interactions were prominent in both the STP and STS. However, visual influences modulated the responses of STS neurons with greater specificity and were more often associated with congruent voice-face stimulus pairings than STP neurons. Together, the results reveal the neuronal processes subserving voice-sensitive fMRI activity patterns in primates, generate hypotheses for testing in the visual modality, and clarify the position of voice-sensitive areas within the unisensory and multisensory processing hierarchies.
Auditory and Visual Modulation of Temporal Lobe Neurons in Voice-Sensitive and Association Cortices

PubMed Central

Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K.

2014-01-01

Effective interactions between conspecific individuals can depend upon the receiver forming a coherent multisensory representation of communication signals, such as merging voice and face content. Neuroimaging studies have identified face- or voice-sensitive areas (Belin et al., 2000; Petkov et al., 2008; Tsao et al., 2008), some of which have been proposed as candidate regions for face and voice integration (von Kriegstein et al., 2005). However, it was unclear how multisensory influences occur at the neuronal level within voice- or face-sensitive regions, especially compared with classically defined multisensory regions in temporal association cortex (Stein and Stanford, 2008). Here, we characterize auditory (voice) and visual (face) influences on neuronal responses in a right-hemisphere voice-sensitive region in the anterior supratemporal plane (STP) of Rhesus macaques. These results were compared with those in the neighboring superior temporal sulcus (STS). Within the STP, our results show auditory sensitivity to several vocal features, which was not evident in STS units. We also newly identify a functionally distinct neuronal subpopulation in the STP that appears to carry the area's sensitivity to voice identity related features. Audiovisual interactions were prominent in both the STP and STS. However, visual influences modulated the responses of STS neurons with greater specificity and were more often associated with congruent voice-face stimulus pairings than STP neurons. Together, the results reveal the neuronal processes subserving voice-sensitive fMRI activity patterns in primates, generate hypotheses for testing in the visual modality, and clarify the position of voice-sensitive areas within the unisensory and multisensory processing hierarchies. PMID:24523543
Engaged Voices--Dialogic Interaction and the Construction of Shared Social Meanings

ERIC Educational Resources Information Center

Cruddas, Leora

2007-01-01

The notion of "pupil voice" reproduces the binary distinction between adult and child, pupil and teacher and therefore serves to reinforce "conventional" constructions of childhood. The concept of "voice" invokes an essentialist construction of self that is singular, coherent, consistent and rational. It is arguably…
MIT-NASA/KSC space life science experiments - A telescience testbed

NASA Technical Reports Server (NTRS)

Oman, Charles M.; Lichtenberg, Byron K.; Fiser, Richard L.; Vordermark, Deborah S.

1990-01-01

Experiments performed at MIT to better define Space Station information system telescience requirements for effective remote coaching of astronauts by principal investigators (PI) on the ground are described. The experiments were conducted via satellite video, data, and voice links to surrogate crewmembers working in a laboratory at NASA's Kennedy Space Center. Teams of two PIs and two crewmembers performed two different space life sciences experiments. During 19 three-hour interactive sessions, a variety of test conditions were explored. Since bit rate limits are necessarily imposed on Space Station video experiments surveillance video was varied down to 50 Kb/s and the effectiveness of PI controlled frame rate, resolution, grey scale, and color decimation was investigated. It is concluded that remote coaching by voice works and that dedicated crew-PI voice loops would be of great value on the Space Station.
Robotic air vehicle. Blending artificial intelligence with conventional software

NASA Technical Reports Server (NTRS)

Mcnulty, Christa; Graham, Joyce; Roewer, Paul

1987-01-01

The Robotic Air Vehicle (RAV) system is described. The program's objectives were to design, implement, and demonstrate cooperating expert systems for piloting robotic air vehicles. The development of this system merges conventional programming used in passive navigation with Artificial Intelligence techniques such as voice recognition, spatial reasoning, and expert systems. The individual components of the RAV system are discussed as well as their interactions with each other and how they operate as a system.
Telemedicine to promote patient safety: Use of phone-based interactive voice response system (IVRS) to reduce adverse safety events in predialysis CKD

PubMed Central

Weiner, Shoshana; Fink, Jeffery C.

2017-01-01

Chronic kidney disease (CKD) patients have several features conferring upon them a high risk of adverse safety events, which are defined as incidents with unintended harm related to processes of care or medications. These characteristics include impaired renal function, polypharmacy, and frequent health system encounters. The consequences of such events in CKD can include new or prolonged hospitalization, accelerated renal function loss, acute kidney injury, end-stage renal disease and death. Health information technology administered via telemedicine presents opportunities for CKD patients to remotely communicate safety-related findings to providers for the purpose of improving their care. However, many CKD patients have limitations which hinder their use of telemedicine and access to the broad capabilities of health information technology. In this review we summarize previous assessments of the pre-dialysis CKD populations’ proficiency in using telemedicine modalities and describe the use of interactive voice-response system (IVRS) to gauge the safety phenotype of the CKD patient. We discuss the potential for expanded IVRS use in CKD to address the safety threats inherent to this population. PMID:28224940
Interactive Communication: A Few Research Answers for a Technological Explosion.

ERIC Educational Resources Information Center

Chapanis, Alphonse

The techniques, procedures, and principal findings of 15 different experiments in a research program on interactive communication are summarized in this paper. Among the principal findings reported are that: problems are solved faster in communication modes that have a voice channel than in those that do not have a voice channel, modes of…
Collaborative Scaffolding in Online Task-Based Voice Interactions between Advanced Learners

ERIC Educational Resources Information Center

Kenning, Marie-Madeleine

2010-01-01

This paper reports some of the findings of a distinctive innovative use of audio-conferencing involving a population (campus-based advanced learners) and a type of application (task-based language learning) that have received little attention to date: the use of Wimba Voice Tools to provide additional opportunities for spoken interactions between…
47 CFR 25.259 - Time sharing between NOAA meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2014 CFR

2014-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. 25.259 Section... systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. (a) The space stations of a non-voice, non-geostationary Mobile-Satellite Service (NVNG MSS) system time-sharing downlink...
47 CFR 25.259 - Time sharing between NOAA meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2013 CFR

2013-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. 25.259 Section... systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. (a) The space stations of a non-voice, non-geostationary Mobile-Satellite Service (NVNG MSS) system time-sharing downlink...

75 FR 30845 - Request Voucher for Grant Payment and Line of Credit Control System (LOCCS) Voice Response System...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-06-02

... request vouchers for distribution of grant funds using the automated Voice Response System (VRS). An... Payment and Line of Credit Control System (LOCCS) Voice Response System Access Authorization AGENCY... subject proposal. Payment request vouchers for distribution of grant funds using the automated Voice...
Empowering Student Voice through Interactive Design and Digital Making

ERIC Educational Resources Information Center

Kim, Yanghee; Searle, Kristin

2017-01-01

Over the last two decades online technology and digital media have provided space for students to participate and express their voices. This paper further explores how new digital technologies, such as humanoid robots and wearable electronics, can be used to offer additional spaces where students' voices are heard. In these spaces, young students…
Gender in Voice Perception in Autism

ERIC Educational Resources Information Center

Groen, Wouter B.; van Orsouw, Linda; Zwiers, Marcel; Swinkels, Sophie; van der Gaag, Rutger Jan; Buitelaar, Jan K.

2008-01-01

Deficits in the perception of social stimuli may contribute to the characteristic impairments in social interaction in high functioning autism (HFA). Although the cortical processing of voice is abnormal in HFA, it is unclear whether this gives rise to impairments in the perception of voice gender. About 20 children with HFA and 20 matched…
The value of visualizing tone of voice.

PubMed

Pullin, Graham; Cook, Andrew

2013-10-01

Whilst most of us have an innate feeling for tone of voice, it is an elusive quality that even phoneticians struggle to describe with sufficient subtlety. For people who cannot speak themselves this can have particularly profound repercussions. Augmentative communication often involves text-to-speech, a technology that only supports a basic choice of prosody based on punctuation. Given how inherently difficult it is to talk about more nuanced tone of voice, there is a risk that its absence from current devices goes unremarked and unchallenged. Looking ahead optimistically to more expressive communication aids, their design will need to involve more subtle interactions with tone of voice-interactions that the people using them can understand and engage with. Interaction design can play a role in making tone of voice visible, tangible, and accessible. Two projects that have already catalysed interdisciplinary debate in this area, Six Speaking Chairs and Speech Hedge, are introduced together with responses. A broader role for design is advocated, as a means to opening up speech technology research to a wider range of disciplinary perspectives, and also to the contributions and influence of people who use it in their everyday lives.
Measures of voiced frication for automatic classification

NASA Astrophysics Data System (ADS)

Jackson, Philip J. B.; Jesus, Luis M. T.; Shadle, Christine H.; Pincas, Jonathan

2004-05-01

As an approach to understanding the characteristics of the acoustic sources in voiced fricatives, it seems apt to draw on knowledge of vowels and voiceless fricatives, which have been relatively well studied. However, the presence of both phonation and frication in these mixed-source sounds offers the possibility of mutual interaction effects, with variations across place of articulation. This paper examines the acoustic and articulatory consequences of these interactions and explores automatic techniques for finding parametric and statistical descriptions of these phenomena. A reliable and consistent set of such acoustic cues could be used for phonetic classification or speech recognition. Following work on devoicing of European Portuguese voiced fricatives [Jesus and Shadle, in Mamede et al. (eds.) (Springer-Verlag, Berlin, 2003), pp. 1-8]. and the modulating effect of voicing on frication [Jackson and Shadle, J. Acoust. Soc. Am. 108, 1421-1434 (2000)], the present study focuses on three types of information: (i) sequences and durations of acoustic events in VC transitions, (ii) temporal, spectral and modulation measures from the periodic and aperiodic components of the acoustic signal, and (iii) voicing activity derived from simultaneous EGG data. Analysis of interactions observed in British/American English and European Portuguese speech corpora will be compared, and the principal findings discussed.
47 CFR 25.260 - Time sharing between DoD meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2014 CFR

2014-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. 25.260... systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. (a) The space stations of a non-voice, non-geostationary Mobile-Satellite Service (NVNG MSS) system time-sharing downlink...
47 CFR 25.260 - Time sharing between DoD meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2010 CFR

2010-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. 25.260... systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in...
47 CFR 25.260 - Time sharing between DoD meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2013 CFR

2013-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. 25.260... systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. (a) The space stations of a non-voice, non-geostationary Mobile-Satellite Service (NVNG MSS) system time-sharing downlink...
47 CFR 25.260 - Time sharing between DoD meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2011 CFR

2011-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. 25.260... systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in...
47 CFR 25.259 - Time sharing between NOAA meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2010 CFR

2010-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. 25.259 Section... systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in the 137-138...
47 CFR 25.260 - Time sharing between DoD meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2012 CFR

2012-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. 25.260... systems and non-voice, non-geostationary satellite systems in the 400.15-401 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in...
47 CFR 25.259 - Time sharing between NOAA meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2011 CFR

2011-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. 25.259 Section... systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in the 137-138...
47 CFR 25.259 - Time sharing between NOAA meteorological satellite systems and non-voice, non-geostationary...

Code of Federal Regulations, 2012 CFR

2012-10-01

... satellite systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. 25.259 Section... systems and non-voice, non-geostationary satellite systems in the 137-138 MHz band. (a) A non-voice, non-geostationary mobile-satellite service system licensee (“NVNG licensee”) time-sharing spectrum in the 137-138...
Effects of a Voice Output Communication Aid on Interactions between Support Personnel and an Individual with Multiple Disabilities.

ERIC Educational Resources Information Center

Schepis, Maureen M.; Reid, Dennis H.

1995-01-01

A young adult with multiple disabilities (profound mental retardation, spastic quadriplegia, and visual impairment) was provided with a voice output communication aid (VOCA) which allowed communication through synthesized speech. Both educational and residential staff members interacted with the individual more frequently when she had access to…
Feasibility of automated speech sample collection with stuttering children using interactive voice response (IVR) technology.

PubMed

Vogel, Adam P; Block, Susan; Kefalianos, Elaina; Onslow, Mark; Eadie, Patricia; Barth, Ben; Conway, Laura; Mundt, James C; Reilly, Sheena

2015-04-01

To investigate the feasibility of adopting automated interactive voice response (IVR) technology for remotely capturing standardized speech samples from stuttering children. Participants were 10 6-year-old stuttering children. Their parents called a toll-free number from their homes and were prompted to elicit speech from their children using a standard protocol involving conversation, picture description and games. The automated IVR system was implemented using an off-the-shelf telephony software program and delivered by a standard desktop computer. The software infrastructure utilizes voice over internet protocol. Speech samples were automatically recorded during the calls. Video recordings were simultaneously acquired in the home at the time of the call to evaluate the fidelity of the telephone collected samples. Key outcome measures included syllables spoken, percentage of syllables stuttered and an overall rating of stuttering severity using a 10-point scale. Data revealed a high level of relative reliability in terms of intra-class correlation between the video and telephone acquired samples on all outcome measures during the conversation task. Findings were less consistent for speech samples during picture description and games. Results suggest that IVR technology can be used successfully to automate remote capture of child speech samples.
The Interaction of Eye-Voice Span with Syntactic Chunking and Predictability in Right- and Left-Embedded Sentences.

ERIC Educational Resources Information Center

Balajthy, Ernest P., Jr.

Sixty tenth graders participated in this study of relationships between eye/voice span, phrase and clause boundaries, reading ability, and sentence structure. Results indicated that sentences apparently are "chunked" into surface constituents during processing. Better tenth grade readers had longer eye/voice spans than did poorer readers and…
Strategies for the Production of Spanish Stop Consonants by Native Speakers of English.

ERIC Educational Resources Information Center

Zampini, Mary L.

A study examined patterns in production of Spanish voiced and voiceless stop consonants by native English speakers, focusing on the interaction between two acoustic cues of stops: voice closure interval and voice onset time (VOT). The study investigated whether learners acquire the appropriate phonetic categories with regard to these stops and if…
Relating to the Speaker behind the Voice: What Is Changing?

PubMed Central

Deamer, Felicity; Hayward, Mark

2018-01-01

We introduce therapeutic techniques that encourage voice hearers to view their voices as coming from intentional agents whose behavior may be dependent on how the voice hearer relates to and interacts with them. We suggest that this approach is effective because the communicative aspect of voice hearing might fruitfully be seen as explanatorily primitive, meaning that the agentive aspect, the auditory properties, and the intended meaning (interpretation) are all necessary parts of the experience, which contribute to the impact the experience has on the voice hearer. We examine the experiences of a patient who received Relating Therapy, and explore the kinds of changes that can result from this therapeutic approach. PMID:29422879
Pointing and Voicing in Deictic Expressions.

ERIC Educational Resources Information Center

Levelt, Willem J. M.; And Others

1985-01-01

Describes a study of how the interdependence of speech and gesture is realized in the course of motor planning and execution. Do the two systems operate interactively or do they operate in a ballistic or independent fashion? Four experiments showed that, for deictic expressions, the ballistic view is very nearly correct. (SED)
Bilingual Computerized Speech Recognition Screening for Depression Symptoms

ERIC Educational Resources Information Center

Gonzalez, Gerardo; Carter, Colby; Blanes, Erika

2007-01-01

The Voice-Interactive Depression Assessment System (VIDAS) is a computerized speech recognition application for screening depression based on the Center for Epidemiological Studies--Depression scale in English and Spanish. Study 1 included 50 English and 47 Spanish speakers. Study 2 involved 108 English and 109 Spanish speakers. Participants…

Bigdata Oriented Multimedia Mobile Health Applications.

PubMed

Lv, Zhihan; Chirivella, Javier; Gagliardo, Pablo

2016-05-01

In this paper, two mHealth applications are introduced, which can be employed as the terminals of bigdata based health service to collect information for electronic medical records (EMRs). The first one is a hybrid system for improving the user experience in the hyperbaric oxygen chamber by 3D stereoscopic virtual reality glasses and immersive perception. Several HMDs have been tested and compared. The second application is a voice interactive serious game as a likely solution for providing assistive rehabilitation tool for therapists. The recorder of the voice of patients could be analysed to evaluate the long-time rehabilitation results and further to predict the rehabilitation process.
Learned face-voice pairings facilitate visual search.

PubMed

Zweig, L Jacob; Suzuki, Satoru; Grabowecky, Marcia

2015-04-01

Voices provide a rich source of information that is important for identifying individuals and for social interaction. During search for a face in a crowd, voices often accompany visual information, and they facilitate localization of the sought-after individual. However, it is unclear whether this facilitation occurs primarily because the voice cues the location of the face or because it also increases the salience of the associated face. Here we demonstrate that a voice that provides no location information nonetheless facilitates visual search for an associated face. We trained novel face-voice associations and verified learning using a two-alternative forced choice task in which participants had to correctly match a presented voice to the associated face. Following training, participants searched for a previously learned target face among other faces while hearing one of the following sounds (localized at the center of the display): a congruent learned voice, an incongruent but familiar voice, an unlearned and unfamiliar voice, or a time-reversed voice. Only the congruent learned voice speeded visual search for the associated face. This result suggests that voices facilitate the visual detection of associated faces, potentially by increasing their visual salience, and that the underlying crossmodal associations can be established through brief training.
Selective attention modulates early human evoked potentials during emotional face-voice processing.

PubMed

Ho, Hao Tam; Schröger, Erich; Kotz, Sonja A

2015-04-01

Recent findings on multisensory integration suggest that selective attention influences cross-sensory interactions from an early processing stage. Yet, in the field of emotional face-voice integration, the hypothesis prevails that facial and vocal emotional information interacts preattentively. Using ERPs, we investigated the influence of selective attention on the perception of congruent versus incongruent combinations of neutral and angry facial and vocal expressions. Attention was manipulated via four tasks that directed participants to (i) the facial expression, (ii) the vocal expression, (iii) the emotional congruence between the face and the voice, and (iv) the synchrony between lip movement and speech onset. Our results revealed early interactions between facial and vocal emotional expressions, manifested as modulations of the auditory N1 and P2 amplitude by incongruent emotional face-voice combinations. Although audiovisual emotional interactions within the N1 time window were affected by the attentional manipulations, interactions within the P2 modulation showed no such attentional influence. Thus, we propose that the N1 and P2 are functionally dissociated in terms of emotional face-voice processing and discuss evidence in support of the notion that the N1 is associated with cross-sensory prediction, whereas the P2 relates to the derivation of an emotional percept. Essentially, our findings put the integration of facial and vocal emotional expressions into a new perspective-one that regards the integration process as a composite of multiple, possibly independent subprocesses, some of which are susceptible to attentional modulation, whereas others may be influenced by additional factors.
Use of speech generating devices can improve perception of qualifications for skilled, verbal, and interactive jobs.

PubMed

Stern, Steven E; Chobany, Chelsea M; Beam, Alexander A; Hoover, Brittany N; Hull, Thomas T; Linsenbigler, Melissa; Makdad-Light, Courtney; Rubright, Courtney N

2017-01-01

We have previously demonstrated that when speech generating devices (SGD) are used as assistive technologies, they are preferred over the users' natural voices. We sought to examine whether using SGDs would affect listener's perceptions of hirability of people with complex communication needs. In a series of three experiments, participants rated videotaped actors, one using SGD and the other using their natural, mildly dysarthric voice, on (a) a measurement of perceptions of speaker credibility, strength, and informedness and (b) measurements of hirability for jobs coded in terms of skill, verbal ability, and interactivity. Experiment 1 examined hirability for jobs varying in terms of skill and verbal ability. Experiment 2 was a replication that examined hirability for jobs varying in terms of interactivity. Experiment 3 examined jobs in terms of skill and specific mode of interaction (face-to-face, telephone, computer-mediated). Actors were rated more favorably when using SGD than their own voices. Actors using SGD were also rated more favorably for highly skilled and highly verbal jobs. This preference for SGDs over mildly dysarthric voice was also found for jobs entailing computer-mediated-communication, particularly skillful jobs.
Low is large: spatial location and pitch interact in voice-based body size estimation.

PubMed

Pisanski, Katarzyna; Isenstein, Sari G E; Montano, Kelyn J; O'Connor, Jillian J M; Feinberg, David R

2017-05-01

The binding of incongruent cues poses a challenge for multimodal perception. Indeed, although taller objects emit sounds from higher elevations, low-pitched sounds are perceptually mapped both to large size and to low elevation. In the present study, we examined how these incongruent vertical spatial cues (up is more) and pitch cues (low is large) to size interact, and whether similar biases influence size perception along the horizontal axis. In Experiment 1, we measured listeners' voice-based judgments of human body size using pitch-manipulated voices projected from a high versus a low, and a right versus a left, spatial location. Listeners associated low spatial locations with largeness for lowered-pitch but not for raised-pitch voices, demonstrating that pitch overrode vertical-elevation cues. Listeners associated rightward spatial locations with largeness, regardless of voice pitch. In Experiment 2, listeners performed the task while sitting or standing, allowing us to examine self-referential cues to elevation in size estimation. Listeners associated vertically low and rightward spatial cues with largeness more for lowered- than for raised-pitch voices. These correspondences were robust to sex (of both the voice and the listener) and head elevation (standing or sitting); however, horizontal correspondences were amplified when participants stood. Moreover, when participants were standing, their judgments of how much larger men's voices sounded than women's increased when the voices were projected from the low speaker. Our results provide novel evidence for a multidimensional spatial mapping of pitch that is generalizable to human voices and that affects performance in an indirect, ecologically relevant spatial task (body size estimation). These findings suggest that crossmodal pitch correspondences evoke both low-level and higher-level cognitive processes.
Bilingual Voicing: A Study of Code-Switching in the Reported Speech of Finnish Immigrants in Estonia

ERIC Educational Resources Information Center

Frick, Maria; Riionheimo, Helka

2013-01-01

Through a conversation analytic investigation of Finnish-Estonian bilingual (direct) reported speech (i.e., voicing) by Finns who live in Estonia, this study shows how code-switching is used as a double contextualization device. The code-switched voicings are shaped by the on-going interactional situation, serving its needs by opening up a context…
Understanding the mechanisms of familiar voice-identity recognition in the human brain.

PubMed

Maguinness, Corrina; Roswandowitz, Claudia; von Kriegstein, Katharina

2018-03-31

Humans have a remarkable skill for voice-identity recognition: most of us can remember many voices that surround us as 'unique'. In this review, we explore the computational and neural mechanisms which may support our ability to represent and recognise a unique voice-identity. We examine the functional architecture of voice-sensitive regions in the superior temporal gyrus/sulcus, and bring together findings on how these regions may interact with each other, and additional face-sensitive regions, to support voice-identity processing. We also contrast findings from studies on neurotypicals and clinical populations which have examined the processing of familiar and unfamiliar voices. Taken together, the findings suggest that representations of familiar and unfamiliar voices might dissociate in the human brain. Such an observation does not fit well with current models for voice-identity processing, which by-and-large assume a common sequential analysis of the incoming voice signal, regardless of voice familiarity. We provide a revised audio-visual integrative model of voice-identity processing which brings together traditional and prototype models of identity processing. This revised model includes a mechanism of how voice-identity representations are established and provides a novel framework for understanding and examining the potential differences in familiar and unfamiliar voice processing in the human brain. Copyright © 2018 Elsevier Ltd. All rights reserved.
Deficits in voice and multisensory processing in patients with Prader-Willi syndrome.

PubMed

Salles, Juliette; Strelnikov, Kuzma; Carine, Mantoulan; Denise, Thuilleaux; Laurier, Virginie; Molinas, Catherine; Tauber, Maïthé; Barone, Pascal

2016-05-01

Prader-Willi syndrome (PWS) is a rare neurodevelopmental and genetic disorder that is characterized by various expression of endocrine, cognitive and behavioral problems, among which a true obsession for food and a deficit of satiety that leads to hyperphagia and severe obesity. Neuropsychological studies have reported that PWS display altered social interactions with a specific weakness in interpreting social information and in responding to them, a symptom closed to that observed in autism spectrum disorders (ASD). Based on the hypothesis that atypical multisensory integration such as face and voice interactions would contribute in PWS to social impairment we investigate the abilities of PWS to process communication signals including the human voice. Patients with PWS recruited from the national reference center for PWS performed a simple detection task of stimuli presented in an uni-o or bimodal condition, as well as a voice discrimination task. Compared to control typically developing (TD) individuals, PWS present a specific deficit in discriminating human voices from environmental sounds. Further, PWS present a much lower multisensory benefits with an absence of violation of the race model indicating that multisensory information do not converge and interact prior to the initiation of the behavioral response. All the deficits observed in PWS were stronger for the subgroup of patients suffering from Uniparental Disomy, a population known to be more sensitive to ASD. Altogether, our study suggests that the deficits in social behavior observed in PWS derive at least partly from an impairment in deciphering the social information carried by voice signals, face signals, and the combination of both. In addition, our work is in agreement with the brain imaging studies revealing an alteration in PWS of the "social brain network" including the STS region involved in processing human voices. Copyright © 2016 Elsevier Ltd. All rights reserved.
Voice and endocrinology

PubMed Central

Hari Kumar, K. V. S.; Garg, Anurag; Ajai Chandra, N. S.; Singh, S. P.; Datta, Rakesh

2016-01-01

Voice is one of the advanced features of natural evolution that differentiates human beings from other primates. The human voice is capable of conveying the thoughts into spoken words along with a subtle emotion to the tone. This extraordinary character of the voice in expressing multiple emotions is the gift of God to the human beings and helps in effective interpersonal communication. Voice generation involves close interaction between cerebral signals and the peripheral apparatus consisting of the larynx, vocal cords, and trachea. The human voice is susceptible to the hormonal changes throughout life right from the puberty until senescence. Thyroid, gonadal and growth hormones have tremendous impact on the structure and function of the vocal apparatus. The alteration of voice is observed even in physiological states such as puberty and menstruation. Astute clinical observers make out the changes in the voice and refer the patients for endocrine evaluation. In this review, we shall discuss the hormonal influence on the voice apparatus in normal and endocrine disorders. PMID:27730065
An innovative multimodal virtual platform for communication with devices in a natural way

NASA Astrophysics Data System (ADS)

Kinkar, Chhayarani R.; Golash, Richa; Upadhyay, Akhilesh R.

2012-03-01

As technology grows people are diverted and are more interested in communicating with machine or computer naturally. This will make machine more compact and portable by avoiding remote, keyboard etc. also it will help them to live in an environment free from electromagnetic waves. This thought has made 'recognition of natural modality in human computer interaction' a most appealing and promising research field. Simultaneously it has been observed that using single mode of interaction limit the complete utilization of commands as well as data flow. In this paper a multimodal platform, where out of many natural modalities like eye gaze, speech, voice, face etc. human gestures are combined with human voice is proposed which will minimize the mean square error. This will loosen the strict environment needed for accurate and robust interaction while using single mode. Gesture complement Speech, gestures are ideal for direct object manipulation and natural language is used for descriptive tasks. Human computer interaction basically requires two broad sections recognition and interpretation. Recognition and interpretation of natural modality in complex binary instruction is a tough task as it integrate real world to virtual environment. The main idea of the paper is to develop a efficient model for data fusion coming from heterogeneous sensors, camera and microphone. Through this paper we have analyzed that the efficiency is increased if heterogeneous data (image & voice) is combined at feature level using artificial intelligence. The long term goal of this paper is to design a robust system for physically not able or having less technical knowledge.
Vibrant Student Voices: Exploring Effects of the Use of Clickers in Large College Courses

ERIC Educational Resources Information Center

Hoekstra, Angel

2008-01-01

Teachers have begun using student response systems (SRSs) in an effort to enhance the learning process in higher education courses. Research providing detailed information about how interactive technologies affect students as they learn is crucial for professors who seek to improve teaching quality, attendance rates and student learning. This…
Understanding and Developing Interactive Voice Response Systems to Support Online Engagement of Older Adults

ERIC Educational Resources Information Center

Brewer, Robin Nicole

2017-01-01

Increasingly, people are engaging online and can participate in activities like searching for information, communicating with family and friends, and self-expression. However, some populations such as older adults, face barriers to online participation like device cost, access, and learnability, which prevent them from reaping the benefits of…
Using interactive voice response to improve disease management and compliance with acute coronary syndrome best practice guidelines: A randomized controlled trial.

PubMed

Sherrard, Heather; Duchesne, Lloyd; Wells, George; Kearns, Sharon Ann; Struthers, Christine

2015-01-01

There is evidence from large clinical trials that compliance with standardized best practice guidelines (BPGs) improves survival of acute coronary syndrome (ACS) patients. However, their application is often suboptimal. In this study, the researchers evaluated whether the use of an interactive voice response (IVR) follow-up system improved ACS BPG compliance. This was a single-centre randomized control trial (RCT) of 1,608 patients (IVR=803; usual care=805). The IVR group received five automated calls in 12 months. The primary composite outcome was increased medication compliance and decreased adverse events. A significant improvement of 60% in the IVR group for the primary composite outcome was found (RR 1.60, 95% CI: 1.29 to 2.00, p <0.001). There was significant improvement in medication compliance (p <0.001) and decrease in unplanned medical visits (p = 0.023). At one year, the majority of patients ( 85%) responded positively to using the system again. Follow-up by IVR produced positive outcomes in ACS patients.
Computational Modeling of Fluid–Structure–Acoustics Interaction during Voice Production

PubMed Central

Jiang, Weili; Zheng, Xudong; Xue, Qian

2017-01-01

The paper presented a three-dimensional, first-principle based fluid–structure–acoustics interaction computer model of voice production, which employed a more realistic human laryngeal and vocal tract geometries. Self-sustained vibrations, important convergent–divergent vibration pattern of the vocal folds, and entrainment of the two dominant vibratory modes were captured. Voice quality-associated parameters including the frequency, open quotient, skewness quotient, and flow rate of the glottal flow waveform were found to be well within the normal physiological ranges. The analogy between the vocal tract and a quarter-wave resonator was demonstrated. The acoustic perturbed flux and pressure inside the glottis were found to be at the same order with their incompressible counterparts, suggesting strong source–filter interactions during voice production. Such high fidelity computational model will be useful for investigating a variety of pathological conditions that involve complex vibrations, such as vocal fold paralysis, vocal nodules, and vocal polyps. The model is also an important step toward a patient-specific surgical planning tool that can serve as a no-risk trial and error platform for different procedures, such as injection of biomaterials and thyroplastic medialization. PMID:28243588
Negotiating Voice Construction between Writers and Readers in College Writing: A Case Study of an L2 Writer

ERIC Educational Resources Information Center

Jwa, Soomin

2018-01-01

Voice is co-constructed, a result of the "text-mediated interaction between the writer and the reader." The present study, using the context of U.S. college writing, explores the complicated process by which an L2 novice writer--one who has a growing awareness of, yet peripheral access to, discourse practices--constructs a voice. Through…
Single-channel voice-response-system program documentation volume I : system description

DOT National Transportation Integrated Search

1977-01-01

This report documents the design and implementation of a Voice Response System (VRS) using Adaptive Differential Pulse Code Modulation (ADPCM) voice coding. Implemented on a Digital Equipment Corporation PDP-11/20,R this VRS system supports a single ...
Electronic Delivery System: Presentation Features.

DTIC Science & Technology

1981-04-01

THE INFOR’"TiO 1. 0 THE FULNCTIONALITY OF THE PRESENTATIO,’, NOT ITS REPLIC., NATURE IS WHAT COUNTS. S-12 REAL ISM _(CNTD. ) * A SEQUENCE OF...E.G, A MOUSE) IS USED FOR INPUTTINZ RESPONSES, THEY CAN BE VERY EFFICIENT, , S-21 -~i INTERACTION - MECHANISt, S (CONTD.) * TOUCH PANELS -- NATURAL , NO...INTERACTION - MECHANISMS (CONTD, i fm O VOICE INPUT --USED WHERE HANDS OR EYES ARE BUSY (E.G., FOR MAINTENANCE AIDING), -- A NATURAL MEANS OF CO;r UNICATION
Voice emotion perception and production in cochlear implant users.

PubMed

Jiam, N T; Caldwell, M; Deroche, M L; Chatterjee, M; Limb, C J

2017-09-01

Voice emotion is a fundamental component of human social interaction and social development. Unfortunately, cochlear implant users are often forced to interface with highly degraded prosodic cues as a result of device constraints in extraction, processing, and transmission. As such, individuals with cochlear implants frequently demonstrate significant difficulty in recognizing voice emotions in comparison to their normal hearing counterparts. Cochlear implant-mediated perception and production of voice emotion is an important but relatively understudied area of research. However, a rich understanding of the voice emotion auditory processing offers opportunities to improve upon CI biomedical design and to develop training programs benefiting CI performance. In this review, we will address the issues, current literature, and future directions for improved voice emotion processing in cochlear implant users. Copyright © 2017 Elsevier B.V. All rights reserved.
``The perceptual bases of speaker identity'' revisited

NASA Astrophysics Data System (ADS)

Voiers, William D.

2003-10-01

A series of experiments begun 40 years ago [W. D. Voiers, J. Acoust. Soc. Am. 36, 1065-1073 (1964)] was concerned with identifying the perceived voice traits (PVTs) on which human recognition of voices depends. It culminated with the development of a voice taxonomy based on 20 PVTs and a set of highly reliable rating scales for classifying voices with respect to those PVTs. The development of a perceptual voice taxonomy was motivated by the need for a practical method of evaluating speaker recognizability in voice communication systems. The Diagnostic Speaker Recognition Test (DSRT) evaluates the effects of systems on speaker recognizability as reflected in changes in the inter-listener reliability of voice ratings on the 20 PVTs. The DSRT thus provides a qualitative, as well as quantitative, evaluation of the effects of a system on speaker recognizability. A fringe benefit of this project is PVT rating data for a sample of 680 voices. [Work partially supported by USAFRL.
Voice responses to changes in pitch of voice or tone auditory feedback

NASA Astrophysics Data System (ADS)

Sivasankar, Mahalakshmi; Bauer, Jay J.; Babu, Tara; Larson, Charles R.

2005-02-01

The present study was undertaken to examine if a subject's voice F0 responded not only to perturbations in pitch of voice feedback but also to changes in pitch of a side tone presented congruent with voice feedback. Small magnitude brief duration perturbations in pitch of voice or tone auditory feedback were randomly introduced during sustained vowel phonations. Results demonstrated a higher rate and larger magnitude of voice F0 responses to changes in pitch of the voice compared with a triangular-shaped tone (experiment 1) or a pure tone (experiment 2). However, response latencies did not differ across voice or tone conditions. Data suggest that subjects responded to the change in F0 rather than harmonic frequencies of auditory feedback because voice F0 response prevalence, magnitude, or latency did not statistically differ across triangular-shaped tone or pure-tone feedback. Results indicate the audio-vocal system is sensitive to the change in pitch of a variety of sounds, which may represent a flexible system capable of adapting to changes in the subject's voice. However, lower prevalence and smaller responses to tone pitch-shifted signals suggest that the audio-vocal system may resist changes to the pitch of other environmental sounds when voice feedback is present. .

Interactive voice technology: Variations in the vocal utterances of speakers performing a stress-inducing task

NASA Astrophysics Data System (ADS)

Mosko, J. D.; Stevens, K. N.; Griffin, G. R.

1983-08-01

Acoustical analyses were conducted of words produced by four speakers in a motion stress-inducing situation. The aim of the analyses was to document the kinds of changes that occur in the vocal utterances of speakers who are exposed to motion stress and to comment on the implications of these results for the design and development of voice interactive systems. The speakers differed markedly in the types and magnitudes of the changes that occurred in their speech. For some speakers, the stress-inducing experimental condition caused an increase in fundamental frequency, changes in the pattern of vocal fold vibration, shifts in vowel production and changes in the relative amplitudes of sounds containing turbulence noise. All speakers showed greater variability in the experimental condition than in more relaxed control situation. The variability was manifested in the acoustical characteristics of individual phonetic elements, particularly in speech sound variability observed serve to unstressed syllables. The kinds of changes and variability observed serve to emphasize the limitations of speech recognition systems based on template matching of patterns that are stored in the system during a training phase. There is need for a better understanding of these phonetic modifications and for developing ways of incorporating knowledge about these changes within a speech recognition system.
Phonological experience modulates voice discrimination: Evidence from functional brain networks analysis.

PubMed

Hu, Xueping; Wang, Xiangpeng; Gu, Yan; Luo, Pei; Yin, Shouhang; Wang, Lijun; Fu, Chao; Qiao, Lei; Du, Yi; Chen, Antao

2017-10-01

Numerous behavioral studies have found a modulation effect of phonological experience on voice discrimination. However, the neural substrates underpinning this phenomenon are poorly understood. Here we manipulated language familiarity to test the hypothesis that phonological experience affects voice discrimination via mediating the engagement of multiple perceptual and cognitive resources. The results showed that during voice discrimination, the activation of several prefrontal regions was modulated by language familiarity. More importantly, the same effect was observed concerning the functional connectivity from the fronto-parietal network to the voice-identity network (VIN), and from the default mode network to the VIN. Our findings indicate that phonological experience could bias the recruitment of cognitive control and information retrieval/comparison processes during voice discrimination. Therefore, the study unravels the neural substrates subserving the modulation effect of phonological experience on voice discrimination, and provides new insights into studying voice discrimination from the perspective of network interactions. Copyright © 2017. Published by Elsevier Inc.
A robotic voice simulator and the interactive training for hearing-impaired people.

PubMed

Sawada, Hideyuki; Kitani, Mitsuki; Hayashi, Yasumori

2008-01-01

A talking and singing robot which adaptively learns the vocalization skill by means of an auditory feedback learning algorithm is being developed. The robot consists of motor-controlled vocal organs such as vocal cords, a vocal tract and a nasal cavity to generate a natural voice imitating a human vocalization. In this study, the robot is applied to the training system of speech articulation for the hearing-impaired, because the robot is able to reproduce their vocalization and to teach them how it is to be improved to generate clear speech. The paper briefly introduces the mechanical construction of the robot and how it autonomously acquires the vocalization skill in the auditory feedback learning by listening to human speech. Then the training system is described, together with the evaluation of the speech training by auditory impaired people.
Voice disorders in teachers and the general population: effects on work performance, attendance, and future career choices.

PubMed

Roy, Nelson; Merrill, Ray M; Thibeault, Susan; Gray, Steven D; Smith, Elaine M

2004-06-01

To examine the frequency and adverse effects of voice disorders on job performance and attendance in teachers and the general population, 2,401 participants from Iowa and Utah (n1 = 1,243 teachers and n2 = 1,279 nonteachers) were randomly selected and were interviewed by telephone using a voice disorder questionnaire. Teachers were significantly more likely than nonteachers to have experienced multiple voice symptoms and signs including hoarseness, discomfort, and increased effort while using their voice, tiring or experiencing a change in voice quality after short use, difficulty projecting their voice, trouble speaking or singing softly, and a loss of their singing range (all odds ratios [ORs] p <.05). Furthermore, teachers consistently attributed these voice symptoms to their occupation and were significantly more likely to indicate that their voice limited their ability to perform certain tasks at work, and had reduced activities or interactions as a result. Teachers, as compared with nonteachers, had missed more workdays over the preceding year because of voice problems and were more likely to consider changing occupations because of their voice (all comparisons p <.05). These findings strongly suggest that occupationally related voice dysfunction in teachers can have significant adverse effects on job performance, attendance, and future career choices.
Inadequate vocal hygiene habits associated with the presence of self-reported voice symptoms in telemarketers.

PubMed

Fuentes-López, Eduardo; Fuente, Adrian; Contreras, Karem V

2017-12-18

The aim of this study is to determine possible associations between vocal hygiene habits and self-reported vocal symptoms in telemarketers. A cross-sectional study that included 79 operators from call centres in Chile was carried out. Their vocal hygiene habits and self-reported symptoms were investigated using a validated and reliable questionnaire created for the purposes of this study. Forty-five percent of telemarketers reported having one or more vocal symptoms. Among them, 16.46% reported that their voices tense up when talking and 10.13% needed to clear their throat to make their voices clearer. Five percent mentioned that they always talk without taking a break and 40.51% reported using their voices in noisy environments. The number of working hours per day and inadequate vocal hygiene habits were associated with the presence of self-reported symptoms. Additionally, an interaction between the use of the voice in noisy environments and not taking breaks during the day was observed. Finally, the frequency of inadequate vocal hygiene habits was associated with the number of symptoms reported. Using the voice in noisy environments and talking without taking breaks were both associated with the presence of specific vocal symptoms. This study provides some evidence about the interaction between these two inadequate vocal hygiene habits that potentiates vocal symptoms.
Shielding voices: The modulation of binding processes between voice features and response features by task representations.

PubMed

Bogon, Johanna; Eisenbarth, Hedwig; Landgraf, Steffen; Dreisbach, Gesine

2017-09-01

Vocal events offer not only semantic-linguistic content but also information about the identity and the emotional-motivational state of the speaker. Furthermore, most vocal events have implications for our actions and therefore include action-related features. But the relevance and irrelevance of vocal features varies from task to task. The present study investigates binding processes for perceptual and action-related features of spoken words and their modulation by the task representation of the listener. Participants reacted with two response keys to eight different words spoken by a male or a female voice (Experiment 1) or spoken by an angry or neutral male voice (Experiment 2). There were two instruction conditions: half of participants learned eight stimulus-response mappings by rote (SR), and half of participants applied a binary task rule (TR). In both experiments, SR instructed participants showed clear evidence for binding processes between voice and response features indicated by an interaction between the irrelevant voice feature and the response. By contrast, as indicated by a three-way interaction with instruction, no such binding was found in the TR instructed group. These results are suggestive of binding and shielding as two adaptive mechanisms that ensure successful communication and action in a dynamic social environment.
Functional selectivity for face processing in the temporal voice area of early deaf individuals

PubMed Central

van Ackeren, Markus J.; Rabini, Giuseppe; Zonca, Joshua; Foa, Valentina; Baruffaldi, Francesca; Rezk, Mohamed; Pavani, Francesco; Rossion, Bruno; Collignon, Olivier

2017-01-01

Brain systems supporting face and voice processing both contribute to the extraction of important information for social interaction (e.g., person identity). How does the brain reorganize when one of these channels is absent? Here, we explore this question by combining behavioral and multimodal neuroimaging measures (magneto-encephalography and functional imaging) in a group of early deaf humans. We show enhanced selective neural response for faces and for individual face coding in a specific region of the auditory cortex that is typically specialized for voice perception in hearing individuals. In this region, selectivity to face signals emerges early in the visual processing hierarchy, shortly after typical face-selective responses in the ventral visual pathway. Functional and effective connectivity analyses suggest reorganization in long-range connections from early visual areas to the face-selective temporal area in individuals with early and profound deafness. Altogether, these observations demonstrate that regions that typically specialize for voice processing in the hearing brain preferentially reorganize for face processing in born-deaf people. Our results support the idea that cross-modal plasticity in the case of early sensory deprivation relates to the original functional specialization of the reorganized brain regions. PMID:28652333
Analysis of Measured and Simulated Supraglottal Acoustic Waves.

PubMed

Fraile, Rubén; Evdokimova, Vera V; Evgrafova, Karina V; Godino-Llorente, Juan I; Skrelin, Pavel A

2016-09-01

To date, although much attention has been paid to the estimation and modeling of the voice source (ie, the glottal airflow volume velocity), the measurement and characterization of the supraglottal pressure wave have been much less studied. Some previous results have unveiled that the supraglottal pressure wave has some spectral resonances similar to those of the voice pressure wave. This makes the supraglottal wave partially intelligible. Although the explanation for such effect seems to be clearly related to the reflected pressure wave traveling upstream along the vocal tract, the influence that nonlinear source-filter interaction has on it is not as clear. This article provides an insight into this issue by comparing the acoustic analyses of measured and simulated supraglottal and voice waves. Simulations have been performed using a high-dimensional discrete vocal fold model. Results of such comparative analysis indicate that spectral resonances in the supraglottal wave are mainly caused by the regressive pressure wave that travels upstream along the vocal tract and not by source-tract interaction. On the contrary and according to simulation results, source-tract interaction has a role in the loss of intelligibility that happens in the supraglottal wave with respect to the voice wave. This loss of intelligibility mainly corresponds to spectral differences for frequencies above 1500 Hz. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Voice, (inter-)subjectivity, and real time recurrent interaction

PubMed Central

Cummins, Fred

2014-01-01

Received approaches to a unified phenomenon called “language” are firmly committed to a Cartesian view of distinct unobservable minds. Questioning this commitment leads us to recognize that the boundaries conventionally separating the linguistic from the non-linguistic can appear arbitrary, omitting much that is regularly present during vocal communication. The thesis is put forward that uttering, or voicing, is a much older phenomenon than the formal structures studied by the linguist, and that the voice has found elaborations and codifications in other domains too, such as in systems of ritual and rite. Voice, it is suggested, necessarily gives rise to a temporally bound subjectivity, whether it is in inner speech (Descartes' “cogito”), in conversation, or in the synchronized utterances of collective speech found in prayer, protest, and sports arenas world wide. The notion of a fleeting subjective pole tied to dynamically entwined participants who exert reciprocal influence upon each other in real time provides an insightful way to understand notions of common ground, or socially shared cognition. It suggests that the remarkable capacity to construct a shared world that is so characteristic of Homo sapiens may be grounded in this ability to become dynamically entangled as seen, e.g., in the centrality of joint attention in human interaction. Empirical evidence of dynamic entanglement in joint speaking is found in behavioral and neuroimaging studies. A convergent theoretical vocabulary is now available in the concept of participatory sense-making, leading to the development of a rich scientific agenda liberated from a stifling metaphysics that obscures, rather than illuminates, the means by which we come to inhabit a shared world. PMID:25101028
Voice Response Systems Technology.

ERIC Educational Resources Information Center

Gerald, Jeanette

1984-01-01

Examines two methods of generating synthetic speech in voice response systems, which allow computers to communicate in human terms (speech), using human interface devices (ears): phoneme and reconstructed voice systems. Considerations prior to implementation, current and potential applications, glossary, directory, and introduction to Input Output…
Pilot study on the feasibility of a computerized speech recognition charting system.

PubMed

Feldman, C A; Stevens, D

1990-08-01

The objective of this study was to determine the feasibility of developing and using a voice recognition computerized charting system to record dental clinical examination data. More specifically, the study was designed to analyze the time and error differential between the traditional examiner/recorder method (ASSISTANT) and computerized voice recognition method (VOICE). DMFS examinations were performed twice on 20 patients using the traditional ASSISTANT and the VOICE charting system. A statistically significant difference was found when comparing the mean ASSISTANT time of 2.69 min to the VOICE time of 3.72 min (P less than 0.001). No statistically significant difference was found when comparing the mean ASSISTANT recording errors of 0.1 to VOICE recording errors of 0.6 (P = 0.059). 90% of the patients indicated they felt comfortable with the dentist talking to a computer and only 5% of the sample indicated they opposed VOICE. Results from this pilot study indicate that a charting system utilizing voice recognition technology could be considered a viable alternative to traditional examiner/recorder methods of clinical charting.
Voice hearing within the context of hearers' social worlds: an interpretative phenomenological analysis.

PubMed

Mawson, Amy; Berry, Katherine; Murray, Craig; Hayward, Mark

2011-09-01

Research has found relational qualities of power and intimacy to exist within hearer-voice interactions. The present study aimed to provide a deeper understanding of the interpersonal context of voice hearing by exploring participants' relationships with their voices and other people in their lives. This research was designed in consultation with service users and employed a qualitative, phenomenological, and idiographic design using semi-structured interviews. Ten participants, recruited via mental health services, and who reported hearing voices in the previous week, completed the interviews. These were transcribed verbatim and analysed using interpretative phenomenological analysis. Five themes resulted from the analysis. Theme 1: 'person and voice' demonstrated that participants' voices often reflected the identity, but not always the quality of social acquaintances. Theme 2: 'voices changing and confirming relationship with the self' explored the impact of voice hearing in producing an inferior sense-of-self in comparison to others. Theme 3: 'a battle for control' centred on issues of control and a dilemma of independence within voice relationships. Theme 4: 'friendships facilitating the ability to cope' and theme 5: 'voices creating distance in social relationships' explored experiences of social relationships within the context of voice hearing, and highlighted the impact of social isolation for voice hearers. The study demonstrated the potential role of qualitative research in developing theories of voice hearing. It extended previous research by highlighting the interface between voices and the social world of the hearer, including reciprocal influences of social relationships on voices and coping. Improving voice hearers' sense-of-self may be a key factor in reducing the distress caused by voices. ©2010 The British Psychological Society.
IBM techexplorer and MathML: Interactive Multimodal Scientific Documents

NASA Astrophysics Data System (ADS)

Diaz, Angel

2001-06-01

The World Wide Web provides a standard publishing platform for disseminating scientific and technical articles, books, journals, courseware, or even homework on the internet; however, the transition from paper to web-based interactive content has brought new opportunities for creating interactive content. Students, scientists, and engineers are now faced with the task of rendering the 2D presentational structure of mathematics, harnessing the wealth of scientific and technical software, and creating truly accessible scientific portals across international boundaries and markets. The recent emergence of World Wide Web Consortium (W3C) standards such as the Mathematical Markup Language (MathML), Language (XSL), and Aural CSS (ACSS) provide a foundation whereby mathematics can be displayed, enlivened, computed, and audio formatted. With interoperability ensured by standards, software applications can be easily brought together to create extensible and interactive scientific content. In this presentation we will provide an overview of the IBM techexplorer Hypermedia Browser, a web browser plug-in and ActiveX control aimed at bringing interactive mathematics to the masses across platforms and applications. We will demonstrate "live" mathematics where documents that contain MathML expressions can be edited and computed right inside your favorite web browser. This demonstration will be generalized as we show how MathML can be used to enliven even PowerPoint presentations. Finally, we will close the loop by demonstrating a novel approach to spoken mathematics based on MathML, DOM, XSL, ACSS, techexplorer, and IBM ViaVoice. By making use of techexplorer as the glue that binds the rendered content to the web browser, the back-end computation software, the Java applets that augment the exposition, and voice-rendering systems such as ViaVoice, authors can indeed create truly extensible and interactive scientific content. For more information see: [http://www.software.ibm.com/techexplorer] [http://www.alphaworks.ibm.com] [http://www.w3.org
Voice-on-Target: A New Approach to Tactical Networking and Unmanned Systems Control via the Voice Interface to the SA Environment

DTIC Science & Technology

2009-06-01

Blackberry handheld) device. After each voice command activation, the medic provided voice comments to be recorded in Observer Notepad over Voice...vial (up-right corner of picture) upon voice activation from the medic’s Blackberry handheld. The NPS UAS which was controlled by voice commands...Voice Portal using a standard Blackberry handheld with a head set. The results demonstrated sufficient accuracy for controlling the tactical sensor
The persuasiveness of synthetic speech versus human speech.

PubMed

Stern, S E; Mullennix, J W; Dyson, C; Wilson, S J

1999-12-01

Is computer-synthesized speech as persuasive as the human voice when presenting an argument? After completing an attitude pretest, 193 participants were randomly assigned to listen to a persuasive appeal under three conditions: a high-quality synthesized speech system (DECtalk Express), a low-quality synthesized speech system (Monologue), and a tape recording of a human voice. Following the appeal, participants completed a posttest attitude survey and a series of questionnaires designed to assess perceptions of speech qualities, perceptions of the speaker, and perceptions of the message. The human voice was generally perceived more favorably than the computer-synthesized voice, and the speaker was perceived more favorably when the voice was a human voice than when it was computer synthesized. There was, however, no evidence that computerized speech, as compared with the human voice, affected persuasion or perceptions of the message. Actual or potential applications of this research include issues that should be considered when designing synthetic speech systems.
Driving While Interacting With Google Glass: Investigating the Combined Effect of Head-Up Display and Hands-Free Input on Driving Safety and Multitask Performance.

PubMed

Tippey, Kathryn G; Sivaraj, Elayaraj; Ferris, Thomas K

2017-06-01

This study evaluated the individual and combined effects of voice (vs. manual) input and head-up (vs. head-down) display in a driving and device interaction task. Advances in wearable technology offer new possibilities for in-vehicle interaction but also present new challenges for managing driver attention and regulating device usage in vehicles. This research investigated how driving performance is affected by interface characteristics of devices used for concurrent secondary tasks. A positive impact on driving performance was expected when devices included voice-to-text functionality (reducing demand for visual and manual resources) and a head-up display (HUD) (supporting greater visibility of the driving environment). Driver behavior and performance was compared in a texting-while-driving task set during a driving simulation. The texting task was completed with and without voice-to-text using a smartphone and with voice-to-text using Google Glass's HUD. Driving task performance degraded with the addition of the secondary texting task. However, voice-to-text input supported relatively better performance in both driving and texting tasks compared to using manual entry. HUD functionality further improved driving performance compared to conditions using a smartphone and often was not significantly worse than performance without the texting task. This study suggests that despite the performance costs of texting-while-driving, voice input methods improve performance over manual entry, and head-up displays may further extend those performance benefits. This study can inform designers and potential users of wearable technologies as well as policymakers tasked with regulating the use of these technologies while driving.
ATC/pilot voice communications: A survey of the literature

NASA Astrophysics Data System (ADS)

Prinzo, O. Veronika; Britton, Thomas W.

1993-11-01

The first radio-equipped control tower in the United States opened at the Cleveland Municipal Airport in 1930. From that time to the present, voice radio communications have played a primary role in air safety. Verbal communications in air traffic control (ATC) operations have been frequently cited as causal factors in operational errors and pilot deviations in the FAA Operational Error and Deviation System, the NASA Aviation Safety Reporting System (ASRS), and reports derived from government sponsored research projects. Collectively, the data provided by these programs indicate that communications constitute a significant problem for pilots and controllers. Although the communications problem was well known the research literature was fragmented, making it difficult to appreciate the various types of verbal communications problems that existed and their unique influence on the quality of ATC/pilot communications. This is a survey of the voice radio communications literature. The 43 reports in the review represent survey data, field studies, laboratory studies, narrative reports, and reviews. The survey topics pertain to communications taxonomies, acoustical correlates and cognitive/psycholinguistic perspectives. Communications taxonomies were used to identify the frequency and types of information that constitute routine communications, as well as those communications involved in operational errors, pilot deviations, and other safety-related events. Acoustical correlate methodologies identified some qualities of a speaker's voice, such as loudness, pitch, and speech rate, which might be used potentially to monitor stress, mental workload, and other forms of psychological or physiological factors that affect performance. Cognitive/psycho-linguistic research offered an information processing perspective for understanding how pilots' and controllers' memory and language comprehension processes affect their ability to communicate effectively with one another. This analysis of the ATC/pilot voice radio communications literature was performed to provide an organized summary for the systematic study of interactive communications between controllers and pilots. Recommendations are given for new research initiatives, communications-based instructional materials, and human factors applications for new communications systems.
Telehealth: voice therapy using telecommunications technology.

PubMed

Mashima, Pauline A; Birkmire-Peters, Deborah P; Syms, Mark J; Holtel, Michael R; Burgess, Lawrence P A; Peters, Leslie J

2003-11-01

Telehealth offers the potential to meet the needs of underserved populations in remote regions. The purpose of this study was a proof-of-concept to determine whether voice therapy can be delivered effectively remotely. Treatment outcomes were evaluated for a vocal rehabilitation protocol delivered under 2 conditions: with the patient and clinician interacting within the same room (conventional group) and with the patient and clinician in separate rooms, interacting in real time via a hard-wired video camera and monitor (video teleconference group). Seventy-two patients with voice disorders served as participants. Based on evaluation by otolaryngologists, 31 participants were diagnosed with vocal nodules, 29 were diagnosed with edema, 9 were diagnosed with unilateral vocal fold paralysis, and 3 presented with vocal hyperfunction with no laryngeal pathology. Fifty-one participants (71%) completed the vocal rehabilitation protocol. Outcome measures included perceptual judgments of voice quality, acoustic analyses of voice, patient satisfaction ratings, and fiber-optic laryngoscopy. There were no differences in outcome measures between the conventional group and the remote video teleconference group. Participants in both groups showed positive changes on all outcome measures after completing the vocal rehabilitation protocol. Reasons for participants discontinuing therapy prematurely provided support for the telehealth model of service delivery.
Sounding the 'citizen-patient': the politics of voice at the Hospice Des Quinze-vingts in post-revolutionary Paris.

PubMed

Sykes, Ingrid

2011-10-01

This essay explores new models of the citizen-patient by attending to the post-Revolutionary blind 'voice'. Voice, in both a literal and figurative sense, was central to the way in which members of the Hospice des Quinze-Vingts, an institution for the blind and partially sighted, interacted with those in the community. Musical voices had been used by members to collect alms and to project the particular spiritual principle of their institution since its foundation in the thirteenth century. At the time of the Revolution, the Quinze-Vingts voice was understood by some political authorities as an exemplary call of humanity. Yet many others perceived it as deeply threatening. After 1800, productive dialogue between those in political control and Quinze-Vingts blind members broke down. Authorities attempted to silence the voice of members through the control of blind musicians and institutional management. The Quinze-Vingts blind continued to reassert their voices until around 1850, providing a powerful form of resistance to political control. The blind 'voice' ultimately recognised the right of the citizen-patient to dialogue with their political carers.
A Voice-Based E-Examination Framework for Visually Impaired Students in Open and Distance Learning

ERIC Educational Resources Information Center

Azeta, Ambrose A.; Inam, Itorobong A.; Daramola, Olawande

2018-01-01

Voice-based systems allow users access to information on the internet over a voice interface. Prior studies on Open and Distance Learning (ODL) e-examination systems that make use of voice interface do not sufficiently exhibit intelligent form of assessment, which diminishes the rigor of examination. The objective of this paper is to improve on…

An Analysis of Content Delivery Systems Using Speaking Voice, Speaking with Repetition Voice, Chanting Voice, and Singing Voice.

ERIC Educational Resources Information Center

Foster, Karen R.; Kersh, Mildred E.; Masztal, Nancy B.

This study investigated the way kindergarten classroom teachers delivered information to students to see if it affected the amount of information students could remember about the solar system. The study also examined whether this difference would be related to the degree of musical aptitude possessed by each student. The students were pretested…
Design of digital voice storage and playback system

NASA Astrophysics Data System (ADS)

Tang, Chao

2018-03-01

Based on STC89C52 chip, this paper presents a single chip microcomputer minimum system, which is used to realize the logic control of digital speech storage and playback system. Compared with the traditional tape voice recording system, the system has advantages of small size, low power consumption, The effective solution of traditional voice recording system is limited in the use of electronic and information processing.
A voice region in the monkey brain.

PubMed

Petkov, Christopher I; Kayser, Christoph; Steudel, Thomas; Whittingstall, Kevin; Augath, Mark; Logothetis, Nikos K

2008-03-01

For vocal animals, recognizing species-specific vocalizations is important for survival and social interactions. In humans, a voice region has been identified that is sensitive to human voices and vocalizations. As this region also strongly responds to speech, it is unclear whether it is tightly associated with linguistic processing and is thus unique to humans. Using functional magnetic resonance imaging of macaque monkeys (Old World primates, Macaca mulatta) we discovered a high-level auditory region that prefers species-specific vocalizations over other vocalizations and sounds. This region not only showed sensitivity to the 'voice' of the species, but also to the vocal identify of conspecific individuals. The monkey voice region is located on the superior-temporal plane and belongs to an anterior auditory 'what' pathway. These results establish functional relationships with the human voice region and support the notion that, for different primate species, the anterior temporal regions of the brain are adapted for recognizing communication signals from conspecifics.
Giving Voice to Emotion: Voice Analysis Technology Uncovering Mental States is Playing a Growing Role in Medicine, Business, and Law Enforcement.

PubMed

Allen, Summer

2016-01-01

It's tough to imagine anything more frustrating than interacting with a call center. Generally, people don't reach out to call centers when they?re happy-they're usually trying to get help with a problem or gearing up to do battle over a billing error. Add in an automatic phone tree, and you have a recipe for annoyance. But what if that robotic voice offering you a smorgasbord of numbered choices could tell that you were frustrated and then funnel you to an actual human being? This type of voice analysis technology exists, and it's just one example of the many ways that computers can use your voice to extract information about your mental and emotional state-including information you may not think of as being accessible through your voice alone.
Recruitment and Retention Challenges in a Technology-Based Study with Older Adults Discharged from a Geriatric Rehabilitation Unit.

PubMed

McCloskey, Rose; Jarrett, Pamela; Stewart, Connie; Keeping-Burke, Lisa

2015-01-01

Technology has the potential to offer support to older adults after being discharged from geriatric rehabilitation. This article highlights recruitment and retention challenges in a study examining an interactive voice response telephone system designed to monitor and support older adults and their informal caregivers following discharge from a geriatric rehabilitation unit. A prospective longitudinal study was planned to examine the feasibility of an interactive voice telephone system in facilitating the transition from rehabilitation to home for older adults and their family caregivers. Patient participants were required to make daily calls into the system. Using standardized instruments, data was to be collected at baseline and during home visits. Older adults and their caregivers may not be willing to learn how to use new technology at the time of hospital discharge. Poor recruitment and retention rates prevented analysis of findings. The importance of recruitment and retention in any study should never be underestimated. Target users of any intervention need to be included in both the design of the intervention and the study examining its benefit. Identifying the issues associated with introducing technology with a group of older rehabilitation patients should assist others who are interested in exploring the role of technology in facilitating hospital discharge. © 2014 Association of Rehabilitation Nurses.
Assessment of an interactive voice response system for identifying falls in a statewide sample of older adults.

PubMed

Albert, Steven M; King, Jennifer; Keene, Robert M

2015-02-01

Interactive voice response (IVR) systems offer great advantages for data collection in large, geographically dispersed samples involving frequent contact. We assessed the quality of IVR data collected from older respondents participating in a statewide falls prevention program evaluation in Pennsylvania in 2010-12. Participants (n=1834) were followed up monthly for up to 10 months to compare respondents who completed all, some, or no assessments in the IVR system. Validity was assessed by examining IVR-reported falls incidence relative to baseline in-person self-report and performance assessment of balance. While a third of the sample switched from IVR to in-person calls over follow-up, IVR interviews were successfully used to complete 68.1% of completed monthly assessments (10,511/15,430). Switching to in-person interviews was not associated with measures of participant function or cognition. Both self-reported (p<.0001) and performance assessment of balance (p=.05) at baseline were related to falls incidence. IVR is a productive modality for falls research among older adults. Future research should establish what level of initial personal research contact is optimal for boosting IVR completion rates and what research domains are most appropriate for this kind of contact. Copyright © 2014 Elsevier Inc. All rights reserved.
Interactive video audio system: communication server for INDECT portal

NASA Astrophysics Data System (ADS)

Mikulec, Martin; Voznak, Miroslav; Safarik, Jakub; Partila, Pavol; Rozhon, Jan; Mehic, Miralem

2014-05-01

The paper deals with presentation of the IVAS system within the 7FP EU INDECT project. The INDECT project aims at developing the tools for enhancing the security of citizens and protecting the confidentiality of recorded and stored information. It is a part of the Seventh Framework Programme of European Union. We participate in INDECT portal and the Interactive Video Audio System (IVAS). This IVAS system provides a communication gateway between police officers working in dispatching centre and police officers in terrain. The officers in dispatching centre have capabilities to obtain information about all online police officers in terrain, they can command officers in terrain via text messages, voice or video calls and they are able to manage multimedia files from CCTV cameras or other sources, which can be interesting for officers in terrain. The police officers in terrain are equipped by smartphones or tablets. Besides common communication, they can reach pictures or videos sent by commander in office and they can respond to the command via text or multimedia messages taken by their devices. Our IVAS system is unique because we are developing it according to the special requirements from the Police of the Czech Republic. The IVAS communication system is designed to use modern Voice over Internet Protocol (VoIP) services. The whole solution is based on open source software including linux and android operating systems. The technical details of our solution are presented in the paper.
Andreas Vesalius' 500th Anniversary: Initial Integral Understanding of Voice Production.

PubMed

Brinkman, Romy J; Hage, J Joris

2017-01-01

Voice production relies on the integrated functioning of a three-part system: respiration, phonation and resonance, and articulation. To commemorate the 500th anniversary of the great anatomist Andreas Vesalius (1515-1564), we report on his understanding of this integral system. The text of Vesalius' masterpiece De Humani Corporis Fabrica Libri Septum and an eyewitness report of the public dissection of three corpses by Vesalius in Bologna, Italy, in 1540, were searched for references to the voice-producing anatomical structures and their function. We clustered the traced, separate parts for the first time. We found that Vesalius recognized the importance for voice production of many details of the respiratory system, the voice box, and various structures of resonance and articulation. He stressed that voice production was a cerebral function and extensively recorded the innervation of the voice-producing organs by the cranial nerves. Vesalius was the first to publicly record the concept of voice production as an integrated and cerebrally directed function of respiration, phonation and resonance, and articulation. In doing so nearly 500 years ago, he laid a firm basis for the understanding of the physiology of voice production and speech and its management as we know it today. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Human-computer interaction for alert warning and attention allocation systems of the multimodal watchstation

NASA Astrophysics Data System (ADS)

Obermayer, Richard W.; Nugent, William A.

2000-11-01

The SPAWAR Systems Center San Diego is currently developing an advanced Multi-Modal Watchstation (MMWS); design concepts and software from this effort are intended for transition to future United States Navy surface combatants. The MMWS features multiple flat panel displays and several modes of user interaction, including voice input and output, natural language recognition, 3D audio, stylus and gestural inputs. In 1999, an extensive literature review was conducted on basic and applied research concerned with alerting and warning systems. After summarizing that literature, a human computer interaction (HCI) designer's guide was prepared to support the design of an attention allocation subsystem (AAS) for the MMWS. The resultant HCI guidelines are being applied in the design of a fully interactive AAS prototype. An overview of key findings from the literature review, a proposed design methodology with illustrative examples, and an assessment of progress made in implementing the HCI designers guide are presented.
The distress of voice-hearing: the use of simulation for awareness, understanding and communication skill development in undergraduate nursing education.

PubMed

Orr, Fiona; Kellehear, Kevin; Armari, Elizabeth; Pearson, Arana; Holmes, Douglas

2013-11-01

Role-play scenarios are frequently used with undergraduate nursing students enrolled in mental health nursing subjects to simulate the experience of voice-hearing. However, role-play has limitations and typically does not involve those who hear voices. This collaborative project between mental health consumers who hear voices and nursing academics aimed to develop and assess simulated voice-hearing as an alternative learning tool that could provide a deeper understanding of the impact of voice-hearing, whilst enabling students to consider the communication skills required when interacting with voice-hearers. Simulated sounds and voices recorded by consumers on mp3 players were given to eighty final year nursing students undertaking a mental health elective. Students participated in various activities whilst listening to the simulations. Seventy-six (95%) students completed a written evaluation following the simulation, which assessed the benefits of the simulation and its implications for clinical practice. An analysis of the students' responses by an external evaluator indicated that there were three major learning outcomes: developing an understanding of voice-hearing, increasing students' awareness of its impact on functioning, and consideration of the communication skills necessary to engage with consumers who hear voices. Copyright © 2013 Elsevier Ltd. All rights reserved.
Effects of an Automated Telephone Support System on Caregiver Burden and Anxiety: Findings from the REACH for TLC Intervention Study

ERIC Educational Resources Information Center

Mahoney, Diane Feeney; Tarlow, Barbara J.; Jones, Richard N.

2003-01-01

Purpose: We determine the main outcome effects of a 12-month computer-mediated automated interactive voice response (IVR) intervention designed to assist family caregivers managing persons with disruptive behaviors related to Alzheimer's disease (AD). Design and Methods: We conducted a randomized controlled study of 100 caregivers, 51 in the usual…
Utilization of Internet Protocol-Based Voice Systems in Remote Payload Operations

NASA Technical Reports Server (NTRS)

Best, Susan; Nichols, Kelvin; Bradford, Robert

2003-01-01

This viewgraph presentation provides an overview of a proposed voice communication system for use in remote payload operations performed on the International Space Station. The system, Internet Voice Distribution System (IVoDS), would make use of existing Internet protocols, and offer a number of advantages over the system currently in use. Topics covered include: system description and operation, system software and hardware, system architecture, project status, and technology transfer applications.
The voices of seduction: cross-gender effects in processing of erotic prosody

PubMed Central

Ethofer, Thomas; Wiethoff, Sarah; Anders, Silke; Kreifelts, Benjamin; Grodd, Wolfgang

2007-01-01

Gender specific differences in cognitive functions have been widely discussed. Considering social cognition such as emotion perception conveyed by non-verbal cues, generally a female advantage is assumed. In the present study, however, we revealed a cross-gender interaction with increasing responses to the voice of opposite sex in male and female subjects. This effect was confined to erotic tone of speech in behavioural data and haemodynamic responses within voice sensitive brain areas (right middle superior temporal gyrus). The observed response pattern, thus, indicates a particular sensitivity to emotional voices that have a high behavioural relevance for the listener. PMID:18985138
Adaptable dialog architecture and runtime engine (AdaRTE): a framework for rapid prototyping of health dialog systems.

PubMed

Rojas-Barahona, L M; Giorgino, T

2009-04-01

Spoken dialog systems have been increasingly employed to provide ubiquitous access via telephone to information and services for the non-Internet-connected public. They have been successfully applied in the health care context; however, speech technology requires a considerable development investment. The advent of VoiceXML reduced the proliferation of incompatible dialog formalisms, at the expense of adding even more complexity. This paper introduces a novel architecture for dialogue representation and interpretation, AdaRTE, which allows developers to lay out dialog interactions through a high-level formalism, offering both declarative and procedural features. AdaRTE's aim is to provide a ground for deploying complex and adaptable dialogs whilst allowing experimentation and incremental adoption of innovative speech technologies. It enhances augmented transition networks with dynamic behavior, and drives multiple back-end realizers, including VoiceXML. It has been especially targeted to the health care context, because of the great scale and the need for reducing the barrier to a widespread adoption of dialog systems.
A General Purpose Connections type CTI Server Based on SIP Protocol and Its Implementation

NASA Astrophysics Data System (ADS)

Watanabe, Toru; Koizumi, Hisao

In this paper, we propose a general purpose connections type CTI (Computer Telephony Integration) server that provides various CTI services such as voice logging where the CTI server communicates with IP-PBX using the SIP (Session Initiation Protocol), and accumulates voice packets of external line telephone call flowing between an IP telephone for extension and a VoIP gateway connected to outside line networks. The CTI server realizes CTI services such as voice logging, telephone conference, or IVR (interactive voice response) with accumulating and processing voice packets sampled. Furthermore, the CTI server incorporates a web server function which can provide various CTI services such as a Web telephone directory via a Web browser to PCs, cellular telephones or smart-phones in mobile environments.
Internet-Based System for Voice Communication With the ISS

NASA Technical Reports Server (NTRS)

Chamberlain, James; Myers, Gerry; Clem, David; Speir, Terri

2005-01-01

The Internet Voice Distribution System (IVoDS) is a voice-communication system that comprises mainly computer hardware and software. The IVoDS was developed to supplement and eventually replace the Enhanced Voice Distribution System (EVoDS), which, heretofore, has constituted the terrestrial subsystem of a system for voice communications among crewmembers of the International Space Station (ISS), workers at the Payloads Operations Center at Marshall Space Flight Center, principal investigators at diverse locations who are responsible for specific payloads, and others. The IVoDS utilizes a communication infrastructure of NASA and NASArelated intranets in addition to, as its name suggests, the Internet. Whereas the EVoDS utilizes traditional circuitswitched telephony, the IVoDS is a packet-data system that utilizes a voice over Internet protocol (VOIP). Relative to the EVoDS, the IVoDS offers advantages of greater flexibility and lower cost for expansion and reconfiguration. The IVoDS is an extended version of a commercial Internet-based voice conferencing system that enables each user to participate in only one conference at a time. In the IVoDS, a user can receive audio from as many as eight conferences simultaneously while sending audio to one of them. The IVoDS also incorporates administrative controls, beyond those of the commercial system, that provide greater security and control of the capabilities and authorizations for talking and listening afforded to each user.
Spanish-Speaking Patients’ Engagement in Interactive Voice Response (IVR) Chronic Disease Self-Management Support Calls: Analyses of Data from Three Countries

PubMed Central

Piette, John D.; Marinec, Nicolle; Gallegos-Cabriales, Esther C.; Gutierrez-Valverde, Juana Mercedes; Rodriguez-Saldaña, Joel; Mendoz-Alevares, Milton; Silveira, Maria J.

2013-01-01

We used data from Interactive Voice Response (IVR) self-management support studies in Honduras, Mexico, and the United States (US) to determine whether IVR calls to Spanish-speaking patients with chronic illnesses is a feasible strategy for improving monitoring and education between face-to-face visits. 268 patients with diabetes or hypertension participated in 6–12 weeks of weekly IVR follow-up. IVR calls emanated from US servers with connections via Voice over IP. More than half (54%) of patients enrolled with an informal caregiver who received automated feedback based on the patient’s assessments, and clinical staff received urgent alerts. Participants had on average 6.1 years of education, and 73% were women. After 2,443 person weeks of follow-up, patients completed 1,494 IVR assessments. Call completion rates were higher in the US (75%) than in Honduras (59%) or Mexico (61%; p<0.001). Patients participating with an informal caregiver were more likely to complete calls (adjusted odds ratio [AOR]: 1.53; 95% confidence interval [CI]: 1.04, 2.25) while patients reporting fair or poor health at enrollment were less likely (AOR:0.59; 95% CI: 0.38, 0.92). Satisfaction rates were high, with 98% of patients reporting that the system was easy to use, and 86% reporting that the calls helped them a great deal in managing their health problems. In summary, IVR self-management support is feasible among Spanish-speaking patients with chronic disease, including those living in less-developed countries. Voice over IP can be used to deliver IVR disease management services internationally; involving informal caregivers may increase patient engagement. PMID:23532005
NWR (National Weather Service) voice synthesis project, phase 1

NASA Astrophysics Data System (ADS)

Sampson, G. W.

1986-01-01

The purpose of the NOAA Weather Radio (NWR) Voice Synthesis Project is to provide a demonstration of the current voice synthesis technology. Phase 1 of this project is presented, providing a complete automation of an hourly surface aviation observation for broadcast over NWR. In examining the products currently available on the market, the decision was made that synthetic voice technology does not have the high quality speech required for broadcast over the NWR. Therefore the system presented uses the phrase concatenation type of technology for a very high quality, versatile, voice synthesis system.
Identification and human condition analysis based on the human voice analysis

NASA Astrophysics Data System (ADS)

Mieshkov, Oleksandr Yu.; Novikov, Oleksandr O.; Novikov, Vsevolod O.; Fainzilberg, Leonid S.; Kotyra, Andrzej; Smailova, Saule; Kozbekova, Ainur; Imanbek, Baglan

2017-08-01

The paper presents a two-stage biotechnical system for human condition analysis that is based on analysis of human voice signal. At the initial stage, the voice signal is pre-processed and its characteristics in time domain are determined. At the first stage, the developed system is capable of identifying the person in the database on the basis of the extracted characteristics. At the second stage, the model of a human voice is built on the basis of the real voice signals after clustering the whole database.
STS-41 Voice Command System Flight Experiment Report

NASA Technical Reports Server (NTRS)

Salazar, George A.

1981-01-01

This report presents the results of the Voice Command System (VCS) flight experiment on the five-day STS-41 mission. Two mission specialists,Bill Shepherd and Bruce Melnick, used the speaker-dependent system to evaluate the operational effectiveness of using voice to control a spacecraft system. In addition, data was gathered to analyze the effects of microgravity on speech recognition performance.

Accessibility of Mobile Devices for Visually Impaired Users: An Evaluation of the Screen-Reader VoiceOver.

PubMed

Smaradottir, Berglind; Håland, Jarle; Martinez, Santiago

2017-01-01

A mobile device's touchscreen allows users to use a choreography of hand gestures to interact with the user interface. A screen reader on a mobile device is designed to support the interaction of visually disabled users while using gestures. This paper presents an evaluation of VoiceOver, a screen reader in Apple Inc. products. The evaluation was a part of the research project "Visually impaired users touching the screen - a user evaluation of assistive technology".
VoiceThread as a Peer Review and Dissemination Tool for Undergraduate Research

NASA Astrophysics Data System (ADS)

Guertin, L. A.

2012-12-01

VoiceThread has been utilized in an undergraduate research methods course for peer review and final research project dissemination. VoiceThread (http://www.voicethread.com) can be considered a social media tool, as it is a web-based technology with the capacity to enable interactive dialogue. VoiceThread is an application that allows a user to place a media collection online containing images, audio, videos, documents, and/or presentations in an interface that facilitates asynchronous communication. Participants in a VoiceThread can be passive viewers of the online content or engaged commenters via text, audio, video, with slide annotations via a doodle tool. The VoiceThread, which runs across browsers and operating systems, can be public or private for viewing and commenting and can be embedded into any website. Although few university students are aware of the VoiceThread platform (only 10% of the students surveyed by Ng (2012)), the 2009 K-12 edition of The Horizon Report (Johnson et al., 2009) lists VoiceThread as a tool to watch because of the opportunities it provides as a collaborative learning environment. In Fall 2011, eleven students enrolled in an undergraduate research methods course at Penn State Brandywine each conducted their own small-scale research project. Upon conclusion of the projects, students were required to create a poster summarizing their work for peer review. To facilitate the peer review process outside of class, each student-created PowerPoint file was placed in a VoiceThread with private access to only the class members and instructor. Each student was assigned to peer review five different student posters (i.e., VoiceThread images) with the audio and doodle tools to comment on formatting, clarity of content, etc. After the peer reviews were complete, the students were allowed to edit their PowerPoint poster files for a new VoiceThread. In the new VoiceThread, students were required to video record themselves describing their research and taking the viewer through their poster in the VoiceThread. This new VoiceThread with their final presentations was open for public viewing but not public commenting. A formal assessment was not conducted on the student impact of using VoiceThread for peer review and final research presentations. From an instructional standpoint, requiring students to use audio for the peer review commenting seemed to result in lengthier and more detailed reviews, connected with specific poster features when the doodle tool was utilized. By recording themselves as a "talking head" for the final product, students were required to be comfortable and confident with presenting their research, similar to what would be expected at a conference presentation. VoiceThread is currently being tested in general education Earth science courses at Penn State Brandywine as a dissemination tool for classroom-based inquiry projects and recruitment tool for Earth & Mineral Science majors.
Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

NASA Astrophysics Data System (ADS)

White, R. W.; Parks, D. L.

1985-07-01

A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept.
Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

NASA Technical Reports Server (NTRS)

White, R. W.; Parks, D. L.

1985-01-01

A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept.
Implementing Artificial Intelligence Behaviors in a Virtual World

NASA Technical Reports Server (NTRS)

Krisler, Brian; Thome, Michael

2012-01-01

In this paper, we will present a look at the current state of the art in human-computer interface technologies, including intelligent interactive agents, natural speech interaction and gestural based interfaces. We describe our use of these technologies to implement a cost effective, immersive experience on a public region in Second Life. We provision our Artificial Agents as a German Shepherd Dog avatar with an external rules engine controlling the behavior and movement. To interact with the avatar, we implemented a natural language and gesture system allowing the human avatars to use speech and physical gestures rather than interacting via a keyboard and mouse. The result is a system that allows multiple humans to interact naturally with AI avatars by playing games such as fetch with a flying disk and even practicing obedience exercises using voice and gesture, a natural seeming day in the park.
A long distance voice transmission system based on the white light LED

NASA Astrophysics Data System (ADS)

Tian, Chunyu; Wei, Chang; Wang, Yulian; Wang, Dachi; Yu, Benli; Xu, Feng

2017-10-01

A long distance voice transmission system based on a visible light communication technology (VLCT) is proposed in the paper. Our proposed system includes transmitter, receiver and the voice signal processing of single chip microcomputer. In the compact-sized LED transmitter, we use on-off-keying and not-return-to-zero (OOK-NRZ) to easily realize high speed modulation, and then systematic complexity is reduced. A voice transmission system, which possesses the properties of the low-noise and wide modulation band, is achieved by the design of high efficiency receiving optical path and using filters to reduce noise from the surrounding light. To improve the speed of the signal processing, we use single chip microcomputer to code and decode voice signal. Furthermore, serial peripheral interface (SPI) is adopted to accurately transmit voice signal data. The test results of our proposed system show that the transmission distance of this system is more than100 meters with the maximum data rate of 1.5 Mbit/s and a SNR of 30dB. This system has many advantages, such as simple construction, low cost and strong practicality. Therefore, it has extensive application prospect in the fields of the emergency communication and indoor wireless communication, etc.
Effects of the Voice over Internet Protocol on Perturbation Analysis of Normal and Pathological Phonation

PubMed Central

Zhu, Yanmei; Witt, Rachel E.; MacCallum, Julia K.; Jiang, Jack J.

2010-01-01

Objective In this study, a Voice over Internet Protocol (VoIP) communication based on G.729 protocol was simulated to determine the effects of this system on acoustic perturbation parameters of normal and pathological voice signals. Patients and Methods: Fifty recordings of normal voice and 48 recordings of pathological voice affected by laryngeal paralysis were transmitted through a VoIP communication system. The acoustic analysis programs of CSpeech and MDVP were used to determine the percent jitter and percent shimmer from the voice samples before and after VoIP transmission. The effects of three frequently used audio compression protocols (MP3, WMA, and FLAC) on the perturbation measures were also studied. Results It was found that VoIP transmission disrupts the waveform and increases the percent jitter and percent shimmer of voice samples. However, after VoIP transmission, significant discrimination between normal and pathological voices affected by laryngeal paralysis was still possible. It was found that the lossless compression method FLAC does not exert any influence on the perturbation measures. The lossy compression methods MP3 and WMA increase percent jitter and percent shimmer values. Conclusion This study validates the feasibility of these transmission and compression protocols in developing remote voice signal data collection and assessment systems. PMID:20588051
Brain systems mediating voice identity processing in blind humans.

PubMed

Hölig, Cordula; Föcker, Julia; Best, Anna; Röder, Brigitte; Büchel, Christian

2014-09-01

Blind people rely more on vocal cues when they recognize a person's identity than sighted people. Indeed, a number of studies have reported better voice recognition skills in blind than in sighted adults. The present functional magnetic resonance imaging study investigated changes in the functional organization of neural systems involved in voice identity processing following congenital blindness. A group of congenitally blind individuals and matched sighted control participants were tested in a priming paradigm, in which two voice stimuli (S1, S2) were subsequently presented. The prime (S1) and the target (S2) were either from the same speaker (person-congruent voices) or from two different speakers (person-incongruent voices). Participants had to classify the S2 as either a old or a young person. Person-incongruent voices (S2) compared with person-congruent voices elicited an increased activation in the right anterior fusiform gyrus in congenitally blind individuals but not in matched sighted control participants. In contrast, only matched sighted controls showed a higher activation in response to person-incongruent compared with person-congruent voices (S2) in the right posterior superior temporal sulcus. These results provide evidence for crossmodal plastic changes of the person identification system in the brain after visual deprivation. Copyright © 2014 Wiley Periodicals, Inc.
Speaker's voice as a memory cue.

PubMed

Campeanu, Sandra; Craik, Fergus I M; Alain, Claude

2015-02-01

Speaker's voice occupies a central role as the cornerstone of auditory social interaction. Here, we review the evidence suggesting that speaker's voice constitutes an integral context cue in auditory memory. Investigation into the nature of voice representation as a memory cue is essential to understanding auditory memory and the neural correlates which underlie it. Evidence from behavioral and electrophysiological studies suggest that while specific voice reinstatement (i.e., same speaker) often appears to facilitate word memory even without attention to voice at study, the presence of a partial benefit of similar voices between study and test is less clear. In terms of explicit memory experiments utilizing unfamiliar voices, encoding methods appear to play a pivotal role. Voice congruency effects have been found when voice is specifically attended at study (i.e., when relatively shallow, perceptual encoding takes place). These behavioral findings coincide with neural indices of memory performance such as the parietal old/new recollection effect and the late right frontal effect. The former distinguishes between correctly identified old words and correctly identified new words, and reflects voice congruency only when voice is attended at study. Characterization of the latter likely depends upon voice memory, rather than word memory. There is also evidence to suggest that voice effects can be found in implicit memory paradigms. However, the presence of voice effects appears to depend greatly on the task employed. Using a word identification task, perceptual similarity between study and test conditions is, like for explicit memory tests, crucial. In addition, the type of noise employed appears to have a differential effect. While voice effects have been observed when white noise is used at both study and test, using multi-talker babble does not confer the same results. In terms of neuroimaging research modulations, characterization of an implicit memory effect reflective of voice congruency is currently lacking. Copyright © 2014 Elsevier B.V. All rights reserved.
Loud and angry: sound intensity modulates amygdala activation to angry voices in social anxiety disorder

PubMed Central

Simon, Doerte; Becker, Michael; Mothes-Lasch, Martin; Miltner, Wolfgang H.R.

2017-01-01

Abstract Angry expressions of both voices and faces represent disorder-relevant stimuli in social anxiety disorder (SAD). Although individuals with SAD show greater amygdala activation to angry faces, previous work has failed to find comparable effects for angry voices. Here, we investigated whether voice sound-intensity, a modulator of a voice’s threat-relevance, affects brain responses to angry prosody in SAD. We used event-related functional magnetic resonance imaging to explore brain responses to voices varying in sound intensity and emotional prosody in SAD patients and healthy controls (HCs). Angry and neutral voices were presented either with normal or high sound amplitude, while participants had to decide upon the speaker’s gender. Loud vs normal voices induced greater insula activation, and angry vs neutral prosody greater orbitofrontal cortex activation in SAD as compared with HC subjects. Importantly, an interaction of sound intensity, prosody and group was found in the insula and the amygdala. In particular, the amygdala showed greater activation to loud angry voices in SAD as compared with HC subjects. This finding demonstrates a modulating role of voice sound-intensity on amygdalar hyperresponsivity to angry prosody in SAD and suggests that abnormal processing of interpersonal threat signals in amygdala extends beyond facial expressions in SAD. PMID:27651541
Whispering - The hidden side of auditory communication.

PubMed

Frühholz, Sascha; Trost, Wiebke; Grandjean, Didier

2016-11-15

Whispering is a unique expression mode that is specific to auditory communication. Individuals switch their vocalization mode to whispering especially when affected by inner emotions in certain social contexts, such as in intimate relationships or intimidating social interactions. Although this context-dependent whispering is adaptive, whispered voices are acoustically far less rich than phonated voices and thus impose higher hearing and neural auditory decoding demands for recognizing their socio-affective value by listeners. The neural dynamics underlying this recognition especially from whispered voices are largely unknown. Here we show that whispered voices in humans are considerably impoverished as quantified by an entropy measure of spectral acoustic information, and this missing information needs large-scale neural compensation in terms of auditory and cognitive processing. Notably, recognizing the socio-affective information from voices was slightly more difficult from whispered voices, probably based on missing tonal information. While phonated voices elicited extended activity in auditory regions for decoding of relevant tonal and time information and the valence of voices, whispered voices elicited activity in a complex auditory-frontal brain network. Our data suggest that a large-scale multidirectional brain network compensates for the impoverished sound quality of socially meaningful environmental signals to support their accurate recognition and valence attribution. Copyright © 2016 Elsevier Inc. All rights reserved.
Loud and angry: sound intensity modulates amygdala activation to angry voices in social anxiety disorder.

PubMed

Simon, Doerte; Becker, Michael; Mothes-Lasch, Martin; Miltner, Wolfgang H R; Straube, Thomas

2017-03-01

Angry expressions of both voices and faces represent disorder-relevant stimuli in social anxiety disorder (SAD). Although individuals with SAD show greater amygdala activation to angry faces, previous work has failed to find comparable effects for angry voices. Here, we investigated whether voice sound-intensity, a modulator of a voice's threat-relevance, affects brain responses to angry prosody in SAD. We used event-related functional magnetic resonance imaging to explore brain responses to voices varying in sound intensity and emotional prosody in SAD patients and healthy controls (HCs). Angry and neutral voices were presented either with normal or high sound amplitude, while participants had to decide upon the speaker's gender. Loud vs normal voices induced greater insula activation, and angry vs neutral prosody greater orbitofrontal cortex activation in SAD as compared with HC subjects. Importantly, an interaction of sound intensity, prosody and group was found in the insula and the amygdala. In particular, the amygdala showed greater activation to loud angry voices in SAD as compared with HC subjects. This finding demonstrates a modulating role of voice sound-intensity on amygdalar hyperresponsivity to angry prosody in SAD and suggests that abnormal processing of interpersonal threat signals in amygdala extends beyond facial expressions in SAD. © The Author (2016). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Voice discrimination in four primates.

PubMed

Candiotti, Agnès; Zuberbühler, Klaus; Lemasson, Alban

2013-10-01

One accepted function of vocalisations is to convey information about the signaller, such as its age-sex class, motivation, or relationship with the recipient. Yet, in natural habitats individuals not only interact with conspecifics but also with members of other species. This is well documented for African forest monkeys, which form semi-permanent mixed-species groups that can persist for decades. Although members of such groups interact with each other on a daily basis, both physically and vocally, it is currently unknown whether they can discriminate familiar and unfamiliar voices of heterospecific group members. We addressed this question with playbacks on monkey species known to form polyspecific associations in the wild: red-capped mangabeys, Campbell's monkeys and Guereza colobus monkeys. We tested subjects' discrimination abilities of contact calls of familiar and unfamiliar female De Brazza monkeys. When pooling all species, subjects looked more often towards the speaker when hearing contact calls of unfamiliar than familiar callers. When testing De Brazza monkeys with their own calls, we found the same effect with the longest gaze durations after hearing unfamiliar voices. This suggests that primates can discriminate, not only between familiar and unfamiliar voices of conspecifics, but also between familiar and unfamiliar voices of heterospecifics living within a close proximity. Copyright © 2013 Elsevier B.V. All rights reserved.
Measuring positive and negative affect in the voiced sounds of African elephants (Loxodonta africana).

PubMed

Soltis, Joseph; Blowers, Tracy E; Savage, Anne

2011-02-01

As in other mammals, there is evidence that the African elephant voice reflects affect intensity, but it is less clear if positive and negative affective states are differentially reflected in the voice. An acoustic comparison was made between African elephant "rumble" vocalizations produced in negative social contexts (dominance interactions), neutral social contexts (minimal social activity), and positive social contexts (affiliative interactions) by four adult females housed at Disney's Animal Kingdom®. Rumbles produced in the negative social context exhibited higher and more variable fundamental frequencies (F(0)) and amplitudes, longer durations, increased voice roughness, and higher first formant locations (F1), compared to the neutral social context. Rumbles produced in the positive social context exhibited similar shifts in most variables (F(0 )variation, amplitude, amplitude variation, duration, and F1), but the magnitude of response was generally less than that observed in the negative context. Voice roughness and F(0) observed in the positive social context remained similar to that observed in the neutral context. These results are most consistent with the vocal expression of affect intensity, in which the negative social context elicited higher intensity levels than the positive context, but differential vocal expression of positive and negative affect cannot be ruled out.
Apollo experience report: Voice communications techniques and performance

NASA Technical Reports Server (NTRS)

Dabbs, J. H.; Schmidt, O. L.

1972-01-01

The primary performance requirement of the spaceborne Apollo voice communications system is percent word intelligibility, which is related to other link/channel parameters. The effect of percent word intelligibility on voice channel design and a description of the verification procedures are included. Development and testing performance problems and the techniques used to solve the problems are also discussed. Voice communications performance requirements should be comprehensive and verified easily; the total system must be considered in component design, and the necessity of voice processing and the associated effect on noise, distortion, and cross talk should be examined carefully.
Exploring expressivity and emotion with artificial voice and speech technologies.

PubMed

Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

2013-10-01

Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.
The pattern of educator voice in clinical counseling in an educational hospital in Shiraz, Iran: a conversation analysis

PubMed Central

Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran

2017-01-01

Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients’ behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians’ dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills. PMID:29296258
The pattern of educator voice in clinical counseling in an educational hospital in Shiraz, Iran: a conversation analysis.

PubMed

Kalateh Sadati, Ahmad; Bagheri Lankarani, Kamran

2017-01-01

Doctor-patient interaction (DPI) includes different voices, of which the educator voice is of considerable importance. Physicians employ this voice to educate patients and their caregivers by providing them with information in order to change the patients' behavior and improve their health status. The subject has not yet been fully understood, and therefore the present study was conducted to explore the pattern of educator voice. For this purpose, conversation analysis (CA) of 33 recorded clinical consultations was performed in outpatient educational clinics in Shiraz, Iran between April 2014 and September 2014. In this qualitative study, all utterances, repetitions, lexical forms, chuckles and speech particles were considered and interpreted as social actions. Interpretations were based on inductive data-driven analysis with the aim to find recurring patterns of educator voice. The results showed educator voice to have two general features: descriptive and prescriptive. However, the pattern of educator voice comprised characteristics such as superficiality, marginalization of patients, one-dimensional approach, ignoring a healthy lifestyle, and robotic nature. The findings of this study clearly demonstrated a deficiency in the educator voice and inadequacy in patient-centered dialogue. In this setting, the educator voice was related to a distortion of DPI through the physicians' dominance, leading them to ignore their professional obligation to educate patients. Therefore, policies in this regard should take more account of enriching the educator voice through training medical students and faculty members in communication skills.
Matching Speaking to Singing Voices and the Influence of Content.

PubMed

Peynircioğlu, Zehra F; Rabinovitz, Brian E; Repice, Juliana

2017-03-01

We tested whether speaking voices of unfamiliar people could be matched to their singing voices, and, if so, whether the content of the utterances would influence this matching performance. Our hypothesis was that enough acoustic features would remain the same between speaking and singing voices such that their identification as belonging to the same or different individuals would be possible even upon a single hearing. We also hypothesized that the contents of the utterances would influence this identification process such that voices uttering words would be easier to match than those uttering vowels. We used a within-participant design with blocked stimuli that were counterbalanced using a Latin square design. In one block, mode (speaking vs singing) was manipulated while content was held constant; in another block, content (word vs syllable) was manipulated while mode was held constant, and in the control block, both mode and content were held constant. Participants indicated whether the voices in any given pair of utterances belonged to the same person or to different people. Cross-mode matching was above chance level, although mode-congruent performance was better. Further, only speaking voices were easier to match when uttering words. We can identify speaking and singing voices as the same or different even on just a single hearing. However, content interacts with mode such that words benefit matching of speaking voices but not of singing voices. Results are discussed within an attentional framework. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Infusing Technology into Customer Relationships: Balancing High-Tech and High-Touch

NASA Astrophysics Data System (ADS)

Salomann, Harald; Kolbe, Lutz; Brenner, Walter

In today's business environment, self-service is becoming increasingly important. In order to promote their self-service activities, banks have created online-only products and airlines offer exclusive discounts for passengers booking online. Self-service technologies' practical applications demonstrate this approach's potential. For example, Amtrak introduced an IVR (Interactive Voice Response) system, allowing cost savings of 13m; likewise Royal Mail installed an IVR system leading to a reduction of its customer service costs by 25% (Economist 2004).

White House Communications Agency (WHCA) Presidential Voice Communications Rack Mount System Mechanical Drawing Package

DTIC Science & Technology

2015-12-01

Rack Mount System Mechanical Drawing Package by Steven P Callaway Approved for public release; distribution unlimited...Laboratory White House Communications Agency (WHCA) Presidential Voice Communications Rack Mount System Mechanical Drawing Package by Steven P...Note 3. DATES COVERED (From - To) 04/2013 4. TITLE AND SUBTITLE White House Communications Agency (WHCA) Presidential Voice Communications Rack
Central Nervous System Control of Voice and Swallowing

PubMed Central

Ludlow, Christy L.

2015-01-01

This review of the central nervous control systems for voice and swallowing has suggested that the traditional concepts of a separation between cortical and limbic and brain stem control should be refined and more integrative. For voice production, a separation of the non-human vocalization system from the human learned voice production system has been posited based primarily on studies of non-human primates. However, recent humans studies of emotionally based vocalizations and human volitional voice production has shown more integration between these two systems than previously proposed. Recent human studies have shown that reflexive vocalization as well as learned voice production not involving speech, involve a common integrative system. On the other hand, recent studies of non-human primates have provided evidence of some cortical activity during vocalization and cortical changes with training during vocal behavior. For swallowing, evidence from the macaque and functional brain imaging in humans indicates that the control for the pharyngeal phase of swallowing is not primarily under brain stem mechanisms as previously proposed. Studies suggest that the initiation and patterning of swallowing for the pharyngeal phase is also under active cortical control for both spontaneous as well as volitional swallowing in awake humans and non-human primates. PMID:26241238
The Provision of Feedback Types to EFL Learners in Synchronous Voice Computer Mediated Communication

ERIC Educational Resources Information Center

Ko, Chao-Jung

2015-01-01

This study examined the relationship between Synchronous Voice Computer Mediated Communication (SVCMC) interaction and the use of feedback types, especially pronunciation feedback types, in distance tutoring contexts. The participants, divided into two groups (explicit and recast), were twelve beginning/low-intermediate level English as a Foreign…
What Does Class Origin and Education Mean for the Capabilities of Agency and Voice?

ERIC Educational Resources Information Center

Nordlander, Erica; Strandh, Mattias; Brännlund, Annica

2015-01-01

This article investigates the relationship between class origin, educational attainment, and the capabilities of agency and voice. The main objectives are to investigate how class origin and educational attainment interact and to consider whether higher education reduces any structural inequalities in the social aspects of life. A longitudinal…
Female Middle School Principals' Voices: Implications for School Leadership Preparation

ERIC Educational Resources Information Center

Jones, Cathy; Ovando, Martha; High, Cynthia

2009-01-01

This study was an attempt to add the voices of women to the discourse of school leadership. It focused on the nature of the middle school leadership experiences of three female middle school principals, their social interactions based on gender role expectations and their own leadership perspectives. Findings suggest that middle school leadership…
Obligatory and facultative brain regions for voice-identity recognition

PubMed Central

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Abstract Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. PMID:29228111
Obligatory and facultative brain regions for voice-identity recognition.

PubMed

Roswandowitz, Claudia; Kappes, Claudia; Obrig, Hellmuth; von Kriegstein, Katharina

2018-01-01

Recognizing the identity of others by their voice is an important skill for social interactions. To date, it remains controversial which parts of the brain are critical structures for this skill. Based on neuroimaging findings, standard models of person-identity recognition suggest that the right temporal lobe is the hub for voice-identity recognition. Neuropsychological case studies, however, reported selective deficits of voice-identity recognition in patients predominantly with right inferior parietal lobe lesions. Here, our aim was to work towards resolving the discrepancy between neuroimaging studies and neuropsychological case studies to find out which brain structures are critical for voice-identity recognition in humans. We performed a voxel-based lesion-behaviour mapping study in a cohort of patients (n = 58) with unilateral focal brain lesions. The study included a comprehensive behavioural test battery on voice-identity recognition of newly learned (voice-name, voice-face association learning) and familiar voices (famous voice recognition) as well as visual (face-identity recognition) and acoustic control tests (vocal-pitch and vocal-timbre discrimination). The study also comprised clinically established tests (neuropsychological assessment, audiometry) and high-resolution structural brain images. The three key findings were: (i) a strong association between voice-identity recognition performance and right posterior/mid temporal and right inferior parietal lobe lesions; (ii) a selective association between right posterior/mid temporal lobe lesions and voice-identity recognition performance when face-identity recognition performance was factored out; and (iii) an association of right inferior parietal lobe lesions with tasks requiring the association between voices and faces but not voices and names. The results imply that the right posterior/mid temporal lobe is an obligatory structure for voice-identity recognition, while the inferior parietal lobe is only a facultative component of voice-identity recognition in situations where additional face-identity processing is required. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain.
Non-verbal emotion communication training induces specific changes in brain function and structure

PubMed Central

Kreifelts, Benjamin; Jacob, Heike; Brück, Carolin; Erb, Michael; Ethofer, Thomas; Wildgruber, Dirk

2013-01-01

The perception of emotional cues from voice and face is essential for social interaction. However, this process is altered in various psychiatric conditions along with impaired social functioning. Emotion communication trainings have been demonstrated to improve social interaction in healthy individuals and to reduce emotional communication deficits in psychiatric patients. Here, we investigated the impact of a non-verbal emotion communication training (NECT) on cerebral activation and brain structure in a controlled and combined functional magnetic resonance imaging (fMRI) and voxel-based morphometry study. NECT-specific reductions in brain activity occurred in a distributed set of brain regions including face and voice processing regions as well as emotion processing- and motor-related regions presumably reflecting training-induced familiarization with the evaluation of face/voice stimuli. Training-induced changes in non-verbal emotion sensitivity at the behavioral level and the respective cerebral activation patterns were correlated in the face-selective cortical areas in the posterior superior temporal sulcus and fusiform gyrus for valence ratings and in the temporal pole, lateral prefrontal cortex and midbrain/thalamus for the response times. A NECT-induced increase in gray matter (GM) volume was observed in the fusiform face area. Thus, NECT induces both functional and structural plasticity in the face processing system as well as functional plasticity in the emotion perception and evaluation system. We propose that functional alterations are presumably related to changes in sensory tuning in the decoding of emotional expressions. Taken together, these findings highlight that the present experimental design may serve as a valuable tool to investigate the altered behavioral and neuronal processing of emotional cues in psychiatric disorders as well as the impact of therapeutic interventions on brain function and structure. PMID:24146641
Non-verbal emotion communication training induces specific changes in brain function and structure.

PubMed

Kreifelts, Benjamin; Jacob, Heike; Brück, Carolin; Erb, Michael; Ethofer, Thomas; Wildgruber, Dirk

2013-01-01

The perception of emotional cues from voice and face is essential for social interaction. However, this process is altered in various psychiatric conditions along with impaired social functioning. Emotion communication trainings have been demonstrated to improve social interaction in healthy individuals and to reduce emotional communication deficits in psychiatric patients. Here, we investigated the impact of a non-verbal emotion communication training (NECT) on cerebral activation and brain structure in a controlled and combined functional magnetic resonance imaging (fMRI) and voxel-based morphometry study. NECT-specific reductions in brain activity occurred in a distributed set of brain regions including face and voice processing regions as well as emotion processing- and motor-related regions presumably reflecting training-induced familiarization with the evaluation of face/voice stimuli. Training-induced changes in non-verbal emotion sensitivity at the behavioral level and the respective cerebral activation patterns were correlated in the face-selective cortical areas in the posterior superior temporal sulcus and fusiform gyrus for valence ratings and in the temporal pole, lateral prefrontal cortex and midbrain/thalamus for the response times. A NECT-induced increase in gray matter (GM) volume was observed in the fusiform face area. Thus, NECT induces both functional and structural plasticity in the face processing system as well as functional plasticity in the emotion perception and evaluation system. We propose that functional alterations are presumably related to changes in sensory tuning in the decoding of emotional expressions. Taken together, these findings highlight that the present experimental design may serve as a valuable tool to investigate the altered behavioral and neuronal processing of emotional cues in psychiatric disorders as well as the impact of therapeutic interventions on brain function and structure.
Literature review of voice recognition and generation technology for Army helicopter applications

NASA Astrophysics Data System (ADS)

Christ, K. A.

1984-08-01

This report is a literature review on the topics of voice recognition and generation. Areas covered are: manual versus vocal data input, vocabulary, stress and workload, noise, protective masks, feedback, and voice warning systems. Results of the studies presented in this report indicate that voice data entry has less of an impact on a pilot's flight performance, during low-level flying and other difficult missions, than manual data entry. However, the stress resulting from such missions may cause the pilot's voice to change, reducing the recognition accuracy of the system. The noise present in helicopter cockpits also causes the recognition accuracy to decrease. Noise-cancelling devices are being developed and improved upon to increase the recognition performance in noisy environments. Future research in the fields of voice recognition and generation should be conducted in the areas of stress and workload, vocabulary, and the types of voice generation best suited for the helicopter cockpit. Also, specific tasks should be studied to determine whether voice recognition and generation can be effectively applied.
Fluid-acoustic interactions and their impact on pathological voiced speech

NASA Astrophysics Data System (ADS)

Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.; Plesniak, Michael W.

2011-11-01

Voiced speech is produced by vibration of the vocal fold structures. Vocal fold dynamics arise from aerodynamic pressure loadings, tissue properties, and acoustic modulation of the driving pressures. Recent speech science advancements have produced a physiologically-realistic fluid flow solver (BLEAP) capable of prescribing asymmetric intraglottal flow attachment that can be easily assimilated into reduced order models of speech. The BLEAP flow solver is extended to incorporate acoustic loading and sound propagation in the vocal tract by implementing a wave reflection analog approach for sound propagation based on the governing BLEAP equations. This enhanced physiological description of the physics of voiced speech is implemented into a two-mass model of speech. The impact of fluid-acoustic interactions on vocal fold dynamics is elucidated for both normal and pathological speech through linear and nonlinear analysis techniques. Supported by NSF Grant CBET-1036280.
Impact of a voice recognition system on report cycle time and radiologist reading time

NASA Astrophysics Data System (ADS)

Melson, David L.; Brophy, Robert; Blaine, G. James; Jost, R. Gilbert; Brink, Gary S.

1998-07-01

Because of its exciting potential to improve clinical service, as well as reduce costs, a voice recognition system for radiological dictation was recently installed at our institution. This system will be clinically successful if it dramatically reduces radiology report turnaround time without substantially affecting radiologist dictation and editing time. This report summarizes an observer study currently under way in which radiologist reporting times using the traditional transcription system and the voice recognition system are compared. Four radiologists are observed interpreting portable intensive care unit (ICU) chest examinations at a workstation in the chest reading area. Data are recorded with the radiologists using the transcription system and using the voice recognition system. The measurements distinguish between time spent performing clerical tasks and time spent actually dictating the report. Editing time and the number of corrections made are recorded. Additionally, statistics are gathered to assess the voice recognition system's impact on the report cycle time -- the time from report dictation to availability of an edited and finalized report -- and the length of reports.
Mobile Health Devices as Tools for Worldwide Cardiovascular Risk Reduction and Disease Management.

PubMed

Piette, John D; List, Justin; Rana, Gurpreet K; Townsend, Whitney; Striplin, Dana; Heisler, Michele

2015-11-24

We examined evidence on whether mobile health (mHealth) tools, including interactive voice response calls, short message service, or text messaging, and smartphones, can improve lifestyle behaviors and management related to cardiovascular diseases throughout the world. We conducted a state-of-the-art review and literature synthesis of peer-reviewed and gray literature published since 2004. The review prioritized randomized trials and studies focused on cardiovascular diseases and risk factors, but included other reports when they represented the best available evidence. The search emphasized reports on the potential benefits of mHealth interventions implemented in low- and middle-income countries. Interactive voice response and short message service interventions can improve cardiovascular preventive care in developed countries by addressing risk factors including weight, smoking, and physical activity. Interactive voice response and short message service-based interventions for cardiovascular disease management also have shown benefits with respect to hypertension management, hospital readmissions, and diabetic glycemic control. Multimodal interventions including Web-based communication with clinicians and mHealth-enabled clinical monitoring with feedback also have shown benefits. The evidence regarding the potential benefits of interventions using smartphones and social media is still developing. Studies of mHealth interventions have been conducted in >30 low- and middle-income countries, and evidence to date suggests that programs are feasible and may improve medication adherence and disease outcomes. Emerging evidence suggests that mHealth interventions may improve cardiovascular-related lifestyle behaviors and disease management. Next-generation mHealth programs developed worldwide should be based on evidence-based behavioral theories and incorporate advances in artificial intelligence for adapting systems automatically to patients' unique and changing needs. © 2015 American Heart Association, Inc.
Designing interaction, voice, and inclusion in AAC research.

PubMed

Pullin, Graham; Treviranus, Jutta; Patel, Rupal; Higginbotham, Jeff

2017-09-01

The ISAAC 2016 Research Symposium included a Design Stream that examined timely issues across augmentative and alternative communication (AAC), framed in terms of designing interaction, designing voice, and designing inclusion. Each is a complex term with multiple meanings; together they represent challenging yet important frontiers of AAC research. The Design Stream was conceived by the four authors, researchers who have been exploring AAC and disability-related design throughout their careers, brought together by a shared conviction that designing for communication implies more than ensuring access to words and utterances. Each of these presenters came to AAC from a different background: interaction design, inclusive design, speech science, and social science. The resulting discussion among 24 symposium participants included controversies about the role of technology, tensions about independence and interdependence, and a provocation about taste. The paper concludes by proposing new directions for AAC research: (a) new interdisciplinary research could combine scientific and design research methods, as distant yet complementary as microanalysis and interaction design, (b) new research tools could seed accessible and engaging contextual research into voice within a social model of disability, and (c) new open research networks could support inclusive, international and interdisciplinary research.
Depressed mothers' infants are less responsive to faces and voices.

PubMed

Field, Tiffany; Diego, Miguel; Hernandez-Reif, Maria

2009-06-01

A review of our recent research suggests that infants of depressed mothers appeared to be less responsive to faces and voices as early as the neonatal period. At that time they have shown less orienting to the live face/voice stimulus of the Brazelton scale examiner and to their own and other infants' cry sounds. This lesser responsiveness has been attributed to higher arousal, less attentiveness and less "empathy." Their delayed heart rate decelerations to instrumental and vocal music sounds have also been ascribed to their delayed attention and/or slower processing. Later at 3-6 months they showed less negative responding to their mothers' non-contingent and still-face behavior, suggesting that they were more accustomed to this behavior in their mothers. The less responsive behavior of the depressed mothers was further compounded by their comorbid mood states of anger and anxiety and their difficult interaction styles including withdrawn or intrusive interaction styles and their later authoritarian parenting style. Pregnancy massage was effectively used to reduce prenatal depression and facilitate more optimal neonatal behavior. Interaction coaching was used during the postnatal period to help these dyads with their interactions and ultimately facilitate the infants' development.
Infants of Depressed Mothers Are Less Responsive To Faces and Voices: A Review

PubMed Central

Field, Tiffany; Diego, Miguel; Hernandez-Reif, Maria

2009-01-01

A review of our recent research suggests that infants of depressed mothers appeared to be less responsive to faces and voices as early as the neonatal period. At that time they have shown less orienting to the live face/voice stimulus of the Brazelton scale examiner and to their own and other infants’ cry sounds. This lesser responsiveness has been attributed to higher arousal, less attentiveness and less “empathy.” Their delayed heart rate decelerations to instrumental and vocal music sounds have also been ascribed to their delayed attention and/or slower processing. Later at 3–6 months they showed less negative responding to their mothers’ non-contingent and still-face behavior, suggesting that they were more accustomed to this behavior in their mothers. The less responsive behavior of the depressed mothers was further compounded by their comorbid mood states of anger and anxiety and their difficult interaction styles including withdrawn or intrusive interaction styles and their later authoritarian parenting style. Pregnancy massage was effectively used to reduce prenatal depression and facilitate more optimal neonatal behavior. Interaction coaching was used during the postnatal period to help these dyads with their interactions and ultimately facilitate the infants’ development PMID:19439359
Time-of-day effects on voice range profile performance in young, vocally untrained adult females.

PubMed

van Mersbergen, M R; Verdolini, K; Titze, I R

1999-12-01

Time-of-day effects on voice range profile performance were investigated in 20 vocally healthy untrained women between the ages of 18 and 35 years. Each subject produced two complete voice range profiles: one in the morning and one in the evening, about 36 hours apart. The order of morning and evening trials was counterbalanced across subjects. Dependent variables were (1) average minimum and average maximum intensity, (2) Voice range profile area and (3) center of gravity (median semitone pitch and median intensity). In this study, the results failed to reveal any clear evidence of time-of-day effects on voice range profile performance, for any of the dependent variables. However, a reliable interaction of time-of-day and trial order was obtained for average minimum intensity. Investigation of other subject populations, in particular trained vocalists or those with laryngeal lesions, is required for any generalization of the results.
14 CFR 25.1457 - Cockpit voice recorders.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 14 Aeronautics and Space 1 2014-01-01 2014-01-01 false Cockpit voice recorders. 25.1457 Section 25... recorders. (a) Each cockpit voice recorder required by the operating rules of this chapter must be approved... interphone system. (4) Voice or audio signals identifying navigation or approach aids introduced into a...
14 CFR 25.1457 - Cockpit voice recorders.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 1 2013-01-01 2013-01-01 false Cockpit voice recorders. 25.1457 Section 25... recorders. (a) Each cockpit voice recorder required by the operating rules of this chapter must be approved... interphone system. (4) Voice or audio signals identifying navigation or approach aids introduced into a...
14 CFR 29.1457 - Cockpit voice recorders.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 1 2012-01-01 2012-01-01 false Cockpit voice recorders. 29.1457 Section 29... recorders. (a) Each cockpit voice recorder required by the operating rules of this chapter must be approved... interphone system. (4) Voice or audio signals identifying navigation or approach aids introduced into a...

14 CFR 29.1457 - Cockpit voice recorders.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 1 2013-01-01 2013-01-01 false Cockpit voice recorders. 29.1457 Section 29... recorders. (a) Each cockpit voice recorder required by the operating rules of this chapter must be approved... interphone system. (4) Voice or audio signals identifying navigation or approach aids introduced into a...
14 CFR 25.1457 - Cockpit voice recorders.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 1 2012-01-01 2012-01-01 false Cockpit voice recorders. 25.1457 Section 25... recorders. (a) Each cockpit voice recorder required by the operating rules of this chapter must be approved... interphone system. (4) Voice or audio signals identifying navigation or approach aids introduced into a...
Pilot vehicle interface on the advanced fighter technology integration F-16

NASA Technical Reports Server (NTRS)

Dana, W. H.; Smith, W. B.; Howard, J. D.

1986-01-01

This paper focuses on the work load aspects of the pilot vehicle interface in regard to the new technologies tested during AMAS Phase II. Subjects discussed in this paper include: a wide field-of-view head-up display; automated maneuvering attack system/sensor tracker system; master modes that configure flight controls and mission avionics; a modified helmet mounted sight; improved multifunction display capability; a voice interactive command system; ride qualities during automated weapon delivery; a color moving map; an advanced digital map display; and a g-induced loss-of-consciousness and spatial disorientation autorecovery system.
Amygdala and auditory cortex exhibit distinct sensitivity to relevant acoustic features of auditory emotions.

PubMed

Pannese, Alessia; Grandjean, Didier; Frühholz, Sascha

2016-12-01

Discriminating between auditory signals of different affective value is critical to successful social interaction. It is commonly held that acoustic decoding of such signals occurs in the auditory system, whereas affective decoding occurs in the amygdala. However, given that the amygdala receives direct subcortical projections that bypass the auditory cortex, it is possible that some acoustic decoding occurs in the amygdala as well, when the acoustic features are relevant for affective discrimination. We tested this hypothesis by combining functional neuroimaging with the neurophysiological phenomena of repetition suppression (RS) and repetition enhancement (RE) in human listeners. Our results show that both amygdala and auditory cortex responded differentially to physical voice features, suggesting that the amygdala and auditory cortex decode the affective quality of the voice not only by processing the emotional content from previously processed acoustic features, but also by processing the acoustic features themselves, when these are relevant to the identification of the voice's affective value. Specifically, we found that the auditory cortex is sensitive to spectral high-frequency voice cues when discriminating vocal anger from vocal fear and joy, whereas the amygdala is sensitive to vocal pitch when discriminating between negative vocal emotions (i.e., anger and fear). Vocal pitch is an instantaneously recognized voice feature, which is potentially transferred to the amygdala by direct subcortical projections. These results together provide evidence that, besides the auditory cortex, the amygdala too processes acoustic information, when this is relevant to the discrimination of auditory emotions. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Interactive telemedicine solution based on a secure mHealth application.

PubMed

Eldeib, Ayman M

2014-01-01

In dynamic healthcare environments, caregivers and patients are constantly moving. To increase the healthcare quality when it is necessary, caregivers need the ability to reach each other and securely access medical information and services from wherever they happened to be. This paper presents an Interactive Telemedicine Solution (ITS) to facilitate and automate the communication within a healthcare facility via Voice over Internet Protocol (VOIP), regular mobile phones, and Wi-Fi connectivity. Our system has the capability to exchange/provide securely healthcare information/services across geographic barriers through 3G/4G wireless communication network. Our system assumes the availability of an Electronic Health Record (EHR) system locally in the healthcare organization and/or on the cloud network such as a nation-wide EHR system. This paper demonstrate the potential of our system to provide effectively and securely remote healthcare solution.
Motorcycle Start-stop System based on Intelligent Biometric Voice Recognition

NASA Astrophysics Data System (ADS)

Winda, A.; E Byan, W. R.; Sofyan; Armansyah; Zariantin, D. L.; Josep, B. G.

2017-03-01

Current mechanical key in the motorcycle is prone to bulgary, being stolen or misplaced. Intelligent biometric voice recognition as means to replace this mechanism is proposed as an alternative. The proposed system will decide whether the voice is belong to the user or not and the word utter by the user is ‘On’ or ‘Off’. The decision voice will be sent to Arduino in order to start or stop the engine. The recorded voice is processed in order to get some features which later be used as input to the proposed system. The Mel-Frequency Ceptral Coefficient (MFCC) is adopted as a feature extraction technique. The extracted feature is the used as input to the SVM-based identifier. Experimental results confirm the effectiveness of the proposed intelligent voice recognition and word recognition system. It show that the proposed method produces a good training and testing accuracy, 99.31% and 99.43%, respectively. Moreover, the proposed system shows the performance of false rejection rate (FRR) and false acceptance rate (FAR) accuracy of 0.18% and 17.58%, respectively. In the intelligent word recognition shows that the training and testing accuracy are 100% and 96.3%, respectively.
Network Speech Systems Technology Program

NASA Astrophysics Data System (ADS)

Weinstein, C. J.

1980-09-01

This report documents work performed during FY 1980 on the DCA-sponsored Network Speech Systems Technology Program. The areas of work reported are: (1) communication systems studies in Demand-Assignment Multiple Access (DAMA), voice/data integration, and adaptive routing, in support of the evolving Defense Communications System (DCS) and Defense Switched Network (DSN); (2) a satellite/terrestrial integration design study including the functional design of voice and data interfaces to interconnect terrestrial and satellite network subsystems; and (3) voice-conferencing efforts dealing with support of the Secure Voice and Graphics Conferencing (SVGC) Test and Evaluation Program. Progress in definition and planning of experiments for the Experimental Integrated Switched Network (EISN) is detailed separately in an FY 80 Experiment Plan Supplement.
How well does voice interaction work in space?

NASA Technical Reports Server (NTRS)

Morris, Randy B.; Whitmore, Mihriban; Adam, Susan C.

1993-01-01

The methods and results of an evaluation of the Voice Navigator software package are discussed. The first phase or ground phase of the study consisted of creating, or training, computer voice files of specific commands. This consisted of repeating each of six commands eight times. The files were then tested for recognition accuracy by the software aboard the microgravity aircraft. During the second phase, both voice training and testing were performed in microgravity. Inflight training was done due to problems encountered in phase one which were believed to be caused by ambient noise levels. Both quantitative and qualitative data were collected. Only one of the commands was found to offer consistently high recognition rates across subjects during the second phase.
Comparisons of voice onset time for trained male singers and male nonsingers during speaking and singing.

PubMed

McCrea, Christopher R; Morris, Richard J

2005-09-01

This study was designed to examine the temporal acoustic differences between male trained singers and nonsingers during speaking and singing across voiced and voiceless English stop consonants. Recordings were made of 5 trained singers and 5 nonsingers, and acoustically analyzed for voice onset time (VOT). A mixed analysis of variance showed that the male trained singers had significantly longer mean VOT than did the nonsingers during voiceless stop production. Sung productions of voiceless stops had significantly longer mean VOTs than did the spoken productions. No significant differences were observed for the voiced stops, nor were any interactions observed. These results indicated that vocal training and phonatory task have a significant influence on VOT.
The Effect of Hydration on the Voice Quality of Future Professional Vocal Performers.

PubMed

van Wyk, Liezl; Cloete, Mariaan; Hattingh, Danel; van der Linde, Jeannie; Geertsema, Salome

2017-01-01

The application of systemic hydration as an instrument for optimal voice quality has been a common practice by several professional voice users over the years. Although the physiological action has been determined, the benefits on acoustic and perceptual characteristics are relatively unknown. The present study aimed to determine whether systemic hydration has beneficial outcomes on the voice quality of future professional voice users. A within-subject, pretest posttest design is applied to determine quantitative research results of female singing students between 18 and 32 years of age without a history of voice pathology. Acoustic and perceptual data were collected before and after a 2-hour singing rehearsal. The difference between the hypohydrated condition (controlled) and the hydrated condition (experimental) and the relationship between adequate hydration and acoustic and perceptual parameters of voice was then investigated. A statistical significant (P = 0.041) increase in jitter values were obtained for the hypohydrated condition. Increased maximum phonation time (MPT/z/) and higher maximum frequency for hydration indicated further statistical significant changes in voice quality (P = 0.028 and P = 0.015, respectively). Systemic hydration has positive outcomes on perceptual and acoustic parameters of voice quality for future professional singers. The singer's ability to sustain notes for longer and reach higher frequencies may reflect well in performances. Any positive change in voice quality may benefit the singer's occupational success and subsequently their social, emotional, and vocational well-being. More research evidence is needed to determine the parameters for implementing adequate hydration in vocal hygiene programs. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Processing of voices in deafness rehabilitation by auditory brainstem implant.

PubMed

Coez, Arnaud; Zilbovicius, Monica; Ferrary, Evelyne; Bouccara, Didier; Mosnier, Isabelle; Ambert-Dahan, Emmanuèle; Kalamarides, Michel; Bizaguet, Eric; Syrota, André; Samson, Yves; Sterkers, Olivier

2009-10-01

The superior temporal sulcus (STS) is specifically involved in processing the human voice. Profound acquired deafness by post-meningitis ossified cochlea and by bilateral vestibular schwannoma in neurofibromatosis type 2 patients are two indications for auditory brainstem implantation (ABI). In order to objectively measure the cortical voice processing of a group of ABI patients, we studied the activation of the human temporal voice areas (TVA) by PET H(2)(15)O, performed in a group of implanted deaf adults (n=7) with more than two years of auditory brainstem implant experience, with an intelligibility score average of 17%+/-17 [mean+/-SD]. Relative cerebral blood flow (rCBF) was measured in the three following conditions: during silence, while passive listening to human voice, and to non-voice stimuli. Compared to silence, the activations induced by voice and non-voice stimuli were bilaterally located in the superior temporal regions. However, compared to non-voice stimuli, the voice stimuli did not induce specific supplementary activation of the TVA along the STS. The comparison of ABI group with a normal-hearing controls group (n=7) showed that TVA activations were significantly enhanced among controls group. ABI allowed the transmission of sound stimuli to temporal brain regions but lacked transmitting the specific cues of the human voice to the TVA. Moreover, among groups, during silent condition, brain visual regions showed higher rCBF in ABI group, although temporal brain regions had higher rCBF in the controls group. ABI patients had consequently developed enhanced visual strategies to keep interacting with their environment.
Man-machine interface requirements - advanced technology

NASA Technical Reports Server (NTRS)

Remington, R. W.; Wiener, E. L.

1984-01-01

Research issues and areas are identified where increased understanding of the human operator and the interaction between the operator and the avionics could lead to improvements in the performance of current and proposed helicopters. Both current and advanced helicopter systems and avionics are considered. Areas critical to man-machine interface requirements include: (1) artificial intelligence; (2) visual displays; (3) voice technology; (4) cockpit integration; and (5) pilot work loads and performance.
Whose Voice Is It Anyway? Hushing and Hearing "Voices" in Speech and Language Therapy Interactions with People with Chronic Schizophrenia

ERIC Educational Resources Information Center

Walsh, Irene P.

2008-01-01

Background: Some people with schizophrenia are considered to have communication difficulties because of concomitant language impairment and/or because of suppressed or "unusual" communication skills due to the often-chronic nature and manifestation of the illness process. Conversations with a person with schizophrenia pose many pragmatic…
Evaluation of a voice recognition system for the MOTAS pseudo pilot station function

NASA Technical Reports Server (NTRS)

Houck, J. A.

1982-01-01

The Langley Research Center has undertaken a technology development activity to provide a capability, the mission oriented terminal area simulation (MOTAS), wherein terminal area and aircraft systems studies can be performed. An experiment was conducted to evaluate state-of-the-art voice recognition technology and specifically, the Threshold 600 voice recognition system to serve as an aircraft control input device for the MOTAS pseudo pilot station function. The results of the experiment using ten subjects showed a recognition error of 3.67 percent for a 48-word vocabulary tested against a programmed vocabulary of 103 words. After the ten subjects retrained the Threshold 600 system for the words which were misrecognized or rejected, the recognition error decreased to 1.96 percent. The rejection rates for both cases were less than 0.70 percent. Based on the results of the experiment, voice recognition technology and specifically the Threshold 600 voice recognition system were chosen to fulfill this MOTAS function.
Monitoring daily affective symptoms and memory function using interactive voice response in outpatients receiving electroconvulsive therapy.

PubMed

Fazzino, Tera L; Rabinowitz, Terry; Althoff, Robert R; Helzer, John E

2013-12-01

Recently, there has been a gradual shift from inpatient-only electroconvulsive therapy (ECT) toward outpatient administration. Potential advantages include convenience and reduced cost. But providers do not have the same opportunity to monitor treatment response and adverse effects as they do with inpatients. This can obviate some of the potential advantages of outpatient ECT, such as tailoring treatment intervals to clinical response. Scheduling is typically algorithmic rather than empirically based. Daily monitoring through an automated telephone, interactive voice response (IVR), is a potential solution to this quandary. To test feasibility of clinical monitoring via IVR, we recruited 26 patients (69% female; mean age, 51 years) receiving outpatient ECT to make daily IVR reports of affective symptoms and subjective memory for 60 days. The IVR also administered a word recognition task daily to test objective memory. Every seventh day, a longer IVR weekly interview included questions about suicidal ideation. Overall daily call compliance was high (mean, 80%). Most participants (96%) did not consider the calls to be time-consuming. Longitudinal regression analysis using generalized estimating equations revealed that participant objective memory functioning significantly improved during the study (P < 0.05). Of 123 weekly IVR interviews, 41 reports (33%) in 14 patients endorsed suicidal ideation during the previous week. Interactive voice response monitoring of outpatient ECT can provide more detailed clinical information than standard outpatient ECT assessment. Interactive voice response data offer providers a comprehensive, longitudinal picture of patient treatment response and adverse effects as a basis for treatment scheduling and ongoing clinical management.
Developmental trends in the interaction between auditory and linguistic processing.

PubMed

Jerger, S; Pirozzolo, F; Jerger, J; Elizondo, R; Desai, S; Wright, E; Reynosa, R

1993-09-01

The developmental course of multidimensional speech processing was examined in 80 children between 3 and 6 years of age and in 60 adults between 20 and 86 years of age. Processing interactions were assessed with a speeded classification task (Garner, 1974a), which required the subjects to attend selectively to the voice dimension while ignoring the linguistic dimension, and vice versa. The children and adults exhibited both similarities and differences in the patterns of processing dependencies. For all ages, performance for each dimension was slower in the presence of variation in the irrelevant dimension; irrelevant variation in the voice dimension disrupted performance more than irrelevant variation in the linguistic dimension. Trends in the degree of interference, on the other hand, showed significant differences between dimensions as a function of age. Whereas the degree of interference for the voice-dimension-relevant did not show significant age-related change, the degree of interference for the word-dimension-relevant declined significantly with age in a linear as well as a quadratic manner. A major age-related change in the relation between dimensions was that word processing, relative to voice-gender processing, required significantly more time in the children than in the adults. Overall, the developmental course characterizing multidimensional speech processing evidenced more pronounced change when the linguistic dimension, rather than the voice dimension, was relevant.
Short-Term Effect of Two Semi-Occluded Vocal Tract Training Programs on the Vocal Quality of Future Occupational Voice Users: "Resonant Voice Training Using Nasal Consonants" Versus "Straw Phonation".

PubMed

Meerschman, Iris; Van Lierde, Kristiane; Peeters, Karen; Meersman, Eline; Claeys, Sofie; D'haeseleer, Evelien

2017-09-18

The purpose of this study was to determine the short-term effect of 2 semi-occluded vocal tract training programs, "resonant voice training using nasal consonants" versus "straw phonation," on the vocal quality of vocally healthy future occupational voice users. A multigroup pretest-posttest randomized control group design was used. Thirty healthy speech-language pathology students with a mean age of 19 years (range: 17-22 years) were randomly assigned into a resonant voice training group (practicing resonant exercises across 6 weeks, n = 10), a straw phonation group (practicing straw phonation across 6 weeks, n = 10), or a control group (receiving no voice training, n = 10). A voice assessment protocol consisting of both subjective (questionnaire, participant's self-report, auditory-perceptual evaluation) and objective (maximum performance task, aerodynamic assessment, voice range profile, acoustic analysis, acoustic voice quality index, dysphonia severity index) measurements and determinations was used to evaluate the participants' voice pre- and posttraining. Groups were compared over time using linear mixed models and generalized linear mixed models. Within-group effects of time were determined using post hoc pairwise comparisons. No significant time × group interactions were found for any of the outcome measures, indicating no differences in evolution over time among the 3 groups. Within-group effects of time showed a significant improvement in dysphonia severity index in the resonant voice training group, and a significant improvement in the intensity range in the straw phonation group. Results suggest that the semi-occluded vocal tract training programs using resonant voice training and straw phonation may have a positive impact on the vocal quality and vocal capacities of future occupational voice users. The resonant voice training caused an improved dysphonia severity index, and the straw phonation training caused an expansion of the intensity range in this population.
Quantitative evaluation of the voice range profile in patients with voice disorder.

PubMed

Ikeda, Y; Masuda, T; Manako, H; Yamashita, H; Yamamoto, T; Komiyama, S

1999-01-01

In 1953, Calvet first displayed the fundamental frequency (pitch) and sound pressure level (intensity) of a voice on a two-dimensional plane and created a voice range profile. This profile has been used to evaluate clinically various vocal disorders, although such evaluations to date have been subjective without quantitative assessment. In the present study, a quantitative system was developed to evaluate the voice range profile utilizing a personal computer. The area of the voice range profile was defined as the voice volume. This volume was analyzed in 137 males and 175 females who were treated for various dysphonias at Kyushu University between 1984 and 1990. Ten normal subjects served as controls. The voice volume in cases with voice disorders significantly decreased irrespective of the disease and sex. Furthermore, cases having better improvement after treatment showed a tendency for the voice volume to increase. These findings illustrated the voice volume as a useful clinical test for evaluating voice control in cases with vocal disorders.
Effects of emotional and perceptual-motor stress on a voice recognition system's accuracy: An applied investigation

NASA Astrophysics Data System (ADS)

Poock, G. K.; Martin, B. J.

1984-02-01

This was an applied investigation examining the ability of a speech recognition system to recognize speakers' inputs when the speakers were under different stress levels. Subjects were asked to speak to a voice recognition system under three conditions: (1) normal office environment, (2) emotional stress, and (3) perceptual-motor stress. Results indicate a definite relationship between voice recognition system performance and the type of low stress reference patterns used to achieve recognition.
Remote voice training: A case study on space shuttle applications, appendix C

NASA Technical Reports Server (NTRS)

Mollakarimi, Cindy; Hamid, Tamin

1990-01-01

The Tile Automation System includes applications of automation and robotics technology to all aspects of the Shuttle tile processing and inspection system. An integrated set of rapid prototyping testbeds was developed which include speech recognition and synthesis, laser imaging systems, distributed Ada programming environments, distributed relational data base architectures, distributed computer network architectures, multi-media workbenches, and human factors considerations. Remote voice training in the Tile Automation System is discussed. The user is prompted over a headset by synthesized speech for the training sequences. The voice recognition units and the voice output units are remote from the user and are connected by Ethernet to the main computer system. A supervisory channel is used to monitor the training sequences. Discussions include the training approaches as well as the human factors problems and solutions for this system utilizing remote training techniques.

Speaking in Character: Voice Communication in Virtual Worlds

NASA Astrophysics Data System (ADS)

Wadley, Greg; Gibbs, Martin R.

This chapter summarizes 5 years of research on the implications of introducing voice communication systems to virtual worlds. Voice introduces both benefits and problems for players of fast-paced team games, from better coordination of groups and greater social presence of fellow players on the positive side, to negative features such as channel congestion, transmission of noise, and an unwillingness by some to use voice with strangers online. Similarly, in non-game worlds like Second Life, issues related to identity and impression management play important roles, as voice may build greater trust that is especially important for business users, yet it erodes the anonymity and ability to conceal social attributes like gender that are important for other users. A very different mixture of problems and opportunities exists when users conduct several simultaneous conversations in multiple text and voice channels. Technical difficulties still exist with current systems, including the challenge of debugging and harmonizing all the participants' voice setups. Different groups use virtual worlds for very different purposes, so a single modality may not suit all.
Building VoiceXML-Based Applications

DTIC Science & Technology

2002-01-01

basketball games. The Busline systems were pri- y developed using an early implementation of VoiceXML he NBA Update Line was developed using VoiceXML...traveling in and out of Pittsburgh’s rsity neighborhood. The second project is the NBA Up- Line, which provides callers with real-time information NBA ... NBA UPDATE LINE The target user of this system is a fairly knowledgeable basket- ball fan; the system must therefore be able to provide detailed
DTO-675: Voice Control of the Closed Circuit Television System

NASA Technical Reports Server (NTRS)

Salazar, George; Gaston, Darilyn M.; Haynes, Dena S.

1996-01-01

This report presents the results of the Detail Test Object (DTO)-675 "Voice Control of the Closed Circuit Television (CCTV)" system. The DTO is a follow-on flight of the Voice Command System (VCS) that flew as a secondary payload on STS-41. Several design changes were made to the VCS for the STS-78 mission. This report discusses those design changes, the data collected during the mission, recognition problems encountered, and findings.
Voice stress analysis and evaluation

NASA Astrophysics Data System (ADS)

Haddad, Darren M.; Ratley, Roy J.

2001-02-01

Voice Stress Analysis (VSA) systems are marketed as computer-based systems capable of measuring stress in a person's voice as an indicator of deception. They are advertised as being less expensive, easier to use, less invasive in use, and less constrained in their operation then polygraph technology. The National Institute of Justice have asked the Air Force Research Laboratory for assistance in evaluating voice stress analysis technology. Law enforcement officials have also been asking questions about this technology. If VSA technology proves to be effective, its value for military and law enforcement application is tremendous.
Multi-Agent Flight Simulation with Robust Situation Generation

NASA Technical Reports Server (NTRS)

Johnson, Eric N.; Hansman, R. John, Jr.

1994-01-01

A robust situation generation architecture has been developed that generates multi-agent situations for human subjects. An implementation of this architecture was developed to support flight simulation tests of air transport cockpit systems. This system maneuvers pseudo-aircraft relative to the human subject's aircraft, generating specific situations for the subject to respond to. These pseudo-aircraft maneuver within reasonable performance constraints, interact in a realistic manner, and make pre-recorded voice radio communications. Use of this system minimizes the need for human experimenters to control the pseudo-agents and provides consistent interactions between the subject and the pseudo-agents. The achieved robustness of this system to typical variations in the subject's flight path was explored. It was found to successfully generate specific situations within the performance limitations of the subject-aircraft, pseudo-aircraft, and the script used.
Academic voice: On feminism, presence, and objectivity in writing.

PubMed

Mitchell, Kim M

2017-10-01

Academic voice is an oft-discussed, yet variably defined concept, and confusion exists over its meaning, evaluation, and interpretation. This paper will explore perspectives on academic voice and counterarguments to the positivist origins of objectivity in academic writing. While many epistemological and methodological perspectives exist, the feminist literature on voice is explored here as the contrary position. From the feminist perspective, voice is a socially constructed concept that cannot be separated from the experiences, emotions, and identity of the writer and, thus, constitutes a reflection of an author's way of knowing. A case study of how author presence can enhance meaning in text is included. Subjective experience is imperative to a practice involving human interaction. Nursing practice, our intimate involvement in patient's lives, and the nature of our research are not value free. A view is presented that a visible presence of an author in academic writing is relevant to the nursing discipline. The continued valuing of an objective, colorless academic voice has consequences for student writers and the faculty who teach them. Thus, a strategically used multivoiced writing style is warranted. © 2017 John Wiley & Sons Ltd.
Intra- and Inter-database Study for Arabic, English, and German Databases: Do Conventional Speech Features Detect Voice Pathology?

PubMed

Ali, Zulfiqar; Alsulaiman, Mansour; Muhammad, Ghulam; Elamvazuthi, Irraivan; Al-Nasheri, Ahmed; Mesallam, Tamer A; Farahat, Mohamed; Malki, Khalid H

2017-05-01

A large population around the world has voice complications. Various approaches for subjective and objective evaluations have been suggested in the literature. The subjective approach strongly depends on the experience and area of expertise of a clinician, and human error cannot be neglected. On the other hand, the objective or automatic approach is noninvasive. Automatic developed systems can provide complementary information that may be helpful for a clinician in the early screening of a voice disorder. At the same time, automatic systems can be deployed in remote areas where a general practitioner can use them and may refer the patient to a specialist to avoid complications that may be life threatening. Many automatic systems for disorder detection have been developed by applying different types of conventional speech features such as the linear prediction coefficients, linear prediction cepstral coefficients, and Mel-frequency cepstral coefficients (MFCCs). This study aims to ascertain whether conventional speech features detect voice pathology reliably, and whether they can be correlated with voice quality. To investigate this, an automatic detection system based on MFCC was developed, and three different voice disorder databases were used in this study. The experimental results suggest that the accuracy of the MFCC-based system varies from database to database. The detection rate for the intra-database ranges from 72% to 95%, and that for the inter-database is from 47% to 82%. The results conclude that conventional speech features are not correlated with voice, and hence are not reliable in pathology detection. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Performance, Accuracy, Data Delivery, and Feedback Methods in Order Selection: A Comparison of Voice, Handheld, and Paper Technologies

ERIC Educational Resources Information Center

Ludwig, Timothy D.; Goomas, David T.

2007-01-01

Field study was conducted in auto-parts after-market distribution centers where selectors used handheld computers to receive instructions and feedback about their product selection process. A wireless voice-interaction technology was then implemented in a multiple baseline fashion across three departments of a warehouse (N = 14) and was associated…
Listening to Voices at the Educational Frontline: New Administrators' Experiences of the Transition from Teacher to Vice-Principal

ERIC Educational Resources Information Center

Armstrong, Denise E.

2015-01-01

This qualitative study explored the transition from teaching to administration through the voices of four novice vice-principals. An integrative approach was used to capture the interaction between new vice-principals, their external contexts, and the resulting leadership outcomes. The data revealed that in spite of these new administrators'…
Voice/Data Integration in Mobile Radio Networks: Overview and Future Research Directions

DTIC Science & Technology

1989-09-30

degradation in interactive speech when delays are less than about 300 ms (Gold 1977; Gitman and Frank, 1978). When delays are larger (between 300 ms and 1.5...222-267. Gitman , 1. and H. Frank (1978), "Economic Analysis of Integrated Voice and Data Networks: A Case Study," Proc. IEEE 66 1549-1570. Glynn, P.W
On Pitch Lowering Not Linked to Voicing: Nguni and Shona Group Depressors

ERIC Educational Resources Information Center

Downing, Laura J.

2009-01-01

This paper tests how well two theories of tone-segment interactions account for the lowering effect of so-called depressor consonants on tone in languages of the Shona and Nguni groups of Southern Bantu. I show that single source theories, which propose that pitch lowering is inextricably linked to consonant voicing, as they are reflexes of the…
Elements of Collaborative Discussion and Shared Problem Solving in a Voice-Enhanced Multiplayer Game

ERIC Educational Resources Information Center

Bluemink, Johanna; Jarvela, Sanna

2011-01-01

This study focuses on investigating the nature of small-group collaborative interaction in a voice-enhanced multiplayer game called "eScape". The aim was to analyse the elements of groups' collaborative discussion and to explore the nature of the players' shared problem solving activity during the solution critical moments in the game. The data…
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2002-01-01

Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.
Performance of wavelet analysis and neural networks for pathological voices identification

NASA Astrophysics Data System (ADS)

Salhi, Lotfi; Talbi, Mourad; Abid, Sabeur; Cherif, Adnane

2011-09-01

Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital 'RABTA' of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.
Automated conversation system before pediatric primary care visits: a randomized trial.

PubMed

Adams, William G; Phillips, Barrett D; Bacic, Janine D; Walsh, Kathleen E; Shanahan, Christopher W; Paasche-Orlow, Michael K

2014-09-01

Interactive voice response systems integrated with electronic health records have the potential to improve primary care by engaging parents outside clinical settings via spoken language. The objective of this study was to determine whether use of an interactive voice response system, the Personal Health Partner (PHP), before routine health care maintenance visits could improve the quality of primary care visits and be well accepted by parents and clinicians. English-speaking parents of children aged 4 months to 11 years called PHP before routine visits and were randomly assigned to groups by the system at the time of the call. Parents' spoken responses were used to provide tailored counseling and support goal setting for the upcoming visit. Data were transferred to the electronic health records for review during visits. The study occurred in an urban hospital-based pediatric primary care center. Participants were called after the visit to assess (1) comprehensiveness of screening and counseling, (2) assessment of medications and their management, and (3) parent and clinician satisfaction. PHP was able to identify and counsel in multiple areas. A total of 9.7% of parents responded to the mailed invitation. Intervention parents were more likely to report discussing important issues such as depression (42.6% vs 25.4%; P < .01) and prescription medication use (85.7% vs 72.6%; P = .04) and to report being better prepared for visits. One hundred percent of clinicians reported that PHP improved the quality of their care. Systems like PHP have the potential to improve clinical screening, counseling, and medication management. Copyright © 2014 by the American Academy of Pediatrics.
[The comparative assessment of the vocal function in the professional voice users and non-occupational voice users in the late adulthood].

PubMed

Pavlikhin, O G; Romanenko, S G; Krasnikova, D I; Lesogorova, E V; Yakovlev, V S

The objective of the present study was to evaluate the clinical and functional condition of the voice apparatus in the elderly patients and to elaborate recommendations for the prevention of disturbances of the vocal function in the professional voice users. This comprehensive study involved 95 patients including the active professional voice users (n=48) and 45 non-occupational voice users at the age from 61 to 82 years with the employment history varying from 32 to 51 years. The study was designed to obtain the voice characteristics by means of the subjective auditory assessment, microlaryngoscopy, video laryngostroboscopy, determination of maximum phonation time (MPT), and computer-assisted acoustic analysis of the voice with the use of the MDVP Kay Pentaxy system. The level of anxiety of the patients was estimated based on the results of the HADS questionnaire study. It is concluded that the majority of the disturbances of the vocal function in the professional voice users have the functional nature. It is concluded that the method of neuro-muscular electrophonopedic stimulation (NMEPS) of laryngeal muscles is the method of choice for the diagnostics of the vocal function of the voice users in the late adulthood. It is recommended that the professional vocal load for such subjects should not exceed 12-14 hours per week. Rational psychotherapy must constitute an important component of the system of measures intended to support the working capacity of the voice users belonging to this age group.
QM/PSK Voice/Data Modem

DOT National Transportation Integrated Search

1976-03-01

Two Quadrature Modulation/Phase Shift Keyed (QM/PSK) Voice/Data Modem systems have been developed as part of the satellite communications hardware for advanced air traffic control systems. These systems consist of a modulator and demodulator unti whi...
Investigation of air transportation technology at Princeton University, 1985

NASA Technical Reports Server (NTRS)

Stengel, Robert F.

1987-01-01

The program proceeded along five avenues during 1985. Guidance and control strategies for penetration of microbursts and wind shear, application of artificial intelligence in flight control and air traffic control systems, the use of voice recognition in the cockpit, the effects of control saturation on closed-loop stability and response of open-loop unstable aircraft, and computer aided control system design are among the topics briefly considered. Areas of investigation relate to guidance and control of commercial transports as well as general aviation aircraft. Interaction between the flight crew and automatic systems is the subject of principal concern.
Space Age Training

NASA Technical Reports Server (NTRS)

1996-01-01

Teledyne Brown developed a computer-based interactive multimedia training system for use with the Crystal Growth Furnace in the U.S. Microgravity Laboratory-2 mission on the Space Shuttle. Teledyne Brown commercialized the system and customized it for PPG Industries Aircraft Products. The system challenges learners with role-playing scenarios and software-driven simulations engaging all the senses using text, video, animation, voice, sounds and music. The transfer of this technology to commercial industrial process training has resulted in significant improvements in effectiveness, standardization, and quality control, as well as cost reductions over the usual classroom and on-the- job training approaches.
Infant face interest is associated with voice information and maternal psychological health.

PubMed

Taylor, Gemma; Slade, Pauline; Herbert, Jane S

2014-11-01

Early infant interest in their mother's face is driven by an experience based face processing system, and is associated with maternal psychological health, even within a non clinical community sample. The present study examined the role of the voice in eliciting infants' interest in mother and stranger faces and in the association between infant face interest and maternal psychological health. Infants aged 3.5-months were shown photographs of their mother's and a stranger's face paired with an audio recording of their mother's and a stranger's voice that was either matched (e.g., mother's face and voice) or mismatched (e.g., mother's face and stranger's voice). Infants spent more time attending to the stranger's matched face and voice than the mother's matched face and voice and the mismatched faces and voices. Thus, infants demonstrated an earlier preference for a stranger's face when given voice information than when the face is presented alone. In the present sample, maternal psychological health varied with 56.7% of mothers reporting mild mood symptoms (depression, anxiety or stress response to childbirth). Infants of mothers with significant mild maternal mood symptoms looked longer at the faces and voices compared to infants of mothers who did not report mild maternal mood symptoms. In sum, infants' experience based face processing system is sensitive to their mothers' maternal psychological health and the multimodal nature of faces. Copyright © 2014 Elsevier Inc. All rights reserved.

Voice loops as coordination aids in space shuttle mission control.

PubMed

Patterson, E S; Watts-Perotti, J; Woods, D D

1999-01-01

Voice loops, an auditory groupware technology, are essential coordination support tools for experienced practitioners in domains such as air traffic management, aircraft carrier operations and space shuttle mission control. They support synchronous communication on multiple channels among groups of people who are spatially distributed. In this paper, we suggest reasons for why the voice loop system is a successful medium for supporting coordination in space shuttle mission control based on over 130 hours of direct observation. Voice loops allow practitioners to listen in on relevant communications without disrupting their own activities or the activities of others. In addition, the voice loop system is structured around the mission control organization, and therefore directly supports the demands of the domain. By understanding how voice loops meet the particular demands of the mission control environment, insight can be gained for the design of groupware tools to support cooperative activity in other event-driven domains.
A survey of the state-of-the-art and focused research in range systems, task 1

NASA Technical Reports Server (NTRS)

Omura, J. K.

1986-01-01

This final report presents the latest research activity in voice compression. We have designed a non-real time simulation system that is implemented around the IBM-PC where the IBM-PC is used as a speech work station for data acquisition and analysis of voice samples. A real-time implementation is also proposed. This real-time Voice Compression Board (VCB) is built around the Texas Instruments TMS-3220. The voice compression algorithm investigated here was described in an earlier report titled, Low Cost Voice Compression for Mobile Digital Radios, by the author. We will assume the reader is familiar with the voice compression algorithm discussed in this report. The VCB compresses speech waveforms at data rates ranging from 4.8 K bps to 16 K bps. This board interfaces to the IBM-PC 8-bit bus, and plugs into a single expansion slot on the mother board.
Voice loops as coordination aids in space shuttle mission control

NASA Technical Reports Server (NTRS)

Patterson, E. S.; Watts-Perotti, J.; Woods, D. D.

1999-01-01

Voice loops, an auditory groupware technology, are essential coordination support tools for experienced practitioners in domains such as air traffic management, aircraft carrier operations and space shuttle mission control. They support synchronous communication on multiple channels among groups of people who are spatially distributed. In this paper, we suggest reasons for why the voice loop system is a successful medium for supporting coordination in space shuttle mission control based on over 130 hours of direct observation. Voice loops allow practitioners to listen in on relevant communications without disrupting their own activities or the activities of others. In addition, the voice loop system is structured around the mission control organization, and therefore directly supports the demands of the domain. By understanding how voice loops meet the particular demands of the mission control environment, insight can be gained for the design of groupware tools to support cooperative activity in other event-driven domains.
Interactions between observer and stimuli fertility status: Endocrine and perceptual responses to intrasexual vocal fertility cues.

PubMed

Ostrander, Grant M; Pipitone, R Nathan; Shoup-Knox, Melanie L

2018-02-01

Both men and women find female voices more attractive at higher fertility times in the menstrual cycle, suggesting the voice is a cue to fertility and/or hormonal status. Preference for fertile females' voices provides males with an obvious reproduction advantage, however the advantage for female listeners is less clear. One possibility is that attention to the fertility status of potential rivals may enable women to enhance their own reproductive strategies through intrasexual competition. If so, the response to having high fertility voices should include hormonal changes that promote competitive behavior. Furthermore, attention and response to such cues should vary as a function of the observer's own fertility, which influences her ability to compete for mates. The current study monitored variation in cortisol and testosterone levels in response to evaluating the attractiveness of voices of other women. All 33 participants completed this task once during ovulation then again during the luteal phase. The voice stimuli were recorded from naturally cycling women at both high and low fertility, and from women using hormonal birth control. We found that listeners rated high fertility voices as more attractive compared to low fertility, with the effect being stronger when listeners were ovulating. Testosterone was elevated following voice ratings suggesting threat detection or the anticipation of competition, but no stress response was found. Copyright © 2017 Elsevier Inc. All rights reserved.
Twenty-Channel Voice Response System

DOT National Transportation Integrated Search

1981-06-01

This report documents the design and implementation of a Voice Response System, which provides Direct-User Access to the FAA's aviation-weather data base. This system supports 20 independent audio channels, and as of this report, speaks three weather...
Conceptual Sound System Design for Clifford Odets' "GOLDEN BOY"

NASA Astrophysics Data System (ADS)

Yang, Yen Chun

There are two different aspects in the process of sound design, "Arts" and "Science". In my opinion, the sound design should engage both aspects strongly and in interaction with each other. I started the process of designing the sound for GOLDEN BOY by building the city soundscape of New York City in 1937. The scenic design for this piece is designed in the round, putting the audience all around the stage; this gave me a great opportunity to use surround and specialization techniques to transform the space into a different sonic world. My specialization design is composed of two subsystems -- one is the four (4) speakers center cluster diffusing towards the four (4) sections of audience, and the other is the four (4) speakers on the four (4) corners of the theatre. The outside ring provides rich sound source localization and the inside ring provides more support for control of the specialization details. In my design four (4) lavalier microphones are hung under the center iron cage from the four (4) corners of the stage. Each microphone is ten (10) feet above the stage. The signal for each microphone is sent to the two (2) center speakers in the cluster diagonally opposite the microphone. With the appropriate level adjustment of the microphones, the audience will not notice the amplification of the voices; however, through my specialization system, the presence and location of the voices of all actors are preserved for all audiences clearly. With such vocal reinforcements provided by the microphones, I no longer need to worry about overwhelming the dialogue on stage by the underscoring. A successful sound system design should not only provide a functional system, but also take the responsibility of bringing actors' voices to the audience and engaging the audience with the world that we create on stage. By designing a system which reinforces the actors' voices while at the same time providing control over localization of movement of sound effects, I was able not only to make the text present and clear for the audiences, but also to support the storyline strongly through my composed music, environmental soundscapes, and underscoring.
Human voice quality measurement in noisy environments.

PubMed

Ueng, Shyh-Kuang; Luo, Cheng-Ming; Tsai, Tsung-Yu; Yeh, Hsuan-Chen

2015-01-01

Computerized acoustic voice measurement is essential for the diagnosis of vocal pathologies. Previous studies showed that ambient noises have significant influences on the accuracy of voice quality assessment. This paper presents a voice quality assessment system that can accurately measure qualities of voice signals, even though the input voice data are contaminated by low-frequency noises. The ambient noises in our living rooms and laboratories are collected and the frequencies of these noises are analyzed. Based on the analysis, a filter is designed to reduce noise level of the input voice signal. Then, improved numerical algorithms are employed to extract voice parameters from the voice signal to reveal the health of the voice signal. Compared with MDVP and Praat, the proposed method outperforms these two widely used programs in measuring fundamental frequency and harmonic-to-noise ratio, and its performance is comparable to these two famous programs in computing jitter and shimmer. The proposed voice quality assessment method is resistant to low-frequency noises and it can measure human voice quality in environments filled with noises from air-conditioners, ceiling fans and cooling fans of computers.
Effects of Voice Coding and Speech Rate on a Synthetic Speech Display in a Telephone Information System

DTIC Science & Technology

1988-05-01

Seeciv Limited- System for varying Senses term filter capacity output until some Figure 2. Original limited-capacity channel model (Frim Broadbent, 1958) S...2 Figure 2. Original limited-capacity channel model (From Broadbent, 1958) .... 10 Figure 3. Experimental...unlimited variety of human voices for digital recording sources. Synthesis by Analysis Analysis-synthesis methods electronically model the human voice
Writing with Voice: An Investigation of the Use of a Voice Recognition System as a Writing Aid for a Man with Aphasia

ERIC Educational Resources Information Center

Bruce, Carolyn; Edmundson, Anne; Coleman, Michael

2003-01-01

Background: People with aphasia may experience difficulties that prevent them from demonstrating in writing what they know and can produce orally. Voice recognition systems that allow the user to speak into a microphone and see their words appear on a computer screen have the potential to assist written communication. Aim: This study investigated…
Perceptual fluency and judgments of vocal aesthetics and stereotypicality.

PubMed

Babel, Molly; McGuire, Grant

2015-05-01

Research has shown that processing dynamics on the perceiver's end determine aesthetic pleasure. Specifically, typical objects, which are processed more fluently, are perceived as more attractive. We extend this notion of perceptual fluency to judgments of vocal aesthetics. Vocal attractiveness has traditionally been examined with respect to sexual dimorphism and the apparent size of a talker, as reconstructed from the acoustic signal, despite evidence that gender-specific speech patterns are learned social behaviors. In this study, we report on a series of three experiments using 60 voices (30 females) to compare the relationship between judgments of vocal attractiveness, stereotypicality, and gender categorization fluency. Our results indicate that attractiveness and stereotypicality are highly correlated for female and male voices. Stereotypicality and categorization fluency were also correlated for male voices, but not female voices. Crucially, stereotypicality and categorization fluency interacted to predict attractiveness, suggesting the role of perceptual fluency is present, but nuanced, in judgments of human voices. © 2014 Cognitive Science Society, Inc.
Human vocal attractiveness as signaled by body size projection.

PubMed

Xu, Yi; Lee, Albert; Wu, Wing-Li; Liu, Xuan; Birkholz, Peter

2013-01-01

Voice, as a secondary sexual characteristic, is known to affect the perceived attractiveness of human individuals. But the underlying mechanism of vocal attractiveness has remained unclear. Here, we presented human listeners with acoustically altered natural sentences and fully synthetic sentences with systematically manipulated pitch, formants and voice quality based on a principle of body size projection reported for animal calls and emotional human vocal expressions. The results show that male listeners preferred a female voice that signals a small body size, with relatively high pitch, wide formant dispersion and breathy voice, while female listeners preferred a male voice that signals a large body size with low pitch and narrow formant dispersion. Interestingly, however, male vocal attractiveness was also enhanced by breathiness, which presumably softened the aggressiveness associated with a large body size. These results, together with the additional finding that the same vocal dimensions also affect emotion judgment, indicate that humans still employ a vocal interaction strategy used in animal calls despite the development of complex language.
Digital signal processing algorithms for automatic voice recognition

NASA Technical Reports Server (NTRS)

Botros, Nazeih M.

1987-01-01

The current digital signal analysis algorithms are investigated that are implemented in automatic voice recognition algorithms. Automatic voice recognition means, the capability of a computer to recognize and interact with verbal commands. The digital signal is focused on, rather than the linguistic, analysis of speech signal. Several digital signal processing algorithms are available for voice recognition. Some of these algorithms are: Linear Predictive Coding (LPC), Short-time Fourier Analysis, and Cepstrum Analysis. Among these algorithms, the LPC is the most widely used. This algorithm has short execution time and do not require large memory storage. However, it has several limitations due to the assumptions used to develop it. The other 2 algorithms are frequency domain algorithms with not many assumptions, but they are not widely implemented or investigated. However, with the recent advances in the digital technology, namely signal processors, these 2 frequency domain algorithms may be investigated in order to implement them in voice recognition. This research is concerned with real time, microprocessor based recognition algorithms.
Establishing the "Fit" between the Patient and the Therapy: The Role of Patient Gender in Selecting Psychological Therapy for Distressing Voices.

PubMed

Hayward, Mark; Slater, Luke; Berry, Katherine; Perona-Garcelán, Salvador

2016-01-01

The experience of hearing distressing voices has recently attracted much attention in the literature on psychological therapies. A new "wave" of therapies is considering voice hearing experiences within a relational framework. However, such therapies may have limited impact if they do not precisely target key psychological variables within the voice hearing experience and/or ensure there is a "fit" between the profile of the hearer and the therapy (the so-called "What works for whom" debate). Gender is one aspect of both the voice and the hearer (and the interaction between the two) that may be influential when selecting an appropriate therapy, and is an issue that has thus far received little attention within the literature. The existing literature suggests that some differences in voice hearing experience are evident between the genders. Furthermore, studies exploring interpersonal relating in men and women more generally suggest differences within intimate relationships in terms of distancing and emotionality. The current study utilized data from four published studies to explore the extent to which these gender differences in social relating may extend to relating within the voice hearing experience. The findings suggest a role for gender as a variable that can be considered when identifying an appropriate psychological therapy for a given hearer.
‘Inner voices’: the cerebral representation of emotional voice cues described in literary texts

PubMed Central

Kreifelts, Benjamin; Gößling-Arnold, Christina; Wertheimer, Jürgen; Wildgruber, Dirk

2014-01-01

While non-verbal affective voice cues are generally recognized as a crucial behavioral guide in any day-to-day conversation their role as a powerful source of information may extend well beyond close-up personal interactions and include other modes of communication such as written discourse or literature as well. Building on the assumption that similarities between the different ‘modes’ of voice cues may not only be limited to their functional role but may also include cerebral mechanisms engaged in the decoding process, the present functional magnetic resonance imaging study aimed at exploring brain responses associated with processing emotional voice signals described in literary texts. Emphasis was placed on evaluating ‘voice’ sensitive as well as task- and emotion-related modulations of brain activation frequently associated with the decoding of acoustic vocal cues. Obtained findings suggest that several similarities emerge with respect to the perception of acoustic voice signals: results identify the superior temporal, lateral and medial frontal cortex as well as the posterior cingulate cortex and cerebellum to contribute to the decoding process, with similarities to acoustic voice perception reflected in a ‘voice’-cue preference of temporal voice areas as well as an emotion-related modulation of the medial frontal cortex and a task-modulated response of the lateral frontal cortex. PMID:24396008
A simulation study of the effects of communication delay on air traffic control

DOT National Transportation Integrated Search

1990-09-01

This study was conducted to examine the impacts of voice communications delays : characteristic of Voice Switching and Control System (VSCS) and satellite : communications systems on air traffic system performance, controller stress : and workload, a...
"You Know Doctor, I Need to Tell You Something": A Discourse Analytical Study of Patients' Voices in the Medical Consultation

ERIC Educational Resources Information Center

Cordella, Marisa

2004-01-01

Most studies in the area of doctor-patient communication focus on the talk that doctors perform during the consultation, leaving under-researched the discourse developed by patients. This article deconstructs and identifies the functions and forms of the voices (i.e. specific forms of talk) that Chilean patients employ in their interactions with…
Do What I Say! Voice Recognition Makes Major Advances.

ERIC Educational Resources Information Center

Ruley, C. Dorsey

1994-01-01

Explains voice recognition technology applications in the workplace, schools, and libraries. Highlights include a voice-controlled work station using the DragonDictate system that can be used with dyslexic students, converting text to speech, and converting speech to text. (LRW)
Data equivalency of an interactive voice response system for home assessment of back pain and function.

PubMed

Shaw, William S; Verma, Santosh K

2007-01-01

Interactive voice response (IVR) systems that collect survey data using automated, push-button telephone responses may be useful to monitor patients' pain and function at home; however, its equivalency to other data collection methods has not been studied. To study the data equivalency of IVR measurement of pain and function to live telephone interviewing. In a prospective cohort study, 547 working adults (66% male) with acute back pain were recruited at an initial outpatient visit and completed telephone assessments one month later to track outcomes of pain, function, treatment helpfulness and return to work. An IVR system was introduced partway through the study (after the first 227 participants) to reduce the staff time necessary to contact participants by telephone during nonworking hours. Of 368 participants who were subsequently recruited and offered the IVR option, 131 (36%) used IVR, 189 (51%) were contacted by a telephone interviewer after no IVR attempt was made within five days, and 48 (13%) were lost to follow-up. Those with lower income were more likely to use IVR. Analysis of outcome measures showed that IVR respondents reported comparatively lower levels of function and less effective treatment, but not after controlling for differences due to the delay in reaching non-IVR users by telephone (mean: 35.4 versus 29.2 days). The results provided no evidence of information or selection bias associated with IVR use; however, IVR must be supplemented with other data collection options to maintain high response rates.
The Impacts of the Voice Change, Grade Level, and Experience on the Singing Self-Efficacy of Emerging Adolescent Males

ERIC Educational Resources Information Center

Fisher, Ryan A.

2014-01-01

The purposes of the study are to describe characteristics of the voice change in sixth-, seventh-, and eighth-grade choir students using Cooksey's voice-change classification system and to determine if the singing self-efficacy of adolescent males is affected by the voice change, grade level, and experience. Participants (N = 80) consisted of…
A mobile communication system providing integrated voice/data services over power limited satellite channels

NASA Astrophysics Data System (ADS)

Bose, Sanjay K.; Gordon, J. J.

The modeling and analysis of a system providing integrated voice/data services to mobile terminals over a power-limited satellite channel are discussed. The mobiles use slotted Aloha random access to send requests for channel assignments to a central station. For successful requests, the actual transmission of voice/data within a call is done using the channel assigned for this purpose by the central station. The satellite channel is assumed to be power limited. Taking into account the known burstiness of voice sources (which use a voice-activated switch), the central station overassigns channels so that the average total power is below the power limit of the satellite transponder. The performance of this model is analyzed. Certain simple, static control strategies for improving performance are also proposed.

Voice and gesture-based 3D multimedia presentation tool

NASA Astrophysics Data System (ADS)

Fukutake, Hiromichi; Akazawa, Yoshiaki; Okada, Yoshihiro

2007-09-01

This paper proposes a 3D multimedia presentation tool that allows the user to manipulate intuitively only through the voice input and the gesture input without using a standard keyboard or a mouse device. The authors developed this system as a presentation tool to be used in a presentation room equipped a large screen like an exhibition room in a museum because, in such a presentation environment, it is better to use voice commands and the gesture pointing input rather than using a keyboard or a mouse device. This system was developed using IntelligentBox, which is a component-based 3D graphics software development system. IntelligentBox has already provided various types of 3D visible, reactive functional components called boxes, e.g., a voice input component and various multimedia handling components. IntelligentBox also provides a dynamic data linkage mechanism called slot-connection that allows the user to develop 3D graphics applications by combining already existing boxes through direct manipulations on a computer screen. Using IntelligentBox, the 3D multimedia presentation tool proposed in this paper was also developed as combined components only through direct manipulations on a computer screen. The authors have already proposed a 3D multimedia presentation tool using a stage metaphor and its voice input interface. This time, we extended the system to make it accept the user gesture input besides voice commands. This paper explains details of the proposed 3D multimedia presentation tool and especially describes its component-based voice and gesture input interfaces.
An efficient protocol for providing integrated voice/data services to mobiles over power-limited satellite channels

NASA Astrophysics Data System (ADS)

Bose, Sanjay K.

1991-02-01

Various mobile satellite communication systems are being developed for providing integrated voice/data services over a shared satellite transponder which is power-limited in nature. A common strategy is to use slotted ALOHA request channels to request channel assignments for voice/data calls from a network management station. To maximize efficiency in a system with a power-limited satellite transponder, it is proposed that the bursty nature of voice sources be exploited by the NMS to 'over-assign' channels. This may cause problems of inefficiency and potential instability, as well as a degradation in the quality of service. Augmenting this with the introduction of simple state-dependent control procedures provides systems which exhibit more desirable operational features.
Reliability of human-supervised formant-trajectory measurement for forensic voice comparison.

PubMed

Zhang, Cuiling; Morrison, Geoffrey Stewart; Ochoa, Felipe; Enzinger, Ewald

2013-01-01

Acoustic-phonetic approaches to forensic voice comparison often include human-supervised measurement of vowel formants, but the reliability of such measurements is a matter of concern. This study assesses the within- and between-supervisor variability of three sets of formant-trajectory measurements made by each of four human supervisors. It also assesses the validity and reliability of forensic-voice-comparison systems based on these measurements. Each supervisor's formant-trajectory system was fused with a baseline mel-frequency cepstral-coefficient system, and performance was assessed relative to the baseline system. Substantial improvements in validity were found for all supervisors' systems, but some supervisors' systems were more reliable than others.
An investigation of users' attitudes, requirements and willingness to use mobile phone-based interactive voice response systems for seeking healthcare in Ghana: a qualitative study.

PubMed

Brinkel, J; Dako-Gyeke, P; Krämer, A; May, J; Fobil, J N

2017-03-01

In implementing mobile health interventions, user requirements and willingness to use are among the most crucial concerns for success of the investigation and have only rarely been examined in sub-Saharan Africa. This study aimed to specify the requirements of caregivers of children in order to use a symptom-based interactive voice response (IVR) system for seeking healthcare. This included (i) the investigation of attitudes towards mobile phone use and user experiences and (ii) the assessment of facilitators and challenges to use the IVR system. This is a population-based cross-sectional study. Four qualitative focus group discussions were conducted in peri-urban and rural towns in Shai Osudoku and Ga West district, as well as in Tema- and Accra Metropolitan Assembly. Participants included male and female caregivers of at least one child between 0 and 10 years of age. A qualitative content analysis was conducted for data analysis. Participants showed a positive attitude towards the use of mobile phones for seeking healthcare. While no previous experience in using IVR for health information was reported, the majority of participants stated that it offers a huge advantage for improvement in health performance. Barriers to IVR use included concerns about costs, lack of familiarly with the technology, social barriers such as lack of human interaction and infrastructural challenges. The establishment of a toll-free number as well as training prior to IVR system was discussed for recommendation. This study suggests that caregivers in the socio-economic environment of Ghana are interested and willing to use mobile phone-based IVR to receive health information for child healthcare. Important identified users' needs should be considered by health programme implementers and policy makers to help facilitate the development and implementation of IVR systems in the field of seeking healthcare. Copyright © 2016 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
Auditory verbal hallucinations: Social, but how?

PubMed Central

Alderson-Day, Ben; Fernyhough, Charles

2017-01-01

Summary Auditory verbal hallucinations (AVH) are experiences of hearing voices in the absence of an external speaker. Standard explanatory models propose that AVH arise from misattributed verbal cognitions (i.e. inner speech), but provide little account of how heard voices often have a distinct persona and agency. Here we review the argument that AVH have important social and agent-like properties and consider how different neurocognitive approaches to AVH can account for these elements, focusing on inner speech, memory, and predictive processing. We then evaluate the possible role of separate social-cognitive processes in the development of AVH, before outlining three ways in which speech and language processes already involve socially important information, such as cues to interact with others. We propose that when these are taken into account, the social characteristics of AVH can be explained without an appeal to separate social-cognitive systems. PMID:29238264
DLMS Voice Data Entry.

DTIC Science & Technology

1980-06-01

34 LIST OF ILLUSTRATIONS FIGURE PAGE 1 Block Diagram of DLMS Voice Recognition System .............. S 2 Flowchart of DefaulV...particular are a speech preprocessor and a minicomputer. In the VRS, as shown in the block diagram of Fig. 1, the preprocessor is a TTI model 8040 and...Data General 6026 Magnetic Zo 4 Tape Unit Display L-- - Equipment Cabinet Fig. 1 block Diagram of DIMS Voice Recognition System qS 2. Flexible Disk
A unified coding strategy for processing faces and voices

PubMed Central

Yovel, Galit; Belin, Pascal

2013-01-01

Both faces and voices are rich in socially-relevant information, which humans are remarkably adept at extracting, including a person's identity, age, gender, affective state, personality, etc. Here, we review accumulating evidence from behavioral, neuropsychological, electrophysiological, and neuroimaging studies which suggest that the cognitive and neural processing mechanisms engaged by perceiving faces or voices are highly similar, despite the very different nature of their sensory input. The similarity between the two mechanisms likely facilitates the multi-modal integration of facial and vocal information during everyday social interactions. These findings emphasize a parsimonious principle of cerebral organization, where similar computational problems in different modalities are solved using similar solutions. PMID:23664703
The recognition of female voice based on voice registers in singing techniques in real-time using hankel transform method and macdonald function

NASA Astrophysics Data System (ADS)

Meiyanti, R.; Subandi, A.; Fuqara, N.; Budiman, M. A.; Siahaan, A. P. U.

2018-03-01

A singer doesn’t just recite the lyrics of a song, but also with the use of particular sound techniques to make it more beautiful. In the singing technique, more female have a diverse sound registers than male. There are so many registers of the human voice, but the voice registers used while singing, among others, Chest Voice, Head Voice, Falsetto, and Vocal fry. Research of speech recognition based on the female’s voice registers in singing technique is built using Borland Delphi 7.0. Speech recognition process performed by the input recorded voice samples and also in real time. Voice input will result in weight energy values based on calculations using Hankel Transformation method and Macdonald Functions. The results showed that the accuracy of the system depends on the accuracy of sound engineering that trained and tested, and obtained an average percentage of the successful introduction of the voice registers record reached 48.75 percent, while the average percentage of the successful introduction of the voice registers in real time to reach 57 percent.
Emotional and Interactional Prosody across Animal Communication Systems: A Comparative Approach to the Emergence of Language

PubMed Central

Filippi, Piera

2016-01-01

Across a wide range of animal taxa, prosodic modulation of the voice can express emotional information and is used to coordinate vocal interactions between multiple individuals. Within a comparative approach to animal communication systems, I hypothesize that the ability for emotional and interactional prosody (EIP) paved the way for the evolution of linguistic prosody – and perhaps also of music, continuing to play a vital role in the acquisition of language. In support of this hypothesis, I review three research fields: (i) empirical studies on the adaptive value of EIP in non-human primates, mammals, songbirds, anurans, and insects; (ii) the beneficial effects of EIP in scaffolding language learning and social development in human infants; (iii) the cognitive relationship between linguistic prosody and the ability for music, which has often been identified as the evolutionary precursor of language. PMID:27733835
Ethnographic interviews to elicit patients' reactions to an intelligent interactive telephone health behavior advisor system.

PubMed

Kaplan, B; Farzanfar, R; Friedman, R H

1999-01-01

Information technology is being used to collect data directly from patients and to provide educational information to them. Concern over patient reactions to this use of information technology is especially important in light of the debate over whether computers dehumanize patients. This study reports reactions that patient users expressed in ethnographic interviews about using a computer-based telecommunications system. The interviews were conducted as part of a larger evaluation of Telephone-Linked Care (TLC)-HealthCall, an intelligent interactive telephone advisor, that advised individuals about how to improve their health through changes in diet or exercise. Interview findings suggest that people formed personal relationships with the TLC system. These relationships ranged from feeling guilty about their diet or exercise behavior to feeling love for the voice. The findings raise system design and user interface issues as well as research and ethical questions.
Ethnographic interviews to elicit patients' reactions to an intelligent interactive telephone health behavior advisor system.

PubMed Central

Kaplan, B.; Farzanfar, R.; Friedman, R. H.

1999-01-01

Information technology is being used to collect data directly from patients and to provide educational information to them. Concern over patient reactions to this use of information technology is especially important in light of the debate over whether computers dehumanize patients. This study reports reactions that patient users expressed in ethnographic interviews about using a computer-based telecommunications system. The interviews were conducted as part of a larger evaluation of Telephone-Linked Care (TLC)-HealthCall, an intelligent interactive telephone advisor, that advised individuals about how to improve their health through changes in diet or exercise. Interview findings suggest that people formed personal relationships with the TLC system. These relationships ranged from feeling guilty about their diet or exercise behavior to feeling love for the voice. The findings raise system design and user interface issues as well as research and ethical questions. PMID:10566420
Voice Response System Statistics Program : Operational Handbook.

DOT National Transportation Integrated Search

1980-06-01

This report documents the Voice Response System (VRS) Statistics Program developed for the preflight weather briefing VRS. It describes the VRS statistical report format and contents, the software program structure, and the program operation.
Generation of surgical pathology report using a 5,000-word speech recognizer.

PubMed

Tischler, A S; Martin, M R

1989-10-01

Pressures to decrease both turnaround time and operating costs simultaneously have placed conflicting demands on traditional forms of medical transcription. The new technology of voice recognition extends the promise of enabling the pathologist or other medical professional to dictate a correct report and have it printed and/or transmitted to a database immediately. The usefulness of voice recognition systems depends on several factors, including ease of use, reliability, speed, and accuracy. These in turn depend on the general underlying design of the systems and inclusion in the systems of a specific knowledge base appropriate for each application. Development of a good knowledge base requires close collaboration between a domain expert and a knowledge engineer with expertise in voice recognition. The authors have recently completed a knowledge base for surgical pathology using the Kurzweil VoiceReport 5,000-word system.
Emotional self-other voice processing in schizophrenia and its relationship with hallucinations: ERP evidence.

PubMed

Pinheiro, Ana P; Rezaii, Neguine; Rauber, Andréia; Nestor, Paul G; Spencer, Kevin M; Niznikiewicz, Margaret

2017-09-01

Abnormalities in self-other voice processing have been observed in schizophrenia, and may underlie the experience of hallucinations. More recent studies demonstrated that these impairments are enhanced for speech stimuli with negative content. Nonetheless, few studies probed the temporal dynamics of self versus nonself speech processing in schizophrenia and, particularly, the impact of semantic valence on self-other voice discrimination. In the current study, we examined these questions, and additionally probed whether impairments in these processes are associated with the experience of hallucinations. Fifteen schizophrenia patients and 16 healthy controls listened to 420 prerecorded adjectives differing in voice identity (self-generated [SGS] versus nonself speech [NSS]) and semantic valence (neutral, positive, and negative), while EEG data were recorded. The N1, P2, and late positive potential (LPP) ERP components were analyzed. ERP results revealed group differences in the interaction between voice identity and valence in the P2 and LPP components. Specifically, LPP amplitude was reduced in patients compared with healthy subjects for SGS and NSS with negative content. Further, auditory hallucinations severity was significantly predicted by LPP amplitude: the higher the SAPS "voices conversing" score, the larger the difference in LPP amplitude between negative and positive NSS. The absence of group differences in the N1 suggests that self-other voice processing abnormalities in schizophrenia are not primarily driven by disrupted sensory processing of voice acoustic information. The association between LPP amplitude and hallucination severity suggests that auditory hallucinations are associated with enhanced sustained attention to negative cues conveyed by a nonself voice. © 2017 Society for Psychophysiological Research.
Voice - How humans communicate?

PubMed

Tiwari, Manjul; Tiwari, Maneesha

2012-01-01

Voices are important things for humans. They are the medium through which we do a lot of communicating with the outside world: our ideas, of course, and also our emotions and our personality. The voice is the very emblem of the speaker, indelibly woven into the fabric of speech. In this sense, each of our utterances of spoken language carries not only its own message but also, through accent, tone of voice and habitual voice quality it is at the same time an audible declaration of our membership of particular social regional groups, of our individual physical and psychological identity, and of our momentary mood. Voices are also one of the media through which we (successfully, most of the time) recognize other humans who are important to us-members of our family, media personalities, our friends, and enemies. Although evidence from DNA analysis is potentially vastly more eloquent in its power than evidence from voices, DNA cannot talk. It cannot be recorded planning, carrying out or confessing to a crime. It cannot be so apparently directly incriminating. As will quickly become evident, voices are extremely complex things, and some of the inherent limitations of the forensic-phonetic method are in part a consequence of the interaction between their complexity and the real world in which they are used. It is one of the aims of this article to explain how this comes about. This subject have unsolved questions, but there is no direct way to present the information that is necessary to understand how voices can be related, or not, to their owners.
Sounding the ‘Citizen–Patient’: The Politics of Voice at the Hospice des Quinze-Vingts in Post-Revolutionary Paris

PubMed Central

Sykes, Ingrid

2011-01-01

This essay explores new models of the citizen–patient by attending to the post-Revolutionary blind ‘voice’. Voice, in both a literal and figurative sense, was central to the way in which members of the Hospice des Quinze-Vingts, an institution for the blind and partially sighted, interacted with those in the community. Musical voices had been used by members to collect alms and to project the particular spiritual principle of their institution since its foundation in the thirteenth century. At the time of the Revolution, the Quinze-Vingts voice was understood by some political authorities as an exemplary call of humanity. Yet many others perceived it as deeply threatening. After 1800, productive dialogue between those in political control and Quinze-Vingts blind members broke down. Authorities attempted to silence the voice of members through the control of blind musicians and institutional management. The Quinze-Vingts blind continued to reassert their voices until around 1850, providing a powerful form of resistance to political control. The blind ‘voice’ ultimately recognised the right of the citizen–patient to dialogue with their political carers. PMID:22025797
Cultural brokerage: Creating linkages between voices of lifeworld and medicine in cross-cultural clinical settings.

PubMed

Lo, Ming-Cheng Miriam

2010-09-01

Culturally competent healthcare has emerged as a policy solution to racial and ethnic health disparities in the United States. Current research indicates that patient-centered care is a central component of culturally competent healthcare, and a rich literature exists on how to elicit patients' lifeworld voices through open-ended questions, sensitive communication skills, and power-sharing interaction styles. But it remains largely unclear how doctors create linkages between cultures of medicine and lifeworld as two sets of incongruent meaning systems. Without such linkages, a doctor lacks the cultural tool to incorporate her patient's assumptions or frameworks into the voice of medicine, rendering it difficult to (at least partially) expand and transform the latter from within. This study explores how doctors perform this bridging work, conceptualized as cultural brokerage, on the job. Cultural brokerage entails mutual inclusion of different sets of schemas or frameworks with which people organize their meanings and information. Based on 24 in-depth interviews with primary care physicians in Northern California, this study inductively documents four empirical mechanisms of cultural brokerage: 'translating between health systems', 'bridging divergent images of medicine', 'establishing long-term relationships', and 'working with patients' relational networks'. Furthermore, the study argues that cultural brokerage must be understood as concrete 'cultural labor', which involves specific tasks and requires time and resources. I argue that the performance of cultural brokerage work is embedded in the institutional contexts of the clinic and therefore faces two macro-level constraints: the cultural ideology and the political economy of the American healthcare system.
Using voice to create hospital progress notes: Description of a mobile application and supporting system integrated with a commercial electronic health record.

PubMed

Payne, Thomas H; Alonso, W David; Markiel, J Andrew; Lybarger, Kevin; White, Andrew A

2018-01-01

We describe the development and design of a smartphone app-based system to create inpatient progress notes using voice, commercial automatic speech recognition software, with text processing to recognize spoken voice commands and format the note, and integration with a commercial EHR. This new system fits hospital rounding workflow and was used to support a randomized clinical trial testing whether use of voice to create notes improves timeliness of note availability, note quality, and physician satisfaction with the note creation process. The system was used to create 709 notes which were placed in the corresponding patient's EHR record. The median time from pressing the Send button to appearance of the formatted note in the Inbox was 8.8 min. It was generally very reliable, accepted by physician users, and secure. This approach provides an alternative to use of keyboard and templates to create progress notes and may appeal to physicians who prefer voice to typing. Copyright © 2017 Elsevier Inc. All rights reserved.
Telephony-based voice pathology assessment using automated speech analysis.

PubMed

Moran, Rosalyn J; Reilly, Richard B; de Chazal, Philip; Lacy, Peter D

2006-03-01

A system for remotely detecting vocal fold pathologies using telephone-quality speech is presented. The system uses a linear classifier, processing measurements of pitch perturbation, amplitude perturbation and harmonic-to-noise ratio derived from digitized speech recordings. Voice recordings from the Disordered Voice Database Model 4337 system were used to develop and validate the system. Results show that while a sustained phonation, recorded in a controlled environment, can be classified as normal or pathologic with accuracy of 89.1%, telephone-quality speech can be classified as normal or pathologic with an accuracy of 74.2%, using the same scheme. Amplitude perturbation features prove most robust for telephone-quality speech. The pathologic recordings were then subcategorized into four groups, comprising normal, neuromuscular pathologic, physical pathologic and mixed (neuromuscular with physical) pathologic. A separate classifier was developed for classifying the normal group from each pathologic subcategory. Results show that neuromuscular disorders could be detected remotely with an accuracy of 87%, physical abnormalities with an accuracy of 78% and mixed pathology voice with an accuracy of 61%. This study highlights the real possibility for remote detection and diagnosis of voice pathology.
[Applicability of voice acoustic analysis with vocal loading testto diagnostics of occupational voice diseases].

PubMed

Niebudek-Bogusz, Ewa; Sliwińska-Kowalska, Mariola

2006-01-01

An assessment of the vocal system, as a part of the medical certification of occupational diseases, should be objective and reliable. Therefore, interest in the method of acoustic voice analysis enabling objective assessment of voice parameters is still growing. The aim of the present study was to evaluate the applicability of acoustic analysis with vocal loading test to the diagnostics of occupational voice disorders. The results of acoustic voice analysis were compared using IRIS software for phoniatrics, before and after a 30-min vocal loading test in 35 female teachers with diagnosed occupational voice disorders (group I) and in 31 female teachers with functional dysphonia (group II). In group I, vocal effort produced significant abnormalities in voice acoustic parameters, compared to group II. These included significantly increased mean fundamental frequency (Fo) value (by 11 Hz) and worsened jitter, shimmer and NHR parameters. Also, the percentage of subjects showing abnormalities in voice acoustic analysis was higher in this group. Conducting voice acoustic analysis before and after the vocal loading test makes it possible to objectively confirm irreversible voice impairments in persons with work-related pathologies of the larynx, which is essential for medical certification of occupational voice diseases.

Acquiring Complex Focus-Marking: Finnish 4- to 5-Year-Olds Use Prosody and Word Order in Interaction

PubMed Central

Arnhold, Anja; Chen, Aoju; Järvikivi, Juhani

2016-01-01

Using a language game to elicit short sentences in various information structural conditions, we found that Finnish 4- to 5-year-olds already exhibit a characteristic interaction between prosody and word order in marking information structure. Providing insights into the acquisition of this complex system of interactions, the production data showed interesting parallels to adult speakers of Finnish on the one hand and to children acquiring other languages on the other hand. Analyzing a total of 571 sentences produced by 16 children, we found that children rarely adjusted input word order, but did systematically avoid marked OVS order in contrastive object focus condition. Focus condition also significantly affected four prosodic parameters, f0, duration, pauses and voice quality. Differing slightly from effects displayed in adult Finnish speech, the children produced larger f0 ranges for words in contrastive focus and smaller ones for unfocused words, varied only the duration of object constituents to be longer in focus and shorter in unfocused condition, inserted more pauses before and after focused constituents and systematically modified their use of non-modal voice quality only in utterances with narrow focus. Crucially, these effects were modulated by word order. In contrast to comparable data from children acquiring Germanic languages, the present findings reflect the more central role of word order and of interactions between word order and prosody in marking information structure in Finnish. Thus, the study highlights the role of the target language in determining linguistic development. PMID:27990130
Using voice input and audio feedback to enhance the reality of a virtual experience

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miner, N.E.

1994-04-01

Virtual Reality (VR) is a rapidly emerging technology which allows participants to experience a virtual environment through stimulation of the participant`s senses. Intuitive and natural interactions with the virtual world help to create a realistic experience. Typically, a participant is immersed in a virtual environment through the use of a 3-D viewer. Realistic, computer-generated environment models and accurate tracking of a participant`s view are important factors for adding realism to a virtual experience. Stimulating a participant`s sense of sound and providing a natural form of communication for interacting with the virtual world are equally important. This paper discusses the advantagesmore » and importance of incorporating voice recognition and audio feedback capabilities into a virtual world experience. Various approaches and levels of complexity are discussed. Examples of the use of voice and sound are presented through the description of a research application developed in the VR laboratory at Sandia National Laboratories.« less
Effects of mucosal loading on vocal fold vibration.

PubMed

Tao, Chao; Jiang, Jack J

2009-06-01

A chain model was proposed in this study to examine the effects of mucosal loading on vocal fold vibration. Mucosal loading was defined as the loading caused by the interaction between the vocal folds and the surrounding tissue. In the proposed model, the vocal folds and the surrounding tissue were represented by a series of oscillators connected by a coupling spring. The lumped masses, springs, and dampers of the oscillators modeled the tissue properties of mass, stiffness, and viscosity, respectively. The coupling spring exemplified the tissue interactions. By numerically solving this chain model, the effects of mucosal loading on the phonation threshold pressure, phonation instability pressure, and energy distribution in a voice production system were studied. It was found that when mucosal loading is small, phonation threshold pressure increases with the damping constant R(r), the mass constant R(m), and the coupling constant R(mu) of mucosal loading but decreases with the stiffness constant R(k). Phonation instability pressure is also related to mucosal loading. It was found that phonation instability pressure increases with the coupling constant R(mu) but decreases with the stiffness constant R(k) of mucosal loading. Therefore, it was concluded that mucosal loading directly affects voice production.
Effects of mucosal loading on vocal fold vibration

NASA Astrophysics Data System (ADS)

Tao, Chao; Jiang, Jack J.

2009-06-01

A chain model was proposed in this study to examine the effects of mucosal loading on vocal fold vibration. Mucosal loading was defined as the loading caused by the interaction between the vocal folds and the surrounding tissue. In the proposed model, the vocal folds and the surrounding tissue were represented by a series of oscillators connected by a coupling spring. The lumped masses, springs, and dampers of the oscillators modeled the tissue properties of mass, stiffness, and viscosity, respectively. The coupling spring exemplified the tissue interactions. By numerically solving this chain model, the effects of mucosal loading on the phonation threshold pressure, phonation instability pressure, and energy distribution in a voice production system were studied. It was found that when mucosal loading is small, phonation threshold pressure increases with the damping constant Rr, the mass constant Rm, and the coupling constant Rμ of mucosal loading but decreases with the stiffness constant Rk. Phonation instability pressure is also related to mucosal loading. It was found that phonation instability pressure increases with the coupling constant Rμ but decreases with the stiffness constant Rk of mucosal loading. Therefore, it was concluded that mucosal loading directly affects voice production.
[Signs and symptoms of autonomic dysfunction in dysphonic individuals].

PubMed

Park, Kelly; Behlau, Mara

2011-01-01

To verify the occurrence of signs and symptoms of autonomic nervous system dysfunction in individuals with behavioral dysphonia, and to compare it with the results obtained by individuals without vocal complaints. Participants were 128 adult individuals with ages between 14 and 74 years, divided into two groups: behavioral dysphonia (61 subjects) and without vocal complaints (67 subjects). It was administered the Protocol of Autonomic Dysfunction, containing 46 questions: 22 related to the autonomic nervous system and had no direct relationship with voice, 16 related to both autonomic nervous system and voice, six non-relevant questions, and two reliability questions. There was a higher occurrence of reported neurovegetative signs in the group with behavioral dysphonia, in questions related to voice, such as frequent throat clearing, frequent swallowing need, fatigability when speaking, and sore throat. In questions not directly related to voice, dysphonic individuals presented greater occurrence of three out of 22 symptoms: gas, tinnitus and aerophagia. Both groups presented similar results in questions non-relevant to the autonomic nervous system. Reliability questions needed reformulation. Individuals with behavioral dysphonia present higher occurrence of neurovegetative signs and symptoms, particularly those with direct relationship with voice, indicating greater lability of the autonomic nervous system in these subjects.
Evaluating iPhone recordings for acoustic voice assessment.

PubMed

Lin, Emily; Hornibrook, Jeremy; Ormond, Tika

2012-01-01

This study examined the viability of using iPhone recordings for acoustic measurements of voice quality. Acoustic measures were compared between voice signals simultaneously recorded from 11 normal speakers (6 females and 5 males) through an iPhone (model A1303, Apple, USA) and a comparison recording system. Comparisons were also conducted between the pre- and post-operative voices recorded from 10 voice patients (4 females and 6 males) through the iPhone. Participants aged between 27 and 79 years. Measures from iPhone and comparison signals were found to be highly correlated. Findings of the effects of vowel type on the selected measures were consistent between the two recording systems and congruent with previous findings. Analysis of the patient data revealed that a selection of acoustic measures, such as vowel space area and voice perturbation measures, consistently demonstrated a positive change following phonosurgery. The present findings indicated that the iPhone device tested was useful for tracking voice changes for clinical management. Preliminary findings regarding factors such as gender and type of pathology suggest that intra-subject, instead of norm-referenced, comparisons of acoustic measures would be more useful in monitoring the progression of a voice disorder or tracking the treatment effect. Copyright © 2012 S. Karger AG, Basel.
Prototype app for voice therapy: a peer review.

PubMed

Lavaissiéri, Paula; Melo, Paulo Eduardo Damasceno

2017-03-09

Voice speech therapy promotes changes in patients' voice-related habits and rehabilitation. Speech-language therapists use a host of materials ranging from pictures to electronic resources and computer tools as aids in this process. Mobile technology is attractive, interactive and a nearly constant feature in the daily routine of a large part of the population and has a growing application in healthcare. To develop a prototype application for voice therapy, submit it to peer assessment, and to improve the initial prototype based on these assessments. a prototype of the Q-Voz application was developed based on Apple's Human Interface Guidelines. The prototype was analyzed by seven speech therapists who work in the voice area. Improvements to the product were made based on these assessments. all features of the application were considered satisfactory by most evaluators. All evaluators found the application very useful; evaluators reported that patients would find it easier to make changes in voice behavior with the application than without it; the evaluators stated they would use this application with their patients with dysphonia and in the process of rehabilitation and that the application offers useful tools for voice self-management. Based on the suggestions provided, six improvements were made to the prototype. the prototype Q-Voz Application was developed and evaluated by seven judges and subsequently improved. All evaluators stated they would use the application with their patients undergoing rehabilitation, indicating that the Q-Voz Application for mobile devices can be considered an auxiliary tool for voice speech therapy.
Exploring the anatomical encoding of voice with a mathematical model of the vocal system.

PubMed

Assaneo, M Florencia; Sitt, Jacobo; Varoquaux, Gael; Sigman, Mariano; Cohen, Laurent; Trevisan, Marcos A

2016-11-01

The faculty of language depends on the interplay between the production and perception of speech sounds. A relevant open question is whether the dimensions that organize voice perception in the brain are acoustical or depend on properties of the vocal system that produced it. One of the main empirical difficulties in answering this question is to generate sounds that vary along a continuum according to the anatomical properties the vocal apparatus that produced them. Here we use a mathematical model that offers the unique possibility of synthesizing vocal sounds by controlling a small set of anatomically based parameters. In a first stage the quality of the synthetic voice was evaluated. Using specific time traces for sub-glottal pressure and tension of the vocal folds, the synthetic voices generated perceptual responses, which are indistinguishable from those of real speech. The synthesizer was then used to investigate how the auditory cortex responds to the perception of voice depending on the anatomy of the vocal apparatus. Our fMRI results show that sounds are perceived as human vocalizations when produced by a vocal system that follows a simple relationship between the size of the vocal folds and the vocal tract. We found that these anatomical parameters encode the perceptual vocal identity (male, female, child) and show that the brain areas that respond to human speech also encode vocal identity. On the basis of these results, we propose that this low-dimensional model of the vocal system is capable of generating realistic voices and represents a novel tool to explore the voice perception with a precise control of the anatomical variables that generate speech. Furthermore, the model provides an explanation of how auditory cortices encode voices in terms of the anatomical parameters of the vocal system. Copyright © 2016 Elsevier Inc. All rights reserved.
An Exploration of the Interaction between Global Education Policy Orthodoxies and National Education Practices in Cambodia, Illuminated through the Voices of Local Teacher Educators

ERIC Educational Resources Information Center

Courtney, Jane

2017-01-01

This research is based on a multi-disciplinary and multi-levelled analysis of evidence to present the case that education reform needs to be contextualised far more widely than is currently practised. It focuses on the voices of Cambodian local teacher trainers through interviews over a five-year period. Interview data is triangulated against…
Vocal responses to unanticipated perturbations in voice loudness feedback: an automatic mechanism for stabilizing voice amplitude.

PubMed

Bauer, Jay J; Mittal, Jay; Larson, Charles R; Hain, Timothy C

2006-04-01

The present study tested whether subjects respond to unanticipated short perturbations in voice loudness feedback with compensatory responses in voice amplitude. The role of stimulus magnitude (+/- 1,3 vs 6 dB SPL), stimulus direction (up vs down), and the ongoing voice amplitude level (normal vs soft) were compared across compensations. Subjects responded to perturbations in voice loudness feedback with a compensatory change in voice amplitude 76% of the time. Mean latency of amplitude compensation was 157 ms. Mean response magnitudes were smallest for 1-dB stimulus perturbations (0.75 dB) and greatest for 6-dB conditions (0.98 dB). However, expressed as gain, responses for 1-dB perturbations were largest and almost approached 1.0. Response magnitudes were larger for the soft voice amplitude condition compared to the normal voice amplitude condition. A mathematical model of the audio-vocal system captured the main features of the compensations. Previous research has demonstrated that subjects can respond to an unanticipated perturbation in voice pitch feedback with an automatic compensatory response in voice fundamental frequency. Data from the present study suggest that voice loudness feedback can be used in a similar manner to monitor and stabilize voice amplitude around a desired loudness level.
Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

PubMed

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers' voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker's face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.
Assured Information Flow Capping Architecture.

DTIC Science & Technology

1985-05-01

Air Control System Deployment, ESD-TR-71-371, AD 733 584, Electronic Systems Division, AFSC, Hanscom Air Force Base, MA, November 1971. 3. I. Gitman and...H. Frank, "Economic Analysis of Integrated Voice and Data Networks: A Case Study," Proceedings of the IEEE, November 1978. 4. H. Frank and I. Gitman ... Gitman , "Study Shows Packet Switching Best for Voice Traffic, Too," Data Communications, March 1979. ___ "Economic Analysis of Integrated Voice and
Task-Oriented, Naturally Elicited Speech (TONE) Database for the Force Requirements Expert System, Hawaii (FRESH)

DTIC Science & Technology

1988-09-01

Group Subgroup Command and control; Computational linguistics; expert system voice recognition; man- machine interface; U.S. Government 19 Abstract...simulates the characteristics of FRESH on a smaller scale. This study assisted NOSC in developing a voice-recognition, man- machine interface that could...scale. This study assisted NOSC in developing a voice-recogni- tion, man- machine interface that could be used with TONE and upgraded at a later date
The Johns Hopkins Medical Institutions' Premise Distribution Plan

PubMed Central

Barta, Wendy; Buckholtz, Howard; Johnston, Mark; Lenhard, Raymond; Tolchin, Stephen; Vienne, Donald

1987-01-01

A Premise Distribution Plan is being developed to address the growing voice and data communications needs at Johns Hopkins Medical Institutions. More specifically, the use of a rapidly expanding Ethernet computer network and a new Integrated Services Digital Network (ISDN) Digital Centrex system must be planned to provide easy, reliable and cost-effective data and voice communications services. Existing Premise Distribution Systems are compared along with voice and data technologies which would use them.
Voice disorders and mental health in teachers: a cross-sectional nationwide study.

PubMed

Nerrière, Eléna; Vercambre, Marie-Noël; Gilbert, Fabien; Kovess-Masféty, Viviane

2009-10-02

Teachers, as professional voice users, are at particular risk of voice disorders. Among contributing factors, stress and psychological tension could play a role but epidemiological data on this problem are scarce. The aim of this study was to evaluate prevalence and cofactors of voice disorders among teachers in the French National Education system, with particular attention paid to the association between voice complaint and psychological status. The source data come from an epidemiological postal survey on physical and mental health conducted in a sample of 20,099 adults (in activity or retired) selected at random from the health plan records of the national education system. Overall response rate was 53%. Of the 10,288 respondents, 3,940 were teachers in activity currently giving classes to students. In the sample of those with complete data (n = 3,646), variables associated with voice disorders were investigated using logistic regression models. Studied variables referred to demographic characteristics, socio-professional environment, psychological distress, mental health disorders (DSM-IV), and sick leave. One in two female teachers reported voice disorders (50.0%) compared to one in four males (26.0%). Those who reported voice disorders presented higher level of psychological distress. Sex- and age-adjusted odds ratios [95% confidence interval] were respectively 1.8 [1.5-2.2] for major depressive episode, 1.7 [1.3-2.2] for general anxiety disorder, and 1.6 [1.2-2.2] for phobia. A significant association between voice disorders and sick leave was also demonstrated (1.5 [1.3-1.7]). Voice disorders were frequent among French teachers. Associations with psychiatric disorders suggest that a situation may exist which is more complex than simple mechanical failure. Further longitudinal research is needed to clarify the comorbidity between voice and psychological disorders.
Voice-enabled Knowledge Engine using Flood Ontology and Natural Language Processing

NASA Astrophysics Data System (ADS)

Sermet, M. Y.; Demir, I.; Krajewski, W. F.

2015-12-01

The Iowa Flood Information System (IFIS) is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to flood inundation maps, real-time flood conditions, flood forecasts, flood-related data, information and interactive visualizations for communities in Iowa. The IFIS is designed for use by general public, often people with no domain knowledge and limited general science background. To improve effective communication with such audience, we have introduced a voice-enabled knowledge engine on flood related issues in IFIS. Instead of navigating within many features and interfaces of the information system and web-based sources, the system provides dynamic computations based on a collection of built-in data, analysis, and methods. The IFIS Knowledge Engine connects to real-time stream gauges, in-house data sources, analysis and visualization tools to answer natural language questions. Our goal is the systematization of data and modeling results on flood related issues in Iowa, and to provide an interface for definitive answers to factual queries. The goal of the knowledge engine is to make all flood related knowledge in Iowa easily accessible to everyone, and support voice-enabled natural language input. We aim to integrate and curate all flood related data, implement analytical and visualization tools, and make it possible to compute answers from questions. The IFIS explicitly implements analytical methods and models, as algorithms, and curates all flood related data and resources so that all these resources are computable. The IFIS Knowledge Engine computes the answer by deriving it from its computational knowledge base. The knowledge engine processes the statement, access data warehouse, run complex database queries on the server-side and return outputs in various formats. This presentation provides an overview of IFIS Knowledge Engine, its unique information interface and functionality as an educational tool, and discusses the future plans for providing knowledge on flood related issues and resources. IFIS Knowledge Engine provides an alternative access method to these comprehensive set of tools and data resources available in IFIS. Current implementation of the system accepts free-form input and voice recognition capabilities within browser and mobile applications.
Voice Messaging.

ERIC Educational Resources Information Center

Davis, Barbara D.; Tisdale, Judy Jones; Krapels, Roberta H.

2001-01-01

Surveys corporate use of voice message systems by interviewing employees in four different companies. Finds that all four companies viewed their voicemail systems as a supplement to personal contact (not a replacement) and provided training, but had no formal method to assess customer satisfaction with their system. Suggests business communication…
Impact of voice- and knowledge-enabled clinical reporting--US example.

PubMed

Bushko, Renata G; Havlicek, Penny L; Deppert, Edward; Epner, Stephen

2002-01-01

This study shows qualitative and quantitative estimates of the national and the clinic level impact of utilizing voice and knowledge enabled clinical reporting systems. Using common sense estimation methodology, we show that the delivery of health care can experience a dramatic improvement in four areas as a result of the broad use of voice and knowledge enabled clinical reporting: (1) Process Quality as measured by cost savings, (2) Organizational Quality as measured by compliance, (3) Clinical Quality as measured by clinical outcomes and (4) Service Quality as measured by patient satisfaction. If only 15 percent of US physicians replaced transcription with modem clinical reporting voice-based methodology, about one half billion dollars could be saved. $6.7 Billion could be saved annually if all medical reporting currently transcribed was handled with voice-and knowledge-enabled dictation and reporting systems.
Multi-modal demands of a smartphone used to place calls and enter addresses during highway driving relative to two embedded systems.

PubMed

Reimer, Bryan; Mehler, Bruce; Reagan, Ian; Kidd, David; Dobres, Jonathan

2016-12-01

There is limited research on trade-offs in demand between manual and voice interfaces of embedded and portable technologies. Mehler et al. identified differences in driving performance, visual engagement and workload between two contrasting embedded vehicle system designs (Chevrolet MyLink and Volvo Sensus). The current study extends this work by comparing these embedded systems with a smartphone (Samsung Galaxy S4). None of the voice interfaces eliminated visual demand. Relative to placing calls manually, both embedded voice interfaces resulted in less eyes-off-road time than the smartphone. Errors were most frequent when calling contacts using the smartphone. The smartphone and MyLink allowed addresses to be entered using compound voice commands resulting in shorter eyes-off-road time compared with the menu-based Sensus but with many more errors. Driving performance and physiological measures indicated increased demand when performing secondary tasks relative to 'just driving', but were not significantly different between the smartphone and embedded systems. Practitioner Summary: The findings show that embedded system and portable device voice interfaces place fewer visual demands on the driver than manual interfaces, but they also underscore how differences in system designs can significantly affect not only the demands placed on drivers, but also the successful completion of tasks.
Implicit prosody mining based on the human eye image capture technology

NASA Astrophysics Data System (ADS)

Gao, Pei-pei; Liu, Feng

2013-08-01

The technology of eye tracker has become the main methods of analyzing the recognition issues in human-computer interaction. Human eye image capture is the key problem of the eye tracking. Based on further research, a new human-computer interaction method introduced to enrich the form of speech synthetic. We propose a method of Implicit Prosody mining based on the human eye image capture technology to extract the parameters from the image of human eyes when reading, control and drive prosody generation in speech synthesis, and establish prosodic model with high simulation accuracy. Duration model is key issues for prosody generation. For the duration model, this paper put forward a new idea for obtaining gaze duration of eyes when reading based on the eye image capture technology, and synchronous controlling this duration and pronunciation duration in speech synthesis. The movement of human eyes during reading is a comprehensive multi-factor interactive process, such as gaze, twitching and backsight. Therefore, how to extract the appropriate information from the image of human eyes need to be considered and the gaze regularity of eyes need to be obtained as references of modeling. Based on the analysis of current three kinds of eye movement control model and the characteristics of the Implicit Prosody reading, relative independence between speech processing system of text and eye movement control system was discussed. It was proved that under the same text familiarity condition, gaze duration of eyes when reading and internal voice pronunciation duration are synchronous. The eye gaze duration model based on the Chinese language level prosodic structure was presented to change previous methods of machine learning and probability forecasting, obtain readers' real internal reading rhythm and to synthesize voice with personalized rhythm. This research will enrich human-computer interactive form, and will be practical significance and application prospect in terms of disabled assisted speech interaction. Experiments show that Implicit Prosody mining based on the human eye image capture technology makes the synthesized speech has more flexible expressions.

National Voice Response System (VRS) Implementation Plan Alternatives Study

DOT National Transportation Integrated Search

1979-07-01

This study examines the alternatives available to implement a national Voice Response System (VRS) for automated preflight weather briefings and flight plan filing. Four major hardware configurations are discussed. A computerized analysis model was d...
A qualitative method for analysing multivoicedness

PubMed Central

Aveling, Emma-Louise; Gillespie, Alex; Cornish, Flora

2015-01-01

‘Multivoicedness’ and the ‘multivoiced Self’ have become important theoretical concepts guiding research. Drawing on the tradition of dialogism, the Self is conceptualised as being constituted by a multiplicity of dynamic, interacting voices. Despite the growth in literature and empirical research, there remains a paucity of established methodological tools for analysing the multivoiced Self using qualitative data. In this article, we set out a systematic, practical ‘how-to’ guide for analysing multivoicedness. Using theoretically derived tools, our three-step method comprises: identifying the voices of I-positions within the Self’s talk (or text), identifying the voices of ‘inner-Others’, and examining the dialogue and relationships between the different voices. We elaborate each step and illustrate our method using examples from a published paper in which data were analysed using this method. We conclude by offering more general principles for the use of the method and discussing potential applications. PMID:26664292
Voice Recognition Software Accuracy with Second Language Speakers of English.

ERIC Educational Resources Information Center

Coniam, D.

1999-01-01

Explores the potential of the use of voice-recognition technology with second-language speakers of English. Involves the analysis of the output produced by a small group of very competent second-language subjects reading a text into the voice recognition software Dragon Systems "Dragon NaturallySpeaking." (Author/VWL)
WES (Waterways Experiment Station) Communications Plan for Voice and Data

DTIC Science & Technology

1989-01-01

modem on a leased line, and two wideband HDLC 56K connections not used on the Honeywell. 30. Honeywell DPS-8 configuration, as of October 1987, is as...based voice system to support additional asynchronous dial-up modem traffic. In June 1987, Dr. N. Radhakhrishnan of the WES Information Technology...voice system (PBX) and very low-speed data communications by the laboratories using 1,200/2,400-baud asynchronous modems over analog phone lines, and
Emotion and attention interactions in social cognition: brain regions involved in processing anger prosody.

PubMed

Sander, David; Grandjean, Didier; Pourtois, Gilles; Schwartz, Sophie; Seghier, Mohamed L; Scherer, Klaus R; Vuilleumier, Patrik

2005-12-01

Multiple levels of processing are thought to be involved in the appraisal of emotionally relevant events, with some processes being engaged relatively independently of attention, whereas other processes may depend on attention and current task goals or context. We conducted an event-related fMRI experiment to examine how processing angry voice prosody, an affectively and socially salient signal, is modulated by voluntary attention. To manipulate attention orthogonally to emotional prosody, we used a dichotic listening paradigm in which meaningless utterances, pronounced with either angry or neutral prosody, were presented simultaneously to both ears on each trial. In two successive blocks, participants selectively attended to either the left or right ear and performed a gender-decision on the voice heard on the target side. Our results revealed a functional dissociation between different brain areas. Whereas the right amygdala and bilateral superior temporal sulcus responded to anger prosody irrespective of whether it was heard from a to-be-attended or to-be-ignored voice, the orbitofrontal cortex and the cuneus in medial occipital cortex showed greater activation to the same emotional stimuli when the angry voice was to-be-attended rather than to-be-ignored. Furthermore, regression analyses revealed a strong correlation between orbitofrontal regions and sensitivity on a behavioral inhibition scale measuring proneness to anxiety reactions. Our results underscore the importance of emotion and attention interactions in social cognition by demonstrating that multiple levels of processing are involved in the appraisal of emotionally relevant cues in voices, and by showing a modulation of some emotional responses by both the current task-demands and individual differences.
Temporal signatures of processing voiceness and emotion in sound

PubMed Central

Gunter, Thomas C.

2017-01-01

Abstract This study explored the temporal course of vocal and emotional sound processing. Participants detected rare repetitions in a stimulus stream comprising neutral and surprised non-verbal exclamations and spectrally rotated control sounds. Spectral rotation preserved some acoustic and emotional properties of the vocal originals. Event-related potentials elicited to unrepeated sounds revealed effects of voiceness and emotion. Relative to non-vocal sounds, vocal sounds elicited a larger centro-parietally distributed N1. This effect was followed by greater positivity to vocal relative to non-vocal sounds beginning with the P2 and extending throughout the recording epoch (N4, late positive potential) with larger amplitudes in female than in male listeners. Emotion effects overlapped with the voiceness effects but were smaller and differed topographically. Voiceness and emotion interacted only for the late positive potential, which was greater for vocal-emotional as compared with all other sounds. Taken together, these results point to a multi-stage process in which voiceness and emotionality are represented independently before being integrated in a manner that biases responses to stimuli with socio-emotional relevance. PMID:28338796
Temporal signatures of processing voiceness and emotion in sound.

PubMed

Schirmer, Annett; Gunter, Thomas C

2017-06-01

This study explored the temporal course of vocal and emotional sound processing. Participants detected rare repetitions in a stimulus stream comprising neutral and surprised non-verbal exclamations and spectrally rotated control sounds. Spectral rotation preserved some acoustic and emotional properties of the vocal originals. Event-related potentials elicited to unrepeated sounds revealed effects of voiceness and emotion. Relative to non-vocal sounds, vocal sounds elicited a larger centro-parietally distributed N1. This effect was followed by greater positivity to vocal relative to non-vocal sounds beginning with the P2 and extending throughout the recording epoch (N4, late positive potential) with larger amplitudes in female than in male listeners. Emotion effects overlapped with the voiceness effects but were smaller and differed topographically. Voiceness and emotion interacted only for the late positive potential, which was greater for vocal-emotional as compared with all other sounds. Taken together, these results point to a multi-stage process in which voiceness and emotionality are represented independently before being integrated in a manner that biases responses to stimuli with socio-emotional relevance. © The Author (2017). Published by Oxford University Press.
The development of emotion perception in face and voice during infancy.

PubMed

Grossmann, Tobias

2010-01-01

Interacting with others by reading their emotional expressions is an essential social skill in humans. How this ability develops during infancy and what brain processes underpin infants' perception of emotion in different modalities are the questions dealt with in this paper. Literature review. The first part provides a systematic review of behavioral findings on infants' developing emotion-reading abilities. The second part presents a set of new electrophysiological studies that provide insights into the brain processes underlying infants' developing abilities. Throughout, evidence from unimodal (face or voice) and multimodal (face and voice) processing of emotion is considered. The implications of the reviewed findings for our understanding of developmental models of emotion processing are discussed. The reviewed infant data suggest that (a) early in development, emotion enhances the sensory processing of faces and voices, (b) infants' ability to allocate increased attentional resources to negative emotional information develops earlier in the vocal domain than in the facial domain, and (c) at least by the age of 7 months, infants reliably match and recognize emotional information across face and voice.
Explaining the high voice superiority effect in polyphonic music: evidence from cortical evoked potentials and peripheral auditory models.

PubMed

Trainor, Laurel J; Marie, Céline; Bruce, Ian C; Bidelman, Gavin M

2014-02-01

Natural auditory environments contain multiple simultaneously-sounding objects and the auditory system must parse the incoming complex sound wave they collectively create into parts that represent each of these individual objects. Music often similarly requires processing of more than one voice or stream at the same time, and behavioral studies demonstrate that human listeners show a systematic perceptual bias in processing the highest voice in multi-voiced music. Here, we review studies utilizing event-related brain potentials (ERPs), which support the notions that (1) separate memory traces are formed for two simultaneous voices (even without conscious awareness) in auditory cortex and (2) adults show more robust encoding (i.e., larger ERP responses) to deviant pitches in the higher than in the lower voice, indicating better encoding of the former. Furthermore, infants also show this high-voice superiority effect, suggesting that the perceptual dominance observed across studies might result from neurophysiological characteristics of the peripheral auditory system. Although musically untrained adults show smaller responses in general than musically trained adults, both groups similarly show a more robust cortical representation of the higher than of the lower voice. Finally, years of experience playing a bass-range instrument reduces but does not reverse the high voice superiority effect, indicating that although it can be modified, it is not highly neuroplastic. Results of new modeling experiments examined the possibility that characteristics of middle-ear filtering and cochlear dynamics (e.g., suppression) reflected in auditory nerve firing patterns might account for the higher-voice superiority effect. Simulations show that both place and temporal AN coding schemes well-predict a high-voice superiority across a wide range of interval spacings and registers. Collectively, we infer an innate, peripheral origin for the higher-voice superiority observed in human ERP and psychophysical music listening studies. Copyright © 2013 Elsevier B.V. All rights reserved.
33 CFR 157.136 - Two-way voice communications.

Code of Federal Regulations, 2010 CFR

2010-07-01

... OIL IN BULK Crude Oil Washing (COW) System on Tank Vessels Design, Equipment, and Installation § 157.136 Two-way voice communications. Each tank vessel having a COW system under § 157.10(e), § 157.10a(a...
Off the Shelf Cloud Robotics for the Smart Home: Empowering a Wireless Robot through Cloud Computing.

PubMed

Ramírez De La Pinta, Javier; Maestre Torreblanca, José María; Jurado, Isabel; Reyes De Cozar, Sergio

2017-03-06

In this paper, we explore the possibilities offered by the integration of home automation systems and service robots. In particular, we examine how advanced computationally expensive services can be provided by using a cloud computing approach to overcome the limitations of the hardware available at the user's home. To this end, we integrate two wireless low-cost, off-the-shelf systems in this work, namely, the service robot Rovio and the home automation system Z-wave. Cloud computing is used to enhance the capabilities of these systems so that advanced sensing and interaction services based on image processing and voice recognition can be offered.
Off the Shelf Cloud Robotics for the Smart Home: Empowering a Wireless Robot through Cloud Computing

PubMed Central

Ramírez De La Pinta, Javier; Maestre Torreblanca, José María; Jurado, Isabel; Reyes De Cozar, Sergio

2017-01-01

In this paper, we explore the possibilities offered by the integration of home automation systems and service robots. In particular, we examine how advanced computationally expensive services can be provided by using a cloud computing approach to overcome the limitations of the hardware available at the user’s home. To this end, we integrate two wireless low-cost, off-the-shelf systems in this work, namely, the service robot Rovio and the home automation system Z-wave. Cloud computing is used to enhance the capabilities of these systems so that advanced sensing and interaction services based on image processing and voice recognition can be offered. PMID:28272305
Voice control of the space shuttle video system

NASA Technical Reports Server (NTRS)

Bejczy, A. K.; Dotson, R. S.; Brown, J. W.; Lewis, J. L.

1981-01-01

A pilot voice control system developed at the Jet Propulsion Laboratory (JPL) to test and evaluate the feasibility of controlling the shuttle TV cameras and monitors by voice commands utilizes a commercially available discrete word speech recognizer which can be trained to the individual utterances of each operator. Successful ground tests were conducted using a simulated full-scale space shuttle manipulator. The test configuration involved the berthing, maneuvering and deploying a simulated science payload in the shuttle bay. The handling task typically required 15 to 20 minutes and 60 to 80 commands to 4 TV cameras and 2 TV monitors. The best test runs show 96 to 100 percent voice recognition accuracy.
A new VOX technique for reducing noise in voice communication systems. [voice operated keying

NASA Technical Reports Server (NTRS)

Morris, C. F.; Morgan, W. C.; Shack, P. E.

1974-01-01

A VOX technique for reducing noise in voice communication systems is described which is based on the separation of voice signals into contiguous frequency-band components with the aid of an adaptive VOX in each band. It is shown that this processing scheme can effectively reduce both wideband and narrowband quasi-periodic noise since the threshold levels readjust themselves to suppress noise that exceeds speech components in each band. Results are reported for tests of the adaptive VOX, and it is noted that improvements can still be made in such areas as the elimination of noise pulses, phoneme reproduction at high-noise levels, and the elimination of distortion introduced by phase delay.
Giving Voice to Neurologically Diverse High School Students: A Case Study Exploration of Interactions, Relationships, and Realizations through a Collaborative Drama/Life Skills Performance

ERIC Educational Resources Information Center

Hare, Jill L.

2013-01-01

This is a case study about giving voice to a neurologically diverse (ND) community of high school students in a life skills classroom. It is about their lived experiences while involved in a collaborative drama production with a regular education drama class. The research of this study was driven by the assumptions, beliefs, and philosophy based…
On the opposing views of the self–nonself discrimination by the immune system

PubMed Central

Cohn, Melvin

2010-01-01

Today’s generally accepted view of the self–nonself discrimination was voiced by Miller1 in 2004 in a thought-provoking essay. In spite of its popularity, this position has its limitations, which are analyzed here with a view toward establishing an interactive discussion that hopefully will culminate in agreed upon decisive experiments. The inadequacies of Miller’s view of the self–nonself discrimination and their resolution under the associative recognition of antigen model are analyzed. PMID:19048020
Indigenous women's voices: marginalization and health.

PubMed

Dodgson, Joan E; Struthers, Roxanne

2005-10-01

Marginalization may affect health care delivery. Ways in which indigenous women experienced marginalization were examined. Data from 57 indigenous women (18 to 65 years) were analyzed for themes. Three themes emerged: historical trauma as lived marginalization, biculturalism experienced as marginalization, and interacting within a complex health care system. Experienced marginalization reflected participants' unique perspective and were congruent with previous research. It is necessary for health care providers to assess the detrimental impact of marginalization on the health status of individuals and/or communities.
Aircraft L-Band Balloon - Simulated Satellite Experiments Volume I: Experiment Description and Voice and Data Modem Test Results

DOT National Transportation Integrated Search

1975-10-01

This report details the result of an experiment performed by the Transportation Systems Center of the Department of Transportation to evaluate candidate voice and data modulation systems for use in an L-Band Air Traffic Control System. The experiment...
Dragon Stream Cipher for Secure Blackbox Cockpit Voice Recorder

NASA Astrophysics Data System (ADS)

Akmal, Fadira; Michrandi Nasution, Surya; Azmi, Fairuz

2017-11-01

Aircraft blackbox is a device used to record all aircraft information, which consists of Flight Data Recorder (FDR) and Cockpit Voice Recorder (CVR). Cockpit Voice Recorder contains conversations in the aircraft during the flight.Investigations on aircraft crashes usually take a long time, because it is difficult to find the aircraft blackbox. Then blackbox should have the ability to send information to other places. Aircraft blackbox must have a data security system, data security is a very important part at the time of information exchange process. The system in this research is to perform the encryption and decryption process on Cockpit Voice Recorder by people who are entitled by using Dragon Stream Cipher algorithm. The tests performed are time of data encryption and decryption, and avalanche effect. Result in this paper show us time encryption and decryption are 0,85 seconds and 1,84 second for 30 seconds Cockpit Voice Recorder data witn an avalanche effect 48,67 %.
The Acoustic Correlates of Breathy Voice: a Study of Source-Vowel INTERACTION{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00} {00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}{00}.

NASA Astrophysics Data System (ADS)

Lin, Yeong-Fen Emily

This thesis is the result of an investigation of the source-vowel interaction from the point of view of perception. Major objectives include the identification of the acoustic correlates of breathy voice and the disclosure of the interdependent relationship between the perception of vowel identity and breathiness. Two experiments were conducted to achieve these objectives. In the first experiment, voice samples from one control group and seven patient groups were compared. The control group consisted of five female and five male adults. The ten normals were recruited to perform a sustained vowel phonation task with constant pitch and loudness. The voice samples of seventy patients were retrieved from a hospital data base, with vowels extracted from sentences repeated by patients at their habitual pitch and loudness. The seven patient groups were divided, based on a unique combination of patients' measures on mean flow rate and glottal resistance. Eighteen acoustic variables were treated with a three-way (Gender x Group x Vowel) ANOVA. Parameters showing a significant female-male difference as well as group differences, especially those between the presumed breathy group and the other groups, were identified as relevant to the distinction of breathy voice. As a result, F1-F3 amplitude difference and slope were found to be most effective in distinguishing breathy voice. Other acoustic correlates of breathy voice included F1 bandwidth, RMS-H1 amplitude difference, and F1-F2 amplitude difference and slope. In the second experiment, a formant synthesizer was used to generate vowel stimuli with varying spectral tilt and F1 bandwidth. Thirteen native American English speakers made dissimilarity judgements on paired stimuli in terms of vowel identity and breathiness. Listeners' perceptual vowel spaces were found to be affected by changes in the acoustic correlates of breathy voice. The threshold of detecting a change of vocal quality in the breathiness domain was also found to be vowel-dependent.

A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms

PubMed Central

Wołk, Agnieszka; Glinkowski, Wojciech

2017-01-01

People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer. PMID:29230254
A Cross-Lingual Mobile Medical Communication System Prototype for Foreigners and Subjects with Speech, Hearing, and Mental Disabilities Based on Pictograms.

PubMed

Wołk, Krzysztof; Wołk, Agnieszka; Glinkowski, Wojciech

2017-01-01

People with speech, hearing, or mental impairment require special communication assistance, especially for medical purposes. Automatic solutions for speech recognition and voice synthesis from text are poor fits for communication in the medical domain because they are dependent on error-prone statistical models. Systems dependent on manual text input are insufficient. Recently introduced systems for automatic sign language recognition are dependent on statistical models as well as on image and gesture quality. Such systems remain in early development and are based mostly on minimal hand gestures unsuitable for medical purposes. Furthermore, solutions that rely on the Internet cannot be used after disasters that require humanitarian aid. We propose a high-speed, intuitive, Internet-free, voice-free, and text-free tool suited for emergency medical communication. Our solution is a pictogram-based application that provides easy communication for individuals who have speech or hearing impairment or mental health issues that impair communication, as well as foreigners who do not speak the local language. It provides support and clarification in communication by using intuitive icons and interactive symbols that are easy to use on a mobile device. Such pictogram-based communication can be quite effective and ultimately make people's lives happier, easier, and safer.
Design and development of an interactive medical teleconsultation system over the World Wide Web.

PubMed

Bai, J; Zhang, Y; Dai, B

1998-06-01

The objective of the medical teleconsultation system presented in this paper is to demonstrate the use of the World Wide Web (WWW) for telemedicine and interactive medical information exchange. The system, which is developed based on Java, could provide several basic Java tools to fulfill the requirements of medical applications, including a file manager, data tool, bulletin board, and digital audio tool. The digital audio tool uses point-to-point structure to enable two physicians to communicate directly through voice. The others use multipoint structure. The file manager manages the medical images stored in the WWW information server, which come from a hospital database. The data tool supports cooperative operations on the medical data between the participating physicians. The bulletin board enables the users to discuss special cases by writing text on the board, send their personal or group diagnostic reports on the cases, and reorganize the reports and store them in its report file for later use. The system provides a hardware-independent platform for physicians to interact with one another as well as to access medical information over the WWW.
An automatic speech recognition system with speaker-independent identification support

NASA Astrophysics Data System (ADS)

Caranica, Alexandru; Burileanu, Corneliu

2015-02-01

The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.
Presidential, But Not Prime Minister, Candidates With Lower Pitched Voices Stand a Better Chance of Winning the Election in Conservative Countries.

PubMed

Banai, Benjamin; Laustsen, Lasse; Banai, Irena Pavela; Bovan, Kosta

2018-01-01

Previous studies have shown that voters rely on sexually dimorphic traits that signal masculinity and dominance when they choose political leaders. For example, voters exert strong preferences for candidates with lower pitched voices because these candidates are perceived as stronger and more competent. Moreover, experimental studies demonstrate that conservative voters, more than liberals, prefer political candidates with traits that signal dominance, probably because conservatives are more likely to perceive the world as a threatening place and to be more attentive to dangerous and threatening contexts. In light of these findings, this study investigates whether country-level ideology influences the relationship between candidate voice pitch and electoral outcomes of real elections. Specifically, we collected voice pitch data for presidential and prime minister candidates, aggregate national ideology for the countries in which the candidates were nominated, and measures of electoral outcomes for 69 elections held across the world. In line with previous studies, we found that candidates with lower pitched voices received more votes and had greater likelihood of winning the elections. Furthermore, regression analysis revealed an interaction between candidate voice pitch, national ideology, and election type (presidential or parliamentary). That is, having a lower pitched voice was a particularly valuable asset for presidential candidates in conservative and right-leaning countries (in comparison to presidential candidates in liberal and left-leaning countries and parliamentary elections). We discuss the practical implications of these findings, and how they relate to existing research on candidates' voices, voting preferences, and democratic elections in general.
Absolute Pitch: Effects of Timbre on Note-Naming Ability

PubMed Central

Vanzella, Patrícia; Schellenberg, E. Glenn

2010-01-01

Background Absolute pitch (AP) is the ability to identify or produce isolated musical tones. It is evident primarily among individuals who started music lessons in early childhood. Because AP requires memory for specific pitches as well as learned associations with verbal labels (i.e., note names), it represents a unique opportunity to study interactions in memory between linguistic and nonlinguistic information. One untested hypothesis is that the pitch of voices may be difficult for AP possessors to identify. A musician's first instrument may also affect performance and extend the sensitive period for acquiring accurate AP. Methods/Principal Findings A large sample of AP possessors was recruited on-line. Participants were required to identity test tones presented in four different timbres: piano, pure tone, natural (sung) voice, and synthesized voice. Note-naming accuracy was better for non-vocal (piano and pure tones) than for vocal (natural and synthesized voices) test tones. This difference could not be attributed solely to vibrato (pitch variation), which was more pronounced in the natural voice than in the synthesized voice. Although starting music lessons by age 7 was associated with enhanced note-naming accuracy, equivalent abilities were evident among listeners who started music lessons on piano at a later age. Conclusions/Significance Because the human voice is inextricably linked to language and meaning, it may be processed automatically by voice-specific mechanisms that interfere with note naming among AP possessors. Lessons on piano or other fixed-pitch instruments appear to enhance AP abilities and to extend the sensitive period for exposure to music in order to develop accurate AP. PMID:21085598
Voice Changes in Real Speaking Situations During a Day, With and Without Vocal Loading: Assessing Call Center Operators.

PubMed

Ben-David, Boaz M; Icht, Michal

2016-03-01

Occupational-related vocal load is an increasing global problem with adverse personal and economic implications. We examined voice changes in real speaking situations during a single day, with and without vocal loading, aiming to identify an objective acoustic index for vocal load over a day. Call center operators (CCOs, n = 27) and age- and gender-matched students (n = 25) were recorded at the beginning and at the end of a day, with (CCOs) and without (students) vocal load. Speaking and reading voice samples were analyzed for fundamental frequency (F0), sound pressure level (SPL), and their variance (F0 coefficient of variation [F0 CV], SPL CV). The impact of lifestyle habits on voice changes was also estimated. The main findings revealed an interaction, with F0 rise at the end of the day for the students but not for the CCOs. We suggest that F0 rise is a typical phenomenon of a day of normal vocal use, whereas vocal loading interferes with this mechanism. In addition, different lifestyle profiles of CCOs and controls were observed, as the CCOs reported higher incidence of dehydrating behaviors (eg, smoking, caffeine). Yet, this profile was not linked with voice changes. In sum, we suggest that F0 rise over a day can potentially serve as an index for typical voice use. Its lack thereof can hint on consequent voice symptoms and complaints. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Risk and protective factors for spasmodic dysphonia: a case-control investigation.

PubMed

Tanner, Kristine; Roy, Nelson; Merrill, Ray M; Kimber, Kamille; Sauder, Cara; Houtz, Daniel R; Doman, Darrin; Smith, Marshall E

2011-01-01

Spasmodic dysphonia (SD) is a chronic, incurable, and often disabling voice disorder of unknown pathogenesis. The purpose of this study was to identify possible endogenous and exogenous risk and protective factors uniquely associated with SD. Prospective, exploratory, case-control investigation. One hundred fifty patients with SD and 150 medical controls (MCs) were interviewed regarding their personal and family histories, environmental exposures, illnesses, injuries, voice use patterns, and general health using a previously vetted and validated epidemiologic questionnaire. Odds ratios and multiple logistic regression analyses (α<0.15) identified several factors that significantly increased the likelihood of having SD. These factors included (1) a personal history of mumps, blepharospasm, tremor, intense occupational and avocational voice use, and a family history of voice disorders; (2) an immediate family history of meningitis, tremor, tics, cancer, and compulsive behaviors; and (3) an extended family history of tremor and cancer. SD is likely multifactorial in etiology, involving both genetic and environmental factors. Viral infections/exposures, along with intense voice use, may trigger the onset of SD in genetically predisposed individuals. Future studies should examine the interaction among genetic and environmental factors to determine the pathogenesis of SD. Copyright Â© 2011 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Listener Perception of Respiratory-Induced Voice Tremor

ERIC Educational Resources Information Center

Farinella, Kimberly A.; Hixon, Thomas J.; Hoit, Jeannette D.; Story, Brad H.; Jones, Patricia A.

2006-01-01

Purpose: The purpose of this study was to determine the relation of respiratory oscillation to the perception of voice tremor. Method: Forced oscillation of the respiratory system was used to simulate variations in alveolar pressure such as are characteristic of voice tremor of respiratory origin. Five healthy men served as speakers, and 6…
Speech Motor Development during Acquisition of the Voicing Contrast

ERIC Educational Resources Information Center

Grigos, Maria I.; Saxman, John H.; Gordon, Andrew M.

2005-01-01

Lip and jaw movements were studied longitudinally in 19-month-old children as they acquired the voicing contrast for /p/ and /b/. A movement tracking system obtained lip and jaw kinematics as participants produced the target utterances /papa/ and /baba/. Laryngeal adjustments were also tracked through acoustically recorded voice onset time (VOT)…
Wireless infrared communications for space and terrestrial applications

NASA Technical Reports Server (NTRS)

Crimmins, James W.

1993-01-01

Voice and data communications via wireless (and fiberless) optical means has been commonplace for many years. However, continuous advances in optoelectronics and microelectronics have resulted in significant advances in wireless optical communications over the last decade. Wilton has specialized in diffuse infrared voice and data communications since 1979. In 1986, NASA Johnson Space Center invited Wilton to apply its wireless telecommunications and factory floor technology to astronaut voice communications aboard the shuttle. In September, 1988 a special infrared voice communications system flew aboard a 'Discovery' Shuttle mission as a flight experiment. Since then the technology has been further developed, resulting in a general purpose of 2Mbs wireless voice/data LAN which has been tested for a variety of applications including use aboard Spacelab. Funds for Wilton's wireless IR development were provided in part by NASA's Technology Utilization Office and by the NASA Small Business Innovative Research Program. As a consequence, Wilton's commercial product capability has been significantly enhanced to include diffuse infrared wireless LAN's as well as wireless infrared telecommunication systems for voice and data.
Voice Controlled Wheelchair

NASA Technical Reports Server (NTRS)

1977-01-01

Michael Condon, a quadraplegic from Pasadena, California, demonstrates the NASA-developed voice-controlled wheelchair and its manipulator, which can pick up packages, open doors, turn a TV knob, and perform a variety of other functions. A possible boon to paralyzed and other severely handicapped persons, the chair-manipulator system responds to 35 one-word voice commands, such as "go," "stop," "up," "down," "right," "left," "forward," "backward." The heart of the system is a voice-command analyzer which utilizes a minicomputer. Commands are taught I to the computer by the patient's repeating them a number of times; thereafter the analyzer recognizes commands only in the patient's particular speech pattern. The computer translates commands into electrical signals which activate appropriate motors and cause the desired motion of chair or manipulator. Based on teleoperator and robot technology for space-related programs, the voice-controlled system was developed by Jet Propulsion Laboratory under the joint sponsorship of NASA and the Veterans Administration. The wheelchair-manipulator has been tested at Rancho Los Amigos Hospital, Downey, California, and is being evaluated at the VA Prosthetics Center in New York City.
Reading affect in the face and voice: neural correlates of interpreting communicative intent in children and adolescents with autism spectrum disorders.

PubMed

Wang, A Ting; Lee, Susan S; Sigman, Marian; Dapretto, Mirella

2007-06-01

Understanding a speaker's communicative intent in everyday interactions is likely to draw on cues such as facial expression and tone of voice. Prior research has shown that individuals with autism spectrum disorders (ASD) show reduced activity in brain regions that respond selectively to the face and voice. However, there is also evidence that activity in key regions can be increased if task demands allow for explicit processing of emotion. To examine the neural circuitry underlying impairments in interpreting communicative intentions in ASD using irony comprehension as a test case, and to determine whether explicit instructions to attend to facial expression and tone of voice will elicit more normative patterns of brain activity. Eighteen boys with ASD (aged 7-17 years, full-scale IQ >70) and 18 typically developing (TD) boys underwent functional magnetic resonance imaging at the Ahmanson-Lovelace Brain Mapping Center, University of California, Los Angeles. Blood oxygenation level-dependent brain activity during the presentation of short scenarios involving irony. Behavioral performance (accuracy and response time) was also recorded. Reduced activity in the medial prefrontal cortex and right superior temporal gyrus was observed in children with ASD relative to TD children during the perception of potentially ironic vs control scenarios. Importantly, a significant group x condition interaction in the medial prefrontal cortex showed that activity was modulated by explicit instructions to attend to facial expression and tone of voice only in the ASD group. Finally, medial prefrontal cortex activity was inversely related to symptom severity in children with ASD such that children with greater social impairment showed less activity in this region. Explicit instructions to attend to facial expression and tone of voice can elicit increased activity in the medial prefrontal cortex, part of a network important for understanding the intentions of others, in children with ASD. These findings suggest a strategy for future intervention research.
A new voice rating tool for clinical practice.

PubMed

Gould, James; Waugh, Jessica; Carding, Paul; Drinnan, Michael

2012-07-01

Perceptual rating of voice quality is a key component in the comprehensive assessment of voice, but there are practical difficulties in making reliable measurements. We have developed the Newcastle Audio Ranking (NeAR) test, a new referential system for the rating of voice parameters. In this article, we present our first results using NeAR. We asked five experts and 11 naive raters to assess 15 male and 15 female voices using the NeAR test. We assessed: validity with respect to the GRBAS scale; interrater reliability; sensitivity to subtle voice differences; and the performance of expert versus naïve raters. There was a uniformly excellent agreement with GRBAS (r=0.87) and interrater agreement (intraclass correlation coefficient=0.86). Considering each GRBAS grade of voice separately, there was still good interrater agreement in NeAR, implying it has good sensitivity to subtle changes. All these results were equally true for expert and naive raters. The NeAR test is a promising new tool in the assessment of voice disorders. Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Speech versus manual control of camera functions during a telerobotic task

NASA Technical Reports Server (NTRS)

Bierschwale, John M.; Sampaio, Carlos E.; Stuart, Mark A.; Smith, Randy L.

1989-01-01

Voice input for control of camera functions was investigated in this study. Objective were to (1) assess the feasibility of a voice-commanded camera control system, and (2) identify factors that differ between voice and manual control of camera functions. Subjects participated in a remote manipulation task that required extensive camera-aided viewing. Each subject was exposed to two conditions, voice and manual input, with a counterbalanced administration order. Voice input was found to be significantly slower than manual input for this task. However, in terms of remote manipulator performance errors and subject preference, there was no difference between modalities. Voice control of continuous camera functions is not recommended. It is believed that the use of voice input for discrete functions, such as multiplexing or camera switching, could aid performance. Hybrid mixes of voice and manual input may provide the best use of both modalities. This report contributes to a better understanding of the issues that affect the design of an efficient human/telerobot interface.
Chair alarm for patient fall prevention based on gesture recognition and interactivity.

PubMed

Knight, Heather; Lee, Jae-Kyu; Ma, Hongshen

2008-01-01

The Gesture Recognition Interactive Technology (GRiT) Chair Alarm aims to prevent patient falls from chairs and wheelchairs by recognizing the gesture of a patient attempting to stand. Patient falls are one of the greatest causes of injury in hospitals. Current chair and bed exit alarm systems are inadequate because of insufficient notification, high false-alarm rate, and long trigger delays. The GRiT chair alarm uses an array of capacitive proximity sensors and pressure sensors to create a map of the patient's sitting position, which is then processed using gesture recognition algorithms to determine when a patient is attempting to stand and to alarm the care providers. This system also uses a range of voice and light feedback to encourage the patient to remain seated and/or to make use of the system's integrated nurse-call function. This system can be seamlessly integrated into existing hospital WiFi networks to send notifications and approximate patient location through existing nurse call systems.
Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging “periodicity-tagged” segregation of competing speech in rooms

PubMed Central

Sayles, Mark; Stasiak, Arkadiusz; Winter, Ian M.

2015-01-01

The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into “auditory objects.” Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening conditions. PMID:25628545
Experiments on Analysing Voice Production: Excised (Human, Animal) and In Vivo (Animal) Approaches

PubMed Central

Döllinger, Michael; Kobler, James; Berry, David A.; Mehta, Daryush D.; Luegmair, Georg; Bohr, Christopher

2015-01-01

Experiments on human and on animal excised specimens as well as in vivo animal preparations are so far the most realistic approaches to simulate the in vivo process of human phonation. These experiments do not have the disadvantage of limited space within the neck and enable studies of the actual organ necessary for phonation, i.e., the larynx. The studies additionally allow the analysis of flow, vocal fold dynamics, and resulting acoustics in relation to well-defined laryngeal alterations. Purpose of Review This paper provides an overview of the applications and usefulness of excised (human/animal) specimen and in vivo animal experiments in voice research. These experiments have enabled visualization and analysis of dehydration effects, vocal fold scarring, bifurcation and chaotic vibrations, three-dimensional vibrations, aerodynamic effects, and mucosal wave propagation along the medial surface. Quantitative data will be shown to give an overview of measured laryngeal parameter values. As yet, a full understanding of all existing interactions in voice production has not been achieved, and thus, where possible, we try to indicate areas needing further study. Recent Findings A further motivation behind this review is to highlight recent findings and technologies related to the study of vocal fold dynamics and its applications. For example, studies of interactions between vocal tract airflow and generation of acoustics have recently shown that airflow superior to the glottis is governed by not only vocal fold dynamics but also by subglottal and supraglottal structures. In addition, promising new methods to investigate kinematics and dynamics have been reported recently, including dynamic optical coherence tomography, X-ray stroboscopy and three-dimensional reconstruction with laser projection systems. Finally, we touch on the relevance of vocal fold dynamics to clinical laryngology and to clinically-oriented research. PMID:26581597
Multi-modal demands of a smartphone used to place calls and enter addresses during highway driving relative to two embedded systems

PubMed Central

Reimer, Bryan; Mehler, Bruce; Reagan, Ian; Kidd, David; Dobres, Jonathan

2016-01-01

Abstract There is limited research on trade-offs in demand between manual and voice interfaces of embedded and portable technologies. Mehler et al. identified differences in driving performance, visual engagement and workload between two contrasting embedded vehicle system designs (Chevrolet MyLink and Volvo Sensus). The current study extends this work by comparing these embedded systems with a smartphone (Samsung Galaxy S4). None of the voice interfaces eliminated visual demand. Relative to placing calls manually, both embedded voice interfaces resulted in less eyes-off-road time than the smartphone. Errors were most frequent when calling contacts using the smartphone. The smartphone and MyLink allowed addresses to be entered using compound voice commands resulting in shorter eyes-off-road time compared with the menu-based Sensus but with many more errors. Driving performance and physiological measures indicated increased demand when performing secondary tasks relative to ‘just driving’, but were not significantly different between the smartphone and embedded systems. Practitioner Summary: The findings show that embedded system and portable device voice interfaces place fewer visual demands on the driver than manual interfaces, but they also underscore how differences in system designs can significantly affect not only the demands placed on drivers, but also the successful completion of tasks. PMID:27110964
Controller/Computer Interface with an Air-Ground Data Link

DOT National Transportation Integrated Search

1976-06-01

This report describes the results of an experiment for evaluating the controller/computer interface in an ARTS III/M&S system modified for use with a simulated digital data link and a voice link utilizing a computer-generated voice system. A modified...

Voices used by nurses when communicating with patients and relatives in a department of medicine for older people-An ethnographic study.

PubMed

Johnsson, Anette; Boman, Åse; Wagman, Petra; Pennbrant, Sandra

2018-04-01

To describe how nurses communicate with older patients and their relatives in a department of medicine for older people in western Sweden. Communication is an essential tool for nurses when working with older patients and their relatives, but often patients and relatives experience shortcomings in the communication exchanges. They may not receive information or are not treated in a professional way. Good communication can facilitate the development of a positive meeting and improve the patient's health outcome. An ethnographic design informed by the sociocultural perspective was applied. Forty participatory observations were conducted and analysed during the period October 2015-September 2016. The observations covered 135 hours of nurse-patient-relative interaction. Field notes were taken, and 40 informal field conversations with nurses and 40 with patients and relatives were carried out. Semistructured follow-up interviews were conducted with five nurses. In the result, it was found that nurses communicate with four different voices: a medical voice described as being incomplete, task-oriented and with a disease perspective; a nursing voice described as being confirmatory, process-oriented and with a holistic perspective; a pedagogical voice described as being contextualised, comprehension-oriented and with a learning perspective; and a power voice described as being distancing and excluding. The voices can be seen as context-dependent communication approaches. When nurses switch between the voices, this indicates a shift in the orientation or situation. The results indicate that if nurses successfully combine the voices, while limiting the use of the power voice, the communication exchanges can become a more positive experience for all parties involved and a good nurse-patient-relative communication exchange can be achieved. Working for improved communication between nurses, patients and relatives is crucial for establishing a positive nurse-patient-relative relationship, which is a basis for improving patient care and healthcare outcomes. © 2018 John Wiley & Sons Ltd.
A TDM link with channel coding and digital voice.

NASA Technical Reports Server (NTRS)

Jones, M. W.; Tu, K.; Harton, P. L.

1972-01-01

The features of a TDM (time-division multiplexed) link model are described. A PCM telemetry sequence was coded for error correction and multiplexed with a digitized voice channel. An all-digital implementation of a variable-slope delta modulation algorithm was used to digitize the voice channel. The results of extensive testing are reported. The measured coding gain and the system performance over a Gaussian channel are compared with theoretical predictions and computer simulations. Word intelligibility scores are reported as a measure of voice channel performance.
78 FR 31972 - Notice of Proposed Information Collection for Public Comment; Request Voucher for Grant Payment...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-28

... request vouchers for distribution of grant funds using the automated Voice Response System (VRS). An... Information Collection for Public Comment; Request Voucher for Grant Payment and Line of Credit Control System (LOCCS) Voice Response System Access AGENCY: Office of the Chief Financial Officer, HUD. ACTION: Notice...
Knowledge Discovery, Integration and Communication for Extreme Weather and Flood Resilience Using Artificial Intelligence: Flood AI Alpha

NASA Astrophysics Data System (ADS)

Demir, I.; Sermet, M. Y.

2016-12-01

Nobody is immune from extreme events or natural hazards that can lead to large-scale consequences for the nation and public. One of the solutions to reduce the impacts of extreme events is to invest in improving resilience with the ability to better prepare, plan, recover, and adapt to disasters. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This abstracts presents our project on developing a resilience framework for flooding to improve societal preparedness with objectives; (a) develop a generalized ontology for extreme events with primary focus on flooding; (b) develop a knowledge engine with voice recognition, artificial intelligence, natural language processing, and inference engine. The knowledge engine will utilize the flood ontology and concepts to connect user input to relevant knowledge discovery outputs on flooding; (c) develop a data acquisition and processing framework from existing environmental observations, forecast models, and social networks. The system will utilize the framework, capabilities and user base of the Iowa Flood Information System (IFIS) to populate and test the system; (d) develop a communication framework to support user interaction and delivery of information to users. The interaction and delivery channels will include voice and text input via web-based system (e.g. IFIS), agent-based bots (e.g. Microsoft Skype, Facebook Messenger), smartphone and augmented reality applications (e.g. smart assistant), and automated web workflows (e.g. IFTTT, CloudWork) to open the knowledge discovery for flooding to thousands of community extensible web workflows.
Collecting Self-Reported Data on Dating Abuse Perpetration From a Sample of Primarily Black and Hispanic, Urban-Residing, Young Adults: A Comparison of Timeline Followback Interview and Interactive Voice Response Methods.

PubMed

Rothman, Emily F; Heeren, Timothy; Winter, Michael; Dorfman, David; Baughman, Allyson; Stuart, Gregory

2016-12-01

Dating abuse is a prevalent and consequential public health problem. However, relatively few studies have compared methods of collecting self-report data on dating abuse perpetration. This study compares two data collection methods-(a) the Timeline Followback (TLFB) retrospective reporting method, which makes use of a written calendar to prompt respondents' recall, and (b) an interactive voice response (IVR) system, which is a prospective telephone-based database system that necessitates respondents calling in and entering data using their telephone keypads. We collected 84 days of data on young adult dating abuse perpetration using IVR from a total of 60 respondents. Of these respondents, 41 (68%) completed a TLFB retrospective report pertaining to the same 84-day period after that time period had ended. A greater number of more severe dating abuse perpetration events were reported via the IVR system. Participants who reported any dating abuse perpetration were more likely to report more frequent abuse perpetration via the IVR than the TLFB (i.e., may have minimized the number of times they perpetrated dating abuse on the TLFB). The TLFB method did not result in a tapering off of reported events past the first week as it has in prior studies, but the IVR method did result in a tapering off of reported events after approximately the sixth week. We conclude that using an IVR system for self-reports of dating abuse perpetration may not have substantial advantages over using a TLFB method, but researchers' choice of mode may vary by research question, resources, sample, and setting.
Construction site Voice Operated Information System (VOIS) test

NASA Astrophysics Data System (ADS)

Lawrence, Debbie J.; Hettchen, William

1991-01-01

The Voice Activated Information System (VAIS), developed by USACERL, allows inspectors to verbally log on-site inspection reports on a hand held tape recorder. The tape is later processed by the VAIS, which enters the information into the system's database and produces a written report. The Voice Operated Information System (VOIS), developed by USACERL and Automated Sciences Group, through a ESACERL cooperative research and development agreement (CRDA), is an improved voice recognition system based on the concepts and function of the VAIS. To determine the applicability of the VOIS to Corps of Engineers construction projects, Technology Transfer Test Bad (T3B) funds were provided to the Corps of Engineers National Security Agency (NSA) Area Office (Fort Meade) to procure and implement the VOIS, and to train personnel in its use. This report summarizes the NSA application of the VOIS to quality assurance inspection of radio frequency shielding and to progress payment logs, and concludes that the VOIS is an easily implemented system that can offer improvements when applied to repetitive inspection procedures. Use of VOIS can save time during inspection, improve documentation storage, and provide flexible retrieval of stored information.
Response time effects of alerting tone and semantic context for synthesized voice cockpit warnings

NASA Technical Reports Server (NTRS)

Simpson, C. A.; Williams, D. H.

1980-01-01

Some handbooks and human factors design guides have recommended that a voice warning should be preceded by a tone to attract attention to the warning. As far as can be determined from a search of the literature, no experimental evidence supporting this exists. A fixed-base simulator flown by airline pilots was used to test the hypothesis that the total 'system-time' to respond to a synthesized voice cockpit warning would be longer when the message was preceded by a tone because the voice itself was expected to perform both the alerting and the information transfer functions. The simulation included realistic ATC radio voice communications, synthesized engine noise, cockpit conversation, and realistic flight routes. The effect of a tone before a voice warning was to lengthen response time; that is, responses were slower with an alerting tone. Lengthening the voice warning with another work, however, did not increase response time.
Pediatric normative data for the KayPENTAX phonatory aerodynamic system model 6600.

PubMed

Weinrich, Barbara; Brehm, Susan Baker; Knudsen, Courtney; McBride, Stephanie; Hughes, Michael

2013-01-01

The objectives of this study were to (1) establish a preliminary pediatric normative database for the KayPENTAX Phonatory Aerodynamic System (PAS) Model 6600 (KayPENTAX Corp, Montvale, NJ) and (2) identify whether the data obtained were age- and/or gender-dependent. Prospective data collection across groups. A sample of 60 children (30 females and 30 males) with normal voices was divided into three age groups (6.0-9.11, 10.0-13.11, 14.0-17.11 years) with equal distribution of males and females within each group. Five PAS protocols (vital capacity, maximum sustained phonation, comfortable sustained phonation, variation in sound pressure level, voicing efficiency) were used to collect 45 phonatory aerodynamic measures. Measurements for the 45 PAS parameters examined revealed 13 parameters to have a difference that was statistically significant by age and/or gender. There was a significant age×gender interaction for mean pitch in the four protocols that reported this measure. Males in the oldest group had significantly lower mean pitch values than the middle and young groups. Statistically significant main effect differences were noted for seven parameters across three age groups (expiratory volume, expiratory airflow duration, phonation time, pitch range (in 2 protocols), aerodynamic resistance, acoustic ohms). Significant main effect differences for genders (males > females) were found for expiratory volume and peak expiratory airflow. The age- and gender-related differences found for some parameters within each of the five protocols are important for the interpretation of data obtained from PAS. These results could be explained by developmental changes that occur in the male and female respiratory and laryngeal systems. Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
Functional Connectivity between Face-Movement and Speech-Intelligibility Areas during Auditory-Only Speech Perception

PubMed Central

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (<2 min/speaker). This was followed by an auditory-only speech recognition task and a control task (voice recognition) involving the learned speakers’ voices in the MRI scanner. As hypothesized, we found that, during speech recognition, familiarity with the speaker’s face increased the functional connectivity between the face-movement sensitive posterior superior temporal sulcus (STS) and an anterior STS region that supports auditory speech intelligibility. There was no difference between normal participants and prosopagnosics. This was expected because previous findings have shown that both groups use the face-movement sensitive STS to optimize auditory-only speech comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas. PMID:24466026
Emotional voices in context: A neurobiological model of multimodal affective information processing

NASA Astrophysics Data System (ADS)

Brück, Carolin; Kreifelts, Benjamin; Wildgruber, Dirk

2011-12-01

Just as eyes are often considered a gateway to the soul, the human voice offers a window through which we gain access to our fellow human beings' minds - their attitudes, intentions and feelings. Whether in talking or singing, crying or laughing, sighing or screaming, the sheer sound of a voice communicates a wealth of information that, in turn, may serve the observant listener as valuable guidepost in social interaction. But how do human beings extract information from the tone of a voice? In an attempt to answer this question, the present article reviews empirical evidence detailing the cerebral processes that underlie our ability to decode emotional information from vocal signals. The review will focus primarily on two prominent classes of vocal emotion cues: laughter and speech prosody (i.e. the tone of voice while speaking). Following a brief introduction, behavioral as well as neuroimaging data will be summarized that allows to outline cerebral mechanisms associated with the decoding of emotional voice cues, as well as the influence of various context variables (e.g. co-occurring facial and verbal emotional signals, attention focus, person-specific parameters such as gender and personality) on the respective processes. Building on the presented evidence, a cerebral network model will be introduced that proposes a differential contribution of various cortical and subcortical brain structures to the processing of emotional voice signals both in isolation and in context of accompanying (facial and verbal) emotional cues.
Emotional voices in context: a neurobiological model of multimodal affective information processing.

PubMed

Brück, Carolin; Kreifelts, Benjamin; Wildgruber, Dirk

2011-12-01

Just as eyes are often considered a gateway to the soul, the human voice offers a window through which we gain access to our fellow human beings' minds - their attitudes, intentions and feelings. Whether in talking or singing, crying or laughing, sighing or screaming, the sheer sound of a voice communicates a wealth of information that, in turn, may serve the observant listener as valuable guidepost in social interaction. But how do human beings extract information from the tone of a voice? In an attempt to answer this question, the present article reviews empirical evidence detailing the cerebral processes that underlie our ability to decode emotional information from vocal signals. The review will focus primarily on two prominent classes of vocal emotion cues: laughter and speech prosody (i.e. the tone of voice while speaking). Following a brief introduction, behavioral as well as neuroimaging data will be summarized that allows to outline cerebral mechanisms associated with the decoding of emotional voice cues, as well as the influence of various context variables (e.g. co-occurring facial and verbal emotional signals, attention focus, person-specific parameters such as gender and personality) on the respective processes. Building on the presented evidence, a cerebral network model will be introduced that proposes a differential contribution of various cortical and subcortical brain structures to the processing of emotional voice signals both in isolation and in context of accompanying (facial and verbal) emotional cues. Copyright © 2011 Elsevier B.V. All rights reserved.
The 2010 Desert Rats Science Operations Test: Outcomes and Lessons Learned

NASA Technical Reports Server (NTRS)

Eppler, D. B.

2011-01-01

The Desert RATS 2010 Team tested a variety of science operations management techniques, applying experience gained during the manned Apollo missions and the robotic Mars missions. This test assessed integrated science operations management of human planetary exploration using real-time, tactical science operations to oversee daily crew science activities, and a night shift strategic science operations team to conduct strategic level assessment of science data and daily traverse results. In addition, an attempt was made to collect numerical metric data on the outcome of the science operations to assist test evaluation. The two most important outcomes were 1) the production of significant (almost overwhelming) volume of data produced during daily traverse operations with two rovers, advanced imaging systems and well trained, scientifically proficient crew-members, and 2) the degree to which the tactical team s interaction with the surface crew enhanced science return. This interaction depended on continuous real-time voice and data communications, and the quality of science return from any human planetary exploration mission will be based strongly on the aggregate interaction between a well trained surface crew and a dedicated science operations support team using voice and imaging data from a planet s surface. In addition, the scientific insight developed by both the science operations team and the crews could not be measurable by simple numerical quantities, and its value will be missed by a purely metric-based evaluation of test outcome. In particular, failure to recognize the critical importance of this qualitative type interaction may result in mission architecture choices that will reduce science return.
Did you or I say pretty, rude or brief? An ERP study of the effects of speaker's identity on emotional word processing.

PubMed

Pinheiro, Ana P; Rezaii, Neguine; Nestor, Paul G; Rauber, Andréia; Spencer, Kevin M; Niznikiewicz, Margaret

2016-02-01

During speech comprehension, multiple cues need to be integrated at a millisecond speed, including semantic information, as well as voice identity and affect cues. A processing advantage has been demonstrated for self-related stimuli when compared with non-self stimuli, and for emotional relative to neutral stimuli. However, very few studies investigated self-other speech discrimination and, in particular, how emotional valence and voice identity interactively modulate speech processing. In the present study we probed how the processing of words' semantic valence is modulated by speaker's identity (self vs. non-self voice). Sixteen healthy subjects listened to 420 prerecorded adjectives differing in voice identity (self vs. non-self) and semantic valence (neutral, positive and negative), while electroencephalographic data were recorded. Participants were instructed to decide whether the speech they heard was their own (self-speech condition), someone else's (non-self speech), or if they were unsure. The ERP results demonstrated interactive effects of speaker's identity and emotional valence on both early (N1, P2) and late (Late Positive Potential - LPP) processing stages: compared with non-self speech, self-speech with neutral valence elicited more negative N1 amplitude, self-speech with positive valence elicited more positive P2 amplitude, and self-speech with both positive and negative valence elicited more positive LPP. ERP differences between self and non-self speech occurred in spite of similar accuracy in the recognition of both types of stimuli. Together, these findings suggest that emotion and speaker's identity interact during speech processing, in line with observations of partially dependent processing of speech and speaker information. Copyright © 2016. Published by Elsevier Inc.
Listening to Young Children's Voices: The Evaluation of a Coding System

ERIC Educational Resources Information Center

Tertoolen, Anja; Geldens, Jeannette; van Oers, Bert; Popeijus, Herman

2015-01-01

Listening to young children's voices is an issue with increasing relevance for many researchers in the field of early childhood research. At the same time, teachers and researchers are faced with challenges to provide children with possibilities to express their notions, and to find ways of comprehending children's voices. In our research we aim…
The "Parental Voice": How the Infant-Toddler (Zero to Three Years) Education System Should Deal with Parents

ERIC Educational Resources Information Center

Plotnik, Ronit

2013-01-01

Parenthood is a concrete experience that develops while having a psychological existence in its background. It is heard in two voices simultaneously: the overt, concrete one versus the covert, psychological one. It moves between four intersecting axes, which together create the "Parental Voice" model. Axis 1--Parenthood between fantasy…
The Relationship between Student Voice and Perceptions of Motivation, Attachment, Achievement and School Climate in Davidson and Rutherford Counties

ERIC Educational Resources Information Center

Matthews, Sharon Elizabeth

2010-01-01

This study investigated the extent to which there were statistically significant relationships between school administrators' systemic implementation of student voice work and student perceptions (i.e. achievement, motivation, attachment and school climate) and PLAN performance. Student voice was defined as students being equal partners in school…
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C [Livermore, CA; Holzrichter, John F [Berkeley, CA; Ng, Lawrence C [Danville, CA

2006-08-08

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2004-03-23

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-02-14

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.
System And Method For Characterizing Voiced Excitations Of Speech And Acoustic Signals, Removing Acoustic Noise From Speech, And Synthesizi

DOEpatents

Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

2006-04-25

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix

PubMed Central

Muhammad, Ghulam; Alhamid, Mohammed F.; Hossain, M. Shamim; Almogren, Ahmad S.; Vasilakos, Athanasios V.

2017-01-01

A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals and electroglottography (EGG) signals. Co-occurrence matrices in different directions and neighborhoods from the spectrograms of these signals were obtained. Several features such as energy, entropy, contrast, and homogeneity from these matrices were calculated and fed into a Gaussian mixture model-based classifier. Experiments were performed with a publicly available database, namely, the Saarbrucken voice database. The results demonstrate the feasibility of the proposed system in light of its high accuracy and speed. The proposed system can be extended to assess other disabilities in an ELE. PMID:28146069
Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix.

PubMed

Muhammad, Ghulam; Alhamid, Mohammed F; Hossain, M Shamim; Almogren, Ahmad S; Vasilakos, Athanasios V

2017-01-29

A large number of the population around the world suffers from various disabilities. Disabilities affect not only children but also adults of different professions. Smart technology can assist the disabled population and lead to a comfortable life in an enhanced living environment (ELE). In this paper, we propose an effective voice pathology assessment system that works in a smart home framework. The proposed system takes input from various sensors, and processes the acquired voice signals and electroglottography (EGG) signals. Co-occurrence matrices in different directions and neighborhoods from the spectrograms of these signals were obtained. Several features such as energy, entropy, contrast, and homogeneity from these matrices were calculated and fed into a Gaussian mixture model-based classifier. Experiments were performed with a publicly available database, namely, the Saarbrucken voice database. The results demonstrate the feasibility of the proposed system in light of its high accuracy and speed. The proposed system can be extended to assess other disabilities in an ELE.
Tracking and data relay satellite system configuration and tradeoff study. Volume 1: TDRS system summary, part 1

NASA Technical Reports Server (NTRS)

1972-01-01

A Tracking and Data Relay Satellite System (TDRSS) concept for service of low and medium data rate user spacecraft has been defined. The TDRS system uses two geosynchronous dual spin satellites compatible with Delta 2914 to provide command, tracking, and telemetry service between multiple low earth orbiting users and a centrally located ground station. The low data rate user service capability via each TDRS is as follows: (1) forward link at UHF: voice to one user, commands to 20 users (sequential), range and range rate service, and (2) return link at VHF: voice from one user, data from 20 users (simultaneous), range and range rate return signals. The medium data rate user service via each TDRS is as follows: (1) forward link at S band: voice or command and tracking signals to one user, and (2) return link at S band: voice, data and tracking signals from one user "order wire" for high priority service requests (implemented with an earth coverage antenna).
Talking to your car can drive you to distraction.

PubMed

Strayer, David L; Cooper, Joel M; Turrill, Jonna; Coleman, James R; Hopman, Rachel J

2016-01-01

This research examined the impact of in-vehicle information system (IVIS) interactions on the driver's cognitive workload; 257 subjects participated in a weeklong evaluation of the IVIS interaction in one of ten different model-year 2015 automobiles. After an initial assessment of the cognitive workload associated with using the IVIS, participants took the vehicle home for 5 days and practiced using the system. At the end of the 5 days of practice, participants returned and the workload of these IVIS interactions was reassessed. The cognitive workload was found to be moderate to high, averaging 3.34 on a 5-point scale and ranged from 2.37 to 4.57. The workload was associated with the intuitiveness and complexity of the system and the time it took participants to complete the interaction. The workload experienced by older drivers was significantly greater than that experienced by younger drivers performing the same operations. Practice did not eliminate the interference from IVIS interactions. In fact, IVIS interactions that were difficult on the first day were still relatively difficult to perform after a week of practice. Finally, there were long-lasting residual costs after the IVIS interactions had terminated. The higher levels of workload should serve as a caution that these voice-based interactions can be cognitively demanding and ought not to be used indiscriminately while operating a motor vehicle.
Acoustic passaggio pedagogy for the male voice.

PubMed

Bozeman, Kenneth Wood

2013-07-01

Awareness of interactions between the lower harmonics of the voice source and the first formant of the vocal tract, and of the passive vowel modifications that accompany them, can assist in working out a smooth transition through the passaggio of the male voice. A stable vocal tract length establishes the general location of all formants, including the higher formants that form the singer's formant cluster. Untrained males instinctively shorten the tube to preserve the strong F1/H2 acoustic coupling of voce aperta, resulting in 'yell' timbre. If tube length and shape are kept stable during pitch ascent, the yell can be avoided by allowing the second harmonic to rise above the first formant, creating the balanced timbre of voce chiusa.
Phonetic Encoding of Coda Voicing Contrast under Different Focus Conditions in L1 vs. L2 English.

PubMed

Choi, Jiyoun; Kim, Sahayng; Cho, Taehong

2016-01-01

This study investigated how coda voicing contrast in English would be phonetically encoded in the temporal vs. spectral dimension of the preceding vowel (in vowel duration vs. F1/F2) by Korean L2 speakers of English, and how their L2 phonetic encoding pattern would be compared to that of native English speakers. Crucially, these questions were explored by taking into account the phonetics-prosody interface, testing effects of prominence by comparing target segments in three focus conditions (phonological focus, lexical focus, and no focus). Results showed that Korean speakers utilized the temporal dimension (vowel duration) to encode coda voicing contrast, but failed to use the spectral dimension (F1/F2), reflecting their native language experience-i.e., with a more sparsely populated vowel space in Korean, they are less sensitive to small changes in the spectral dimension, and hence fine-grained spectral cues in English are not readily accessible. Results also showed that along the temporal dimension, both the L1 and L2 speakers hyperarticulated coda voicing contrast under prominence (when phonologically or lexically focused), but hypoarticulated it in the non-prominent condition. This indicates that low-level phonetic realization and high-order information structure interact in a communicatively efficient way, regardless of the speakers' native language background. The Korean speakers, however, used the temporal phonetic space differently from the way the native speakers did, especially showing less reduction in the no focus condition. This was also attributable to their native language experience-i.e., the Korean speakers' use of temporal dimension is constrained in a way that is not detrimental to the preservation of coda voicing contrast, given that they failed to add additional cues along the spectral dimension. The results imply that the L2 phonetic system can be more fully illuminated through an investigation of the phonetics-prosody interface in connection with the L2 speakers' native language experience.
Phonetic Encoding of Coda Voicing Contrast under Different Focus Conditions in L1 vs. L2 English

PubMed Central

Choi, Jiyoun; Kim, Sahayng; Cho, Taehong

2016-01-01

This study investigated how coda voicing contrast in English would be phonetically encoded in the temporal vs. spectral dimension of the preceding vowel (in vowel duration vs. F1/F2) by Korean L2 speakers of English, and how their L2 phonetic encoding pattern would be compared to that of native English speakers. Crucially, these questions were explored by taking into account the phonetics-prosody interface, testing effects of prominence by comparing target segments in three focus conditions (phonological focus, lexical focus, and no focus). Results showed that Korean speakers utilized the temporal dimension (vowel duration) to encode coda voicing contrast, but failed to use the spectral dimension (F1/F2), reflecting their native language experience—i.e., with a more sparsely populated vowel space in Korean, they are less sensitive to small changes in the spectral dimension, and hence fine-grained spectral cues in English are not readily accessible. Results also showed that along the temporal dimension, both the L1 and L2 speakers hyperarticulated coda voicing contrast under prominence (when phonologically or lexically focused), but hypoarticulated it in the non-prominent condition. This indicates that low-level phonetic realization and high-order information structure interact in a communicatively efficient way, regardless of the speakers’ native language background. The Korean speakers, however, used the temporal phonetic space differently from the way the native speakers did, especially showing less reduction in the no focus condition. This was also attributable to their native language experience—i.e., the Korean speakers’ use of temporal dimension is constrained in a way that is not detrimental to the preservation of coda voicing contrast, given that they failed to add additional cues along the spectral dimension. The results imply that the L2 phonetic system can be more fully illuminated through an investigation of the phonetics-prosody interface in connection with the L2 speakers’ native language experience. PMID:27242571
Multimodal interaction for human-robot teams

NASA Astrophysics Data System (ADS)

Burke, Dustin; Schurr, Nathan; Ayers, Jeanine; Rousseau, Jeff; Fertitta, John; Carlin, Alan; Dumond, Danielle

2013-05-01

Unmanned ground vehicles have the potential for supporting small dismounted teams in mapping facilities, maintaining security in cleared buildings, and extending the team's reconnaissance and persistent surveillance capability. In order for such autonomous systems to integrate with the team, we must move beyond current interaction methods using heads-down teleoperation which require intensive human attention and affect the human operator's ability to maintain local situational awareness and ensure their own safety. This paper focuses on the design, development and demonstration of a multimodal interaction system that incorporates naturalistic human gestures, voice commands, and a tablet interface. By providing multiple, partially redundant interaction modes, our system degrades gracefully in complex environments and enables the human operator to robustly select the most suitable interaction method given the situational demands. For instance, the human can silently use arm and hand gestures for commanding a team of robots when it is important to maintain stealth. The tablet interface provides an overhead situational map allowing waypoint-based navigation for multiple ground robots in beyond-line-of-sight conditions. Using lightweight, wearable motion sensing hardware either worn comfortably beneath the operator's clothing or integrated within their uniform, our non-vision-based approach enables an accurate, continuous gesture recognition capability without line-of-sight constraints. To reduce the training necessary to operate the system, we designed the interactions around familiar arm and hand gestures.
47 CFR 90.353 - LMS operations in the 902-928 MHz band.

Code of Federal Regulations, 2012 CFR

2012-10-01

... band. (b) LMS systems are authorized to transmit status and instructional messages, either voice or non-voice, so long as they are related to the location or monitoring functions of the system. (c) LMS... subparts B and C of this part. (d) Multilateration LMS systems will be authorized on a primary basis within...
47 CFR 90.353 - LMS operations in the 902-928 MHz band.

Code of Federal Regulations, 2010 CFR

2010-10-01

... band. (b) LMS systems are authorized to transmit status and instructional messages, either voice or non-voice, so long as they are related to the location or monitoring functions of the system. (c) LMS... subparts B and C of this part. (d) Multilateration LMS systems will be authorized on a primary basis within...
47 CFR 90.353 - LMS operations in the 902-928 MHz band.

Code of Federal Regulations, 2011 CFR

2011-10-01

... band. (b) LMS systems are authorized to transmit status and instructional messages, either voice or non-voice, so long as they are related to the location or monitoring functions of the system. (c) LMS... subparts B and C of this part. (d) Multilateration LMS systems will be authorized on a primary basis within...
47 CFR 90.353 - LMS operations in the 902-928 MHz band.

Code of Federal Regulations, 2014 CFR

2014-10-01

... band. (b) LMS systems are authorized to transmit status and instructional messages, either voice or non-voice, so long as they are related to the location or monitoring functions of the system. (c) LMS... subparts B and C of this part. (d) Multilateration LMS systems will be authorized on a primary basis within...
47 CFR 90.353 - LMS operations in the 902-928 MHz band.

Code of Federal Regulations, 2013 CFR

2013-10-01

... band. (b) LMS systems are authorized to transmit status and instructional messages, either voice or non-voice, so long as they are related to the location or monitoring functions of the system. (c) LMS... subparts B and C of this part. (d) Multilateration LMS systems will be authorized on a primary basis within...
[Research on Control System of an Exoskeleton Upper-limb Rehabilitation Robot].

PubMed

Wang, Lulu; Hu, Xin; Hu, Jie; Fang, Youfang; He, Rongrong; Yu, Hongliu

2016-12-01

In order to help the patients with upper-limb disfunction go on rehabilitation training,this paper proposed an upper-limb exoskeleton rehabilitation robot with four degrees of freedom(DOF),and realized two control schemes,i.e.,voice control and electromyography control.The hardware and software design of the voice control system was completed based on RSC-4128 chips,which realized the speech recognition technology of a specific person.Besides,this study adapted self-made surface eletromyogram(sEMG)signal extraction electrodes to collect sEMG signals and realized pattern recognition by conducting sEMG signals processing,extracting time domain features and fixed threshold algorithm.In addition,the pulse-width modulation(PWM)algorithm was used to realize the speed adjustment of the system.Voice control and electromyography control experiments were then carried out,and the results showed that the mean recognition rate of the voice control and electromyography control reached 93.1%and 90.9%,respectively.The results proved the feasibility of the control system.This study is expected to lay a theoretical foundation for the further improvement of the control system of the upper-limb rehabilitation robot.
A flight investigation of simulated data-link communications during single-pilot IFR flight. Volume 2: Flight evaluations

NASA Technical Reports Server (NTRS)

Parker, J. F., Jr.; Duffy, J. W.

1982-01-01

Key problems in single pilot instrument flight operations are in the management of flight data and the processing of cockpit information during conditions of heavy workload. A flight data console was developed to allow simulation of a digital data link to replace the current voice communications stem used in air traffic control. This is a human factors evaluation of a data link communications system to determine how such a system might reduce cockpit workload, improve flight proficiency, and be accepted by general aviation pilots. The need for a voice channel as backup to a digital link is examined. The evaluations cover both airport terminal area operations and full mission instrument flight. Results show that general aviation pilots operate well with a digital data link communications system. The findings indicate that a data link system for pilot/ATC communications, with a backup voice channel, is well accepted by general aviation pilots and is considered to be safer, more efficient, and result in less workload than the current voice system.
Multisensory perception of the six basic emotions is modulated by attentional instruction and unattended modality

PubMed Central

Takagi, Sachiko; Hiramatsu, Saori; Tabei, Ken-ichi; Tanaka, Akihiro

2015-01-01

Previous studies have shown that the perception of facial and vocal affective expressions interacts with each other. Facial expressions usually dominate vocal expressions when we perceive the emotions of face–voice stimuli. In most of these studies, participants were instructed to pay attention to the face or voice. Few studies compared the perceived emotions with and without specific instructions regarding the modality to which attention should be directed. Also, these studies used combinations of the face and voice which expresses two opposing emotions, which limits the generalizability of the findings. The purpose of this study is to examine whether the emotion perception is modulated by instructions to pay attention to the face or voice using the six basic emotions. Also we examine the modality dominance between the face and voice for each emotion category. Before the experiment, we recorded faces and voices which expresses the six basic emotions and orthogonally combined these faces and voices. Consequently, the emotional valence of visual and auditory information was either congruent or incongruent. In the experiment, there were unisensory and multisensory sessions. The multisensory session was divided into three blocks according to whether an instruction was given to pay attention to a given modality (face attention, voice attention, and no instruction). Participants judged whether the speaker expressed happiness, sadness, anger, fear, disgust, or surprise. Our results revealed that instructions to pay attention to one modality and congruency of the emotions between modalities modulated the modality dominance, and the modality dominance is differed for each emotion category. In particular, the modality dominance for anger changed according to each instruction. Analyses also revealed that the modality dominance suggested by the congruency effect can be explained in terms of the facilitation effect and the interference effect. PMID:25698945
Multimodal approaches for emotion recognition: a survey

NASA Astrophysics Data System (ADS)

Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.

2004-12-01

Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.
Multimodal approaches for emotion recognition: a survey

NASA Astrophysics Data System (ADS)

Sebe, Nicu; Cohen, Ira; Gevers, Theo; Huang, Thomas S.

2005-01-01

Recent technological advances have enabled human users to interact with computers in ways previously unimaginable. Beyond the confines of the keyboard and mouse, new modalities for human-computer interaction such as voice, gesture, and force-feedback are emerging. Despite important advances, one necessary ingredient for natural interaction is still missing-emotions. Emotions play an important role in human-to-human communication and interaction, allowing people to express themselves beyond the verbal domain. The ability to understand human emotions is desirable for the computer in several applications. This paper explores new ways of human-computer interaction that enable the computer to be more aware of the user's emotional and attentional expressions. We present the basic research in the field and the recent advances into the emotion recognition from facial, voice, and physiological signals, where the different modalities are treated independently. We then describe the challenging problem of multimodal emotion recognition and we advocate the use of probabilistic graphical models when fusing the different modalities. We also discuss the difficult issues of obtaining reliable affective data, obtaining ground truth for emotion recognition, and the use of unlabeled data.
Current trends in small vocabulary speech recognition for equipment control

NASA Astrophysics Data System (ADS)

Doukas, Nikolaos; Bardis, Nikolaos G.

2017-09-01

Speech recognition systems allow human - machine communication to acquire an intuitive nature that approaches the simplicity of inter - human communication. Small vocabulary speech recognition is a subset of the overall speech recognition problem, where only a small number of words need to be recognized. Speaker independent small vocabulary recognition can find significant applications in field equipment used by military personnel. Such equipment may typically be controlled by a small number of commands that need to be given quickly and accurately, under conditions where delicate manual operations are difficult to achieve. This type of application could hence significantly benefit by the use of robust voice operated control components, as they would facilitate the interaction with their users and render it much more reliable in times of crisis. This paper presents current challenges involved in attaining efficient and robust small vocabulary speech recognition. These challenges concern feature selection, classification techniques, speaker diversity and noise effects. A state machine approach is presented that facilitates the voice guidance of different equipment in a variety of situations.
Expedition Memory: Towards Agent-based Web Services for Creating and Using Mars Exploration Data.

NASA Technical Reports Server (NTRS)

Clancey, William J.; Sierhuis, Maarten; Briggs, Geoff; Sims, Mike

2005-01-01

Explorers ranging over kilometers of rugged, sometimes "feature-less" terrain for over a year could be overwhelmed by tracking and sharing what they have done and learned. An automated system based on the existing Mobile Agents design [ I ] and Mars Exploration Rover experience [2], could serve as an "expedition memory" that would be indexed by voice as wel1 as a web interface, linking people, places, activities, records (voice notes, photographs, samples). and a descriptive scientific ontology. This database would be accessible during EVAs by astronauts, annotated by the remote science team, linked to EVA plans, and allow cross indexing between sites and expeditions. We consider the basic problem, our philosophical approach, technical methods, and uses of the expedition memory for facilitating long-term collaboration between Mars crews and Earth support teams. We emphasize that a "memory" does not mean a database per se, but an interactive service that combines different resources, and ultimately could be like a helpful librarian.

Effects of Voice Harmonic Complexity on ERP Responses to Pitch-Shifted Auditory Feedback

PubMed Central

Behroozmand, Roozbeh; Korzyukov, Oleg; Larson, Charles R.

2011-01-01

Objective The present study investigated the neural mechanisms of voice pitch control for different levels of harmonic complexity in the auditory feedback. Methods Event-related potentials (ERPs) were recorded in response to +200 cents pitch perturbations in the auditory feedback of self-produced natural human vocalizations, complex and pure tone stimuli during active vocalization and passive listening conditions. Results During active vocal production, ERP amplitudes were largest in response to pitch shifts in the natural voice, moderately large for non-voice complex stimuli and smallest for the pure tones. However, during passive listening, neural responses were equally large for pitch shifts in voice and non-voice complex stimuli but still larger than that for pure tones. Conclusions These findings suggest that pitch change detection is facilitated for spectrally rich sounds such as natural human voice and non-voice complex stimuli compared with pure tones. Vocalization-induced increase in neural responses for voice feedback suggests that sensory processing of naturally-produced complex sounds such as human voice is enhanced by means of motor-driven mechanisms (e.g. efference copies) during vocal production. Significance This enhancement may enable the audio-vocal system to more effectively detect and correct for vocal errors in the feedback of natural human vocalizations to maintain an intended vocal output for speaking. PMID:21719346
A 4.8 kbps code-excited linear predictive coder

NASA Technical Reports Server (NTRS)

Tremain, Thomas E.; Campbell, Joseph P., Jr.; Welch, Vanoy C.

1988-01-01

A secure voice system STU-3 capable of providing end-to-end secure voice communications (1984) was developed. The terminal for the new system will be built around the standard LPC-10 voice processor algorithm. The performance of the present STU-3 processor is considered to be good, its response to nonspeech sounds such as whistles, coughs and impulse-like noises may not be completely acceptable. Speech in noisy environments also causes problems with the LPC-10 voice algorithm. In addition, there is always a demand for something better. It is hoped that LPC-10's 2.4 kbps voice performance will be complemented with a very high quality speech coder operating at a higher data rate. This new coder is one of a number of candidate algorithms being considered for an upgraded version of the STU-3 in late 1989. The problems of designing a code-excited linear predictive (CELP) coder to provide very high quality speech at a 4.8 kbps data rate that can be implemented on today's hardware are considered.
Adductor spasmodic dysphonia: Relationships between acoustic indices and perceptual judgments

NASA Astrophysics Data System (ADS)

Cannito, Michael P.; Sapienza, Christine M.; Woodson, Gayle; Murry, Thomas

2003-04-01

This study investigated relationships between acoustical indices of spasmodic dysphonia and perceptual scaling judgments of voice attributes made by expert listeners. Audio-recordings of The Rainbow Passage were obtained from thirty one speakers with spasmodic dysphonia before and after a BOTOX injection of the vocal folds. Six temporal acoustic measures were obtained across 15 words excerpted from each reading sample, including both frequency of occurrence and percent time for (1) aperiodic phonation, (2) phonation breaks, and (3) fundamental frequency shifts. Visual analog scaling judgments were also obtained from six voice experts using an interactive computer interface to quantify four voice attributes (i.e., overall quality, roughness, brokenness, breathiness) in a carefully psychoacoustically controlled environment, using the same reading passages as stimuli. Number and percent aperiodicity and phonation breaks correlated significanly with perceived overall voice quality, roughness, and brokenness before and after the BOTOX injection. Breathiness was correlated with aperidocity only prior to injection, while roughness also correlated with frequency shifts following injection. Factor analysis reduced perceived attributes to two principal components: glottal squeezing and breathiness. The acoustic measures demonstrated a strong regression relationship with perceived glottal squeezing, but no regression relationship with breathiness was observed. Implications for an analysis of pathologic voices will be discussed.
Comparing the demands of destination entry using Google Glass and the Samsung Galaxy S4 during simulated driving.

PubMed

Beckers, Niek; Schreiner, Sam; Bertrand, Pierre; Mehler, Bruce; Reimer, Bryan

2017-01-01

The relative impact of using a Google Glass based voice interface to enter a destination address compared to voice and touch-entry methods using a handheld Samsung Galaxy S4 smartphone was assessed in a driving simulator. Voice entry (Google Glass and Samsung) had lower subjective workload ratings, lower standard deviation of lateral lane position, shorter task durations, faster remote Detection Response Task (DRT) reaction times, lower DRT miss rates, and resulted in less time glancing off-road than the primary visual-manual interaction with the Samsung Touch interface. Comparing voice entry methods, using Google Glass took less time, while glance metrics and reaction time to DRT events responded to were similar. In contrast, DRT miss rate was higher for Google Glass, suggesting that drivers may be under increased distraction levels but for a shorter period of time; whether one or the other equates to an overall safer driving experience is an open question. Copyright © 2016 Elsevier Ltd. All rights reserved.
Numerical simulation of self-sustained oscillation of a voice-producing element based on Navier-Stokes equations and the finite element method.

PubMed

de Vries, Martinus P; Hamburg, Marc C; Schutte, Harm K; Verkerke, Gijsbertus J; Veldman, Arthur E P

2003-04-01

Surgical removal of the larynx results in radically reduced production of voice and speech. To improve voice quality a voice-producing element (VPE) is developed, based on the lip principle, called after the lips of a musician while playing a brass instrument. To optimize the VPE, a numerical model is developed. In this model, the finite element method is used to describe the mechanical behavior of the VPE. The flow is described by two-dimensional incompressible Navier-Stokes equations. The interaction between VPE and airflow is modeled by placing the grid of the VPE model in the grid of the aerodynamical model, and requiring continuity of forces and velocities. By applying and increasing pressure to the numerical model, pulses comparable to glottal volume velocity waveforms are obtained. By variation of geometric parameters their influence can be determined. To validate this numerical model, an in vitro test with a prototype of the VPE is performed. Experimental and numerical results show an acceptable agreement.
Reproducibility of Automated Voice Range Profiles, a Systematic Literature Review.

PubMed

Printz, Trine; Rosenberg, Tine; Godballe, Christian; Dyrvig, Anne-Kirstine; Grøntved, Ågot Møller

2018-05-01

Reliable voice range profiles are of great importance when measuring effects and side effects from surgery affecting voice capacity. Automated recording systems are increasingly used, but the reproducibility of results is uncertain. Our objective was to identify and review the existing literature on test-retest accuracy of the automated voice range profile assessment. Systematic review. PubMed, Scopus, Cochrane Library, ComDisDome, Embase, and CINAHL (EBSCO). We conducted a systematic literature search of six databases from 1983 to 2016. The following keywords were used: phonetogram, voice range profile, and acoustic voice analysis. Inclusion criteria were automated recording procedure, healthy voices, and no intervention between test and retest. Test-retest values concerning fundamental frequency and voice intensity were reviewed. Of 483 abstracts, 231 full-text articles were read, resulting in six articles included in the final results. The studies found high reliability, but data are few and heterogeneous. The reviewed articles generally reported high reliability of the voice range profile, and thus clinical usefulness, but uncertainty remains because of low sample sizes and different procedures for selecting, collecting, and analyzing data. More data are needed, and clinical conclusions must be drawn with caution. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Towards a Metalanguage Adequate to Linguistic Achievement in Post-Structuralism and English: Reflections on Voicing in the Writing of Secondary Students

ERIC Educational Resources Information Center

Macken-Horarik, Mary; Morgan, Wendy

2011-01-01

This paper considers the development of voicing in the writing of secondary English students influenced by post-structuralist approaches to literature. It investigates students' growing capacity not only to voice their own responses to literature but also to relate these to a range of theoretical discourses. Drawing on systemic functional…
Design of Phoneme MIDI Codes Using the MIDI Encoding Tool “Auto-F” and Realizing Voice Synthesizing Functions Based on Musical Sounds

NASA Astrophysics Data System (ADS)

Modegi, Toshio

Using our previously developed audio to MIDI code converter tool “Auto-F”, from given vocal acoustic signals we can create MIDI data, which enable to playback the voice-like signals with a standard MIDI synthesizer. Applying this tool, we are constructing a MIDI database, which consists of previously converted simple harmonic structured MIDI codes from a set of 71 Japanese male and female syllable recorded signals. And we are developing a novel voice synthesizing system based on harmonically synthesizing musical sounds, which can generate MIDI data and playback voice signals with a MIDI synthesizer by giving Japanese plain (kana) texts, referring to the syllable MIDI code database. In this paper, we propose an improved MIDI converter tool, which can produce temporally higher-resolution MIDI codes. Then we propose an algorithm separating a set of 20 consonant and vowel phoneme MIDI codes from 71 syllable MIDI converted codes in order to construct a voice synthesizing system. And, we present the evaluation results of voice synthesizing quality between these separated phoneme MIDI codes and their original syllable MIDI codes by our developed 4-syllable word listening tests.
Utility and accuracy of perceptual voice and speech distinctions in the diagnosis of Parkinson's disease, PSP and MSA-P.

PubMed

Miller, Nick; Nath, Uma; Noble, Emma; Burn, David

2017-06-01

To determine if perceptual speech measures distinguish people with Parkinson's disease (PD), multiple system atrophy with predominant parkinsonism (MSA-P) and progressive supranuclear palsy (PSP). Speech-language therapists blind to patient characteristics employed clinical rating scales to evaluate speech/voice in 24 people with clinically diagnosed PD, 17 with PSP and 9 with MSA-P, matched for disease duration (mean 4.9 years, standard deviation 2.2). No consistent intergroup differences appeared on specific speech/voice variables. People with PD were significantly less impaired on overall speech/voice severity. Analyses by severity suggested further investigation around laryngeal, resonance and fluency changes may characterize individual groups. MSA-P and PSP compared with PD were distinguished by severity of speech/voice deterioration, but individual speech/voice parameters failed to consistently differentiate groups.
Familiar Person Recognition: Is Autonoetic Consciousness More Likely to Accompany Face Recognition Than Voice Recognition?

NASA Astrophysics Data System (ADS)

Barsics, Catherine; Brédart, Serge

2010-11-01

Autonoetic consciousness is a fundamental property of human memory, enabling us to experience mental time travel, to recollect past events with a feeling of self-involvement, and to project ourselves in the future. Autonoetic consciousness is a characteristic of episodic memory. By contrast, awareness of the past associated with a mere feeling of familiarity or knowing relies on noetic consciousness, depending on semantic memory integrity. Present research was aimed at evaluating whether conscious recollection of episodic memories is more likely to occur following the recognition of a familiar face than following the recognition of a familiar voice. Recall of semantic information (biographical information) was also assessed. Previous studies that investigated the recall of biographical information following person recognition used faces and voices of famous people as stimuli. In this study, the participants were presented with personally familiar people's voices and faces, thus avoiding the presence of identity cues in the spoken extracts and allowing a stricter control of frequency exposure with both types of stimuli (voices and faces). In the present study, the rate of retrieved episodic memories, associated with autonoetic awareness, was significantly higher from familiar faces than familiar voices even though the level of overall recognition was similar for both these stimuli domains. The same pattern was observed regarding semantic information retrieval. These results and their implications for current Interactive Activation and Competition person recognition models are discussed.
Natural asynchronies in audiovisual communication signals regulate neuronal multisensory interactions in voice-sensitive cortex.

PubMed

Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K; Petkov, Christopher I

2015-01-06

When social animals communicate, the onset of informative content in one modality varies considerably relative to the other, such as when visual orofacial movements precede a vocalization. These naturally occurring asynchronies do not disrupt intelligibility or perceptual coherence. However, they occur on time scales where they likely affect integrative neuronal activity in ways that have remained unclear, especially for hierarchically downstream regions in which neurons exhibit temporally imprecise but highly selective responses to communication signals. To address this, we exploited naturally occurring face- and voice-onset asynchronies in primate vocalizations. Using these as stimuli we recorded cortical oscillations and neuronal spiking responses from functional MRI (fMRI)-localized voice-sensitive cortex in the anterior temporal lobe of macaques. We show that the onset of the visual face stimulus resets the phase of low-frequency oscillations, and that the face-voice asynchrony affects the prominence of two key types of neuronal multisensory responses: enhancement or suppression. Our findings show a three-way association between temporal delays in audiovisual communication signals, phase-resetting of ongoing oscillations, and the sign of multisensory responses. The results reveal how natural onset asynchronies in cross-sensory inputs regulate network oscillations and neuronal excitability in the voice-sensitive cortex of macaques, a suggested animal model for human voice areas. These findings also advance predictions on the impact of multisensory input on neuronal processes in face areas and other brain regions.
Satellite voice broadcase system study. Volume 1: Executive summary

NASA Technical Reports Server (NTRS)

Horstein, M.

1985-01-01

The feasibility of providing Voice of America (VOA) broadcasts by satellite relay was investigated. Satellite voice broadcast systems are described for three different frequency bands: HF, FHV, and L-band. Geostationary satellite configurations are considered for both frequency bands. A system of subsynchronous, circular satellites with an orbit period of 8 hours was developed for the HF band. The VHF broadcasts are provided by a system of Molniya satellites. The satellite designs are limited in size and weight to the capability of the STS/Centaur launch vehicle combination. At L-band, only four geostationary satellites are needed to meet the requirements of the complete broadcast schedule. These satellites are comparable in size and weight to current satellites designed for the direct broadcast of video program material.
Micro-video display with ocular tracking and interactive voice control

NASA Technical Reports Server (NTRS)

Miller, James E.

1993-01-01

In certain space-restricted environments, many of the benefits resulting from computer technology have been foregone because of the size, weight, inconvenience, and lack of mobility associated with existing computer interface devices. Accordingly, an effort to develop a highly miniaturized and 'wearable' computer display and control interface device, referred to as the Sensory Integrated Data Interface (SIDI), is underway. The system incorporates a micro-video display that provides data display and ocular tracking on a lightweight headset. Software commands are implemented by conjunctive eye movement and voice commands of the operator. In this initial prototyping effort, various 'off-the-shelf' components have been integrated into a desktop computer and with a customized menu-tree software application to demonstrate feasibility and conceptual capabilities. When fully developed as a customized system, the interface device will allow mobile, 'hand-free' operation of portable computer equipment. It will thus allow integration of information technology applications into those restrictive environments, both military and industrial, that have not yet taken advantage of the computer revolution. This effort is Phase 1 of Small Business Innovative Research (SBIR) Topic number N90-331 sponsored by the Naval Undersea Warfare Center Division, Newport. The prime contractor is Foster-Miller, Inc. of Waltham, MA.
Inside-in, alternative paradigms for sound spatialization

NASA Astrophysics Data System (ADS)

Bahn, Curtis; Moore, Stephan

2003-04-01

Arrays of widely spaced mono-directional loudspeakers (P.A.-style stereo configurations or ``outside-in'' surround-sound systems) have long provided the dominant paradigms for electronic sound diffusion. So prevalent are these models that alternatives have largely been ignored and electronic sound, regardless of musical aesthetic, has come to be inseparably associated with single-channel speakers, or headphones. We recognize the value of these familiar paradigms, but believe that electronic sound can and should have many alternative, idiosyncratic voices. Through the design and construction of unique sound diffusion structures, one can reinvent the nature of electronic sound; when allied with new sensor technologies, these structures offer alternative modes of interaction with techniques of sonic computation. This paper describes several recent applications of spherical speakers (multichannel, outward-radiating geodesic speaker arrays) and Sensor-Speaker-Arrays (SenSAs: combinations of various sensor devices with outward-radiating multi-channel speaker arrays). This presentation introduces the development of four generations of spherical speakers-over a hundred individual speakers of various configurations-and their use in many different musical situations including live performance, recording, and sound installation. We describe the design and construction of these systems, and, more generally, the new ``voices'' they give to electronic sound.
Voice over Internet Protocol (VoIP) Technology as a Global Learning Tool: Information Systems Success and Control Belief Perspectives

ERIC Educational Resources Information Center

Chen, Charlie C.; Vannoy, Sandra

2013-01-01

Voice over Internet Protocol- (VoIP) enabled online learning service providers struggling with high attrition rates and low customer loyalty issues despite VoIP's high degree of system fit for online global learning applications. Effective solutions to this prevalent problem rely on the understanding of system quality, information quality, and…
a Study of Multiplexing Schemes for Voice and Data.

NASA Astrophysics Data System (ADS)

Sriram, Kotikalapudi

Voice traffic variations are characterized by on/off transitions of voice calls, and talkspurt/silence transitions of speakers in conversations. A speaker is known to be in silence for more than half the time during a telephone conversation. In this dissertation, we study some schemes which exploit speaker silences for an efficient utilization of the transmission capacity in integrated voice/data multiplexing and in digital speech interpolation. We study two voice/data multiplexing schemes. In each scheme, any time slots momentarily unutilized by the voice traffic are made available to data. In the first scheme, the multiplexer does not use speech activity detectors (SAD), and hence the voice traffic variations are due to call on/off only. In the second scheme, the multiplexer detects speaker silences using SAD and transmits voice only during talkspurts. The multiplexer with SAD performs digital speech interpolation (DSI) as well as dynamic channel allocation to voice and data. The performance of the two schemes is evaluated using discrete-time modeling and analysis. The data delay performance for the case of English speech is compared with that for the case of Japanese speech. A closed form expression for the mean data message delay is derived for the single-channel single-talker case. In a DSI system, occasional speech losses occur whenever the number of speakers in simultaneous talkspurt exceeds the number of TDM voice channels. In a buffered DSI system, speech loss is further reduced at the cost of delay. We propose a novel fixed-delay buffered DSI scheme. In this scheme, speech fill-in/hangover is not required because there are no variable delays. Hence, all silences that naturally occur in speech are fully utilized. Consequently, a substantial improvement in the DSI performance is made possible. The scheme is modeled and analyzed in discrete -time. Its performance is evaluated in terms of the probability of speech clipping, packet rejection ratio, DSI advantage, and the delay.
47 CFR 22.1007 - Channels for offshore radiotelephone systems.

Code of Federal Regulations, 2012 CFR

2012-10-01

... emergency auto alarm and voice transmission pertaining to emergency conditions only. Central Subscriber 488... fixed, surface and/or airborne mobile) as indicated, for emergency auto alarm and voice transmission...
47 CFR 22.1007 - Channels for offshore radiotelephone systems.

Code of Federal Regulations, 2013 CFR

2013-10-01

... emergency auto alarm and voice transmission pertaining to emergency conditions only. Central Subscriber 488... fixed, surface and/or airborne mobile) as indicated, for emergency auto alarm and voice transmission...
47 CFR 22.1007 - Channels for offshore radiotelephone systems.

Code of Federal Regulations, 2014 CFR

2014-10-01

... emergency auto alarm and voice transmission pertaining to emergency conditions only. Central Subscriber 488... fixed, surface and/or airborne mobile) as indicated, for emergency auto alarm and voice transmission...
Using the Web to Market Your Schools.

ERIC Educational Resources Information Center

Carr, Nora

2001-01-01

With careful planning and a strategic focus, today's technology can greatly enhance a district's marketing efforts. Websites can offer features such as interactive school assignment (based on home address), ability to check student progress, education portals (24-hour news channels), one-to-one communication, and interactive voice responses. (MLH)

A system for analysis and classification of voice communications

NASA Technical Reports Server (NTRS)

Older, H. J.; Jenney, L. L.; Garland, L.

1973-01-01

A method for analysis and classification of verbal communications typically associated with manned space missions or simulations was developed. The study was carried out in two phases. Phase 1 was devoted to identification of crew tasks and activities which require voice communication for accomplishment or reporting. Phase 2 entailed development of a message classification system and a preliminary test of its feasibility. The classification system permits voice communications to be analyzed to three progressively more specific levels of detail and to be described in terms of message content, purpose, and the participants in the information exchange. A coding technique was devised to allow messages to be recorded by an eight-digit number.
Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

DTIC Science & Technology

2017-12-04

frequency scale (male voice; normal voice effort) ............................... 4 Fig. 2 Diagram of a speech communication system (Letowski...languages. Consonants contain mostly high frequency (above 1500 Hz) speech energy, but this energy is relatively small in comparison to that of the whole...voices (Letowski et al. 1993). Since the mid- frequency spectral region contains mostly vowel energy while consonants are high frequency sounds, an
An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification.

PubMed

Al-Nasheri, Ahmed; Muhammad, Ghulam; Alsulaiman, Mansour; Ali, Zulfiqar; Mesallam, Tamer A; Farahat, Mohamed; Malki, Khalid H; Bencherif, Mohamed A

2017-01-01

Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
A hybrid voice/data modulation for the VHF aeronautical channels

NASA Technical Reports Server (NTRS)

Akos, Dennis M.

1993-01-01

A method of improving the spectral efficiency of the existing Very High Frequency (VHF) Amplitude Modulation (AM) voice communication channels is proposed. The technique is to phase modulate the existing voice amplitude modulated carrier with digital data. This allows the transmission of digital information over an existing AM voice channel with no change to the existing AM signal format. There is no modification to the existing AM receiver to demodulate the voice signal and an additional receiver module can be added for processing of the digital data. The existing VHF AM transmitter requires only a slight modification for the addition of the digital data signal. The past work in the area is summarized and presented together with an improved system design and the proposed implementation.
Effects of voice harmonic complexity on ERP responses to pitch-shifted auditory feedback.

PubMed

Behroozmand, Roozbeh; Korzyukov, Oleg; Larson, Charles R

2011-12-01

The present study investigated the neural mechanisms of voice pitch control for different levels of harmonic complexity in the auditory feedback. Event-related potentials (ERPs) were recorded in response to+200 cents pitch perturbations in the auditory feedback of self-produced natural human vocalizations, complex and pure tone stimuli during active vocalization and passive listening conditions. During active vocal production, ERP amplitudes were largest in response to pitch shifts in the natural voice, moderately large for non-voice complex stimuli and smallest for the pure tones. However, during passive listening, neural responses were equally large for pitch shifts in voice and non-voice complex stimuli but still larger than that for pure tones. These findings suggest that pitch change detection is facilitated for spectrally rich sounds such as natural human voice and non-voice complex stimuli compared with pure tones. Vocalization-induced increase in neural responses for voice feedback suggests that sensory processing of naturally-produced complex sounds such as human voice is enhanced by means of motor-driven mechanisms (e.g. efference copies) during vocal production. This enhancement may enable the audio-vocal system to more effectively detect and correct for vocal errors in the feedback of natural human vocalizations to maintain an intended vocal output for speaking. Copyright Â© 2011 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Modeling and Analysis of Hybrid Cellular/WLAN Systems with Integrated Service-Based Vertical Handoff Schemes

NASA Astrophysics Data System (ADS)

Xia, Weiwei; Shen, Lianfeng

We propose two vertical handoff schemes for cellular network and wireless local area network (WLAN) integration: integrated service-based handoff (ISH) and integrated service-based handoff with queue capabilities (ISHQ). Compared with existing handoff schemes in integrated cellular/WLAN networks, the proposed schemes consider a more comprehensive set of system characteristics such as different features of voice and data services, dynamic information about the admitted calls, user mobility and vertical handoffs in two directions. The code division multiple access (CDMA) cellular network and IEEE 802.11e WLAN are taken into account in the proposed schemes. We model the integrated networks by using multi-dimensional Markov chains and the major performance measures are derived for voice and data services. The important system parameters such as thresholds to prioritize handoff voice calls and queue sizes are optimized. Numerical results demonstrate that the proposed ISHQ scheme can maximize the utilization of overall bandwidth resources with the best quality of service (QoS) provisioning for voice and data services.
Adaptive Suppression of Noise in Voice Communications

NASA Technical Reports Server (NTRS)

Kozel, David; DeVault, James A.; Birr, Richard B.

2003-01-01

A subsystem for the adaptive suppression of noise in a voice communication system effects a high level of reduction of noise that enters the system through microphones. The subsystem includes a digital signal processor (DSP) plus circuitry that implements voice-recognition and spectral- manipulation techniques. The development of the adaptive noise-suppression subsystem was prompted by the following considerations: During processing of the space shuttle at Kennedy Space Center, voice communications among test team members have been significantly impaired in several instances because some test participants have had to communicate from locations with high ambient noise levels. Ear protection for the personnel involved is commercially available and is used in such situations. However, commercially available noise-canceling microphones do not provide sufficient reduction of noise that enters through microphones and thus becomes transmitted on outbound communication links.
Evaluation of Different Speech and Touch Interfaces to In-Vehicle Music Retrieval Systems

PubMed Central

Garay-Vega, L.; Pradhan, A. K.; Weinberg, G.; Schmidt-Nielsen, B.; Harsham, B.; Shen, Y.; Divekar, G.; Romoser, M.; Knodler, M.; Fisher, D. L.

2010-01-01

In-vehicle music retrieval systems are becoming more and more popular. Previous studies have shown that they pose a real hazard to drivers when the interface is a tactile one which requires multiple entries and a combination of manual control and visual feedback. Voice interfaces exist as an alternative. Such interfaces can require either multiple or single conversational turns. In this study, each of 17 participants between the ages of 18 and 30 years old was asked to use three different music-retrieval systems (one with a multiple entry touch interface, the iPod™, one with a multiple turn voice interface, interface B, and one with a single turn voice interface, interface C) while driving through a virtual world. Measures of secondary task performance, eye behavior, vehicle control, and workload were recorded. When compared with the touch interface, the voice interfaces reduced the total time drivers spent with their eyes off the forward roadway, especially in prolonged glances, as well as both the total number of glances away from the forward roadway and the perceived workload. Furthermore, when compared with driving without a secondary task, both voice interfaces did not significantly impact hazard anticipation, the frequency of long glances away from the forward roadway, or vehicle control. The multiple turn voice interface (B) significantly increased both the time it took drivers to complete the task and the workload. The implications for interface design and safety are discussed. PMID:20380920
Audio-vocal system regulation in children with autism spectrum disorders.

PubMed

Russo, Nicole; Larson, Charles; Kraus, Nina

2008-06-01

Do children with autism spectrum disorders (ASD) respond similarly to perturbations in auditory feedback as typically developing (TD) children? Presentation of pitch-shifted voice auditory feedback to vocalizing participants reveals a close coupling between the processing of auditory feedback and vocal motor control. This paradigm was used to test the hypothesis that abnormalities in the audio-vocal system would negatively impact ASD compensatory responses to perturbed auditory feedback. Voice fundamental frequency (F(0)) was measured while children produced an /a/ sound into a microphone. The voice signal was fed back to the subjects in real time through headphones. During production, the feedback was pitch shifted (-100 cents, 200 ms) at random intervals for 80 trials. Averaged voice F(0) responses to pitch-shifted stimuli were calculated and correlated with both mental and language abilities as tested via standardized tests. A subset of children with ASD produced larger responses to perturbed auditory feedback than TD children, while the other children with ASD produced significantly lower response magnitudes. Furthermore, robust relationships between language ability, response magnitude and time of peak magnitude were identified. Because auditory feedback helps to stabilize voice F(0) (a major acoustic cue of prosody) and individuals with ASD have problems with prosody, this study identified potential mechanisms of dysfunction in the audio-vocal system for voice pitch regulation in some children with ASD. Objectively quantifying this deficit may inform both the assessment of a subgroup of ASD children with prosody deficits, as well as remediation strategies that incorporate pitch training.
SPACEWAY: Providing affordable and versatile communication solutions

NASA Astrophysics Data System (ADS)

Fitzpatrick, E. J.

1995-08-01

By the end of this decade, Hughes' SPACEWAY network will provide the first interactive 'bandwidth on demand' communication services for a variety of applications. High quality digital voice, interactive video, global access to multimedia databases, and transborder workgroup computing will make SPACEWAY an essential component of the computer-based workplace of the 21st century. With relatively few satellites to construct, insure, and launch -- plus extensive use of cost-effective, tightly focused spot beams on the world's most populated areas -- the high capacity SPACEWAY system can pass its significant cost savings onto its customers. The SPACEWAY network is different from other proposed global networks in that its geostationary orbit location makes it a truly market driven system: each satellite will make available extensive telecom services to hundreds of millions of people within the continuous view of that satellite, providing immediate capacity within a specific region of the world.
Wearable Technology to Garner the Perspective of Dementia Family Caregivers

PubMed Central

Matthews, Judith T.; Campbell, Grace B.; Hunsaker, Amanda E.; Klinger, Julie; Mecca, Laurel Person; Hu, Lu; Hostein, Sally; Lingler, Jennifer H.

2015-01-01

Family caregivers of persons with dementia typically have limited opportunity during brief clinical encounters to describe the dementia-related behaviors and interactions that they find difficult to handle. Lack of objective data depicting the nature, intensity, and impact of these manifestations of the underlying disease further constrains the extent to which strategies recommended by nurses or other health care providers can be tailored to the situation. We describe a prototype wearable camera system used to gather image and voice data from the caregiver’s perspective in a pilot feasibility intervention study conducted with 18 caregiving dyads. Several scenarios are presented that incorporate salient events (i.e., behaviors or interactions deemed difficult by the caregiver or identified as concerning by our team during screening) identified in the resulting video. We anticipate that future wearable camera systems and software will automate screening for salient events, providing new tools for assessment and intervention by nurses. PMID:26468655
SPACEWAY: Providing affordable and versatile communication solutions

NASA Technical Reports Server (NTRS)

Fitzpatrick, E. J.

1995-01-01

By the end of this decade, Hughes' SPACEWAY network will provide the first interactive 'bandwidth on demand' communication services for a variety of applications. High quality digital voice, interactive video, global access to multimedia databases, and transborder workgroup computing will make SPACEWAY an essential component of the computer-based workplace of the 21st century. With relatively few satellites to construct, insure, and launch -- plus extensive use of cost-effective, tightly focused spot beams on the world's most populated areas -- the high capacity SPACEWAY system can pass its significant cost savings onto its customers. The SPACEWAY network is different from other proposed global networks in that its geostationary orbit location makes it a truly market driven system: each satellite will make available extensive telecom services to hundreds of millions of people within the continuous view of that satellite, providing immediate capacity within a specific region of the world.
Using Ambulatory Voice Monitoring to Investigate Common Voice Disorders: Research Update

PubMed Central

Mehta, Daryush D.; Van Stan, Jarrad H.; Zañartu, Matías; Ghassemi, Marzyeh; Guttag, John V.; Espinoza, Víctor M.; Cortés, Juan P.; Cheyne, Harold A.; Hillman, Robert E.

2015-01-01

Many common voice disorders are chronic or recurring conditions that are likely to result from inefficient and/or abusive patterns of vocal behavior, referred to as vocal hyperfunction. The clinical management of hyperfunctional voice disorders would be greatly enhanced by the ability to monitor and quantify detrimental vocal behaviors during an individual’s activities of daily life. This paper provides an update on ongoing work that uses a miniature accelerometer on the neck surface below the larynx to collect a large set of ambulatory data on patients with hyperfunctional voice disorders (before and after treatment) and matched-control subjects. Three types of analysis approaches are being employed in an effort to identify the best set of measures for differentiating among hyperfunctional and normal patterns of vocal behavior: (1) ambulatory measures of voice use that include vocal dose and voice quality correlates, (2) aerodynamic measures based on glottal airflow estimates extracted from the accelerometer signal using subject-specific vocal system models, and (3) classification based on machine learning and pattern recognition approaches that have been used successfully in analyzing long-term recordings of other physiological signals. Preliminary results demonstrate the potential for ambulatory voice monitoring to improve the diagnosis and treatment of common hyperfunctional voice disorders. PMID:26528472
Crossmodal plasticity in the fusiform gyrus of late blind individuals during voice recognition.

PubMed

Hölig, Cordula; Föcker, Julia; Best, Anna; Röder, Brigitte; Büchel, Christian

2014-12-01

Blind individuals are trained in identifying other people through voices. In congenitally blind adults the anterior fusiform gyrus has been shown to be active during voice recognition. Such crossmodal changes have been associated with a superiority of blind adults in voice perception. The key question of the present functional magnetic resonance imaging (fMRI) study was whether visual deprivation that occurs in adulthood is followed by similar adaptive changes of the voice identification system. Late blind individuals and matched sighted participants were tested in a priming paradigm, in which two voice stimuli were subsequently presented. The prime (S1) and the target (S2) were either from the same speaker (person-congruent voices) or from two different speakers (person-incongruent voices). Participants had to classify the S2 as either coming from an old or a young person. Only in late blind but not in matched sighted controls, the activation in the anterior fusiform gyrus was modulated by voice identity: late blind volunteers showed an increase of the BOLD signal in response to person-incongruent compared with person-congruent trials. These results suggest that the fusiform gyrus adapts to input of a new modality even in the mature brain and thus demonstrate an adult type of crossmodal plasticity. Copyright © 2014 Elsevier Inc. All rights reserved.
Factors associated with voice therapy outcomes in the treatment of presbyphonia.

PubMed

Mau, Ted; Jacobson, Barbara H; Garrett, C Gaelyn

2010-06-01

Age, vocal fold atrophy, glottic closure pattern, and the burden of medical problems are associated with voice therapy outcomes for presbyphonia. Retrospective. Records of patients seen over a 3-year period at a voice center were screened. Inclusion criteria consisted of age over 55 years, primary complaint of hoarseness, presence of vocal fold atrophy on examination, and absence of laryngeal or neurological pathology. Videostroboscopic examinations on initial presentation were reviewed. Voice therapy outcomes were assessed with the American Speech-Language-Hearing Association National Outcomes Measurement System scale. Statistical analysis was performed with Spearman rank correlation and chi(2) tests. Sixty-seven patients were included in the study. Of the patients, 85% demonstrated improvement with voice therapy. The most common type of glottic closure consisted of a slit gap. Gender or age had no effect on voice therapy outcomes. Larger glottic gaps on initial stroboscopy examination and more pronounced vocal fold atrophy were weakly correlated with less improvement from voice therapy. A weak correlation was also found between the number of chronic medical conditions and poorer outcomes from voice therapy. The degree of clinician-determined improvement in vocal function from voice therapy is independent of patient age but is influenced by the degree of vocal fold atrophy, glottic closure pattern, and the patient's burden of medical problems.
Overview of the Anik C satellites and services

NASA Astrophysics Data System (ADS)

Smart, F. H.

An overview of the important technical characteristics of the Anik C series of Canadian communications satellites is presented. The system was launched as part of the Telesat Communications payload of the Space Shuttle in 1982. Among the services the system will in the near future provide are: a 27 MHz channel bandwidth television service for pay-TV distribution in Canada; two TV channels for hockey broadcasts and a transportable TV system; a heavy-voice route telephone service for five major Canadian cities; and a telephone system for business voice and data communications. Services anticipated for Anik-C satellites later in the decade include a Single Channel Per Carrier (SCPC) voice and data communications system for British Columbia and the Maritime Provinces, and a direct-to-home broadcast service to be sold to television markets in the United States.
Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair

NASA Astrophysics Data System (ADS)

Sasou, Akira; Kojima, Hiroaki

2009-12-01

Conventional voice-driven wheelchairs usually employ headset microphones that are capable of achieving sufficient recognition accuracy, even in the presence of surrounding noise. However, such interfaces require users to wear sensors such as a headset microphone, which can be an impediment, especially for the hand disabled. Conversely, it is also well known that the speech recognition accuracy drastically degrades when the microphone is placed far from the user. In this paper, we develop a noise robust speech recognition system for a voice-driven wheelchair. This system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors. We verified the effectiveness of our system in experiments in different environments, and confirmed that our system can achieve almost the same recognition accuracy as the headset microphone without wearing sensors.
THROUGH-THE EARTH (TTE) SYSTEM AND THE IN-MINE POWER LINE (IMPL) SYSTEM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zvi H. Meiksin

Work has progressed on both subsystems: the through-the-earth (TTE) system and the In-Mine Power Line (IMPL) system. After the Lab prototype of the IMPL system was perfected to function satisfactorily, the thrust of the work focused on building a first production prototype that can be installed and tested inside a mine. To obtain multi-channel voice communication through the TTE system, effort has proceeded to compress voice messages and make the format compatible with the power-line interface protocol.
Noise Source Visualization Using a Digital Voice Recorder and Low-Cost Sensors

PubMed Central

Cho, Yong Thung

2018-01-01

Accurate sound visualization of noise sources is required for optimal noise control. Typically, noise measurement systems require microphones, an analog-digital converter, cables, a data acquisition system, etc., which may not be affordable for potential users. Also, many such systems are not highly portable and may not be convenient for travel. Handheld personal electronic devices such as smartphones and digital voice recorders with relatively lower costs and higher performance have become widely available recently. Even though such devices are highly portable, directly implementing them for noise measurement may lead to erroneous results since such equipment was originally designed for voice recording. In this study, external microphones were connected to a digital voice recorder to conduct measurements and the input received was processed for noise visualization. In this way, a low cost, compact sound visualization system was designed and introduced to visualize two actual noise sources for verification with different characteristics: an enclosed loud speaker and a small air compressor. Reasonable accuracy of noise visualization for these two sources was shown over a relatively wide frequency range. This very affordable and compact sound visualization system can be used for many actual noise visualization applications in addition to educational purposes. PMID:29614038
Multipath/RFI/modulation study for DRSS-RFI problem: Voice coding and intelligibility testing for a satellite-based air traffic control system

NASA Technical Reports Server (NTRS)

Birch, J. N.; Getzin, N.

1971-01-01

Analog and digital voice coding techniques for application to an L-band satellite-basedair traffic control (ATC) system for over ocean deployment are examined. In addition to performance, the techniques are compared on the basis of cost, size, weight, power consumption, availability, reliability, and multiplexing features. Candidate systems are chosen on the bases of minimum required RF bandwidth and received carrier-to-noise density ratios. A detailed survey of automated and nonautomated intelligibility testing methods and devices is presented and comparisons given. Subjective evaluation of speech system by preference tests is considered. Conclusion and recommendations are developed regarding the selection of the voice system. Likewise, conclusions and recommendations are developed for the appropriate use of intelligibility tests, speech quality measurements, and preference tests with the framework of the proposed ATC system.

Full Duplex, Spread Spectrum Radio System

NASA Technical Reports Server (NTRS)

Harvey, Bruce A.

2000-01-01

The goal of this project was to support the development of a full duplex, spread spectrum voice communications system. The assembly and testing of a prototype system consisting of a Harris PRISM spread spectrum radio, a TMS320C54x signal processing development board and a Zilog Z80180 microprocessor was underway at the start of this project. The efforts under this project were the development of multiple access schemes, analysis of full duplex voice feedback delays, and the development and analysis of forward error correction (FEC) algorithms. The multiple access analysis involved the selection between code division multiple access (CDMA), frequency division multiple access (FDMA) and time division multiple access (TDMA). Full duplex voice feedback analysis involved the analysis of packet size and delays associated with full loop voice feedback for confirmation of radio system performance. FEC analysis included studies of the performance under the expected burst error scenario with the relatively short packet lengths, and analysis of implementation in the TMS320C54x digital signal processor. When the capabilities and the limitations of the components used were considered, the multiple access scheme chosen was a combination TDMA/FDMA scheme that will provide up to eight users on each of three separate frequencies. Packets to and from each user will consist of 16 samples at a rate of 8,000 samples per second for a total of 2 ms of voice information. The resulting voice feedback delay will therefore be 4 - 6 ms. The most practical FEC algorithm for implementation was a convolutional code with a Viterbi decoder. Interleaving of the bits of each packet will be required to offset the effects of burst errors.
Mobile Communication Devices, Ambient Noise, and Acoustic Voice Measures.

PubMed

Maryn, Youri; Ysenbaert, Femke; Zarowski, Andrzej; Vanspauwen, Robby

2017-03-01

The ability to move with mobile communication devices (MCDs; ie, smartphones and tablet computers) may induce differences in microphone-to-mouth positioning and use in noise-packed environments, and thus influence reliability of acoustic voice measurements. This study investigated differences in various acoustic voice measures between six recording equipments in backgrounds with low and increasing noise levels. One chain of continuous speech and sustained vowel from 50 subjects with voice disorders (all separated by silence intervals) was radiated and re-recorded in an anechoic chamber with five MCDs and one high-quality recording system. These recordings were acquired in one condition without ambient noise and in four conditions with increased ambient noise. A total of 10 acoustic voice markers were obtained in the program Praat. Differences between MCDs and noise condition were assessed with Friedman repeated-measures test and posthoc Wilcoxon signed-rank tests, both for related samples, after Bonferroni correction. (1) Except median fundamental frequency and seven nonsignificant differences, MCD samples have significantly higher acoustic markers than clinical reference samples in minimal environmental noise. (2) Except median fundamental frequency, jitter local, and jitter rap, all acoustic measures on samples recorded with the reference system experienced significant influence from room noise levels. Fundamental frequency is resistant to recording system, environmental noise, and their combination. All other measures, however, were impacted by both recording system and noise condition, and especially by their combination, often already in the reference/baseline condition without added ambient noise. Caution is therefore warranted regarding implementation of MCDs as clinical recording tools, particularly when applied for treatment outcomes assessments. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Real-time interactive speech technology at Threshold Technology, Incorporated

NASA Technical Reports Server (NTRS)

Herscher, Marvin B.

1977-01-01

Basic real-time isolated-word recognition techniques are reviewed. Industrial applications of voice technology are described in chronological order of their development. Future research efforts are also discussed.
How do you say 'hello'? Personality impressions from brief novel voices.

PubMed

McAleer, Phil; Todorov, Alexander; Belin, Pascal

2014-01-01

On hearing a novel voice, listeners readily form personality impressions of that speaker. Accurate or not, these impressions are known to affect subsequent interactions; yet the underlying psychological and acoustical bases remain poorly understood. Furthermore, hitherto studies have focussed on extended speech as opposed to analysing the instantaneous impressions we obtain from first experience. In this paper, through a mass online rating experiment, 320 participants rated 64 sub-second vocal utterances of the word 'hello' on one of 10 personality traits. We show that: (1) personality judgements of brief utterances from unfamiliar speakers are consistent across listeners; (2) a two-dimensional 'social voice space' with axes mapping Valence (Trust, Likeability) and Dominance, each driven by differing combinations of vocal acoustics, adequately summarises ratings in both male and female voices; and (3) a positive combination of Valence and Dominance results in increased perceived male vocal Attractiveness, whereas perceived female vocal Attractiveness is largely controlled by increasing Valence. Results are discussed in relation to the rapid evaluation of personality and, in turn, the intent of others, as being driven by survival mechanisms via approach or avoidance behaviours. These findings provide empirical bases for predicting personality impressions from acoustical analyses of short utterances and for generating desired personality impressions in artificial voices.
Instrumental and perceptual evaluations of two related singers.

PubMed

Buder, Eugene H; Wolf, Teresa

2003-06-01

The primary goal of this study was to characterize a performer's singing and speaking voice. One woman was not admitted to a premier choral group, but her sister, who was comparable in physical characteristics and background, was admitted and provided a valuable control subject. The perceptual judgment of a vocal coach who conducted the group's auditions was decisive in discriminating these 2 singers. The singer not admitted to the group described a history of voice pathology, lacked a functional head register, and spoke with a voice characterized by hoarseness. Multiple listener judgments and acoustic and aerodynamic evaluations of both singers provided a more systematic basis for determining: 1) the phonatory basis for this judgment; 2) whether similar judgments would be made by groups of vocal coaches and speech-language pathologists; and 3) whether the type of tasks (e.g., sung vs. spoken) would influence these judgments. Statistically significant differences were observed between the ratings of vocal health provided by two different groups of listeners. Significant interactions were also observed as a function of the types of voice samples heard by these listeners. Instrumental analyses provided evidence that, in comparison to her sister, the rejected singer had a compromised vocal range, glottal insufficiencies as assessed aerodynamically and electroglottographically, and impaired acoustic quality, especially in her speaking voice.
Middle Years Science Teachers Voice Their First Experiences with Interactive Whiteboard Technology

ERIC Educational Resources Information Center

Gadbois, Shannon A.; Haverstock, Nicole

2012-01-01

Among new technologies, interactive whiteboards (IWBs) particularly seem to engage students and offer entertainment value that may make them highly beneficial for learning. This study examined 10 Grade 6 teachers' initial experiences and uses of IWBs for teaching science. Through interviews, classroom visits, and field notes, the outcomes…
Frontal brain activation in premature infants' response to auditory stimuli in neonatal intensive care unit.

PubMed

Saito, Yuri; Fukuhara, Rie; Aoyama, Shiori; Toshima, Tamotsu

2009-07-01

The present study was focusing on the very few contacts with the mother's voice that NICU infants have in the womb as well as after birth, we examined whether they can discriminate between their mothers' utterances and those of female nurses in terms of the emotional bonding that is facilitated by prosodic utterances. Twenty-six premature infants were included in this study, and their cerebral blood flows were measured by near-infrared spectroscopy. They were exposed to auditory stimuli in the form of utterances made by their mothers and female nurses. A two (stimulus: mother and nurse) x two (recording site: right frontal area and left frontal area) analysis of variance (ANOVA) for these relative oxy-Hb values was conducted. The ANOVA showed a significant interaction between stimulus and recording site. The mother's and the nurse's voices were activated in the same way in the left frontal area, but showed different reactions in the right frontal area. We presume that the nurse's voice might become associated with pain and stress for premature infants. Our results showed that the premature infants reacted differently to the different voice stimuli. Therefore, we presume that both mothers' and nurses' voices represent positive stimuli for premature infants because both activate the frontal brain. Accordingly, we cannot explain our results only in terms of the state-dependent marker for infantile individual differences, but must also address the stressful trigger of nurses' voices for NICU infants.
Show and Tell: Video Modeling and Instruction Without Feedback Improves Performance but Is Not Sufficient for Retention of a Complex Voice Motor Skill.

PubMed

Look, Clarisse; McCabe, Patricia; Heard, Robert; Madill, Catherine J

2018-02-02

Modeling and instruction are frequent components of both traditional and technology-assisted voice therapy. This study investigated the value of video modeling and instruction in the early acquisition and short-term retention of a complex voice task without external feedback. Thirty participants were randomized to two conditions and trained to produce a vocal siren over 40 trials. One group received a model and verbal instructions, the other group received a model only. Sirens were analyzed for phonation time, vocal intensity, cepstral peak prominence, peak-to-peak time, and root-mean-square error at five time points. The model and instruction group showed significant improvement on more outcome measures than the model-only group. There was an interaction effect for vocal intensity, which showed that instructions facilitated greater improvement when they were first introduced. However, neither group reproduced the model's siren performance across all parameters or retained the skill 1 day later. Providing verbal instruction with a model appears more beneficial than providing a model only in the prepractice phase of acquiring a complex voice skill. Improved performance was observed; however, the higher level of performance was not retained after 40 trials in both conditions. Other prepractice variables may need to be considered. Findings have implications for traditional and technology-assisted voice therapy. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Natural asynchronies in audiovisual communication signals regulate neuronal multisensory interactions in voice-sensitive cortex

PubMed Central

Perrodin, Catherine; Kayser, Christoph; Logothetis, Nikos K.; Petkov, Christopher I.

2015-01-01

When social animals communicate, the onset of informative content in one modality varies considerably relative to the other, such as when visual orofacial movements precede a vocalization. These naturally occurring asynchronies do not disrupt intelligibility or perceptual coherence. However, they occur on time scales where they likely affect integrative neuronal activity in ways that have remained unclear, especially for hierarchically downstream regions in which neurons exhibit temporally imprecise but highly selective responses to communication signals. To address this, we exploited naturally occurring face- and voice-onset asynchronies in primate vocalizations. Using these as stimuli we recorded cortical oscillations and neuronal spiking responses from functional MRI (fMRI)-localized voice-sensitive cortex in the anterior temporal lobe of macaques. We show that the onset of the visual face stimulus resets the phase of low-frequency oscillations, and that the face–voice asynchrony affects the prominence of two key types of neuronal multisensory responses: enhancement or suppression. Our findings show a three-way association between temporal delays in audiovisual communication signals, phase-resetting of ongoing oscillations, and the sign of multisensory responses. The results reveal how natural onset asynchronies in cross-sensory inputs regulate network oscillations and neuronal excitability in the voice-sensitive cortex of macaques, a suggested animal model for human voice areas. These findings also advance predictions on the impact of multisensory input on neuronal processes in face areas and other brain regions. PMID:25535356
Improving Higher Education Practice through Student Evaluation Systems: Is the Student Voice Being Heard?

ERIC Educational Resources Information Center

Blair, Erik; Valdez Noel, Keisha

2014-01-01

Many higher education institutions use student evaluation systems as a way of highlighting course and lecturer strengths and areas for improvement. Globally, the student voice has been increasing in volume, and capitalising on student feedback has been proposed as a means to benefit teacher professional development. This paper examines the student…
78 FR 63488 - 60-Day Notice of Proposed Information Collection: Grant Drawdown Payment Request/LOCCS/VRS Voice...

Federal Register 2010, 2011, 2012, 2013, 2014

2013-10-24

... system. The information collected on the payment voucher will also be used as an internal control measure... LOCCS/VRS voice activated system. The information collected on the form serves also as an internal control measure to ensure the lawful and appropriate disbursement of Federal funds. DATES: Comments Due...
Systems concept for speech technology application in general aviation

NASA Technical Reports Server (NTRS)

North, R. A.; Bergeron, H.

1984-01-01

The application potential of voice recognition and synthesis circuits for general aviation, single-pilot IFR (SPIFR) situations is examined. The viewpoint of the pilot was central to workload analyses and assessment of the effectiveness of the voice systems. A twin-engine, high performance general aviation aircraft on a cross-country fixed route was employed as the study model. No actual control movements were considered and other possible functions were scored by three IFR-rated instructors. The SPIFR was concluded helpful in alleviating visual and manual workloads during take-off, approach and landing, particularly for data retrieval and entry tasks. Voice synthesis was an aid in alerting a pilot to in-flight problems. It is expected that usable systems will be available within 5 yr.
Neural effects of environmental advertising: An fMRI analysis of voice age and temporal framing.

PubMed

Casado-Aranda, Luis-Alberto; Martínez-Fiestas, Myriam; Sánchez-Fernández, Juan

2018-01-15

Ecological information offered to society through advertising enhances awareness of environmental issues, encourages development of sustainable attitudes and intentions, and can even alter behavior. This paper, by means of functional Magnetic Resonance Imaging (fMRI) and self-reports, explores the underlying mechanisms of processing ecological messages. The study specifically examines brain and behavioral responses to persuasive ecological messages that differ in temporal framing and in the age of the voice pronouncing them. The findings reveal that attitudes are more positive toward future-framed messages presented by young voices. The whole-brain analysis reveals that future-framed (FF) ecological messages trigger activation in brain areas related to imagery, prospective memories and episodic events, thus reflecting the involvement of past behaviors in future ecological actions. Past-framed messages (PF), in turn, elicit brain activations within the episodic system. Young voices (YV), in addition to triggering stronger activation in areas involved with the processing of high-timbre, high-pitched and high-intensity voices, are perceived as more emotional and motivational than old voices (OV) as activations in anterior cingulate cortex and amygdala. Messages expressed by older voices, in turn, exhibit stronger activation in areas formerly linked to low-pitched voices and voice gender perception. Interestingly, a link is identified between neural and self-report responses indicating that certain brain activations in response to future-framed messages and young voices predicted higher attitudes toward future-framed and young voice advertisements, respectively. The results of this study provide invaluable insight into the unconscious origin of attitudes toward environmental messages and indicate which voice and temporal frame of a message generate the greatest subconscious value. Copyright © 2017 Elsevier Ltd. All rights reserved.
Utility of an Interactive Voice Response System to Assess Antiretroviral Pharmacotherapy Adherence Among Substance Users Living with HIV/AIDS in the Rural South

PubMed Central

Simpson, Cathy A.; Huang, Jin; Roth, David L.; Stewart, Katharine E.

2013-01-01

Abstract Promoting HIV medication adherence is basic to HIV/AIDS clinical care and reducing transmission risk and requires sound assessment of adherence and risk behaviors such as substance use that may interfere with adherence. The present study evaluated the utility of a telephone-based Interactive Voice Response self-monitoring (IVR SM) system to assess prospectively daily HIV medication adherence and its correlates among rural substance users living with HIV/AIDS. Community-dwelling patients (27 men, 17 women) recruited from a non-profit HIV medical clinic in rural Alabama reported daily medication adherence, substance use, and sexual practices for up to 10 weeks. Daily IVR reports of adherence were compared with short-term IVR-based recall reports over 4- and 7-day intervals. Daily IVR reports were positively correlated with both recall measures over matched intervals. However, 7-day recall yielded higher adherence claims compared to the more contemporaneous daily IVR and 4-day recall measures suggestive of a social desirability bias over the longer reporting period. Nearly one-third of participants (32%) reported adherence rates below the optimal rate of 95% (range=0–100%). Higher IVR-reported daily medication adherence was associated with lower baseline substance use, shorter duration of HIV/AIDS medical care, and higher IVR utilization. IVR SM appears to be a useful telehealth tool for monitoring medication adherence and identifying patients with suboptimal adherence between clinic visits and can help address geographic barriers to care among disadvantaged, rural adults living with HIV/AIDS. PMID:23651105
A real-time compliance mapping system using standard endoscopic surgical forceps.

PubMed

Fakhry, Morkos; Bello, Fernando; Hanna, George B

2009-04-01

In endoscopic surgery, the use of long surgical instruments through access ports diminishes tactile feedback and degrades the surgeon's ability to identify hidden tissue abnormalities. To overcome this constraint, we developed a real-time compliance mapping system that is composed of: 1) a standard surgical instrument with a high-precision sensor configuration design; 2) real-time objective interpretation of the output signals for tissue identification; and 3) a novel human-computer interaction technique using interactive voice and handle force monitoring techniques to suit operating theater working environment. The system was calibrated and used in clinical practice in four routine endoscopic human procedures. In a laboratory-based experiment to compare the tissue discriminatory power of the system with that of surgeons' hands, the system's tissue discriminatory power was three times more sensitive and 10% less specific. The data acquisition precision was tested using principal component analysis (R(2)X = 0.975, Q2 [cumulative (cum)] = 0.808 ) and partial least square discriminate analysis (R(2)X = 0.903, R(2)Y = 0.729, Q2 (cum) = 0.572).
Voice Over Internet Protocol (VoIP) in a Control Center Environment

NASA Technical Reports Server (NTRS)

Pirani, Joseph; Calvelage, Steven

2010-01-01

The technology of transmitting voice over data networks has been available for over 10 years. Mass market VoIP services for consumers to make and receive standard telephone calls over broadband Internet networks have grown in the last 5 years. While operational costs are less with VoIP implementations as opposed to time division multiplexing (TDM) based voice switches, is it still advantageous to convert a mission control center s voice system to this newer technology? Marshall Space Flight Center (MSFC) Huntsville Operations Support Center (HOSC) has converted its mission voice services to a commercial product that utilizes VoIP technology. Results from this testing, design, and installation have shown unique considerations that must be addressed before user operations. There are many factors to consider for a control center voice design. Technology advantages and disadvantages were investigated as they refer to cost. There were integration concerns which could lead to complex failure scenarios but simpler integration for the mission infrastructure. MSFC HOSC will benefit from this voice conversion with less product replacement cost, less operations cost and a more integrated mission services environment.
Successful mLearning Pilot in Senegal: Delivering Family Planning Refresher Training Using Interactive Voice Response and SMS

PubMed Central

Diedhiou, Abdoulaye; Gilroy, Kate E; Cox, Carie Muntifering; Duncan, Luke; Koumtingue, Djimadoum; Pacqué-Margolis, Sara; Fort, Alfredo; Settle, Dykki; Bailey, Rebecca

2015-01-01

Background: In-service training of health workers plays a pivotal role in improving service quality. However, it is often expensive and requires providers to leave their posts. We developed and assessed a prototype mLearning system that used interactive voice response (IVR) and text messaging on simple mobile phones to provide in-service training without interrupting health services. IVR allows trainees to respond to audio recordings using their telephone keypad. Methods: In 2013, the CapacityPlus project tested the mobile delivery of an 8-week refresher training course on management of contraceptive side effects and misconceptions to 20 public-sector nurses and midwives working in Mékhé and Tivaouane districts in the Thiès region of Senegal. The course used a spaced-education approach in which questions and detailed explanations are spaced and repeated over time. We assessed the feasibility through the system's administrative data, examined participants' experiences using an endline survey, and employed a pre- and post-test survey to assess changes in provider knowledge. Results: All participants completed the course within 9 weeks. The majority of participant prompts to interact with the mobile course were made outside normal working hours (median time, 5:16 pm); average call duration was about 13 minutes. Participants reported positive experiences: 60% liked the ability to determine the pace of the course and 55% liked the convenience. The largest criticism (35% of participants) was poor network reception, and 30% reported dropped IVR calls. Most (90%) participants thought they learned the same or more compared with a conventional course. Knowledge of contraceptive side effects increased significantly, from an average of 12.6/20 questions correct before training to 16.0/20 after, and remained significantly higher 10 months after the end of training than at baseline, at 14.8/20, without any further reinforcement. Conclusions: The mLearning system proved appropriate, feasible, and acceptable to trainees, and it was associated with sustained knowledge gains. IVR mLearning has potential to improve quality of care without disrupting routine service delivery. Monitoring and evaluation of larger-scale implementation could provide evidence of system effectiveness at scale. PMID:26085026
Relationship Between Voice and Motor Disabilities of Parkinson's Disease.

PubMed

Majdinasab, Fatemeh; Karkheiran, Siamak; Soltani, Majid; Moradi, Negin; Shahidi, Gholamali

2016-11-01

To evaluate voice of Iranian patients with Parkinson's disease (PD) and find any relationship between motor disabilities and acoustic voice parameters as speech motor components. We evaluated 27 Farsi-speaking PD patients and 21 age- and sex-matched healthy persons as control. Motor performance was assessed by the Unified Parkinson's Disease Rating Scale part III and Hoehn and Yahr rating scale in the "on" state. Acoustic voice evaluation, including fundamental frequency (f0), standard deviation of f0, minimum of f0, maximum of f0, shimmer, jitter, and harmonic to noise ratio, was done using the Praat software via /a/ prolongation. No difference was seen between the voice of the patients and the voice of the controls. f0 and its variation had a significant correlation with the duration of the disease, but did not have any relationships with the Unified Parkinson's Disease Rating Scale part III. Only limited relationship was observed between voice and motor disabilities. Tremor is an important main feature of PD that affects motor and phonation systems. Females had an older age at onset, more prolonged disease, and more severe motor disabilities (not statistically significant), but phonation disorders were more frequent in males and showed more relationship with severity of motor disabilities. Voice is affected by PD earlier than many other motor components and is more sensitive to disease progression. Tremor is the most effective part of PD that impacts voice. PD has more effect on voice of male versus female patients. Copyright Â© 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Making Choices, Taking Chances, Facing Challenges, Managing Change: The Implementation of a Voice/Video/Data Network at the Alliance Library System.

ERIC Educational Resources Information Center

Wilford, Valerie J.; Logan, Lee; Bell, Lori; Cloyes, Kay

The Alliance Library System (ALS) is one of 12 regional library systems in Illinois, providing a full spectrum of support services for 300 member libraries of all types (public, school, academic, and special) located in west central Illinois. This paper describes the process by which ALS implemented a voice/video/data network connecting their four…
Calibration of Clinical Audio Recording and Analysis Systems for Sound Intensity Measurement.

PubMed

Maryn, Youri; Zarowski, Andrzej

2015-11-01

Sound intensity is an important acoustic feature of voice/speech signals. Yet recordings are performed with different microphone, amplifier, and computer configurations, and it is therefore crucial to calibrate sound intensity measures of clinical audio recording and analysis systems on the basis of output of a sound-level meter. This study was designed to evaluate feasibility, validity, and accuracy of calibration methods, including audiometric speech noise signals and human voice signals under typical speech conditions. Calibration consisted of 3 comparisons between data from 29 measurement microphone-and-computer systems and data from the sound-level meter: signal-specific comparison with audiometric speech noise at 5 levels, signal-specific comparison with natural voice at 3 levels, and cross-signal comparison with natural voice at 3 levels. Intensity measures from recording systems were then linearly converted into calibrated data on the basis of these comparisons, and validity and accuracy of calibrated sound intensity were investigated. Very strong correlations and quasisimilarity were found between calibrated data and sound-level meter data across calibration methods and recording systems. Calibration of clinical sound intensity measures according to this method is feasible, valid, accurate, and representative for a heterogeneous set of microphones and data acquisition systems in real-life circumstances with distinct noise contexts.

Self-contained miniature electronics transceiver provides voice communication in hazardous environment

NASA Technical Reports Server (NTRS)

Cribb, H. E.

1970-01-01

Two-way wireless voice communications system is automatic, provides freedom of movement, allows for complete awareness of the environment, and does not present any additional hazards such as activation of electromagnetic sensitive devices.
Visual Confirmation of Voice Takeoff Clearance (VICON) Alternative Study

DOT National Transportation Integrated Search

1980-05-01

This report presents the results of a program undertaken to study potential alternatives to the VICON (Visual Confirmation of Voice Takeoff Clearance) System which has undergone operational field tests at Bradley International Airport, Windsor Locks,...
New perspective on psychosocial distress in patients with dysphonia: The moderating role of perceived control

PubMed Central

Meredith, Liza; Peterson, Carol B.; Frazier, Patricia A.

2015-01-01

Objectives Although an association between psychosocial distress (depression, anxiety, somatization, and perceived stress) and voice disorders has been observed, little is known about the relationship between distress and patient-reported voice handicap. Further, the psychological mechanisms underlying this relationship are poorly understood. Perceived control plays an important role in distress associated with other medical disorders. The objectives of this study were to 1) characterize the relationship between distress and patient-reported voice handicap and 2) examine the role of perceived control in this relationship. Study Design Cross-sectional study in tertiary care academic voice clinic. Methods Distress, perceived stress, voice handicap, and perceived control were measured using established assessment scales. Association was measured with Pearson’s correlation coefficient; moderation was assessed using multiple hierarchical regression. Results 533 patients enrolled. 34% met criteria for clinically significant distress (i.e., depression, anxiety, and/or somatization). A weak association (r=0.13, p=0.003) was observed between severity of psychosocial distress and vocal handicap. Present perceived control was inversely associated with distress (r=−0.41, p<0.0001), stress (r=−0.30, p<0.0001), and voice handicap (r=−0.30, p<0.0001). The relationship between voice handicap and psychosocial distress was moderated by perceived control (b for interaction term −0.15, p<0.001); greater vocal handicap was associated with greater distress in patients with low perceived control. Conclusions Severity of distress and vocal handicap were positively related, and the relation between them was moderated by perceived control. Vocal handicap was more related to distress among those with low perceived control; targeting this potential mechanism may facilitate new approaches for improved care. PMID:25795347
Development of the child's voice: premutation, mutation.

PubMed

Hacki, T; Heitmüller, S

1999-10-05

Voice range profile (VRP) measurement was used to evaluate the vocal capabilities of 180 children aged between 4 and 12 years without voice pathology. There were 10 boys and 10 girls in each age group. Using an automatic VRP measurement system, F0 and SPL dB (lin) were determined and displayed two-dimensionally in real time. The speaking voice, the shouting voice and the singing voice were investigated. The results show that vocal capabilities grow with advancing age, but not continuously. The lowering of the habitual pitch of the speaking voice as well as of the entire speaking pitch range occurs for girls between the ages of 7 and 8, for boys between 8 and 9. A temporary restriction of the minimum vocal intensity of the speaking voice (the ability to speak softly) as well as of the singing voice occurs for girls and for boys at the age of 7-8. A decrease of the maximum speech intensity is found for girls at the age of between 7 and 8, for boys between 8 and 9. A lowering of the pitch as well as of the intensity of the shouting voice occurs for both sexes from the age of 10. In contrast to earlier general opinion we note for girls a stage of premutation (between the age of 7 and 8) with essentially the same changes seen among boys, but 1 year earlier. The beginning of the mutation can be fixed at the age of 10-11 years.
Processing voiceless vowels in Japanese: Effects of language-specific phonological knowledge

NASA Astrophysics Data System (ADS)

Ogasawara, Naomi

2005-04-01

There has been little research on processing allophonic variation in the field of psycholinguistics. This study focuses on processing the voiced/voiceless allophonic alternation of high vowels in Japanese. Three perception experiments were conducted to explore how listeners parse out vowels with the voicing alternation from other segments in the speech stream and how the different voicing statuses of the vowel affect listeners' word recognition process. The results from the three experiments show that listeners use phonological knowledge of their native language for phoneme processing and for word recognition. However, interactions of the phonological and acoustic effects are observed to be different in each process. The facilitatory phonological effect and the inhibitory acoustic effect cancel out one another in phoneme processing; while in word recognition, the facilitatory phonological effect overrides the inhibitory acoustic effect.
Plastic reorganization of neural systems for perception of others in the congenitally blind.

PubMed

Fairhall, S L; Porter, K B; Bellucci, C; Mazzetti, M; Cipolli, C; Gobbini, M I

2017-09-01

Recent evidence suggests that the function of the core system for face perception might extend beyond visual face-perception to a broader role in person perception. To critically test the broader role of core face-system in person perception, we examined the role of the core system during the perception of others in 7 congenitally blind individuals and 15 sighted subjects by measuring their neural responses using fMRI while they listened to voices and performed identity and emotion recognition tasks. We hypothesised that in people who have had no visual experience of faces, core face-system areas may assume a role in the perception of others via voices. Results showed that emotions conveyed by voices can be decoded in homologues of the core face system only in the blind. Moreover, there was a specific enhancement of response to verbal as compared to non-verbal stimuli in bilateral fusiform face areas and the right posterior superior temporal sulcus showing that the core system also assumes some language-related functions in the blind. These results indicate that, in individuals with no history of visual experience, areas of the core system for face perception may assume a role in aspects of voice perception that are relevant to social cognition and perception of others' emotions. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Speech-based Class Attendance

NASA Astrophysics Data System (ADS)

Faizel Amri, Umar; Nur Wahidah Nik Hashim, Nik; Hazrin Hany Mohamad Hanif, Noor

2017-11-01

In the department of engineering, students are required to fulfil at least 80 percent of class attendance. Conventional method requires student to sign his/her initial on the attendance sheet. However, this method is prone to cheating by having another student signing for their fellow classmate that is absent. We develop our hypothesis according to a verse in the Holy Qur’an (95:4), “We have created men in the best of mould”. Based on the verse, we believe each psychological characteristic of human being is unique and thus, their speech characteristic should be unique. In this paper we present the development of speech biometric-based attendance system. The system requires user’s voice to be installed in the system as trained data and it is saved in the system for registration of the user. The following voice of the user will be the test data in order to verify with the trained data stored in the system. The system uses PSD (Power Spectral Density) and Transition Parameter as the method for feature extraction of the voices. Euclidean and Mahalanobis distances are used in order to verified the user’s voice. For this research, ten subjects of five females and five males were chosen to be tested for the performance of the system. The system performance in term of recognition rate is found to be 60% correct identification of individuals.
What can vortices tell us about vocal fold vibration and voice production.

PubMed

Khosla, Sid; Murugappan, Shanmugam; Gutmark, Ephraim

2008-06-01

Much clinical research on laryngeal airflow has assumed that airflow is unidirectional. This review will summarize what additional knowledge can be obtained about vocal fold vibration and voice production by studying rotational motion, or vortices, in laryngeal airflow. Recent work suggests two types of vortices that may strongly contribute to voice quality. The first kind forms just above the vocal folds during glottal closing, and is formed by flow separation in the glottis; these flow separation vortices significantly contribute to rapid closing of the glottis, and hence, to producing loudness and high frequency harmonics in the acoustic spectrum. The second is a group of highly three-dimensional and coherent supraglottal vortices, which can produce sound by interaction with structures in the vocal tract. Present work is also described that suggests that certain laryngeal pathologies, such as asymmetric vocal fold tension, will significantly modify both types of vortices, with adverse impact on sound production: decreased rate of glottal closure, increased broadband noise, and a decreased signal to noise ratio. Recent research supports the hypothesis that glottal airflow contains certain vortical structures that significantly contribute to voice quality.
A Spot Reminder System for the Visually Impaired Based on a Smartphone Camera

PubMed Central

Takizawa, Hotaka; Orita, Kazunori; Aoyagi, Mayumi; Ezaki, Nobuo; Mizuno, Shinji

2017-01-01

The present paper proposes a smartphone-camera-based system to assist visually impaired users in recalling their memories related to important locations, called spots, that they visited. The memories are recorded as voice memos, which can be played back when the users return to the spots. Spot-to-spot correspondence is determined by image matching based on the scale invariant feature transform. The main contribution of the proposed system is to allow visually impaired users to associate arbitrary voice memos with arbitrary spots. The users do not need any special devices or systems except smartphones and do not need to remember the spots where the voice memos were recorded. In addition, the proposed system can identify spots in environments that are inaccessible to the global positioning system. The proposed system has been evaluated by two experiments: image matching tests and a user study. The experimental results suggested the effectiveness of the system to help visually impaired individuals, including blind individuals, recall information about regularly-visited spots. PMID:28165403
A Spot Reminder System for the Visually Impaired Based on a Smartphone Camera.

PubMed

Takizawa, Hotaka; Orita, Kazunori; Aoyagi, Mayumi; Ezaki, Nobuo; Mizuno, Shinji

2017-02-04

The present paper proposes a smartphone-camera-based system to assist visually impaired users in recalling their memories related to important locations, called spots, that they visited. The memories are recorded as voice memos, which can be played back when the users return to the spots. Spot-to-spot correspondence is determined by image matching based on the scale invariant feature transform. The main contribution of the proposed system is to allow visually impaired users to associate arbitrary voice memos with arbitrary spots. The users do not need any special devices or systems except smartphones and do not need to remember the spots where the voice memos were recorded. In addition, the proposed system can identify spots in environments that are inaccessible to the global positioning system. The proposed system has been evaluated by two experiments: image matching tests and a user study. The experimental results suggested the effectiveness of the system to help visually impaired individuals, including blind individuals, recall information about regularly-visited spots.
A flight investigation of simulated data link communications during single-pilot IFR flight

NASA Technical Reports Server (NTRS)

Parker, J. F.; Duffy, J. W.; Christensen, D. G.

1983-01-01

A Flight Data Console (FDC) was developed to allow simulation of a digital communications link to replace the current voice communication system used in air traffic control (ATC). The voice system requires manipulation of radio equipment, read-back of clearances, and mental storage of critical information items, all contributing to high workload, particularly during single-pilot operations. This was an inflight study to determine how a digital communications system might reduce cockpit workload, improve flight proficiency, and be accepted by general aviation pilots. Results show that instrument flight, including approach and landing, can be accomplished quite effectively using a digital data link system for ATC communications. All pilots expressed a need for a back-up voice channel. When included, this channel was used sparingly and principally to confirm any item of information about which there might be uncertainty.
Modulation of voice related to tremor and vibrato

NASA Astrophysics Data System (ADS)

Lester, Rosemary Anne

Modulation of voice is a result of physiologic oscillation within one or more components of the vocal system including the breathing apparatus (i.e., pressure supply), the larynx (i.e. sound source), and the vocal tract (i.e., sound filter). These oscillations may be caused by pathological tremor associated with neurological disorders like essential tremor or by volitional production of vibrato in singers. Because the acoustical characteristics of voice modulation specific to each component of the vocal system and the effect of these characteristics on perception are not well-understood, it is difficult to assess individuals with vocal tremor and to determine the most effective interventions for reducing the perceptual severity of the disorder. The purpose of the present studies was to determine how the acoustical characteristics associated with laryngeal-based vocal tremor affect the perception of the magnitude of voice modulation, and to determine if adjustments could be made to the voice source and vocal tract filter to alter the acoustic output and reduce the perception of modulation. This research was carried out using both a computational model of speech production and trained singers producing vibrato to simulate laryngeal-based vocal tremor with different voice source characteristics (i.e., vocal fold length and degree of vocal fold adduction) and different vocal tract filter characteristics (i.e., vowel shapes). It was expected that, by making adjustments to the voice source and vocal tract filter that reduce the amplitude of the higher harmonics, the perception of magnitude of voice modulation would be reduced. The results of this study revealed that listeners' perception of the magnitude of modulation of voice was affected by the degree of vocal fold adduction and the vocal tract shape with the computational model, but only by the vocal quality (corresponding to the degree of vocal fold adduction) with the female singer. Based on regression analyses, listeners' judgments were predicted by modulation information in both low and high frequency bands. The findings from these studies indicate that production of a breathy vocal quality might be a useful compensatory strategy for reducing the perceptual severity of modulation of voice for individuals with tremor affecting the larynx.
New Perspective on Psychosocial Distress in Patients with Dysphonia: The Moderating Role of Perceived Control.

PubMed

Misono, Stephanie; Meredith, Liza; Peterson, Carol B; Frazier, Patricia A

2016-03-01

Although an association between psychosocial distress (depression, anxiety, somatization, and perceived stress) and voice disorders has been observed, little is known about the relationship between distress and patient-reported voice handicap. Furthermore, the psychological mechanisms underlying this relationship are poorly understood. Perceived control plays an important role in distress associated with other medical disorders. The objectives of this study were to (1) characterize the relationship between distress and patient-reported voice handicap and (2) examine the role of perceived control in this relationship. This is a cross-sectional study in a tertiary care academic voice clinic. Distress, perceived stress, voice handicap, and perceived control were measured using established assessment scales. Association was measured with Pearson correlation coefficients; moderation was assessed using multiple hierarchical regression. A total of 533 patients enrolled. Thirty-four percent of the patients met criteria for clinically significant distress (ie, depression, anxiety, and/or somatization). A weak association (r = 0.13; P = 0.003) was observed between severity of psychosocial distress and vocal handicap. Present perceived control was inversely associated with distress (r = -0.41; P < 0.0001), stress (r = -0.30; P < 0.0001), and voice handicap (r = -0.30; P < 0.0001). The relationship between voice handicap and psychosocial distress was moderated by perceived control (b for interaction term, -0.15; P < 0.001); greater vocal handicap was associated with greater distress in patients with low perceived control. Severity of distress and vocal handicap were positively related, and the relation between them was moderated by perceived control. Vocal handicap was more related to distress among those with low perceived control; targeting this potential mechanism may facilitate new approaches for improved care. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The role of the medial temporal limbic system in processing emotions in voice and music.

PubMed

Frühholz, Sascha; Trost, Wiebke; Grandjean, Didier

2014-12-01

Subcortical brain structures of the limbic system, such as the amygdala, are thought to decode the emotional value of sensory information. Recent neuroimaging studies, as well as lesion studies in patients, have shown that the amygdala is sensitive to emotions in voice and music. Similarly, the hippocampus, another part of the temporal limbic system (TLS), is responsive to vocal and musical emotions, but its specific roles in emotional processing from music and especially from voices have been largely neglected. Here we review recent research on vocal and musical emotions, and outline commonalities and differences in the neural processing of emotions in the TLS in terms of emotional valence, emotional intensity and arousal, as well as in terms of acoustic and structural features of voices and music. We summarize the findings in a neural framework including several subcortical and cortical functional pathways between the auditory system and the TLS. This framework proposes that some vocal expressions might already receive a fast emotional evaluation via a subcortical pathway to the amygdala, whereas cortical pathways to the TLS are thought to be equally used for vocal and musical emotions. While the amygdala might be specifically involved in a coarse decoding of the emotional value of voices and music, the hippocampus might process more complex vocal and musical emotions, and might have an important role especially for the decoding of musical emotions by providing memory-based and contextual associations. Copyright © 2014 Elsevier Ltd. All rights reserved.
Organizational uncertainty and stress among teachers in Hong Kong: work characteristics and organizational justice.

PubMed

Hassard, Juliet; Teoh, Kevin; Cox, Tom

2017-10-01

A growing literature now exists examining the relationship between organizational justice and employees' experience of stress. Despite the growth in this field of enquiry, there remain continued gaps in knowledge. In particular, the contribution of perceptions of justice to employees' stress within an organizational context of uncertainty and change, and in relation to the new and emerging concept of procedural-voice justice. The aim of the current study was to examine the main, interaction and additive effects of work characteristics and organizational justice perceptions to employees' experience of stress (as measured by their feelings of helplessness and perceived coping) during an acknowledged period of organizational uncertainty. Questionnaires were distributed among teachers in seven public primary schools in Hong Kong that were under threat of closure (n = 212). Work characteristics were measured using the demand-control-support model. Hierarchical regression analyses observed perceptions of job demands and procedural-voice justice to predict both teachers' feelings of helplessness and perceived coping ability. Furthermore, teacher's perceived coping was predicted by job control and a significant interaction between procedural-voice justice and distributive justice. The addition of organizational justice variables did account for unique variance, but only in relation to the measure of perceived coping. The study concludes that in addition to 'traditional' work characteristics, health promotion strategies should also address perceptions of organizational justice during times of organizational uncertainty; and, in particular, the value and importance of enhancing employee's perceived 'voice' in influencing and shaping justice-related decisions. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Utilization of Internet Protocol-Based Voice Systems in Remote Payload Operations

NASA Technical Reports Server (NTRS)

Chamberlain, jim; Bradford, Bob; Best, Susan; Nichols, Kelvin

2002-01-01

Due to limited crew availability to support science and the large number of experiments to be operated simultaneously, telescience is key to a successful International Space Station (ISS) science program. Crew, operations personnel at NASA centers, and researchers at universities and companies around the world must work closely together to per orm scientific experiments on-board ISS. The deployment of reliable high-speed Internet Protocol (IP)-based networks promises to greatly enhance telescience capabilities. These networks are now being used to cost-effectively extend the reach of remote mission support systems. They reduce the need for dedicated leased lines and travel while improving distributed workgroup collaboration capabilities. NASA has initiated use of Voice over Internet Protocol (VoIP) to supplement the existing mission voice communications system used by researchers at their remote sites. The Internet Voice Distribution System (IVoDS) connects remote researchers to mission support "loopsll or conferences via NASA networks and Internet 2. Researchers use NODS software on personal computers to talk with operations personnel at NASA centers. IVoDS also has the ;capability, if authorized, to allow researchers to communicate with the ISS crew during experiment operations. NODS was developed by Marshall Space Flight Center with contractors & Technology, First Virtual Communications, Lockheed-Martin, and VoIP Group. NODS is currently undergoing field-testing with full deployment for up to 50 simultaneous users expected in 2002. Research is being performed in parallel with IVoDS deployment for a next-generation system to qualitatively enhance communications among ISS operations personnel. In addition to the current voice capability, video and data/application-sharing capabilities are being investigated. IVoDS technology is also being considered for mission support systems for programs such as Space Launch Initiative and Homeland Defense.
Exploring interpersonal behavior and team sensemaking during health information technology implementation.

PubMed

Kitzmiller, Rebecca R; McDaniel, Reuben R; Johnson, Constance M; Lind, E Allan; Anderson, Ruth A

2013-01-01

We examine how interpersonal behavior and social interaction influence team sensemaking and subsequent team actions during a hospital-based health information technology (HIT) implementation project. Over the course of 18 months, we directly observed the interpersonal interactions of HIT implementation teams using a sensemaking lens. We identified three voice-promoting strategies enacted by team leaders that fostered team member voice and sensemaking; communicating a vision; connecting goals to team member values; and seeking team member input. However, infrequent leader expressions of anger quickly undermined team sensemaking, halting dialog essential to problem solving. By seeking team member opinions, team leaders overcame the negative effects of anger. Leaders must enact voice-promoting behaviors and use them throughout a team's engagement. Further, training teams in how to use conflict to achieve greater innovation may improve sensemaking essential to project risk mitigation. Health care work processes are complex; teams involved in implementing improvements must be prepared to deal with conflicting, contentious issues, which will arise during change. Therefore, team conflict training may be essential to sustaining sensemaking. Future research should seek to identify team interactions that foster sensemaking, especially when topics are difficult or unwelcome, then determine the association between staff sensemaking and the impact on HIT implementation outcomes. We are among the first to focus on project teams tasked with HIT implementation. This research extends our understanding of how leaders' behaviors might facilitate or impeded speaking up among project teams in health care settings.
Fluid-Structure Interactions as Flow Propagates Tangentially Over a Flexible Plate with Application to Voiced Speech Production

NASA Astrophysics Data System (ADS)

Westervelt, Andrea; Erath, Byron

2013-11-01

Voiced speech is produced by fluid-structure interactions that drive vocal fold motion. Viscous flow features influence the pressure in the gap between the vocal folds (i.e. glottis), thereby altering vocal fold dynamics and the sound that is produced. During the closing phases of the phonatory cycle, vortices form as a result of flow separation as air passes through the divergent glottis. It is hypothesized that the reduced pressure within a vortex core will alter the pressure distribution along the vocal fold surface, thereby aiding in vocal fold closure. The objective of this study is to determine the impact of intraglottal vortices on the fluid-structure interactions of voiced speech by investigating how the dynamics of a flexible plate are influenced by a vortex ring passing tangentially over it. A flexible plate, which models the medial vocal fold surface, is placed in a water-filled tank and positioned parallel to the exit of a vortex generator. The physical parameters of plate stiffness and vortex circulation are scaled with physiological values. As vortices propagate over the plate, particle image velocimetry measurements are captured to analyze the energy exchange between the fluid and flexible plate. The investigations are performed over a range of vortex formation numbers, and lateral displacements of the plate from the centerline of the vortex trajectory. Observations show plate oscillations with displacements directly correlated with the vortex core location.
The Federal Telecommunications System 2000, a Military Perspective

DTIC Science & Technology

1988-01-01

w . - I LL) p SECURITY CLASSIFICATIOr; C," THIS PAGE (When Data Ftvered) ’ J, REPORT DOCUMENTATION PACE READ INSTRUINT, R•PORT NUMBER iT ACCESSION No...the federal government’s telecommunications problems. This system will offer voice, data and video services across a transparent, nationwide network...of fiscal realities, proposes the FTS2000 as the answer to the federal government’s telecommunications problems. This system will * offer voice, data
Occupational voice demands and their impact on the call-centre industry.

PubMed

Hazlett, D E; Duffy, O M; Moorhead, S A

2009-04-20

Within the last decade there has been a growth in the call-centre industry in the UK, with a growing awareness of the voice as an important tool for successful communication. Occupational voice problems such as occupational dysphonia, in a business which relies on healthy, effective voice as the primary professional communication tool, may threaten working ability and occupational health and safety of workers. While previous studies of telephone call-agents have reported a range of voice symptoms and functional vocal health problems, there have been no studies investigating the use and impact of vocal performance in the communication industry within the UK. This study aims to address a significant gap in the evidence-base of occupational health and safety research. The objectives of the study are: 1. to investigate the work context and vocal communication demands for call-agents; 2. to evaluate call-agents' vocal health, awareness and performance; and 3. to identify key risks and training needs for employees and employers within call-centres. This is an occupational epidemiological study, which plans to recruit call-centres throughout the UK and Ireland. Data collection will consist of three components: 1. interviews with managers from each participating call-centre to assess their communication and training needs; 2. an online biopsychosocial questionnaire will be administered to investigate the work environment and vocal demands of call-agents; and 3. voice acoustic measurements of a random sample of participants using the Multi-dimensional Voice Program (MDVP). Qualitative content analysis from the interviews will identify underlying themes and issues. A multivariate analysis approach will be adopted using Structural Equation Modelling (SEM), to develop voice measurement models in determining the construct validity of potential factors contributing to occupational dysphonia. Quantitative data will be analysed using SPSS version 15. Ethical approval is granted for this study from the School of Communication, University of Ulster. The results from this study will provide the missing element of voice-based evidence, by appraising the interactional dimensions of vocal health and communicative performance. This information will be used to inform training for call-agents and to contribute to health policies within the workplace, in order to enhance vocal health.

The Effect of Hydration on Voice Quality in Adults: A Systematic Review.

PubMed

Alves, Maxine; Krüger, Esedra; Pillay, Bhavani; van Lierde, Kristiane; van der Linde, Jeannie

2017-11-06

We aimed to critically appraise scientific, peer-reviewed articles, published in the past 10 years on the effects of hydration on voice quality in adults. This is a systematic review. Five databases were searched using the key words "vocal fold hydration", "voice quality", "vocal fold dehydration", and "hygienic voice therapy". The Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) guidelines were followed. The included studies were scored based on American Speech-Language-Hearing Association's levels of evidence and quality indicators, as well as the Cochrane Collaboration's risk of bias tool. Systemic dehydration as a result of fasting and not ingesting fluids significantly negatively affected the parameters of noise-to-harmonics ratio (NHR), shimmer, jitter, frequency, and the s/z ratio. Water ingestion led to significant improvements in shimmer, jitter, frequency, and maximum phonation time values. Caffeine intake does not appear to negatively affect voice production. Laryngeal desiccation challenges by oral breathing led to surface dehydration which negatively affected jitter, shimmer, NHR, phonation threshold pressure, and perceived phonatory effort. Steam inhalation significantly improved NHR, shimmer, and jitter. Only nebulization of isotonic solution decreased phonation threshold pressure and showed some indication of a potential positive effect of nebulization substances. Treatments in high humidity environments prove to be effective and adaptations of low humidity environments should be encouraged. Recent literature regarding vocal hydration is high quality evidence. Systemic hydration is the easiest and most cost-effective solution to improve voice quality. Recent evidence therefore supports the inclusion of hydration in a vocal hygiene program. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback.

PubMed

Behroozmand, Roozbeh; Larson, Charles R

2011-06-06

The motor-driven predictions about expected sensory feedback (efference copies) have been proposed to play an important role in recognition of sensory consequences of self-produced motor actions. In the auditory system, this effect was suggested to result in suppression of sensory neural responses to self-produced voices that are predicted by the efference copies during vocal production in comparison with passive listening to the playback of the identical self-vocalizations. In the present study, event-related potentials (ERPs) were recorded in response to upward pitch shift stimuli (PSS) with five different magnitudes (0, +50, +100, +200 and +400 cents) at voice onset during active vocal production and passive listening to the playback. Results indicated that the suppression of the N1 component during vocal production was largest for unaltered voice feedback (PSS: 0 cents), became smaller as the magnitude of PSS increased to 200 cents, and was almost completely eliminated in response to 400 cents stimuli. Findings of the present study suggest that the brain utilizes the motor predictions (efference copies) to determine the source of incoming stimuli and maximally suppresses the auditory responses to unaltered feedback of self-vocalizations. The reduction of suppression for 50, 100 and 200 cents and its elimination for 400 cents pitch-shifted voice auditory feedback support the idea that motor-driven suppression of voice feedback leads to distinctly different sensory neural processing of self vs. non-self vocalizations. This characteristic may enable the audio-vocal system to more effectively detect and correct for unexpected errors in the feedback of self-produced voice pitch compared with externally-generated sounds.
Ad spending: maintaining market share.

PubMed

Jones, J P

1990-01-01

Accuracy in manufacturers' advertising budgeting is hampered by reliance on the case rate system, which ties budgets to sales. A better measure is a brand's market share compared with its share of voice (the brand's share of the total value of the main media exposure in that product category). New brands are often "investing" in the market: speaking in a louder voice than their market shares would justify. Popular brands are often "profit taking"--keeping their voices low but enjoying a disproportionately large market share. The interrelationship between market share and share of voice, with either "investing" or "profit taking" the desired result, is not usually considered when determining ad budgets. But as advertisers realize how market share can respond to advertising pressure through switches in the share of voice, this method of market testing should gain in importance.
Effects of audio compression in automatic detection of voice pathologies.

PubMed

Sáenz-Lechón, Nicolás; Osma-Ruiz, Víctor; Godino-Llorente, Juan I; Blanco-Velasco, Manuel; Cruz-Roldán, Fernando; Arias-Londoño, Julián D

2008-12-01

This paper investigates the performance of an automatic system for voice pathology detection when the voice samples have been compressed in MP3 format and different binary rates (160, 96, 64, 48, 24, and 8 kb/s). The detectors employ cepstral and noise measurements, along with their derivatives, to characterize the voice signals. The classification is performed using Gaussian mixtures models and support vector machines. The results between the different proposed detectors are compared by means of detector error tradeoff (DET) and receiver operating characteristic (ROC) curves, concluding that there are no significant differences in the performance of the detector when the binary rates of the compressed data are above 64 kb/s. This has useful applications in telemedicine, reducing the storage space of voice recordings or transmitting them over narrow-band communications channels.
Fostering Students' Science Inquiry through App Affordances of Multimodality, Collaboration, Interactivity, and Connectivity

ERIC Educational Resources Information Center

Beach, Richard; O'Brien, David

2015-01-01

This study examined 6th graders' use of the VoiceThread app as part of a science inquiry project on photosynthesis and carbon dioxide emissions in terms of their ability to engage in causal reasoning and their use of the affordances of multimodality, collaboration, interactivity, and connectivity. Students employed multimodal production using…
Application of AI techniques to a voice-actuated computer system for reconstructing and displaying magnetic resonance imaging data

NASA Astrophysics Data System (ADS)

Sherley, Patrick L.; Pujol, Alfonso, Jr.; Meadow, John S.

1990-07-01

To provide a means of rendering complex computer architectures languages and input/output modalities transparent to experienced and inexperienced users research is being conducted to develop a voice driven/voice response computer graphics imaging system. The system will be used for reconstructing and displaying computed tomography and magnetic resonance imaging scan data. In conjunction with this study an artificial intelligence (Al) control strategy was developed to interface the voice components and support software to the computer graphics functions implemented on the Sun Microsystems 4/280 color graphics workstation. Based on generated text and converted renditions of verbal utterances by the user the Al control strategy determines the user''s intent and develops and validates a plan. The program type and parameters within the plan are used as input to the graphics system for reconstructing and displaying medical image data corresponding to that perceived intent. If the plan is not valid the control strategy queries the user for additional information. The control strategy operates in a conversation mode and vocally provides system status reports. A detailed examination of the various AT techniques is presented with major emphasis being placed on their specific roles within the total control strategy structure. 1.
Interactive Voice Technology: Variations in the Vocal Utterances of Speakers Performing a Stress-Inducing Task,

DTIC Science & Technology

1983-08-16

34. " .. ,,,,.-j.Aid-is.. ;,,i . -i.t . "’" ’, V ,1 5- 4. 3- kHz 2-’ r 1 r s ’.:’ BOGEY 5D 0 S BOGEY 12D Figure 10. Spectrograms of two versions of the word...MF5852801B 0001 Reviewed by Approved and Released by Ashton Graybiel, M.D. Captain W. M. Houk , MC, USN Chief Scientific Advisor Commanding Officer 16 August...incorporating knowledge about these changes into speech recognition systems. i A J- I. . S , .4, ... ..’-° -- -iii l - - .- - i- . .. " •- - i ,f , i
Differential neural contributions to native- and foreign-language talker identification

PubMed Central

Perrachione, Tyler K.; Pierrehumbert, Janet B.; Wong, Patrick C.M.

2009-01-01

Humans are remarkably adept at identifying individuals by the sound of their voice, a behavior supported by the nervous system’s ability to integrate information from voice and speech perception. Talker-identification abilities are significantly impaired when listeners are unfamiliar with the language being spoken. Recent behavioral studies describing the language-familiarity effect implicate functionally integrated neural systems for speech and voice perception, yet specific neuroscientific evidence demonstrating the basis for such integration has not yet been shown. Listeners in the present study learned to identify voices speaking a familiar (native) or unfamiliar (foreign) language. The talker-identification performance of neural circuitry in each cerebral hemisphere was assessed using dichotic listening. To determine the relative contribution of circuitry in each hemisphere to ecological (binaural) talker identification abilities, we compared the predictive capacity of dichotic performance on binaural performance across languages. We found listeners’ right-ear (left hemisphere) performance to be a better predictor of overall accuracy in their native language than a foreign one. The enhanced predictive capacity of the classically language-dominant left-hemisphere on overall talker-identification accuracy demonstrates functionally integrated neural systems for speech and voice perception during natural talker identification. PMID:19968445
The effect of deep brain stimulation on the speech motor system.

PubMed

Mücke, Doris; Becker, Johannes; Barbe, Michael T; Meister, Ingo; Liebhart, Lena; Roettger, Timo B; Dembek, Till; Timmermann, Lars; Grice, Martine

2014-08-01

Chronic deep brain stimulation of the nucleus ventralis intermedius is an effective treatment for individuals with medication-resistant essential tremor. However, these individuals report that stimulation has a deleterious effect on their speech. The present study investigates one important factor leading to these effects: the coordination of oral and glottal articulation. Sixteen native-speaking German adults with essential tremor, between 26 and 86 years old, with and without chronic deep brain stimulation of the nucleus ventralis intermedius and 12 healthy, age-matched subjects were recorded performing a fast syllable repetition task (/papapa/, /tatata/, /kakaka/). Syllable duration and voicing-to-syllable ratio as well as parameters related directly to consonant production, voicing during constriction, and frication during constriction were measured. Voicing during constriction was greater in subjects with essential tremor than in controls, indicating a perseveration of voicing into the voiceless consonant. Stimulation led to fewer voiceless intervals (voicing-to-syllable ratio), indicating a reduced degree of glottal abduction during the entire syllable cycle. Stimulation also induced incomplete oral closures (frication during constriction), indicating imprecise oral articulation. The detrimental effect of stimulation on the speech motor system can be quantified using acoustic measures at the subsyllabic level.
Mapping Phonetic Features for Voice-Driven Sound Synthesis

NASA Astrophysics Data System (ADS)

Janer, Jordi; Maestre, Esteban

In applications where the human voice controls the synthesis of musical instruments sounds, phonetics convey musical information that might be related to the sound of the imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrument-dependent, but they remain constant for a single subject and instrument. We propose a user-adapted system, where mappings from voice features to synthesis parameters depend on how subjects sing musical articulations, i.e. note to note transitions. The system consists of two components. First, a voice signal segmentation module that automatically determines note-to-note transitions. Second, a classifier that determines the type of musical articulation for each transition based on a set of phonetic features. For validating our hypothesis, we run an experiment where subjects imitated real instrument recordings with their voice. Performance recordings consisted of short phrases of saxophone and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (user-independent). Finally, from the previous results we show how to control the articulation in a sample-concatenation synthesizer by selecting the most appropriate samples.
The effects of voice and manual control mode on dual task performance

NASA Technical Reports Server (NTRS)

Wickens, C. D.; Zenyuh, J.; Culp, V.; Marshak, W.

1986-01-01

Two fundamental principles of human performance, compatibility and resource competition, are combined with two structural dichotomies in the human information processing system, manual versus voice output, and left versus right cerebral hemisphere, in order to predict the optimum combination of voice and manual control with either hand, for time-sharing performance of a dicrete and continuous task. Eight right handed male subjected performed a discrete first-order tracking task, time-shared with an auditorily presented Sternberg Memory Search Task. Each task could be controlled by voice, or by the left or right hand, in all possible combinations except for a dual voice mode. When performance was analyzed in terms of a dual-task decrement from single task control conditions, the following variables influenced time-sharing efficiency in diminishing order of magnitude, (1) the modality of control, (discrete manual control of tracking was superior to discrete voice control of tracking and the converse was true with the memory search task), (2) response competition, (performance was degraded when both tasks were responded manually), (3) hemispheric competition, (performance degraded whenever two tasks were controlled by the left hemisphere) (i.e., voice or right handed control). The results confirm the value of predictive models invoice control implementation.
A cyber-physical management system for delivering and monitoring surgical instruments in the OR.

PubMed

Li, Yu-Ting; Jacob, Mithun; Akingba, George; Wachs, Juan P

2013-08-01

The standard practice in the operating room (OR) is having a surgical technician deliver surgical instruments to the surgeon quickly and inexpensively, as required. This human "in the loop" system may result in mistakes (eg, missing information, ambiguity of instructions, and delays). Errors can be reduced or eliminated by integrating information technology (IT) and cybernetics into the OR. Gesture and voice automatic acquisition, processing, and interpretation allow interaction with these new systems without disturbing the normal flow of surgery. This article describes the development of a cyber-physical management system (CPS), including a robotic scrub nurse, to support surgeons by passing surgical instruments during surgery as required and recording counts of surgical instruments into a personal health record (PHR). The robot used responds to hand signals and voice messages detected through sophisticated computer vision and data mining techniques. The CPS was tested during a mock surgery in the OR. The in situ experiment showed that the robot recognized hand gestures reliably (with an accuracy of 97%), it can retrieve instruments as close as 25 mm, and the total delivery time was less than 3 s on average. This online health tool allows the exchange of clinical and surgical information to electronic medical record-based and PHR-based applications among different hospitals, regardless of the style viewer. The CPS has the potential to be adopted in the OR to handle surgical instruments and track them in a safe and accurate manner, releasing the human scrub tech from these tasks.
Effects of vocal training and phonatory task on voice onset time.

PubMed

McCrea, Christopher R; Morris, Richard J

2007-01-01

The purpose of this study was to examine the temporal-acoustic differences between trained singers and nonsingers during speech and singing tasks. Thirty male participants were separated into two groups of 15 according to level of vocal training (ie, trained or untrained). The participants spoke and sang carrier phrases containing English voiced and voiceless bilabial stops, and voice onset time (VOT) was measured for the stop consonant productions. Mixed analyses of variance revealed a significant main effect between speech and singing for /p/ and /b/, with VOT durations longer during speech than singing for /p/, and the opposite true for /b/. Furthermore, a significant phonatory task by vocal training interaction was observed for /p/ productions. The results indicated that the type of phonatory task influences VOT and that these influences are most obvious in trained singers secondary to the articulatory and phonatory adjustments learned during vocal training.
Irregular vocal fold dynamics incited by asymmetric fluid loading in a model of recurrent laryngeal nerve paralysis

NASA Astrophysics Data System (ADS)

Sommer, David; Erath, Byron D.; Zanartu, Matias; Peterson, Sean D.

2011-11-01

Voiced speech is produced by dynamic fluid-structure interactions in the larynx. Traditionally, reduced order models of speech have relied upon simplified inviscid flow solvers to prescribe the fluid loadings that drive vocal fold motion, neglecting viscous flow effects that occur naturally in voiced speech. Viscous phenomena, such as skewing of the intraglottal jet, have the most pronounced effect on voiced speech in cases of vocal fold paralysis where one vocal fold loses some, or all, muscular control. The impact of asymmetric intraglottal flow in pathological speech is captured in a reduced order two-mass model of speech by coupling a boundary-layer estimation of the asymmetric pressures with asymmetric tissue parameters that are representative of recurrent laryngeal nerve paralysis. Nonlinear analysis identifies the emergence of irregular and chaotic vocal fold dynamics at values representative of pathological speech conditions.
Differences in botulinum toxin dosing between patients with adductor spasmodic dysphonia and essential voice tremor.

PubMed

Orbelo, Diana M; Duffy, Joseph R; Hughes Borst, Becky J; Ekbom, Dale; Maragos, Nicolas E

2014-01-01

To explore possible dose differences in average botulinum toxin (BTX) given to patients with adductor spasmodic dysphonia (ADSD) compared with patients with essential voice tremor (EVT). A retrospective study compared the average BTX dose injected in equal doses to the thyroarytenoid (TA) muscles of 51 patients with ADSD with 52 patients with EVT. Those with ADSD received significantly higher total doses (6.80 ± 2.79 units) compared with those with EVT (5.02 ± 1.65 units). Dose at time of first injection, age at time of first injection, gender, year of first injection, and average time between injections were included in multivariate analysis but did not interact with total average dose findings. Patients with ADSD may need relatively higher doses of BTX injections to bilateral TA muscles compared with patients with EVT. Copyright © 2014 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
In vitro experimental investigation of voice production

PubMed Central

Horáčcek, Jaromír; Brücker, Christoph; Becker, Stefan

2012-01-01

The process of human phonation involves a complex interaction between the physical domains of structural dynamics, fluid flow, and acoustic sound production and radiation. Given the high degree of nonlinearity of these processes, even small anatomical or physiological disturbances can significantly affect the voice signal. In the worst cases, patients can lose their voice and hence the normal mode of speech communication. To improve medical therapies and surgical techniques it is very important to understand better the physics of the human phonation process. Due to the limited experimental access to the human larynx, alternative strategies, including artificial vocal folds, have been developed. The following review gives an overview of experimental investigations of artificial vocal folds within the last 30 years. The models are sorted into three groups: static models, externally driven models, and self-oscillating models. The focus is on the different models of the human vocal folds and on the ways in which they have been applied. PMID:23181007
The effect of voice communications latency in high density, communications-intensive airspace.

DOT National Transportation Integrated Search

2003-01-01

The Federal Aviation Administration (FAA) Next Generation Air-Ground Communications program plans to replace aging analog radio equipment with the Very High Frequency Digital Link Mode 3 (VDL3) system. VDL3 will implement both digital voice and data ...
An adaptive narrow band frequency modulation voice communication system

NASA Technical Reports Server (NTRS)

Wishna, S.

1972-01-01

A narrow band frequency modulation communication system is described which provides for the reception of good quality voice at low carrier-to-noise ratios. The high level of performance is obtained by designing a limiter and phase lock loop combination as a demodulator, so that the bandwidth of the phase lock loop decreases as the carrier level decreases. The system was built for the position location and aircraft communication equipment experiment of the ATS 6 program.
High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Voice Call Analysis

DTIC Science & Technology

2015-09-01

Gateway 2 4. Voice Packet Flow: SIP , Session Description Protocol (SDP), and RTP 3 5. Voice Data Analysis 5 6. Call Analysis 6 7. Call Metrics 6...analysis processing is designed for a general VoIP system architecture based on Session Initiation Protocol ( SIP ) for negotiating call sessions and...employs Skinny Client Control Protocol for network communication between the phone and the local CallManager (e.g., for each dialed digit), SIP
Accuracy and Speed of Response to Different Voice Types in a Cockpit Voice Warning System

DTIC Science & Technology

1983-09-01

military aircraft. Different levels of engine background noise, signal to noise ratio of the warning message, and precursor delivery formats were used. The...flight deck signals, the Society of Automotive Engineers stated that a unique, attention-getting sound (such as a chime, 4 etc.) together with voice...aircr,-ft wherein there is no flight engineer position" (cited in Thorburn, 1971, p. 3). The AFIAS letter cited several incidents in which the VWS had

Some links on this page may take you to non-federal websites. Their policies may differ from this site.