Note: This page contains sample records for the topic speech from Science.gov.
While these samples are representative of the content of Science.gov,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of Science.gov
to obtain the most current and comprehensive results.
Last update: August 15, 2014.
1

Speech, Speech!  

ERIC Educational Resources Information Center

Discussion focuses on the nature of computer-generated speech and voice synthesis today. State-of-the-art devices for home computers are called text-to-speech (TTS) systems. Details about the operation and use of TTS synthesizers are provided, and the time saving in programing over previous methods is emphasized. (MP)

McComb, Gordon

1982-01-01

2

Speech Synthesis  

NASA Astrophysics Data System (ADS)

Text-to-speech (TTS) synthesis is the art of designing talking machines. It is often seen by engineers as an easy task, compared to speech recognition.1 It is true, indeed, that it is easier to create a bad, first trial text-to-speech (TTS) system than to design a rudimentary speech recognizer.

Dutoit, Thierry; Bozkurt, Baris

3

Sparkling Speeches  

NSDL National Science Digital Library

Sparkling is the word! In this lesson, students will investigate transforming an exciting student-created expository into an engaging and quality speech using resources from the classroom and the school media center. Students will listen to a remarkable Martin Luther King speech provided by YouTube, confer with classmates on speech construction, and use a variety of easy to access materials (included with this lesson) during the construction of their speech. The lesson allows for in-depth trials and experiments with expository writing and speech writing. In one exciting option, students may use a "Speech Forum" to safely practice their unique speeches in front of a small non-assessing audience of fellow students. A complete exploration and comprehension of introductions, main ideas with support details, and an engaging conclusion transformed into a student speech with a written exam are the final assessments for this memorable lesson.

2012-12-14

4

Speech Aids  

NASA Technical Reports Server (NTRS)

Designed to assist deaf and hearing impaired-persons in achieving better speech, Resnick Worldwide Inc.'s device provides a visual means of cuing the deaf as a speech-improvement measure. This is done by electronically processing the subjects' sounds and comparing them with optimum values which are displayed for comparison.

1987-01-01

5

Speech Problems  

MedlinePLUS

... However, the difference is that cluttering is a language disorder, while stuttering is a speech disorder. A person ... examine how and when you do so. Speech-language pathologists may evaluate ... in fluency disorders may use computerized analysis. By gathering as much ...

6

Symbolic Speech  

ERIC Educational Resources Information Center

The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)

Podgor, Ellen S.

1976-01-01

7

Speech Communication.  

ERIC Educational Resources Information Center

The communications approach to teaching speech to high school students views speech as the study of the communication process in order to develop an awareness of and a sensitivity to the variables that affect human interaction. In using this approach the student is encouraged to try out as many types of messages using as many techniques and…

Anderson, Betty

8

Speech coding  

SciTech Connect

Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.

Ravishankar, C., Hughes Network Systems, Germantown, MD

1998-05-08

9

Speech Research.  

National Technical Information Service (NTIS)

Partial Contents: Speech Perception, Children's Memory for Sentences and Word Strings in Relation to Reading Ability, Effects of Vocalic Formant Transitions and Vowel Quality on the English (s)-(s) Boundary, Influence of Vocalic Context on Perception of t...

A. S. Abramson T. Baer F. Bell-Berti C. Best G. J. Borden

1979-01-01

10

Speech and Language Impairments  

MedlinePLUS

... 11] Back to top Development of Speech and Language Skills in Childhood Speech and language skills develop ... story. Back to top Characteristics of Speech or Language Impairments The characteristics of speech or language impairments ...

11

Great American Speeches  

NSDL National Science Digital Library

Watch the video presentations of each of these speeches. Gettysburg address Martin Luther King- I Have a Dream Freedom of Speech by Mario Savio Mario Savio Speech New worker plan Speech by FDR For manuscripts, audio and video of many other modern and past speeches follow the link below: American Speech Bank ...

Olsen, Ms.

2006-11-14

12

Speech communications in noise  

NASA Technical Reports Server (NTRS)

The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

1984-01-01

13

Speech warnings: a review  

Microsoft Academic Search

This article reviews the use and design of speech warnings in terms of ergonomic considerations. Firstly, it considers the benefits of using the auditory channel and technological approaches to producing artificial speech, Secondly, the characteristics of human and machine-generated speech are reviewed: the latter focusing on naturalness, intelligibility, rate of presentation, emotional content and quality. Thirdly, non-speech and speech warnings,

J. M. Noyes; E. Hellier; J. Edworthy

2006-01-01

14

Speech and Voice Problems  

MedlinePLUS

... Text Larger Text Print In this article Overview Speech disorders are fairly common in MS. Speech patterns are ... it difficult to speak and be understood. Medically, speech disorders are called dysarthrias . One pattern that is commonly ...

15

Speech and Language Disorders  

MedlinePLUS

... This information in Spanish ( en español ) Speech and language disorders More information on speech and language disorders ... Return to top More information on Speech and language disorders Explore other publications and websites Aphasia - This ...

16

Speech research  

NASA Astrophysics Data System (ADS)

Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.

1992-06-01

17

Interactions between speech coders and disordered speech  

Microsoft Academic Search

We examined the impact of standard speech coders currently used in modern communication systems, on the quality of speech from persons with common speech and voice disorders. Four standardized coders, viz. G. 728 LD-CELP, GSM 6.10 RPE-LTP, FS1016 CELP, FS1015 LPC and the recently proposed US Federal Standard 2400 bps MELP were evaluated with speech samples collected from 30 disordered

Vijay Parsa; D. G. Jamieson

2003-01-01

18

Handheld Speech to Speech Translation System  

Microsoft Academic Search

Recent Advances in the processing capabilities of handheld devices (PDAs or mobile phones) have provided the opportunity for\\u000a enablement of speech recognition system, and even end-to-end speech translation system on these devices. However, two-way free-form speech-to-speech translation (as opposite to fixed phrase translation) is a highly complex task. A large amount of computation\\u000a is involved to achieve reliable transformation performance.

Yuqing Gao; Bowen Zhou; Weizhong Zhu; Wei Zhang

19

Models of speech synthesis.  

PubMed

The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community. PMID:7479805

Carlson, R

1995-10-24

20

Speech Skimmer: Interactively Skimming Recorded Speech  

Microsoft Academic Search

ABSTRACT Skimming,or browsing,audio,recordings,is much,more difficult than visually scanning,a document,because,of the temporal,nature,of audio.,By exploiting,properties,of spontaneous,speech,it is possible,to automatically,select and,present,salient,audio,segments,in a time-efficient manner.,Techniques,for segmenting,recordings,and,a prototype,user interface for skimming,speech,are described. The system,developed,incorporates,time-compressed,speech and,pause,removal,to reduce,the time,needed,to listen to speech,recordings.,This paper,presents,a multi-level approach to auditory skimming, along with user interface techniques,for interacting,with,the audio,and,providing feedback.,Several time,compression,algorithms,ami,an adaptive,speech detection technique,are also stuntnarized. KEYWORDS Speech skimming, browsing, speech

B. Arons

1994-01-01

21

Towards Universal Speech Recognition  

Microsoft Academic Search

The increasing interest in multilingual applications like speech-to-speech translation systems is accompanied by the need for speech recognition front-ends in many languages that can also handle multiple input languages at the same time. In this paper we describe a universal speech recognition system that fulfills such needs. It is trained by sharing speech and text data across languages and thus

Zhirong Wang; Umut Topkara; Tanja Schultz; Alex Waibel

2002-01-01

22

Speech research directions  

SciTech Connect

This paper presents an overview of the current activities in speech research. The authors discuss the state of the art in speech coding, text-to-speech synthesis, speech recognition, and speaker recognition. In the speech coding area, current algorithms perform well at bit rates down to 9.6 kb/s, and the research is directed at bringing the rate for high-quality speech coding down to 2.4 kb/s. In text-to-speech synthesis, what we currently are able to produce is very intelligible but not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthetic speech that these systems produce. Finally, today's systems for speech and speaker recognition provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, etc.

Atal, B.S.; Rabiner, L.R.

1986-09-01

23

Cochlear implant speech recognition with speech maskers  

Microsoft Academic Search

Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners,

Ginger S. Stickney; Fan-Gang Zeng; Ruth Litovsky; Peter Assmann

2004-01-01

24

Delayed Speech or Language Development  

MedlinePLUS

... your child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and ... for example). Continue The Difference Between Speech and Language Speech and language are often confused, but there ...

25

Linguistic Resources for Speech Parsing.  

National Technical Information Service (NTIS)

We report on the success of a two-pass approach to annotating metadata, speech effects and syntactic structure in English conversational speech: separately annotating transcribed speech for structural metadata, or structural events, (fillers, speech repai...

A. Bies S. Strassel H. Lee K. Maeda S. Kulick

2006-01-01

26

Speech Compression and Synthesis.  

National Technical Information Service (NTIS)

This document reports on work towards a very low rate phonetic vocoder, text to speech, and multirate speech compression. This work included improvement of the phonetic synthesis algorithms and continued gathering of the diphone templates data base for ph...

M. Berouti J. Klovstad J. Makhoul R. Schwartz J. Sorensen

1979-01-01

27

Trainable Videorealistic Speech Animation.  

National Technical Information Service (NTIS)

We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a ...

T. Ezzat, G. Geiger, T. Poggio

2006-01-01

28

Apraxia of Speech  

MedlinePLUS

... inflections of speech that are used to help express meaning. Children with developmental apraxia of speech generally ... than they are able to use language to express themselves. Some children with the disorder may also ...

29

Speech disorders - children  

MedlinePLUS

... person has problems creating or forming the speech sounds needed to communicate with others. Three common speech ... are disorders in which a person repeats a sound, word, or phrase. Stuttering may be the most ...

30

Speech-to-Speech Relay Service  

MedlinePLUS

... are specifically trained in understanding a variety of speech disorders, which enables them to repeat what the caller says in a manner that makes the caller’s words clear and understandable to the called ... Often people with speech disabilities cannot communicate by telephone because the parties ...

31

Flexible speech translation systems  

Microsoft Academic Search

Speech translation research has made significant progress over the years with many high-visibility efforts showing that translation of spontaneously spoken speech from and to di- verse languages is possible and applicable in a variety of domains. As language and domains continue to expand, practical concerns such as portability and reconfigurability of speech come into play: system maintenance becomes a key

Tanja Schultz; Alan W. Black; Stephan Vogel; Monika Woszczyna

2006-01-01

32

Machine Translation from Speech  

NASA Astrophysics Data System (ADS)

This chapter describes approaches for translation from speech. Translation from speech presents two new issues. First, of course, we must recognize the speech in the source language. Although speech recognition has improved considerably over the last three decades, it is still far from being a solved problem. In the best of conditions, when the speech comes from high quality, carefully enunciated speech, on common topics (such as speech read by a trained news broadcaster), the word error rate is typically on the order of 5%. Humans can typically transcribe speech like this with less than 1% disagreement between annotators, so even this best number is still far worse than human performance. However, the task gets much harder when anything changes from this ideal condition. Some of the conditions that cause higher error rate are, if the topic is somewhat unusual, or the speakers are not reading so that their speech is more spontaneous, or if the speakers have an accent or are speaking a dialect, or if there is any acoustic degradation, such as noise or reverberation. In these cases, the word error can increase significantly to 20%, 30%, or higher. Accordingly, most of this chapter discusses techniques for improving speech recognition accuracy, while one section discusses techniques for integrating speech recognition with translation.

Schwartz, Richard; Olive, Joseph; McCary, John; Christianson, Caitlin

33

Electronic commerce and free speech  

Microsoft Academic Search

For commercial purveyors of digital speech, information and entertainment, the biggest threat posed by the Internet isn't the threat of piracy, but the threat posed by free speech -- speech that doesn't cost any money. Free speech has the potential to squeeze out expensive speech. A glut of high quality free stuff has the potential to run companies in the

Jessica Litman

1999-01-01

34

RAPID DEVLOPEMENT OF SPEECH-TO-SPEECH TRANSLATION SYSTEMS  

Microsoft Academic Search

This paper describes building of the basic components, par- ticularly speech recognition and synthesis, of a speech-to- speech translation system. This work is described within the framework of the \\

Alan W Black; Ralf D. Brown; Robert Frederking; Kevin Lenzo; John Moody; Alexander Rudnicky; Rita Singh; Eric Steinbrecher

2002-01-01

35

Analyzing a Famous Speech  

NSDL National Science Digital Library

After gaining skill through analyzing a historic and contemporary speech as a class, students will select a famous speech from a list compiled from several resources and write an essay that identifies and explains the rhetorical strategies that the author deliberately chose while crafting the text to make an effective argument. Their analysis will consider questions such as: What makes the speech an argument?, How did the author's rhetoric evoke a response from the audience?, and Why are the words still venerated today?

Noel, Melissa W.

2012-08-01

36

Polyphase speech recognition  

Microsoft Academic Search

We propose a model for speech recognition that consists of multiple semi-synchronized recognizers operating on a polyphase decomposition of standard speech features. Specifically, w e con- sider multiple out-of-phase downsampled speech features as separate streams which are modeled separately at the lowest level, and are then integrated at the higher level (words) during first-pas s decod- ing. Our model lessens

Hui Lin; Jeff Bilmes

2008-01-01

37

Advances in speech processing  

NASA Astrophysics Data System (ADS)

The field of speech processing is undergoing a rapid growth in terms of both performance and applications and this is fueled by the advances being made in the areas of microelectronics, computation, and algorithm design. The use of voice for civil and military communications is discussed considering advantages and disadvantages including the effects of environmental factors such as acoustic and electrical noise and interference and propagation. The structure of the existing NATO communications network and the evolving Integrated Services Digital Network (ISDN) concept are briefly reviewed to show how they meet the present and future requirements. The paper then deals with the fundamental subject of speech coding and compression. Recent advances in techniques and algorithms for speech coding now permit high quality voice reproduction at remarkably low bit rates. The subject of speech synthesis is next treated where the principle objective is to produce natural quality synthetic speech from unrestricted text input. Speech recognition where the ultimate objective is to produce a machine which would understand conversational speech with unrestricted vocabulary, from essentially any talker, is discussed. Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. It is for this reason that the paper is concerned primarily with this technique.

Ince, A. Nejat

1992-10-01

38

Hearing or speech impairment - resources  

MedlinePLUS

Resources - hearing or speech impairment ... The following organizations are good resources for information on hearing impairment or speech impairment: American Speech-Language-Hearing Association - www.asha.org National Dissemination Center for Children ...

39

Reviews: The High School Basic Speech Text  

ERIC Educational Resources Information Center

Critical evaluations of "The Art of Speaking, "Building Better Speech, "The New American Speech, "Public Speaking: The Essentials, "Speak Up! "Speech: A High School Course, "The Speech Arts," Speech for All," Speech for Today," Speech in Action," and Speech in American Society". (RD)

Klopf, Donald W.

1970-01-01

40

Speech Sound Disorders: Articulation and Phonological Processes  

MedlinePLUS

Speech Sound Disorders: Articulation and Phonological Processes What are speech sound disorders ? Can adults have speech sound disorders ? What ... individuals with speech sound disorders ? What are speech sound disorders? Most children make some mistakes as they ...

41

Cochlear implant speech recognition with speech maskers.  

PubMed

Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced. PMID:15376674

Stickney, Ginger S; Zeng, Fan-Gang; Litovsky, Ruth; Assmann, Peter

2004-08-01

42

Cochlear implant speech recognition with speech maskers  

NASA Astrophysics Data System (ADS)

Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners, target-masker combinations were processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli, a normal-hearing control group maintained high levels of intelligibility down to target-to-masker ratios as low as 0 dB and showed a release from masking, producing better performance with single-talker maskers than with steady-state noise. In contrast, no masking release was observed in either implant or normal-hearing subjects listening through an implant simulation. The performance of the simulation and implant groups did not improve when the single-talker masker was a different talker compared to the same talker as the target speech, as was found in the normal-hearing control. These results are interpreted as evidence for a significant role of informational masking and modulation interference in cochlear implant speech recognition with fluctuating maskers. This informational masking may originate from increased target-masker similarity when spectral resolution is reduced.

Stickney, Ginger S.; Zeng, Fan-Gang; Litovsky, Ruth; Assmann, Peter

2004-08-01

43

Research in speech communication.  

PubMed Central

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13

Flanagan, J

1995-01-01

44

Sampled Speech Compression System.  

National Technical Information Service (NTIS)

A sampled speech compression system, for two-dimensional processing of speech or other types of audio signal, comprises transmit/encode apparatus and receive/decode apparatus. The transmit/encode apparatus comprises means, adapted to receive an input sign...

H. J. Whitehouse J. M. Alsup

1979-01-01

45

Chief Seattle's Speech Revisited  

ERIC Educational Resources Information Center

Indian orators have been saying good-bye for more than three hundred years. John Eliot's "Dying Speeches of Several Indians" (1685), as David Murray notes, inaugurates a long textual history in which "Indians... are most useful dying," or, as in a number of speeches, bidding the world farewell as they embrace an undesired but apparently inevitable…

Krupat, Arnold

2011-01-01

46

Research in speech communication.  

PubMed

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. PMID:7479806

Flanagan, J

1995-10-24

47

Private Speech in Ballet  

ERIC Educational Resources Information Center

Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

Johnston, Dale

2006-01-01

48

Egocentric Speech Reconsidered.  

ERIC Educational Resources Information Center

A range of language use model is proposed as an alternative conceptual framework to a stage model of egocentric speech. The range of language use model is proposed to clarify the meaning of the term egocentric speech, to examine the validity of stage assumptions, and to explain the existence of contextual variation in the form of children's…

Braunwald, Susan R.

49

Tracking Speech Sound Acquisition  

ERIC Educational Resources Information Center

This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

Powell, Thomas W.

2011-01-01

50

Automatic speech recognition  

NASA Astrophysics Data System (ADS)

Great strides have been made in the development of automatic speech recognition (ASR) technology over the past thirty years. Most of this effort has been centered around the extension and improvement of Hidden Markov Model (HMM) approaches to ASR. Current commercially-available and industry systems based on HMMs can perform well for certain situational tasks that restrict variability such as phone dialing or limited voice commands. However, the holy grail of ASR systems is performance comparable to humans-in other words, the ability to automatically transcribe unrestricted conversational speech spoken by an infinite number of speakers under varying acoustic environments. This goal is far from being reached. Key to the success of ASR is effective modeling of variability in the speech signal. This tutorial will review the basics of ASR and the various ways in which our current knowledge of speech production, speech perception and prosody can be exploited to improve robustness at every level of the system.

Espy-Wilson, Carol

2005-04-01

51

Speech, Language & Hearing Association  

NSDL National Science Digital Library

The American Speech-Language-Hearing Association’s (ASHA) mission statement is to “promote the interests of and provide the highest quality services for professionals in audiology, speech-language pathology, and speech and hearing science.” Their website is designed to help ASHA accomplish this task, and is a valuable resource for anyone involved in this industry. The ASHA has been around for 79 years and in that time has created resources for students and the general public, in order to educate people about speech and communication disorders and diseases. The site includes detailed explanations on many diseases and disorders and provides additional resources for those who want to learn more. For students, there are sections with information on various speech, language, and hearing professions; a guide to academic programs; and a useful guide to the Praxis exam required for many of these professions.

2006-12-28

52

Recognition of conversational telephone speech using the JANUS speech engine  

Microsoft Academic Search

Recognition of conversational speech is one of the most challenging speech recognition tasks to-date. While recognition error rates of 10% or lower can now be reached on speech dictation tasks over vocabularies in excess of 60,000 words, recognition of conversational speech has persistently resisted most attempts at improvements by way of the proven techniques to date. Difficulties arise from shorter

Torsten Zeppenfeld; Michael Finke; Klaus Ries; Martin Westphal; Alex Waibel

1997-01-01

53

Normalization of the Speech Modulation Spectra for Robust Speech Recognition  

Microsoft Academic Search

Abstract—In this paper, we study a novel technique that normalizes the modulation spectra of speech signals for robust speech recognition. The modulation spectra of a speech signal are the power spectral density (PSD) functions of the feature trajectories generated from the signal, hence they describe the temporal structure of the features. The modulation spectra are distorted when,the speech signal is

Xiong Xiao; Chng Eng Siong; Haizhou Li

2008-01-01

54

PROBLEMS OF MEASURING SPEECH RATE.  

ERIC Educational Resources Information Center

A DISCUSSION WAS PRESENTED ON THE PROBLEMS OF MEASURING SPEECH RATE, A CRITICAL VARIABLE IN SPEECH COMPRESSION, BOTH IN DESCRIBING THE INPUT TO ANY SPEECH COMPRESSION SYSTEM AND IN CHARACTERIZING THE OUTPUT. THE DISCUSSION WAS LIMITED TO SPEECH RATE MEASUREMENT OF "ORAL READING RATE," ONLY, AND DID NOT DEAL WITH THE MEASUREMENT OF "SPONTANEOUS…

CARROLL, JOHN B.

55

Speech Correction in the Schools.  

ERIC Educational Resources Information Center

An introduction to the problems and therapeutic needs of school age children whose speech requires remedial attention, the text is intended for both the classroom teacher and the speech correctionist. General considerations include classification and incidence of speech defects, speech correction services, the teacher as a speaker, the mechanism…

Eisenson, Jon; Ogilvie, Mardel

56

Voice and Speech after Laryngectomy  

ERIC Educational Resources Information Center

The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

2006-01-01

57

The Effect of SpeechEasy on Stuttering Frequency, Speech Rate, and Speech Naturalness  

ERIC Educational Resources Information Center

The effects of SpeechEasy on stuttering frequency, stuttering severity self-ratings, speech rate, and speech naturalness for 31 adults who stutter were examined. Speech measures were compared for samples obtained with and without the device in place in a dispensing setting. Mean stuttering frequencies were reduced by 79% and 61% for the device…

Armson, Joy; Kiefte, Michael

2008-01-01

58

Great American Speeches  

NSDL National Science Digital Library

This new companion site from PBS offers an excellent collection of speeches, some with audio and video clips, from many of the nation's "most influential and poignant speakers of the recorded age." In the Speech Archives, users will find a timeline of significant 20th-century events interspersed with the texts of over 90 speeches, some of which also offer background and audio or video clips. Additional sections of the site include numerous activities for students: two quizzes in the American History Challenge, Pop-Up Trivia, A Wordsmith Challenge, Critics' Corner and Could You be a Politician? which allows visitors to try their hand at reading a speech off of a teleprompter.

59

Churchill Speech Interactive  

NSDL National Science Digital Library

A number of highly coordinated online "rich content" digital projects have come online in the past few years, and a number of them are indicative of the far-reaching possibilities of such endeavors. One such project is the Churchill Speech Interactive website which is a special initiative in Web learning that features a multi-media presentation of Winston Churchill's "Sinews of Peace" speech delivered at Westminster College. The speech is best known for introducing the famous phrase "Iron Curtain", and users will find the complete audio speech here, but enhanced entirely through the ability to weave through a clickable interface that provides content information about the broader historical context of world events, organized around a number of key themes, such as "Europe in Ruins", "The Atom Bomb", and "Churchill and Europe". Overall, the site is quite fascinating.

60

Secure Digital Speech Communication.  

National Technical Information Service (NTIS)

This invention relates to a secure digital speech communication system, and more particularly to waveform coding with digital samples. A novel form of waveform coding is used, designed a Critical Point Coder (CPC). It operates by transmitting only those p...

G. Benke

1984-01-01

61

Speech and Communication Disorders  

MedlinePLUS

Many disorders can affect our ability to speak and communicate. They range from saying sounds incorrectly to ... speak or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those ...

62

Speech Understanding Systems.  

National Technical Information Service (NTIS)

This project is an effort to develop a continuous speech understanding system which uses syntactic, semantic and pragmatic support from higher level linguistic knowledge sources to compensate for the inherent acoustic indeterminacies in continuous spoken ...

W. A. Woods, M. Bates, G. Brown, B. Bruce, C. Cook

1976-01-01

63

[Dysphagia and speech disorders].  

PubMed

Swallowing disorders in oral and pharyngeal phase after surgery of mouth, pharynx or larynx are very often interrelated with speech and voice disorders. The results of diagnostic methods of dysphagia and voice/speech disorders based on own material of patients after total laryngectomy, partial tongue resection and cleft palate surgery were presented. Attention was also paid to other etiological factors of swallowing disorders observed in phoniatric practice. PMID:10391042

Pruszewicz, A; Wo?nica, B; Obrebowski, A; Karlik, M

1999-01-01

64

Speech processing using maximum likelihood continuity mapping  

DOEpatents

Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

Hogden, John E. (Santa Fe, NM)

2000-01-01

65

Research on Speech Communications and Automatic Speech Recognition.  

National Technical Information Service (NTIS)

The general areas of research include speech analysis, automatic processing of phonemic strings to orthographic outputs, linguistic theory and language description, and the prosodies of speech. Progress in each of these areas is described. (Author)

J. E. Shoup

1970-01-01

66

Objective speech quality evaluation of real-time speech coders  

NASA Astrophysics Data System (ADS)

This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.

Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

1984-02-01

67

SpeechSkimmer: a system for interactively skimming recorded speech  

Microsoft Academic Search

Listening to a speech recording is much more difficult than visually scanning a document because of the transient and temporal nature of audio. Audio recordings capture the richness of speech, yet it is difficult to directly browse the stored information. This article describes techniques for structuring, filtering, and presenting recorded speech, allowing a user to navigate and interactively find information

Barry Arons

1997-01-01

68

Differential Diagnosis of Severe Speech Disorders Using Speech Gestures  

ERIC Educational Resources Information Center

The differentiation of childhood apraxia of speech from severe phonological disorder is a common clinical problem. This article reports on an attempt to describe speech errors in children with childhood apraxia of speech on the basis of gesture use and acoustic analyses of articulatory gestures. The focus was on the movement of articulators and…

Bahr, Ruth Huntley

2005-01-01

69

Muscle-based Approach to Speech Therapy  

Microsoft Academic Search

Underlying muscle weakness and instability of the jaw, lips and tongue can lead to poor muscle coordination for speech. 1 When the foundations of speech are compromised, due to various deficits and disorders affecting muscle movement, speech clarity is diminished. A muscle-based approach to speech therapy provides the basic building blocks of speech by addressing speech clarity disorders secondary to

Jennifer A. Bathel

70

[Visual synthesis of speech].  

PubMed

The eyes can come to be the sole tool of communication for highly disabled patients. With the appropriate technology it is possible to successfully interpret eye movements, increasing the possibilities of patient communication with the use of speech synthesisers. A system of these characteristics will have to include a speech synthesiser, an interface for the user to construct the text and a method of gaze interpretation. In this way a situation will be achieved in which the user will manage the system solely with his eyes. This review sets out the state of the art of the three modules that make up a system of this type, and finally it introduces the speech synthesis system (Síntesis Visual del Habla [SiVHa]), which is being developed in the Public University of Navarra. PMID:12886320

Blanco, Y; Villanueva, A; Cabeza, R

2000-01-01

71

Environment-Optimized Speech Enhancement  

Microsoft Academic Search

In this paper, we present a training-based approach to speech enhancement that exploits the spectral statistical characteristics of clean speech and noise in a specific environment. In contrast to many state-of-the-art approaches, we do not model the probability density function (pdf) of the clean speech and the noise spectra. Instead, subband-individual weighting rules for noisy speech spectral amplitudes are separately

Tim Fingscheidt; Suhadi Suhadi; Sorel Stan

2008-01-01

72

Emotive Qualities in Robot Speech.  

National Technical Information Service (NTIS)

This paper explores the expression of emotion in synthesized speech for an anthropomorphic robot. We have adapted several key emotional correlates of human speech to the robot's speech synthesizer to allow the robot to speak in either an angry, calm, disg...

C. Breazeal

2000-01-01

73

Megatrends in Speech Communication: Administration.  

ERIC Educational Resources Information Center

Recurring questions on the discipline of speech communication include whether it is in fact a discipline, whether it justifies its own department, and what job prospects await speech communication graduates. These questions are not unique, but are taken very seriously by most administrators when evaluating speech programs. Reasons underlying most…

Peterson, Brent D.

74

Speech Perception in the Classroom.  

ERIC Educational Resources Information Center

This article discusses how poor room acoustics can make speech inaudible and presents a speech-perception model demonstrating the linkage between adequacy of classroom acoustics and the development of a speech and language systems. It argues both aspects must be considered when evaluating barriers to listening and learning in a classroom.…

Smaldino, Joseph J.; Crandell, Carl C.

1999-01-01

75

Speed of speech and persuasion  

Microsoft Academic Search

The relationship between speaking rate and attitude change was investigated in 2 field experiments with 449 Ss. Manipulations of speech rate were crossed with (a) credibility of the speaker and (b) complexity of the spoken message. Results suggest that speech rate functions as a general cue that augments credibility; rapid speech enhances persuasion, and therefore argues against information-processing interpretations of

Norman Miller; Geoffrey Maruyama; Rex J. Beaber; Keith Valone

1976-01-01

76

Microphones for speech and speech recognition  

NASA Astrophysics Data System (ADS)

Automatic speech recognition (ASR) requires about a 15- to 20-dB signal-to-noise ratio (S/N) for high accuracy even for small vocabulary systems. This S/N is generally achievable using a telephone handset in normal office or home environments. In the early 1990s ATT and the regional telephone companies began using speaker-independent ASR to replace several operator services. The variable distortion in the carbon microphone was not transparent and resulted in reduced ASR accuracy. The linear electret condenser microphone, common in most modern telephones, improved handset performance both in sound quality and ASR accuracy. Hands-free ASR in quiet conditions is a bit more complex because of the increased distance between the microphone and the speech source. Cardioid directional microphones offer some improvement in noisy locations when the noise and desired signals are spatially separated, but this is not the general case and the resulting S/N is not adequate for seamless machine translation. Higher-order directional microphones, when properly oriented with respect to the talker and noise, have shown good improvement over omni-directional microphones. Some ASR results measured in simulated car noise will be presented.

West, James E.

2004-10-01

77

Comparative evaluation of the speech quality of speech coders and text-to-speech synthesizers  

Microsoft Academic Search

In a joint project called SPIN, which is sponsored by the European Information Technology Program ESPRIT, a speech interface for office automation will be developed and tested. Two specific aspects of such an interface, which are discussed in this paper, have to do with speech store-and-forward and text-to-speech synthesis-by-rule. We will diagnostically evaluate the speech quality of the medium-band coders

L. C. W. Pols; G. W. Boxelaar

1986-01-01

78

Free Speech Yearbook 1979.  

ERIC Educational Resources Information Center

The seven articles in this collection deal with theoretical and practical freedom of speech issues. Topics covered are: the United States Supreme Court, motion picture censorship, and the color line; judicial decision making; the established scientific community's suppression of the ideas of Immanuel Velikovsky; the problems of avant-garde jazz,…

Kane, Peter E., Ed.

79

Free Speech Yearbook 1973.  

ERIC Educational Resources Information Center

The first article in this collection examines civil disobedience and the protections offered by the First Amendment. The second article discusses a study on antagonistic expressions in a free society. The third essay deals with attitudes toward free speech and treatment of the United States flag. There are two articles on media; the first examines…

Barbour, Alton, Ed.

80

Hearing speech in music.  

PubMed

The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (P<.01). Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01) and SPN (P<.05). Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01), but there were smaller differences between masking conditions (P<.01). It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings. PMID:21768731

Ekström, Seth-Reino; Borg, Erik

2011-01-01

81

From the Speech Files  

ERIC Educational Resources Information Center

In a speech, Looking Ahead in Vocational Education", to a group of Hamilton educators, D.O. Davis, Vice-President, Engineering, Dominion Foundries and Steel Limited, Hamilton, Ontario spoke of the challenge of change and what educators and industry must do to help the future of vocational education. (Editor)

Can Vocat J, 1970

1970-01-01

82

Media Criticism Group Speech  

ERIC Educational Resources Information Center

Objective: To integrate speaking practice with rhetorical theory. Type of speech: Persuasive. Point value: 100 points (i.e., 30 points based on peer evaluations, 30 points based on individual performance, 40 points based on the group presentation), which is 25% of course grade. Requirements: (a) References: 7-10; (b) Length: 20-30 minutes; (c)…

Ramsey, E. Michele

2004-01-01

83

Black History Speech  

ERIC Educational Resources Information Center

The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…

Noldon, Carl

2007-01-01

84

Continuous speech recognition  

Microsoft Academic Search

The authors focus on a tutorial description of the hybrid HMM\\/ANN method. The approach has been applied to large vocabulary continuous speech recognition, and variants are in use by many researchers, The method provides a mechanism for incorporating a range of sources of evidence without strong assumptions about their joint statistics, and may have applicability to much more complex systems

NELSON MORGAN; HERVE BOURLARD

1995-01-01

85

Perceptual Learning in Speech  

ERIC Educational Resources Information Center

This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g.,…

Norris, Dennis; McQueen, James M.; Cutler, Anne

2003-01-01

86

Listening and Compressed Speech.  

ERIC Educational Resources Information Center

Since listening plays such a large role in communication and learning, audio tapes can function in an important fashion in the design and delivery of instruction. In addition, recent research indicates that compressed audio tapes, in which speech is edited electronically by a sampling method so that the words-per-minute rate is increased without…

Arrasjid, Harun

87

System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech  

DOEpatents

Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

Burnett, Greg C. (Livermore, CA); Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

2002-01-01

88

Multilevel Analysis in Analyzing Speech Data  

ERIC Educational Resources Information Center

The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

Guddattu, Vasudeva; Krishna, Y.

2011-01-01

89

Monitoring for speech errors has different functions in inner and overt speech  

Microsoft Academic Search

Abstract In this paper it is argued that monitoring for speecherrors is not the same in inner speech and in overt speech. In inner speech it is meant to prevent the errors from becoming public, in overt speech to repair the damage caused by the errors. It is expected that in inner speech, but not in overt speech, more nonword

Sieb Nooteboom

90

Gisting conversational speech  

Microsoft Academic Search

A novel system for extracting information from stereotyped voice traffic is described. Off-the-air recordings of commercial air traffic control communications are interpreted in order to identify the flights present and determine the scenario (e.g., takeoff, landing) that they are following. The system combines algorithms from signal segmentation, speaker segregation, speech recognition, natural language parsing, and topic classification into a single

J. R. Rohlicek; D. Ayuso; M. Bates; R. Bobrow; A. Boulanger; H. Gish; P. Jeanrenaud; M. Meteer; M. Siu

1992-01-01

91

[Speech changes in dementia].  

PubMed

This review analyzes the spectrum of language deficits commonly encountered in dementia. A specific communication profile is found in dementia of the "cortical" type, such as Alzheimer's disease. With advancing disease lexical, comprehension and pragmatic functions deteriorate, whereas syntax and phonology tend to be preserved. This pattern bears some resemblance to aphasia types like transcortical and Wernicke's aphasia, however, a much broader range of communicative functions is impaired in Alzheimer's disease than in aphasia. Differentiation of dementia and aphasia, especially in elderly patients requires careful neuropsychological assessment of language, memory and other psychological functions. "Subcortical" dementia commonly presents with dysarthria as the leading symptom and linguistic impairment is rarely of crucial importance until late stages. Thus, the interetiologic dissociation of language and speech impairment can be used for dementia differentiation. Aphasia batteries are not sufficient to comprehend the range of language deficits in demented patients. Testing the communication impairment in dementia requires specific tasks for spontaneous speech, naming, comprehension, reading, writing, repetition and motor speech functions. Tasks for verbal learning and metalinguistic abilities should also be performed. Language deficits are frequent initial symptoms of dementia, thus language assessment may be of diagnostic relevance. Many data support the concept that the communication deficit in dementia results from a particular impairment of semantic memory. PMID:1695887

Benke, T; Andree, B; Hittmair, M; Gerstenbrand, F

1990-06-01

92

Applications for Subvocal Speech  

NASA Technical Reports Server (NTRS)

A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.

Jorgensen, Charles; Betts, Bradley

2007-01-01

93

Semantic Interpretation for Speech Recognition  

NSDL National Science Digital Library

The first working draft of the World Wide Web Consortium's (W3C) Semantic Interpretation for Speech Recognition is now available. The document "defines the process of Semantic Interpretation for Speech Recognition and the syntax and semantics of semantic interpretation tags that can be added to speech recognition grammars." The document is a draft, open for suggestions from W3C members and other interested users.

Lernout & Hauspie Speech Products.; Tichelen, Luc V.

2001-01-01

94

Speech Recognition Over IP Networks  

Microsoft Academic Search

This chapter introduces the basic features of speech recognition over an IP-based network. First of all, we review typical\\u000a lossy packet channel models and several speech coders used for voice over IP, where the performance of a network speech recognition\\u000a (NSR) system can significantly degrade. Second, several techniques for maintaining the performance of NSR against packet loss\\u000a are addressed. The

Hong Kook Kim

95

On the Nature of Speech Science.  

National Technical Information Service (NTIS)

The discipline of speech science is considered and the various basic and applied areas of the discipline are discussed. The basic areas encompass the various processes of the physiology of speech production, the acoustical characteristics of speech, inclu...

G. E. Peterson

1967-01-01

96

Controlled Generation for Speech-to-Speech MT Systems  

Microsoft Academic Search

In spoken dialog systems, a well- crafted prompt is important in order to get the user to respond with an expected type of utterance. We identify a new, important area for research in speech- to-speech translation, which focuses on the fact that the output of the MT sys- tem serves as the prompt for the user on each end. The

Arendse Bernth

2003-01-01

97

Speech-in-Speech Recognition: A Training Study  

ERIC Educational Resources Information Center

This study aims to identify aspects of speech-in-noise recognition that are susceptible to training, focusing on whether listeners can learn to adapt to target talkers ("tune in") and learn to better cope with various maskers ("tune out") after short-term training. Listeners received training on English sentence recognition in speech-shaped noise…

Van Engen, Kristin J.

2012-01-01

98

Inside a speech recognition machine  

Microsoft Academic Search

Illustration of speech recognition techniques by concentrating on a particular speech recognition system was presented. The system, know as Logos is designed as a flexible, high performance, experimental machine for research on recognition methods and applications aspects. Algorithms for connected word recognition on which the system is based, are presented. Implementing such algorithms in computer programs and special purpose equipment

J. S. Bridle

1983-01-01

99

Speech acoustics: How much science?  

PubMed Central

Human vocalizations are sounds made exclusively by a human vocal tract. Among other vocalizations, for example, laughs or screams, speech is the most important. Speech is the primary medium of that supremely human symbolic communication system called language. One of the functions of a voice, perhaps the main one, is to realize language, by conveying some of the speaker's thoughts in linguistic form. Speech is language made audible. Moreover, when phoneticians compare and describe voices, they usually do so with respect to linguistic units, especially speech sounds, like vowels or consonants. It is therefore necessary to understand the structure as well as nature of speech sounds and how they are described. In order to understand and evaluate the speech, it is important to have at least a basic understanding of science of speech acoustics: how the acoustics of speech are produced, how they are described, and how differences, both between speakers and within speakers, arise in an acoustic output. One of the aims of this article is try to facilitate this understanding.

Tiwari, Manjul

2012-01-01

100

Interpersonal Orientation and Speech Behavior.  

ERIC Educational Resources Information Center

Indicates that (1) males with low interpersonal orientation (IO) were least vocally active and expressive and least consistent in their speech performances, and (2) high IO males and low IO females tended to demonstrate greater speech convergence than either low IO males or high IO females. (JD)

Street, Richard L., Jr.; Murphy, Thomas L.

1987-01-01

101

Robust Signal Subspace Speech Classifier  

Microsoft Academic Search

A speech model inspired by the signal subspace approach was recently proposed as a speech classifier with modest results. The method entails, in general, the assemblage of a set of subspace trajectories that consist of the right singular vectors of measurement matrices of the signal under consideration. Given an unknown signal, a simple distortion measure then applies in the classification

Alan W. C. Tan; M. V. C. Rao; B. S. Daya Sagar

2007-01-01

102

Phonetic Transcription of Disordered Speech.  

ERIC Educational Resources Information Center

This article reviews major approaches to the transcription of disordered speech using the International Alphabet (IPA). Application of selected symbols for transcribing non-English sounds is highlighted in clinical examples, as are commonly used diacritic symbols. Included is an overview of the IPA extensions for transcription of atypical speech,…

Powell, Thomas W.

2001-01-01

103

Distortion measures for speech processing  

Microsoft Academic Search

Several properties, interrelations, and interpretations are developed for various speech spectral distortion measures. The principle results are 1) the development of notions of relative strength and equivalence of the various distortion measures both in a mathematical sense corresponding to subjective equivalence and in a coding sense when used in minimum distortion or nearest neighbor speech processing systems; 2) the demonstration

ROBERT M. GRAY; ANDRES BUZO; Y. Matsuyama

1980-01-01

104

Speech Prosody in Cerebellar Ataxia  

ERIC Educational Resources Information Center

Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

2007-01-01

105

Apraxia of Speech: An overview  

Microsoft Academic Search

Apraxia of speech (AOS) is a motor speech disorder that can occur in the absence of aphasia or dysarthria. AOS has been the subject of some controversy since the disorder was first named and described by Darley and his Mayo Clinic colleagues in the 1960s. A recent revival of interest in AOS is due in part to the fact that

Jennifer Ogar; Hilary Slama; Nina Dronkers; Serena Amici; Maria Luisa Gorno-Tempini

2005-01-01

106

Perceptual Aspects of Cluttered Speech  

ERIC Educational Resources Information Center

The purpose of this descriptive investigation was to explore perceptual judgments of speech naturalness, compared to judgments of articulation, language, disfluency, and speaking rate, in the speech of two youths who differed in cluttering severity. Two groups of listeners, 48 from New York and 48 from West Virginia, judged 93 speaking samples on…

St. Louis, Kenneth O.; Myers, Florence L.; Faragasso, Kristine; Townsend, Paula S.; Gallaher, Amanda J.

2004-01-01

107

Speech Restoration: An Interactive Process  

ERIC Educational Resources Information Center

Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…

Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny

2009-01-01

108

Speech categorization in context: Joint effects of nonspeech and speech precursors  

PubMed Central

The extent to which context influences speech categorization can inform theories of pre-lexical speech perception. Across three conditions, listeners categorized speech targets preceded by speech context syllables. These syllables were presented as the sole context or paired with nonspeech tone contexts previously shown to affect speech categorization. Listeners’ context-dependent categorization across these conditions provides evidence that speech and nonspeech context stimuli jointly influence speech processing. Specifically, when the spectral characteristics of speech and nonspeech context stimuli are mismatched such that they are expected to produce opposing effects on speech categorization the influence of nonspeech contexts may undermine, or even reverse, the expected effect of adjacent speech context. Likewise, when spectrally matched, the cross-class contexts may collaborate to increase effects of context. Similar effects are observed even when natural speech syllables, matched in source to the speech categorization targets, serve as the speech contexts. Results are well-predicted by spectral characteristics of the context stimuli.

Holt, Lori L.

2006-01-01

109

Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder  

ERIC Educational Resources Information Center

Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech

Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

2006-01-01

110

Clear speech: A strategem for improving radio communications and automatic speech recognition in noise  

NASA Astrophysics Data System (ADS)

The acoustic characteristics of conversational speech production and clear speech production were compared for three different talkers. Increases in fundamental frequency, word token duration, and voice level for the clear speech were obtained. These results are compared to the results of similar studies and implications for improved intelligibility of speech and automatic speech recognition are discussed.

Mosko, J. D.

1981-06-01

111

ON THE NATURE OF SPEECH SCIENCE.  

ERIC Educational Resources Information Center

IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…

PETERSON, GORDON E.

112

Infant-Directed Speech Facilitates Word Segmentation  

ERIC Educational Resources Information Center

There are reasons to believe that infant-directed (ID) speech may make language acquisition easier for infants. However, the effects of ID speech on infants' learning remain poorly understood. The experiments reported here assess whether ID speech facilitates word segmentation from fluent speech. One group of infants heard a set of nonsense…

Thiessen, Erik D.; Hill, Emily A.; Saffran, Jenny R.

2005-01-01

113

Audio-Visual Speech Perception Is Special  

ERIC Educational Resources Information Center

In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and…

Tuomainen, J.; Andersen, T.S.; Tiippana, K.; Sams, M.

2005-01-01

114

Infant Perception of Atypical Speech Signals  

ERIC Educational Resources Information Center

The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

Vouloumanos, Athena; Gelfand, Hanna M.

2013-01-01

115

Multifractal nature of unvoiced speech signals  

SciTech Connect

A refinement is made in the nonlinear dynamic modeling of speech signals. Previous research successfully characterized speech signals as chaotic. Here, we analyze fricative speech signals using multifractal measures to determine various fractal regimes present in their chaotic attractors. Results support the hypothesis that speech signals have multifractal measures. {copyright} {ital 1996 American Institute of Physics.}

Adeyemi, O.A. [Department of Electrical Engineering, University of Rhode Island, Kingston, Rhode Island 02881 (United States); Hartt, K. [Department of Physics, University of Rhode Island, Kingston, Rhode Island 02881 (United States); Boudreaux-Bartels, G.F. [Department of Electrical Engineering, University of Rhode Island, Kingston, Rhode Island 02881 (United States)

1996-06-01

116

Metrical perception of trisyllabic speech rhythms.  

PubMed

The perception of duration-based syllabic rhythm was examined within a metrical framework. Participants assessed the duration patterns of four-syllable phrases set within the stress structure XxxX (an Abercrombian trisyllabic foot). Using on-screen sliders, participants created percussive sequences that imitated speech rhythms and analogous non-speech monotone rhythms. There was a tendency to equalize the interval durations for speech stimuli but not for non-speech. Despite the perceptual regularization of syllable durations, different speech phrases were conceived in various rhythmic configurations, pointing to a diversity of perceived meters in speech. In addition, imitations of speech stimuli showed more variability than those of non-speech. Rhythmically skilled listeners exhibited lower variability and were more consistent with vowel-centric estimates when assessing speech stimuli. These findings enable new connections between meter- and duration-based models of speech rhythm perception. PMID:23417710

Benadon, Fernando

2014-01-01

117

Silog: Speech Input Logon  

NASA Astrophysics Data System (ADS)

Silog is a biometrie authentication system that extends the conventional PC logon process using voice verification. Users enter their ID and password using a conventional Windows logon procedure but then the biometrie authentication stage makes a Voice over IP (VoIP) call to a VoiceXML (VXML) server. User interaction with this speech-enabled component then allows the user's voice characteristics to be extracted as part of a simple user/system spoken dialogue. If the captured voice characteristics match those of a previously registered voice profile, then network access is granted. If no match is possible, then a potential unauthorised system access has been detected and the logon process is aborted.

Grau, Sergio; Allen, Tony; Sherkat, Nasser

118

Tools For Researchand Education In Speech Science  

Microsoft Academic Search

The Center for Spoken Language Understanding (CSLU)provides free language resources to researchers and educators inall areas of speech and hearing science. These resources are ofgreat potential value to speech scientists for analyzing speech,for diagnosing and treating speech and language problems, forresearching and evaluating language technologies, and fortraining students in the theory and practice of speech science.This article describes language resources

Ronald A. Cole

1999-01-01

119

Neural Networks for Speech Application.  

National Technical Information Service (NTIS)

This is a general introduction to the reemerging technology called neural networks, and how these networks may provide an important alternative to traditional forms of computing in speech applications. Neural networks, sometimes called Artificial Neural S...

S. A. Luse

1987-01-01

120

Status Report on Speech Research.  

National Technical Information Service (NTIS)

In the report, the table content includes the following: Tiers in articulatory phonology, with some implications for casual speech; Coarticulatory influences on the perceived height of nasal vowels; Domain-final lengthening and foot-level shortening in sp...

M. Studdert-Kennedy

1988-01-01

121

Speech motor skill and stuttering.  

PubMed

The authors review converging lines of evidence from behavioral, kinematic, and neuroimaging data that point to limitations in speech motor skills in people who stutter (PWS). From their review, they conclude that PWS differ from those who do not in terms of their ability to improve with practice and retain practiced changes in the long term, and that they are less efficient and less flexible in their adaptation to lower (motor) and higher (cognitive-linguistic) order requirements that impact on speech motor functions. These findings in general provide empirical support for the position that PWS may occupy the low end of the speech motor skill continuum as argued in the Speech Motor Skills approach (Van Lieshout, Hulstijn, & Peters, 2004). PMID:22106825

Namasivayam, Aravind Kumar; van Lieshout, Pascal

2011-01-01

122

Acute stress reduces speech fluency.  

PubMed

People often report word-finding difficulties and other language disturbances when put in a stressful situation. There is, however, scant empirical evidence to support the claim that stress affects speech productivity. To address this issue, we measured speech and language variables during a stressful Trier Social Stress Test (TSST) as well as during a less stressful "placebo" TSST (Het et al., 2009). Compared to the non-stressful speech, participants showed higher word productivity during the TSST. By contrast, participants paused more during the stressful TSST, an effect that was especially pronounced in participants who produced a larger cortisol and heart rate response to the stressor. Findings support anecdotal evidence of stress-impaired speech production abilities. PMID:24555989

Buchanan, Tony W; Laures-Gore, Jacqueline S; Duff, Melissa C

2014-03-01

123

Why Go to Speech Therapy?  

MedlinePLUS

Tweet Why Go To Speech Therapy? Download brochure for free By Lisa Scott, Ph.D., Florida State University Many teens and ... types of therapy work best when you can go on an intensive schedule (i.e., every day ...

124

Writing, Inner Speech, and Meditation.  

ERIC Educational Resources Information Center

Examines the interrelationships among meditation, inner speech (stream of consciousness), and writing. Considers the possibilities and implications of using the techniques of meditation in educational settings, especially in the writing classroom. (RL)

Moffett, James

1982-01-01

125

Perceptual Learning of Interrupted Speech  

PubMed Central

The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations.

Benard, Michel Ruben; Baskent, Deniz

2013-01-01

126

[Speech disorders in ENT practice].  

PubMed

The most frequent speech and language disorders which ENT doctors are confronted with are generally known to be and presented as: delayed speech and language development, dystalia, dysglossia, rhinolalia, dysarthria, and verbal fluency disorders (stuttering, cluttering). The diagnostic portion in comparison to the therapeutic part is always greater and quite different. The close cooperation with representatives of phoniatrics and pedaudiology, as well as logopedics and other specialities such as neurology, and internal medicine is highly necessary. PMID:9264604

Seidner, W

1997-04-01

127

Speech Perception as a Multimodal Phenomenon  

PubMed Central

Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal speech information could explain the reported automaticity, immediacy, and completeness of audiovisual speech integration. However, recent findings suggest that speech integration can be influenced by higher cognitive properties such as lexical status and semantic context. Proponents of amodal accounts will need to explain these results.

Rosenblum, Lawrence D.

2009-01-01

128

Impaired motor speech performance in Huntington's disease.  

PubMed

Dysarthria is a common symptom of Huntington's disease and has been reported, besides other features, to be characterized by alterations of speech rate and regularity. However, data on the specific pattern of motor speech impairment and their relationship to other motor and neuropsychological symptoms are sparse. Therefore, the aim of the present study was to describe and objectively analyse different speech parameters with special emphasis on the aspect of speech timing of connected speech and non-speech verbal utterances. 21 patients with manifest Huntington's disease and 21 age- and gender-matched healthy controls had to perform a reading task and several syllable repetition tasks. Computerized acoustic analysis of different variables for the measurement of speech rate and regularity generated a typical pattern of impaired motor speech performance with a reduction of speech rate, an increase of pauses and a marked disability to steadily repeat single syllables. Abnormalities of speech parameters were more pronounced in the subgroup of patients with Huntington's disease receiving antidopaminergic medication, but were also present in the drug-naïve patients. Speech rate related to connected speech and parameters of syllable repetition showed correlations to overall motor impairment, capacity of tapping in a quantitative motor assessment and some score of cognitive function. After these preliminary data, further investigations on patients in different stages of disease are warranted to survey if the analysis of speech and non-speech verbal utterances might be a helpful additional tool for the monitoring of functional disability in Huntington's disease. PMID:24221215

Skodda, Sabine; Schlegel, Uwe; Hoffmann, Rainer; Saft, Carsten

2014-04-01

129

Infant perception of atypical speech signals.  

PubMed

The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how visual context influences infant speech perception. Nine-month-olds heard speech and nonspeech sounds produced by either a human or a parrot, concurrently with 1 of 2 visual displays: a static checkerboard or a static image of a human face. Using an infant-controlled looking task, we examined infants' preferences for speech and nonspeech sounds. Infants listened equally to parrot speech and nonspeech when paired with a checkerboard. However, in the presence of faces, infants listened longer to parrot speech than to nonspeech sounds, such that their preference for parrot speech was similar to their preference for human speech sounds. These data are consistent with the possibility that infants treat parrot speech similarly to human speech relative to nonspeech vocalizations but only in some visual contexts. Like adults, infants may perceive a range of signals as speech. PMID:22709131

Vouloumanos, Athena; Gelfand, Hanna M

2013-05-01

130

Aerodynamic assessment of prosthetic speech aids.  

PubMed

The primary function of a speech aid prosthesis is to provide adequate palatopharyngeal function by preventing nasal emission and hypernasality during oral speech production and permitting sufficient nasal air escape during nasal consonant production. The adequacy of speech aids is often judged subjectively by speech-language pathologists and prosthodontists. However, when oral and laryngeal function are also affected, additional information may be needed for accurate assessment of palatopharyngeal function and optimal prosthetic management. In these instances, aerodynamic measurements can provide information about palatopharyngeal function and guide fabrication and modification of speech aid prostheses to provide adequate palatopharyngeal function for speech. PMID:3863945

Reisberg, D J; Smith, B E

1985-11-01

131

Speech prosody in cerebellar ataxia  

NASA Astrophysics Data System (ADS)

The present study sought an acoustic signature for the speech disturbance recognized in cerebellar degeneration. Magnetic resonance imaging was used for a radiological rating of cerebellar involvement in six cerebellar ataxic dysarthric speakers. Acoustic measures of the [pap] syllables in contrastive prosodic conditions and of normal vs. brain-damaged patients were used to further our understanding both of the speech degeneration that accompanies cerebellar pathology and of speech motor control and movement in general. Pair-wise comparisons of the prosodic conditions within the normal group showed statistically significant differences for four prosodic contrasts. For three of the four contrasts analyzed, the normal speakers showed both longer durations and higher formant and fundamental frequency values in the more prominent first condition of the contrast. The acoustic measures of the normal prosodic contrast values were then used as a model to measure the degree of speech deterioration for individual cerebellar subjects. This estimate of speech deterioration as determined by individual differences between cerebellar and normal subjects' acoustic values of the four prosodic contrasts was used in correlation analyses with MRI ratings. Moderate correlations between speech deterioration and cerebellar atrophy were found in the measures of syllable duration and f0. A strong negative correlation was found for F1. Moreover, the normal model presented by these acoustic data allows for a description of the flexibility of task- oriented behavior in normal speech motor control. These data challenge spatio-temporal theory which explains movement as an artifact of time wherein longer durations predict more extreme movements and give further evidence for gestural internal dynamics of movement in which time emerges from articulatory events rather than dictating those events. This model provides a sensitive index of cerebellar pathology with quantitative acoustic analyses.

Casper, Maureen

132

System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech  

DOEpatents

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

Burnett, Greg C. (Livermore, CA); Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

2006-08-08

133

Current methods of digital speech processing  

NASA Astrophysics Data System (ADS)

The field of digital speech processing includes the areas of speech coding, speech synthesis, and speech recognition. With the advent of faster computation and high speed VLSI circuits, speech processing algorithms are becoming more sophisticated, more robust, and more reliable. As a result, significant advances have been made in coding, synthesis, and recognition, but, in each area, there still remain great challenges in harnessing speech technology to human needs. In the area of speech coding, current algorithms perform well at bit rates down to 16 kbits/sec. Current research is directed at further reducing the coding rate for high-quality speech into the data speed range, even as low as 2.4 kbits/sec. In text-to-speech synthesis speech is produced which is very intelligible but is not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthesis speech produced by these systems. Finally, in the area of speech and speaker recognition, present systems provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, and favorable signal-to-noise ratios. Current research is directed at solving the problem of continuous speech recognition for large vocabularies, and at verifying talker's identities from a limited amount of spoken text.

Rabiner, Lawrence R.; Atal, B. S.; Flanagan, J. L.

1990-05-01

134

Contextual variability during speech-in-speech recognition.  

PubMed

This study examined the influence of background language variation on speech recognition. English listeners performed an English sentence recognition task in either "pure" background conditions in which all trials had either English or Dutch background babble or in mixed background conditions in which the background language varied across trials (i.e., a mix of English and Dutch or one of these background languages mixed with quiet trials). This design allowed the authors to compare performance on identical trials across pure and mixed conditions. The data reveal that speech-in-speech recognition is sensitive to contextual variation in terms of the target-background language (mis)match depending on the relative ease/difficulty of the test trials in relation to the surrounding trials. PMID:24993234

Brouwer, Susanne; Bradlow, Ann R

2014-07-01

135

Speech enhancement using a soft-decision noise suppression filter  

Microsoft Academic Search

One way of enhancing speech in an additive acoustic noise environment is to perform a spectral decomposition of a frame of noisy speech and to attenuate a particular spectral line depending on how much the measured speech plus noise power exceeds an estimate of the background noise. Using a two-state model for the speech event (speech absent or speech present)

R. McAulay; M. Malpass

1980-01-01

136

The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech  

ERIC Educational Resources Information Center

Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

Wayne, Rachel V.; Johnsrude, Ingrid S.

2012-01-01

137

Bilingual aligned corpora for speech to speech translation for Spanish, English and Catalan  

Microsoft Academic Search

In the framework of the EU-funded Project LC-STAR, a set of Language Resources (LR) for all the Speech to Speech Translation components (Speech recognition, Machine Translation and Speech Synthesis) was developed. This paper deals with the development of bilingual corpora in Spanish, US English and Catalan. The corpora were obtained from spontaneous dialogues in one of these three languages which

David Conejero; Alan Lounds; Carmen García-Mateo; Leandro Rodríguez Liñares; Raquel Mochales; Asunción Moreno

2005-01-01

138

Effects of speech therapy and pharmacologic and surgical treatments on voice and speech in parkinson's disease  

Microsoft Academic Search

The purpose of this review was to examine the different treatment approaches for persons with Parkinson's Disease (PD) and to examine the effects of these treatments on speech. Treatment methods reviewed include speech therapy, pharmacological, and surgical. Research from the 1950s through the 1970s had not demonstrated significant improvements following speech therapy. Recent research has shown that speech therapy (when

GERALYN M. SCHULZ; MEGAN K. GRANT

2000-01-01

139

Speech analysis and synthesis based on pitch-synchronous segmentation of the speech waveform  

Microsoft Academic Search

This report describes a new speech analysis\\/synthesis method. This new technique does not attempt to model the human speech production mechanism. Instead, we represent the speech waveform directly in terms of the speech waveform defined in a pitch period. A significant merit of this approach is the complete elimination of pitch interference because each pitch-synchronously segmented waveform does not include

George S. Kang; Lawrence J. Fransen

1994-01-01

140

Reading Speech from Still and Moving Faces: The Neural Substrates of Visible Speech  

Microsoft Academic Search

Speech is perceived both by ear and by eye. Unlike heard speech, some seen speech gestures can be captured in stilled image sequences. Previous studies have shown that in hearing people, natural time-varying silent seen speech can access the auditory cortex (left superior temporal regions). Using functional magnetic resonance imaging (fMRI), the present study explored the extent to which this

Gemma A. Calvert; Ruth Campbell

2003-01-01

141

Speech Act Theory and Business Communication Conventions.  

ERIC Educational Resources Information Center

Applies speech act theory to business writing to determine why certain letters and memos succeed while others fail. Specifically, shows how speech act theorist H. P. Grice's rules or maxims illuminate the writing process in business communication. (PD)

Ewald, Helen Rothschild; Stine, Donna

1983-01-01

142

President Kennedy's Speech at Rice University  

NASA Technical Reports Server (NTRS)

This video tape presents unedited film footage of President John F. Kennedy's speech at Rice University, Houston, Texas, September 12, 1962. The speech expresses the commitment of the United States to landing an astronaut on the Moon.

1988-01-01

143

Task Force on Speech Pathology and Audiology.  

National Technical Information Service (NTIS)

This report presents the results and conclusions of a 1972 study performed by the Task Force on Speech Pathology and Audiology. Thirteen educational institutions offering degrees in speech pathology and audiology in Louisiana were surveyed, and completed ...

J. L. Peterson

1973-01-01

144

Speech and Language Problems in Children  

MedlinePLUS

Children vary in their development of speech and language skills. Health professionals have milestones for what's normal. ... it may be due to a speech or language disorder. Language disorders can mean that the child ...

145

Noise Suppression Methods for Robust Speech Processing.  

National Technical Information Service (NTIS)

Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments to develop real time, compressed speech analysis-synthesis algo...

S. F. Boll D. Pulsipher W. Done B. Cox C. K. Rushforth

1979-01-01

146

Prosodic Information for Speech Understanding Systems.  

National Technical Information Service (NTIS)

The goal of this project was to use prosodic information to aid speech recognition systems. Naturalistic speech data was collected and used to test and develop hypotheses about the relationship of prosodic information to the syntactic and semantic structu...

M. H. O'Malley

1977-01-01

147

Multiple Approaches to Robust Speech Recognition.  

National Technical Information Service (NTIS)

This paper compares several different approaches to robust speech recognition. We review CMU's ongoing research in the use of acoustical pre- processing to achieve robust speech recognition, and we present the results of the first evaluation of pre- proce...

A. Acero F. Liu R. M. Stern T. M. Sullivan Y. Ohshima

1992-01-01

148

Integration of Speech and Natural Language.  

National Technical Information Service (NTIS)

This report presents our work on integrating speech and natural language processing for speech understanding. It describes the components of the system: the unification grammar and corresponding parser, the higher order intensional logic and the type of s...

D. Ayuso Y. Chow A. Haas R. Ingria S. Roucos

1988-01-01

149

Units of Speech Perception: Phoneme and Syllable  

ERIC Educational Resources Information Center

Two detection experiments were conducted with short lists of synthetic speech stimuli where phoneme targets were compared to syllable targets. Results suggest that phonemes and syllables are equally basic to speech perception. (Author/RM)

Healy, Alice F.; Cutting, James E.

1976-01-01

150

Parametric trajectory models for speech recognition  

Microsoft Academic Search

The basic motivation for employing trajectory models forspeech recognition is that sequences of speech features arestatistically dependent and that the e#ective and e#cientmodeling of the speech process will incorporate this dependency.In our previous work #1# we presented an approachto modeling the speech process with trajectories. In thispaper we continue our developmentofparametric trajectorymodels for speech recognition. We extend our modelsto include

Herbert Gish; Kenney Ng

1996-01-01

151

The motor theory of speech perception reviewed  

Microsoft Academic Search

More than 50 years after the appearance of the motor theory of speech perception, it is timely to evaluate its three main\\u000a claims that (1) speech processing is special, (2) perceiving speech is perceiving gestures, and (3) the motor system is recruited\\u000a for perceiving speech. We argue that to the extent that it can be evaluated, the first claim is

Bruno Galantucci; Carol A. Fowler; M. T. Turvey

2006-01-01

152

Nonlinear Statistical Modeling of Speech  

NASA Astrophysics Data System (ADS)

Contemporary approaches to speech and speaker recognition decompose the problem into four components: feature extraction, acoustic modeling, language modeling and search. Statistical signal processing is an integral part of each of these components, and Bayes Rule is used to merge these components into a single optimal choice. Acoustic models typically use hidden Markov models based on Gaussian mixture models for state output probabilities. This popular approach suffers from an inherent assumption of linearity in speech signal dynamics. Language models often employ a variety of maximum entropy techniques, but can employ many of the same statistical techniques used for acoustic models. In this paper, we focus on introducing nonlinear statistical models to the feature extraction and acoustic modeling problems as a first step towards speech and speaker recognition systems based on notions of chaos and strange attractors. Our goal in this work is to improve the generalization and robustness properties of a speech recognition system. Three nonlinear invariants are proposed for feature extraction: Lyapunov exponents, correlation fractal dimension, and correlation entropy. We demonstrate an 11% relative improvement on speech recorded under noise-free conditions, but show a comparable degradation occurs for mismatched training conditions on noisy speech. We conjecture that the degradation is due to difficulties in estimating invariants reliably from noisy data. To circumvent these problems, we introduce two dynamic models to the acoustic modeling problem: (1) a linear dynamic model (LDM) that uses a state space-like formulation to explicitly model the evolution of hidden states using an autoregressive process, and (2) a data-dependent mixture of autoregressive (MixAR) models. Results show that LDM and MixAR models can achieve comparable performance with HMM systems while using significantly fewer parameters. Currently we are developing Bayesian parameter estimation and discriminative training algorithms for these new models to improve noise robustness.

Srinivasan, S.; Ma, T.; May, D.; Lazarou, G.; Picone, J.

2009-12-01

153

Characteristics of Speech Motor Development in Children.  

ERIC Educational Resources Information Center

Pulsed ultrasound was used to study tongue movements in the speech of children from 3 to 11 years of age. Speech data attained were characteristic of systems that can be described by second-order differential equations. Relationships observed in these systems may indicate that speech control involves tonic and phasic muscle inputs. (Author/RH)

Ostry, David J.; And Others

1984-01-01

154

Interventions for Speech Sound Disorders in Children  

ERIC Educational Resources Information Center

With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

2010-01-01

155

Syllable Structure in Dysfunctional Portuguese Children's Speech  

ERIC Educational Resources Information Center

The goal of this work is to investigate whether children with speech dysfunctions (SD) show a deficit in planning some Portuguese syllable structures (PSS) in continuous speech production. Knowledge of which aspects of speech production are affected by SD is necessary for efficient improvement in the therapy techniques. The case-study is focused…

Candeias, Sara; Perdigao, Fernando

2010-01-01

156

Data Acquisition and Modelling in Speech Communication  

Microsoft Academic Search

A framework of speech communication is presented in which natural spontaneous data and speech modelling are interrelated in a cyclic progession towards the goal of elucidating linguistic behaviour. Paralinguistic phenomena are treated as an integral part of a theory of speech communication. Data acquisition design has to adapt to the naturalness requirement. Index Terms : emphasis, communicative function, prosody 1.

Klaus J. Kohler

157

Speech disorders of Parkinsonism: a review  

Microsoft Academic Search

Study of the speech disorders of Parkinsonism provides a paradigm of the integration of phonation, articulation and language in the production of speech. The initial defect in the untreated patient is a failure to control respiration for the purpose of speech and there follows a forward progression of articulatory symptoms involving larynx, pharynx, tongue and finally lips. There is evidence

E M Critchley

1981-01-01

158

Microsoft Windows highly intelligent speech recognizer: Whisper  

Microsoft Academic Search

Since January 1993, the authors have been working to refine and extend Sphinx-II technologies in order to develop practical speech recognition at Microsoft. The result of that work has been the Whisper (Windows Highly Intelligent Speech Recognizer). Whisper represents significantly improved recognition efficiency, usability, and accuracy, when compared with the Sphinx-II system. In addition Whisper offers speech input capabilities for

Xuedong Huang; Alex Acero; Fil Alleva; Mei-Yuh Hwang; Li Jiang; Milind Mahajan

1995-01-01

159

Speech Correction in the Schools. Third Edition.  

ERIC Educational Resources Information Center

The volume, intended to introduce readers to the problems and therapeutic needs of speech impaired school children, first presents general considerations and background knowledge necessary for basic insights of the classroom teacher and the school speech clinician in relation to the speech handicapped child. Discussed are the classification and…

Eisenson, Jon; Ogilvie, Mardel

160

All-pole modeling of degraded speech  

Microsoft Academic Search

This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise. The procedure, based on maximum a posteriori (MAP) estimation techniques is first developed in the absence of noise and related to linear prediction analysis of speech. The modification in the presence of background noise is shown to be

Jae Lim; A. Oppenheim

1978-01-01

161

Speech recognition from adaptive windowing PSD estimation  

Microsoft Academic Search

Speech-recognition technology is embedded in voiceactivated routing systems at customer call centers, voice dialing on mobile phones, and many other everyday applications. Consequently, designing a robust speechrecognition system that adapts to acoustic conditions, such as the speaker’s speech rate and accent is of utmost interest. In this paper we present a machine learning approach for speech recognition using the k

Maryam Ravan; Soosan Beheshti

2011-01-01

162

Speech-Song Interface of Chinese Speakers  

ERIC Educational Resources Information Center

Pitch is a psychoacoustic construct crucial in the production and perception of speech and songs. This article is an exploration of the interface of speech and song performance of Chinese speakers. Although parallels might be drawn from the prosodic and sound structures of the linguistic and musical systems, perceiving and producing speech and…

Mang, Esther

2007-01-01

163

JANUS 93: towards spontaneous speech translation  

Microsoft Academic Search

We present first results from our efforts toward translation of spontaneously spoken speech. Improvements include increasing coverage, robustness, generality and speed of JANUS, the speech-to-speech translation system of Carnegie Mellon and Karlsruhe University. The recognition and machine translation engine have been upgraded to deal with requirements introduced by spontaneous human to human dialogs. To allow for development and evaluation of

M. Woszczyna; N. Aoki-Waibel; F. D. Buo; N. Coccaro; K. Horiguchi; T. Kemp; A. Lavie; A. McNair; T. Polzin; I. Rogina; C. P. Rose; T. Schultz; B. Suhm; M. Tomita; A. Waibel

1994-01-01

164

Speech and Hearing Science, Anatomy and Physiology.  

ERIC Educational Resources Information Center

Written for those interested in speech pathology and audiology, the text presents the anatomical, physiological, and neurological bases for speech and hearing. Anatomical nomenclature used in the speech and hearing sciences is introduced and the breathing mechanism is defined and discussed in terms of the respiratory passage, the framework and…

Zemlin, Willard R.

165

Vygotskian Inner Speech and the Reading Process  

ERIC Educational Resources Information Center

There is a paucity of Vygotskian influenced inner speech research in relation to the reading process. Those few studies which have examined Vygotskian inner speech from a reading perspective tend to support the notion that inner speech is an important covert function that is crucial to the reading process and to reading acquisition in general.…

Ehrich, J. F.

2006-01-01

166

Free Speech in the College Community.  

ERIC Educational Resources Information Center

This book discusses freedom of speech issues affecting the college community, in light of "speech codes" imposed by some institutions, new electronic technology such as the Internet, and recent court decisions. Chapter 1 addresses campus speech codes, the advantages and disadvantages of such codes, and their conflict with the First Amendment of…

O'Neil, Robert M.

167

The Modulation Transfer Function for Speech Intelligibility  

Microsoft Academic Search

We systematically determined which spectrotemporal modulations in speech are necessary for comprehension by human listeners. Speech comprehension has been shown to be robust to spectral and temporal degradations, but the specific relevance of particular degradations is arguable due to the complexity of the joint spectral and temporal information in the speech signal. We applied a novel modulation filtering technique to

Taffeta M. Elliott; Frédéric E. Theunissen

2009-01-01

168

Audiovisual Speech Integration and Lipreading in Autism  

ERIC Educational Resources Information Center

Background: During speech perception, the ability to integrate auditory and visual information causes speech to sound louder and be more intelligible, and leads to quicker processing. This integration is important in early language development, and also continues to affect speech comprehension throughout the lifespan. Previous research shows that…

Smith, Elizabeth G.; Bennetto, Loisa

2007-01-01

169

Speech Perception Within an Auditory Cognitive Science  

Microsoft Academic Search

The complexities of the acoustic speech signal pose many significant challenges for listeners. Although perceiving speech begins with auditory processing, inves- tigation of speech perception has progressed mostly inde- pendently of study of the auditory system. Nevertheless, a growing body of evidence demonstrates that cross-fertil- ization between the two areas of research can be produc- tive. We briefly describe research

Lori L. Holt; Andrew J. Lotto

170

Inner speech as a forward model?  

PubMed

Pickering & Garrod (P&G) consider the possibility that inner speech might be a product of forward production models. Here I consider the idea of inner speech as a forward model in light of empirical work from the past few decades, concluding that, while forward models could contribute to it, inner speech nonetheless requires activity from the implementers. PMID:23789938

Oppenheim, Gary M

2013-08-01

171

Current considerations in pediatric speech audiometry  

Microsoft Academic Search

Current considerations in pediatric speech perception assessment are highlighted in this article with a focus on specific test principles and variables that must be addressed when evaluating speech perception perfor- mance in children. Existing test materials are reviewed with an emphasis on the level of sensitivity and standar- dization that they have for accurate assessment of a child's speech perception

Lisa Lucks Mendel

2008-01-01

172

Comparative experiments on large vocabulary speech recognition  

Microsoft Academic Search

This paper describes several key experiments in large vocabulary speech recognition. We demonstrate that, counter to our intuitions, given a fixed amount of training speech, the number of training speakers has little effect on the accuracy. We show how much speech is needed for speaker-independent (SI) recognition in order to achieve the same performance as speaker-dependent (SD) recognition. We demonstrate

Richard Schwartz; Tasos Anastasakos; Francis Kubala; John Makhoul; Long Nguyen; George Zavaliagkos

1993-01-01

173

Speech enhancement using a Bayesian evidence approach  

Microsoft Academic Search

We consider the enhancement of speech corrupted by additive white Gaussian noise. In a Bayesian inference framework, maximum a posteriori (MAP) estimation of the signal is performed, along the lines developed by Lim & Oppenheim (1978). The speech enhancement problem is treated as a signal estimation problem, whose aim is to obtain a MAP estimate of the clean speech signal,

Gaafar M. K. Saleh; Mahesan Niranjan

2001-01-01

174

DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.  

ERIC Educational Resources Information Center

THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…

KARLIN, ISAAC W.; AND OTHERS

175

Speech Synthesis Applied to Language Teaching.  

ERIC Educational Resources Information Center

The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…

Sherwood, Bruce

1981-01-01

176

Auditory models for speech analysis  

NASA Astrophysics Data System (ADS)

This paper reviews the psychophysical basis for auditory models and discusses their application to automatic speech recognition. First an overview of the human auditory system is presented, followed by a review of current knowledge gleaned from neurological and psychoacoustic experimentation. Next, a general framework describes established peripheral auditory models which are based on well-understood properties of the peripheral auditory system. This is followed by a discussion of current enhancements to that models to include nonlinearities and synchrony information as well as other higher auditory functions. Finally, the initial performance of auditory models in the task of speech recognition is examined and additional applications are mentioned.

Maybury, Mark T.

177

Enhancement of Non-Air Conduct Speech Auditory Masking: Enhancement of Non-Air Conduct Speech  

Microsoft Academic Search

Besides air conduct speech, the non-air conduct speech may provide some exciting possibility of wide applications. This paper explores a new non-air conduct speech detecting method using millimeter wave radar. However, the combined noise which is introduced in the detecting system corrupted the non-air conduct speech greatly. Therefore, this study proposed an efficient way to enhance the corrupted speech by

Sheng Li; Jian Qi Wang; Ming Niu; Xi Jing Jing

2008-01-01

178

Phonetic and lexical interferences in informational masking during speech-in-speech comprehension  

Microsoft Academic Search

This study investigates masking effects occurring during speech comprehension in the presence of concurrent speech signals. We exam- ined the differential effects of acoustic-phonetic and lexical content of 4- to 8-talker babble (natural speech) or babble-like noise (reversed speech) on word identification. Behavioral results show a monotonic decrease in speech comprehension rates with an increasing number of simultaneous talkers in

Michel Hoen; Fanny Meunier; Claire-léonie Grataloup; François Pellegrino; Nicolas Grimault; Fabien Perrin; Xavier Perrot; Lionel Collet

2007-01-01

179

Speech perception and short-term memory deficits in persistent developmental speech disorder  

Microsoft Academic Search

Children with developmental speech disorders may have additional deficits in speech perception and\\/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech perception and short-term memory. Nine adults with a persistent familial

Mary Kay Kenney; Dragana Barac-Cikoja; Kimberly Finnegan; Neal Jeffries; Christy L. Ludlow

2006-01-01

180

Speech-to-speech translation based on finite-state transducers  

Microsoft Academic Search

Nowadays, the most successful speech recognition systems are based on stochastic finite-state networks (hidden Markov models and n-grams). Speech translation can be accomplished in a similar way as speech recognition. Stochastic finite-state transducers, which are specific stochastic finite-state networks, have proved very adequate for translation modeling. In this work a speech-to-speech translation system, the EuTRANS system, is presented. The acoustic,

F. Casacuberta; D. Llorens; S. Molau; F. Nevado; H. Ney; M. Pastor; A. Sanchis; E. Vidal; J. M. Vilar

2001-01-01

181

Normalizing the Speech Modulation Spectrum for Robust Speech Recognition  

Microsoft Academic Search

This paper presents a novel feature normalization technique for robust speech recognition. The proposed technique normalizes the temporal structure of the feature to reduce the feature variation due to environmental interferences. Specifically, it normalizes the utterance-dependent feature modulation spectrum to a reference function by filtering the feature using a square-root Wiener filter in the temporal domain. We show experimentally that

Xiong Xiao; Eng Siong Chng; Haizhou Li

2007-01-01

182

Embedding speech in web interfaces  

Microsoft Academic Search

In this paper, we will describe work in progress at the MITRE Cor- poration on embedding speech-enabled interfaces in Web browsers. This research is part of our work to establish the infrastructure to create Web-hosted versions of prototype multimodal interfaces, both intelligent and otherwise. Like many others, we believe that the Web is the best potential delivery and distribution vehicle

Samuel Bayer

1996-01-01

183

Speech errors across the lifespan  

Microsoft Academic Search

Dell, Burger, and Svec (1997) proposed that the proportion of speech errors classified as anticipations (e.g., “moot and mouth”) can be predicted solely from the overall error rate, such that the greater the error rate, the lower the anticipatory proportion (AP) of errors. We report a study examining whether this effect applies to changes in error rates that occur developmentally

Janet I. Vousden; Elizabeth A. Maylor

2006-01-01

184

Speech Research. Interim Scientific Report.  

ERIC Educational Resources Information Center

The status and progress of several studies dealing with the nature of speech, instrumentation for its investigation, and instrumentation for practical applications is reported on. The period of January 1 through June 30, 1969 is covered. Extended reports and manuscripts cover the following topics: programing for the Glace-Holmes synthesizer,…

Cooper, Franklin S.

185

Careers in Speech Communication: 1990.  

ERIC Educational Resources Information Center

Discusses the increased popularity of communication as an academic major and career choice. Suggests that the change reflects the shift in the United States workplace to an information orientation. Reports results of a study of the jobs held by recipients of communication degrees. Concludes that speech communication as a major offers flexibility…

Wolvin, Andrew D.

1991-01-01

186

Speech Motor Skill and Stuttering  

Microsoft Academic Search

The authors review converging lines of evidence from behavioral, kinematic, and neuroimaging data that point to limitations in speech motor skills in people who stutter (PWS). From their review, they conclude that PWS differ from those who do not in terms of their ability to improve with practice and retain practiced changes in the long term, and that they are

Aravind Kumar Namasivayam; Pascal van Lieshout

2011-01-01

187

Neuronal basis of speech comprehension.  

PubMed

Verbal communication does not rely only on the simple perception of auditory signals. It is rather a parallel and integrative processing of linguistic and non-linguistic information, involving temporal and frontal areas in particular. This review describes the inherent complexity of auditory speech comprehension from a functional-neuroanatomical perspective. The review is divided into two parts. In the first part, structural and functional asymmetry of language relevant structures will be discus. The second part of the review will discuss recent neuroimaging studies, which coherently demonstrate that speech comprehension processes rely on a hierarchical network involving the temporal, parietal, and frontal lobes. Further, the results support the dual-stream model for speech comprehension, with a dorsal stream for auditory-motor integration, and a ventral stream for extracting meaning but also the processing of sentences and narratives. Specific patterns of functional asymmetry between the left and right hemisphere can also be demonstrated. The review article concludes with a discussion on interactions between the dorsal and ventral streams, particularly the involvement of motor related areas in speech perception processes, and outlines some remaining unresolved issues. This article is part of a Special Issue entitled Human Auditory Neuroimaging. PMID:24113115

Specht, Karsten

2014-01-01

188

Speech Errors across the Lifespan  

ERIC Educational Resources Information Center

Dell, Burger, and Svec (1997) proposed that the proportion of speech errors classified as anticipations (e.g., "moot and mouth") can be predicted solely from the overall error rate, such that the greater the error rate, the lower the anticipatory proportion (AP) of errors. We report a study examining whether this effect applies to changes in error…

Vousden, Janet I.; Maylor, Elizabeth A.

2006-01-01

189

Inner Speech Impairments in Autism  

ERIC Educational Resources Information Center

Background: Three experiments investigated the role of inner speech deficit in cognitive performances of children with autism. Methods: Experiment 1 compared children with autism with ability-matched controls on a verbal recall task presenting pictures and words. Experiment 2 used pictures for which the typical names were either single syllable or…

Whitehouse, Andrew J. O.; Maybery, Murray T.; Durkin, Kevin

2006-01-01

190

Free Speech Advocates at Berkeley.  

ERIC Educational Resources Information Center

This study compares highly committed members of the Free Speech Movement (FSM) at Berkeley with the student population at large on 3 sociopsychological foci: general biographical data, religious orientation, and rigidity-flexibility. Questionnaires were administered to 172 FSM members selected by chance from the 10 to 1200 who entered and "sat-in"…

Watts, William A.; Whittaker, David

1966-01-01

191

Linguistic aspects of speech synthesis.  

PubMed

The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized. PMID:7479807

Allen, J

1995-10-24

192

Speech and Language Developmental Milestones  

MedlinePLUS

... What are the milestones for speech and language development? The first signs of communication occur when an infant learns that a cry will bring food, comfort, and companionship. Newborns also begin to recognize important sounds in their environment, such as the voice of their mother or ...

193

Acoustic Analysis of PD Speech  

PubMed Central

According to the U.S. National Institutes of Health, approximately 500,000 Americans have Parkinson's disease (PD), with roughly another 50,000 receiving new diagnoses each year. 70%–90% of these people also have the hypokinetic dysarthria associated with PD. Deep brain stimulation (DBS) substantially relieves motor symptoms in advanced-stage patients for whom medication produces disabling dyskinesias. This study investigated speech changes as a result of DBS settings chosen to maximize motor performance. The speech of 10 PD patients and 12 normal controls was analyzed for syllable rate and variability, syllable length patterning, vowel fraction, voice-onset time variability, and spirantization. These were normalized by the controls' standard deviation to represent distance from normal and combined into a composite measure. Results show that DBS settings relieving motor symptoms can improve speech, making it up to three standard deviations closer to normal. However, the clinically motivated settings evaluated here show greater capacity to impair, rather than improve, speech. A feedback device developed from these findings could be useful to clinicians adjusting DBS parameters, as a means for ensuring they do not unwittingly choose DBS settings which impair patients' communication.

Chenausky, Karen; MacAuslan, Joel; Goldhor, Richard

2011-01-01

194

Perspectives on speech recognition technology.  

PubMed

Speech recognition technology is used in all sorts of applications. However, for radiology, the issues are more complex than merely being able to dial a contact from an address book. In addition, radiologists have been hesitant to embrace the technology, with some preferring the status quo. Speech recognition technology has dramatically improved over the past several years, and they generally have been broadly commercialized. However, the use of speech recognition for composition of text reports or email has had only limited acceptance. The overriding reasons appear to be that most computer users prefer not to talk to their computers. They have learned to compose text documents via a "type-and-organize" methodology rather than composing the document "in their heads" and dictating. Radiologists are still required to dictate their reports, whether it is digitally, into an analog tape recording device or via a speech recognition system. The benefits extend to the radiologist's patients, and to the radiologist's employers--the hospitals or imaging centers--but it could be said that there is little direct benefit for the radiologist There is a belief that systems should focus more on improving radiologist efficiency rather than emphasizing cost savings and turnaround time. Integration with existing systems is critical. But any technology, in order for it to be well accepted by the primary user, needs to benefit that user. Before selecting any speech recognition technology a radiology administrator should do some research and find answers to several questions that address the basics of speech recognition technology and the companies that provide this technology. In addition, the radiology administrator must ensure that the facility is prepared to implement the technology and address any workflow- or culture-related issues that may arise. There are a number of opportunities for improvement in speech recognition radiology applications. These include the ongoing need for improvement recognition rates, the need to streamline integration with picture archiving and communication system (PACS) and radiology information system (RIS) technologies, and the general need to improve the user interface. In addition to these improvements, one can expect an increased adoption of structured reporting technologies within radiology. These techniques allow easier automated extraction of content and more flexible communication and organization of data (such as communication to electronic medical record systems). PMID:15794378

Talton, David

2005-01-01

195

Speech recognition with amplitude and frequency modulations  

NASA Astrophysics Data System (ADS)

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

2005-02-01

196

Speech recognition with amplitude and frequency modulations  

PubMed Central

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.

Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

2005-01-01

197

Speech recognition with amplitude and frequency modulations.  

PubMed

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. PMID:15677723

Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

2005-02-15

198

Speech disorders in right-hemisphere stroke.  

PubMed

Clinical practice shows that right-hemisphere cerebral strokes are often accompanied by one speech disorder or another. The aim of the present work was to analyze published data addressing speech disorders in right-sided strokes. Questions of the lateralization of speech functions are discussed, with particular reference to the role of the right hemisphere in speech activity and the structure of speech pathology in right-hemisphere foci. Clinical variants of speech disorders, such as aphasia, dysprosody, dysarthria, mutism, and stutter are discussed in detail. Types of speech disorders are also discussed, along with the possible mechanisms of their formation depending on the locations of lesions in the axis of the brain (cortex, subcortical structures, stem, cerebellum) and focus size. PMID:20532830

Dyukova, G M; Glozman, Z M; Titova, E Y; Kriushev, E S; Gamaleya, A A

2010-07-01

199

Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.  

PubMed

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

200

Adaptive Redundant Speech Transmission over Wireless Multimedia Sensor Networks Based on Estimation of Perceived Speech Quality  

PubMed Central

An adaptive redundant speech transmission (ARST) approach to improve the perceived speech quality (PSQ) of speech streaming applications over wireless multimedia sensor networks (WMSNs) is proposed in this paper. The proposed approach estimates the PSQ as well as the packet loss rate (PLR) from the received speech data. Subsequently, it decides whether the transmission of redundant speech data (RSD) is required in order to assist a speech decoder to reconstruct lost speech signals for high PLRs. According to the decision, the proposed ARST approach controls the RSD transmission, then it optimizes the bitrate of speech coding to encode the current speech data (CSD) and RSD bitstream in order to maintain the speech quality under packet loss conditions. The effectiveness of the proposed ARST approach is then demonstrated using the adaptive multirate-narrowband (AMR-NB) speech codec and ITU-T Recommendation P.563 as a scalable speech codec and the PSQ estimation, respectively. It is shown from the experiments that a speech streaming application employing the proposed ARST approach significantly improves speech quality under packet loss conditions in WMSNs.

Kang, Jin Ah; Kim, Hong Kook

2011-01-01

201

Speech and language delay in children.  

PubMed

Speech and language delay in children is associated with increased difficulty with reading, writing, attention, and socialization. Although physicians should be alert to parental concerns and to whether children are meeting expected developmental milestones, there currently is insufficient evidence to recommend for or against routine use of formal screening instruments in primary care to detect speech and language delay. In children not meeting the expected milestones for speech and language, a comprehensive developmental evaluation is essential, because atypical language development can be a secondary characteristic of other physical and developmental problems that may first manifest as language problems. Types of primary speech and language delay include developmental speech and language delay, expressive language disorder, and receptive language disorder. Secondary speech and language delays are attributable to another condition such as hearing loss, intellectual disability, autism spectrum disorder, physical speech problems, or selective mutism. When speech and language delay is suspected, the primary care physician should discuss this concern with the parents and recommend referral to a speech-language pathologist and an audiologist. There is good evidence that speech-language therapy is helpful, particularly for children with expressive language disorder. PMID:21568252

McLaughlin, Maura R

2011-05-15

202

Loss tolerant speech decoder for telecommunications  

NASA Technical Reports Server (NTRS)

A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.

Prieto, Jr., Jaime L. (Inventor)

1999-01-01

203

Cluster-based modeling for ubiquitous speech recognition  

Microsoft Academic Search

In order to realize speech recognition systems that can achieve high recognition accuracy for ubiquitous speech, it is crucial to make the systems flexible enough to cope with a large variability of spontaneous speech. This paper investigates two speech recognition methods that can adapt to speech variation using a large number of models trained based on clustering techniques; one automatically

Sadaoki Furui; Tomohisa Ichiba; Takahiro Shinozaki; Edward W. D. Whittaker; Koji Iwano

2005-01-01

204

Prediction and imitation in speech  

PubMed Central

It has been suggested that intra- and inter-speaker variability in speech are correlated. Interlocutors have been shown to converge on various phonetic dimensions. In addition, speakers imitate the phonetic properties of voices they are exposed to in shadowing, repetition, and even passive listening tasks. We review three theoretical accounts of speech imitation and convergence phenomena: (i) the Episodic Theory (ET) of speech perception and production (Goldinger, 1998); (ii) the Motor Theory (MT) of speech perception (Liberman and Whalen, 2000; Galantucci et al., 2006); (iii) Communication Accommodation Theory (CAT; Giles and Coupland, 1991; Giles et al., 1991). We argue that no account is able to explain all the available evidence. In particular, there is a need to integrate low-level, mechanistic accounts (like ET and MT), and higher-level accounts (like CAT). We propose that this is possible within the framework of an integrated theory of production and comprehension (Pickering and Garrod, 2013). Similarly to both ET and MT, this theory assumes parity between production and perception. Uniquely, however, it posits that listeners simulate speakers' utterances by computing forward-model predictions at many different levels, which are then compared to the incoming phonetic input. In our account phonetic imitation can be achieved via the same mechanism that is responsible for sensorimotor adaptation; i.e., the correction of prediction errors. In addition, the model assumes that the degree to which sensory prediction errors lead to motor adjustments is context-dependent. The notion of context subsumes both the preceding linguistic input and non-linguistic attributes of the situation (e.g., the speaker's and listener's social identities, their conversational roles, the listener's intention to imitate).

Gambi, Chiara; Pickering, Martin J.

2013-01-01

205

Speaker Verification Using Coded Speech  

Microsoft Academic Search

The implementation of a pseudo text-independent Speaker Verifica- tion system is described. This system was designed to use only information ex- tracted directly from the coded parameters embedded in the ITU-T G.729 bit- stream. Experiments were performed over the YOHO database (1). The feature vector as a short-time representation of speech consists of 16 LPC-Cepstral co- efficients, as well as

Antonio Moreno-daniel; Biing-Hwang Juang; Juan Arturo Nolazco-flores

2004-01-01

206

Representing Speech Through Musical Notation  

Microsoft Academic Search

The achievements of the inventor Joshua Steele (1700–1791) are twofold. First, he used musical concepts to analyze what modern linguists call “suprasegmentals”—voice-pitch, length, and stress; then, he used musical notation as the basis of a new system for recording and executing speech. Just as scholars of music now rely on musical notation for transcribing music heard during fieldwork, so Steele

Jamie C. Kassler

2005-01-01

207

MVA Processing of Speech Features  

Microsoft Academic Search

In this paper, we investigate a technique consisting of mean subtraction, variance normalization and time sequence filtering. Unlike other techniques, it applies auto-regression moving-average (ARMA) filtering directly in the cepstral domain. We call this technique mean subtraction, variance normalization, and ARMA filtering (MVA) post-processing, and speech features with MVA post-processing are called MVA features. Overall, compared to raw features without

Chia-ping Chen; Jeff A. Bilmes

2007-01-01

208

The Levels of Speech Usage Rating Scale: Comparison of Client Self-Ratings with Speech Pathologist Ratings  

ERIC Educational Resources Information Center

Background: The term "speech usage" refers to what people want or need to do with their speech to fulfil the communication demands in their life roles. Speech-language pathologists (SLPs) need to know about clients' speech usage to plan appropriate interventions to meet their life participation goals. The Levels of Speech Usage is a categorical…

Gray, Christina; Baylor, Carolyn; Eadie, Tanya; Kendall, Diane; Yorkston, Kathryn

2012-01-01

209

The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses  

ERIC Educational Resources Information Center

The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…

Adank, Patti

2012-01-01

210

Imaging applications in speech production research  

NASA Astrophysics Data System (ADS)

The primary focus of speech production research is directed towards obtaining improved understanding and quantitative characterization of the articulatory dynamics, acoustics, and cognition of both normal and pathological human speech. Such efforts are, however, frequently challenged by the lack of appropriate physical and physiological data. A great deal of attention is, hence, given to the development of novel measurement/instrumentation techniques which are desirably non invasive, safe, and do not interfere with normal speech production. Several imaging techniques have been successfully employed for studying speech production. In the first part of this paper, an overview of the various imaging techniques used in speech research such as x-rays, ultrasound, structural and functional magnetic resonance imaging, glossometry, palatography, video fibroscopy and imaging is presented. In the second part of the paper, we describe the results of our efforts to understand and model speech production mechanisms of vowels, fricatives, and lateral and rhotic consonants based on MRI data.

Narayanan, Shrikanth; Alwan, Abeer

1996-04-01

211

Primary progressive aphasia and apraxia of speech.  

PubMed

Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. PMID:24234355

Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A

2013-09-01

212

[Ergonomical study on Chinese speech warning].  

PubMed

Ergonomical experiments on speech warning under noise background were carried out in 40 healthy males, aged 20-33. Through the determination of auditory reaction time to the Chinese speech warning under dual-tasks and the subjective evalution of the suitable time length of main warning voice by the subject, the optimum parameters of Chinese speech warning in accordance with space ergonomics were determined. It was found that: suitable time length of main warning voice is 0.35-0.55s, main interval is 0.15-0.35s, speech speed is 4-6 word/s, and sentence interval is 0.2-0.4s. Meanwhile, the analysis of heart rate (HR) and heart rate variability (HRV) demonstrated that the speech warning using aforementioned parameters didn't increase the operator's work load. The results can serve as the objective ergonomical basis and the evaluation criterion for design of speech warning in manned space vehicle. PMID:11541261

Han, D; Zhou, C; Liu, Y; Zhai, Y

1998-02-01

213

Speech Evoked Auditory Brainstem Responses: A New Tool to Study Brainstem Encoding of Speech Sounds  

Microsoft Academic Search

The neural encoding of speech sound begins in the auditory nerve and travels to the auditory brainstem. Non speech stimuli\\u000a such as click or tone bursts stimulus are used to check the auditory neural integrity routinely. Recently Speech evoked Auditory\\u000a Brainstem measures (ABR) are being used as a tool to study the brainstem processing of Speech sounds. The aim of

Sujeet Kumar Sinha; Vijayalakshmi Basavaraj

2010-01-01

214

RECENT ADVANCES IN SRI'S IRAQCOMM™ IRAQI ARABIC-ENGLISH SPEECH-TO-SPEECH TRANSLATION SYSTEM  

Microsoft Academic Search

We summarize recent progress on SRI's IraqComm™ Iraqi Arabic-English two-way speech-to-speech translation system. In the past year we made substantial developments in our speech recognition and machine translation technology, leading to significant improvements in both accuracy and speed of the IraqComm system. On the 2008 NIST-evaluation dataset our two- way speech-to-text (S2T) system achieved 6% to 8% absolute improvement in

Murat Akbacak; Horacio Franco; Michael Frandsen; Huda Jameel; Andreas Kathol; Shahram Khadivi; Xin Lei; Arindam Mandal; Saab Mansour; Kristin Precoda; Colleen Richey; Dimitra Vergyri; Wen Wang; Mei Yang; Jing Zheng

215

A Trainable Approach for Multi-Lingual Speech-To-Speech Translation System  

Microsoft Academic Search

This paper presents a statistical speech-to-speech machine transla- tion (MT) system for limited domain applications using a cascaded approach. This architecture allows for the creation of multilingual applications. In this paper, the system architecture and its compo- nents, including the speech recognition, parsing, information ex- traction, translation, natural language generation (NLG) and text- to-speech (TTS) components are described. We have

Y. Gao; J. Sorensen; H. B. Zhou; Z. Diao

216

The neural processing of masked speech  

PubMed Central

Spoken language is rarely heard in silence, and a great deal of interest in psychoacoustics has focused on the ways that the perception of speech is affected by properties of masking noise. In this review we first briefly outline the neuroanatomy of speech perception. We then summarise the neurobiological aspects of the perception of masked speech, and investigate this as a function of masker type, masker level and task.

Scott, Sophie K; McGettigan, Carolyn

2014-01-01

217

Speech, Text and Braille Conversion Technology  

Microsoft Academic Search

\\u000a This chapter is devoted to the fascinating triangle of conversion technologies that arise between text, speech and Braille.\\u000a These are enabling technologies that allow speech to be converted into text as might happen in the creation of a letter, that\\u000a allows text to be converted into speech as might happen in the reading of a book for enjoyment and then,

Rüdiger Hoffmann

218

Neural Network-based Speech Synthesis  

Microsoft Academic Search

Neural networks and other softcomputing methodologies (mainly fuzzy logic and genetic algorithms) prove optimal in solving pattern-matching problems in audio- and speech-processing applications. Very little research has been done that targets softcomputing audio and speech synthesis. Softcomputing computational models can be an optimal audio-synthesis solution for reducing memory and computing-power requirements. A neural network-based text-to-speech processor is proposed and compared

David Frontini; Mario Malcangi

2006-01-01

219

Developing Client-Server Speech Translation Platform  

Microsoft Academic Search

This paper describes a client-server speech translation platform designed for use at mobile terminals. Because terminals and servers are connected via a 3G public mobile phone networks, speech translation services are available at various places with thin client. This platform realizes hands-free communication and robustness for real use of speech translation in noisy environments. A microphone array and new noise

Tohru Shimizu; Yutaka Ashikari; Toshiyuki Takezawa; Masahide Mizushima; Gen-ichiro Kikui; Yutaka Sasaki; Satoshi Nakamura

2006-01-01

220

Continuous Speech Recognition for Clinicians  

PubMed Central

The current generation of continuous speech recognition systems claims to offer high accuracy (greater than 95 percent) speech recognition at natural speech rates (150 words per minute) on low-cost (under $2000) platforms. This paper presents a state-of-the-technology summary, along with insights the authors have gained through testing one such product extensively and other products superficially. The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute “twice a day” for “bid” when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians' offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal “usable” hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a “speech quality” sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate. The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription.

Zafar, Atif; Overhage, J. Marc; McDonald, Clement J.

1999-01-01

221

Spotlight on Speech Codes 2007: The State of Free Speech on Our Nation's Campuses  

ERIC Educational Resources Information Center

Last year, the Foundation for Individual Rights in Education (FIRE) conducted its first-ever comprehensive study of restrictions on speech at America's colleges and universities, "Spotlight on Speech Codes 2006: The State of Free Speech on our Nation's Campuses." In light of the essentiality of free expression to a truly liberal education, its…

Foundation for Individual Rights in Education (NJ1), 2007

2007-01-01

222

The response of the apparent receptive speech disorder of Parkinson's disease to speech therapy  

Microsoft Academic Search

Eleven patients with Parkinson's disease were tested for prosodic abnormality, on three tests of speech production (of angry, questioning, and neutral statement forms), and four tests of appreciation of the prosodic features of speech and facial expression. The tests were repeated after a control period of two weeks without speech therapy and were not substantially different. After two weeks of

S Scott; F I Caird

1984-01-01

223

Speech and Language Skills of Parents of Children with Speech Sound Disorders  

ERIC Educational Resources Information Center

Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…

Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry

2007-01-01

224

JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies  

Microsoft Academic Search

The authors present JANUS, a speech-to-speech translation system that utilizes diverse processing strategies including dynamic programming, stochastic techniques, connectionist learning, and traditional AI knowledge representation approaches. JANUS translates continuously spoken English utterances into Japanese and German speech utterances. The overall system performance on a corpus of conference registration conversations is 87%. Two versions of JANUS are compared: one using a

Alex Waibel; Ajay N. Jain; Arthur E. McNair; Hiroaki Saito; Alexander G. Hauptmann; Joe Tebelskis

1991-01-01

225

DELAYED SPEECH AND LANGUAGE DEVELOPMENT, PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.  

ERIC Educational Resources Information Center

WRITTEN FOR SPEECH PATHOLOGY STUDENTS AND PROFESSIONAL WORKERS, THE BOOK BEGINS BY DEFINING LANGUAGE AND SPEECH AND TRACING THE DEVELOPMENT OF SPEECH AND LANGUAGE FROM THE INFANT THROUGH THE 4-YEAR OLD. CAUSAL FACTORS OF DELAYED DEVELOPMENT ARE GIVEN, INCLUDING CENTRAL NERVOUS SYSTEM IMPAIRMENT AND ASSOCIATED BEHAVIORAL CLUES AND LANGUAGE…

WOOD, NANCY E.

226

Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses  

ERIC Educational Resources Information Center

The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be protected by the…

Foundation for Individual Rights in Education (NJ1), 2012

2012-01-01

227

The Social and Private Worlds of Speech: Speech for Inter- and Intramental Activity  

ERIC Educational Resources Information Center

During a study designed to examine the processes of learning English as an additional language as manifest in the interactive behaviour of small groups of bilingual school children playing specially designed board games, several instances of "private speech" were captured. Private speech is commonly described as speech addressed to the self for…

Smith, Heather J.

2007-01-01

228

Private and Inner Speech and the Regulation of Social Speech Communication  

ERIC Educational Resources Information Center

To further investigate the possible regulatory role of private and inner speech in the context of referential social speech communications, a set of clear and systematically applied measures is needed. This study addresses this need by introducing a rigorous method for identifying private speech and certain sharply defined instances of inaudible…

San Martin Martinez, Conchi; Boada i Calbet, Humbert; Feigenbaum, Peter

2011-01-01

229

Statistical modeling of infant-directed versus adult-directed speech: Insights from speech recognition  

NASA Astrophysics Data System (ADS)

Studies on infant speech perception have shown that infant-directed speech (motherese) exhibits exaggerated acoustic properties, which are assumed to guide infants in the acquisition of phonemic categories. Training an automatic speech recognizer on such data might similarly lead to improved performance since classes can be expected to be more clearly separated in the training material. This claim was tested by training automatic speech recognizers on adult-directed (AD) versus infant-directed (ID) speech and testing them under identical versus mismatched conditions. 32 mother-infant conversations and 32 mother-adult conversations were used as training and test data. Both sets of conversations included a set of cue words containing unreduced vowels (e.g., sheep, boot, top, etc.), which mothers were encouraged to use repeatedly. Experiments on continuous speech recognition of the entire data set showed that recognizers trained on infant-directed speech did perform significantly better than those trained on adult-directed speech. However, isolated word recognition experiments focusing on the above-mentioned cue words showed that the drop in performance of the ID-trained speech recognizer on AD test speech was significantly smaller than vice versa, suggesting that speech with over-emphasized phonetic contrasts may indeed constitute better training material for speech recognition. [Work supported by CMBL, University of Washington.

Kirchhoff, Katrin; Schimmel, Steven

2003-10-01

230

A computational auditory scene analysis system for speech segregation and robust speech recognition  

Microsoft Academic Search

A conventional automatic speech recognizer does not perform well in the presence of multiple sound sources, while human listeners are able to segregate and recognize a signal of interest through auditory scene analysis. We present a com- putational auditory scene analysis system for separating and recognizing target speech in the presence of competing speech or noise. We estimate, in two

Yang Shao; Soundararajan Srinivasan; Zhaozhang Jin; DeLiang Wang

2010-01-01

231

Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis  

ERIC Educational Resources Information Center

Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…

Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

2009-01-01

232

Stability and Composition of Functional Synergies for Speech Movements in Children with Developmental Speech Disorders  

ERIC Educational Resources Information Center

The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa(/spa[image omitted]/) and paas(/pa[image omitted]s/) by 10 6- to 9-year-olds with developmental speech

Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

2011-01-01

233

Speech development in children after cochlear implantation.  

PubMed

We evaluated the long-term speech intelligibility of young deaf children after cochlear implantation (CI). A prospective study on 47 consecutively implanted deaf children with up to 5 years cochlear implant use was performed. The study was conducted at a pediatric tertiary referral center for CI. All children in the study were deaf prelingually. They each receive implant before the program of auditory verbal therapy. A speech intelligibility rating scale evaluated the spontaneous speech of each child before and at frequent interval for 5 years after implantation. After cochlear implantation, the difference between the speech intelligibility, rating increased significantly each year for 3 years (P < 0.05). For the first year, the average rating remained "prerecognizable words" or "unintelligible speech". After 2 year of implantation the children had intelligible speech if someone concentrates and lip-reads (category 3). At the 4- and 5-year interval, 71.5 and 78% of children had intelligible speech to all listeners (category 5), respectively. So, 5 years after rehabilitation mode and median of speech intelligibility rating was five. Congenital and prelingually deaf children gradually develop intelligible speech that does not plateau 5 years after implantation. PMID:17639444

Bakhshaee, Mehdi; Ghasemi, Mohammad Mahdi; Shakeri, Mohammad Taghi; Razmara, Narjes; Tayarani, Hamid; Tale, Mohammad Reza

2007-11-01

234

Adjusting dysarthric speech timing using neural nets  

Microsoft Academic Search

Abstract Speech,timing problems,associated,with dysarthria often involve the presence,of periods of extraneous silence and non-speech sounds,as well as inappropriate ly timed or misplaced,speech,gestures. This study evaluated the performance,of neural networks in detecting the presence,of inappropriate or non-speech sounds and extraneous silence. The ‘opt’ neural network,program,[E. Barnard and,R. Cole. OGC Tech. Report No.CSE 89-014] which uses a conjugate gradient algorithm to adjust

Shirley M. Peters; H. Timothy Bunnell

1991-01-01

235

Speech evaluation for patients with cleft palate.  

PubMed

Children with cleft palate are at risk for speech problems, particularly those caused by velopharyngeal insufficiency. There may be an additional risk of speech problems caused by malocclusion. This article describes the speech evaluation for children with cleft palate and how the results of the evaluation are used to make treatment decisions. Instrumental procedures that provide objective data regarding the function of the velopharyngeal valve, and the 2 most common methods of velopharyngeal imaging, are also described. Because many readers are not familiar with phonetic symbols for speech phonemes, Standard English letters are used for clarity. PMID:24607192

Kummer, Ann W

2014-04-01

236

Post-laryngectomy speech respiration patterns  

PubMed Central

Objectives The goal of this study was to determine if speech breathing changes over time in laryngectomy patients using an electrolarynx, and to explore the potential of using respiratory signals to control an artificial voice source. Methods Respiratory patterns during serial speech tasks (counting, days of the week) with an electrolarynx were prospectively studied in six individuals across their first 1–2 years after total laryngectomy, as well as in an additional eight individuals at least 1 year post-laryngectomy using inductance plethysmography. Results In contrast to normal speech that is only produced during exhalation, all individuals were found to engage in inhalation during speech production, with those studied longitudinally displaying increased occurrences of inhalation during speech production with time post-laryngectomy. These trends appear to be stronger for individuals who used an electrolarynx as their primary means of oral communication rather than tracheoesophageal (TE) speech, possibly due to continued dependence on respiratory support for the production of TE speech. Conclusions Our results indicate that there are post-laryngectomy changes in electrolarynx speech breathing behaviors. This has implications for designing improved electrolarynx communication systems, which could use signals derived from respiratory function as one of many potential physiologically based sources for more natural control of electrolarynx speech.

Stepp, Cara E.; Heaton, James T.; Hillman, Robert E.

2012-01-01

237

Speech synthesis with artificial neural networks  

NASA Astrophysics Data System (ADS)

The application of neural nets to speech synthesis is considered. In speech synthesis, the main efforts so far have been to master the grapheme to phoneme conversion. During this conversion symbols (graphemes) are converted into other symbols (phonemes). Neural networks, however, are especially competitive for tasks in which complex nonlinear transformations are needed and sufficient domain specific knowledge is not available. The conversion of text into speech parameters appropriate as input for a speech generator seems such a task. Results of a pilot study in which an attempt is made to train a neural network for this conversion are presented.

Weijters, Ton; Thole, Johan

1992-10-01

238

Multilingual Phoneme Models for Rapid Speech Processing System Development.  

National Technical Information Service (NTIS)

Current speech recognition systems tend to be developed only for commercially viable languages. The resources needed for a typical speech recognition system include hundreds of hours of transcribed speech for acoustic models and 10 to 100 million words of...

E. G. Hansen

2006-01-01

239

Speech and Language Disorders in the School Setting  

MedlinePLUS

... and Swallowing › Development Frequently Asked Questions: Speech and Language Disorders in the School Setting What types of speech and language disorders affect school-age children ? Do speech-language ...

240

A Resource Manual for Speech and Hearing Programs in Oklahoma.  

ERIC Educational Resources Information Center

Administrative aspects of the Oklahoma speech and hearing program are described, including state requirements, school administrator role, and organizational and operational procedures. Information on speech and language development and remediation covers language, articulation, stuttering, voice disorders, cleft palate, speech improvement,…

Oklahoma State Dept. of Education, Oklahoma City.

241

DARPA TIMIT Acoustic-Phonetic Continous Speech Corpus CD-ROM Documentation. NIST Speech Disc 1-1.1.  

National Technical Information Service (NTIS)

The Texas Instruments/Massachusetts Institute of Technology (TIMIT) corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition sy...

J. S. Garofolo L. F. Lamel W. M. Fisher J. G. Fiscus D. S. Pallett

1993-01-01

242

Open Microphone Speech Understanding: Correct Discrimination Of In Domain Speech  

NASA Technical Reports Server (NTRS)

An ideal spoken dialogue system listens continually and determines which utterances were spoken to it, understands them and responds appropriately while ignoring the rest This paper outlines a simple method for achieving this goal which involves trading a slightly higher false rejection rate of in domain utterances for a higher correct rejection rate of Out of Domain (OOD) utterances. The system recognizes semantic entities specified by a unification grammar which is specialized by Explanation Based Learning (EBL). so that it only uses rules which are seen in the training data. The resulting grammar has probabilities assigned to each construct so that overgeneralizations are not a problem. The resulting system only recognizes utterances which reduce to a valid logical form which has meaning for the system and rejects the rest. A class N-gram grammar has been trained on the same training data. This system gives good recognition performance and offers good Out of Domain discrimination when combined with the semantic analysis. The resulting systems were tested on a Space Station Robot Dialogue Speech Database and a subset of the OGI conversational speech database. Both systems run in real time on a PC laptop and the present performance allows continuous listening with an acceptably low false acceptance rate. This type of open microphone system has been used in the Clarissa procedure reading and navigation spoken dialogue system which is being tested on the International Space Station.

Hieronymus, James; Aist, Greg; Dowding, John

2006-01-01

243

Method and apparatus for obtaining complete speech signals for speech recognition applications  

NASA Technical Reports Server (NTRS)

The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)

2009-01-01

244

Analysis of Fundamental Frequency Pattern in Speech.  

National Technical Information Service (NTIS)

The fundamental frequency pattern is varied in speech. For example, when we speak angrily, the fundamental frequency is high, and the intonation comes to be deep. Is there a rule about the pattern and speech. The way to analyze the fundamental frequency p...

T. Okudera Y. Takasawa

1993-01-01

245

Noise estimation techniques for robust speech recognition  

Microsoft Academic Search

Two new techniques are presented to estimate the noise spectra or the noise characteristics for noisy speech signals. No explicit speech pause detection is required. Past noisy segments of just about 400 ms duration are needed for the estimation. Thus the algorithm is able to quickly adapt to slowly varying noise levels or slowly changing noise spectra. This techniques can

H. G. Hirsch; C. Ehrlicher

1995-01-01

246

Speech Recognition with Primarily Temporal Cues  

Microsoft Academic Search

Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants,

Robert V. Shannon; Fan-Gang Zeng; Vivek Kamath; John Wygonski; Michael Ekelid

1995-01-01

247

Speech recognition with amplitude and frequency modulations  

Microsoft Academic Search

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited

Fan-Gang Zeng; Kaibao Nie; Ginger S. Stickney; Ying-Yee Kong; Michael Vongphoe; Ashish Bhargave; Chaogang Wei; Keli Cao

2005-01-01

248

Localization of Sublexical Speech Perception Components  

ERIC Educational Resources Information Center

Models of speech perception are in general agreement with respect to the major cortical regions involved, but lack precision with regard to localization and lateralization of processing units. To refine these models we conducted two Activation Likelihood Estimation (ALE) meta-analyses of the neuroimaging literature on sublexical speech perception.…

Turkeltaub, Peter E.; Coslett, H. Branch

2010-01-01

249

A distributed system architecture for speech recognition  

Microsoft Academic Search

This paper is an overview of the software environment (called AGORA) that is being developed at Carnegie-Mellon University to support the devlopment of connected, speaker independent, speech recognition systems. This includes the design, implemenentation, testing and fast (parallel) execution of speech recognition programs composed of a large number of independent algorithms. Systems described within AGORA can be executed on a

F. Alleva; R. Bisiani; S. Forin; R. Lerner

1986-01-01

250

Building Searchable Collections of Enterprise Speech Data.  

ERIC Educational Resources Information Center

The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

251

Pronunciation Modeling for Large Vocabulary Speech Recognition  

ERIC Educational Resources Information Center

The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy in automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the…

Kantor, Arthur

2010-01-01

252

Pronunciation Modeling for Spontaneous Mandarin Speech Recognition  

Microsoft Academic Search

Pronunciation variations in spontaneous speech can be classified into complete changes and partial changes. A complete change is the replacement of a canonical phoneme by another alternative phone, such as 'b' being pronounced as 'p'. Partial changes are variations within the phoneme such as nasalization, centralization and voiced. Most current work in pronunciation modeling for spontaneous Mandarin speech remains at

Yi Liu; Pascale Fung

2004-01-01

253

SPEECH LEVELS IN VARIOUS NOISE ENVIRONMENTS  

EPA Science Inventory

The goal of this study was to determine average speech levels used by people when conversing in different levels of background noise. The non-laboratory environments where speech was recorded were: high school classrooms, homes, hospitals, department stores, trains and commercial...

254

Unit Selection Speech Synthesis in Noise  

Microsoft Academic Search

The paper presents an approach to unit selection speech synthesis in noise. The approach is based on a modification of the speech synthesis method originally published in A.W. Black and P. Taylor (1997), where the distance of a candidate unit from its cluster center is used as the unit selection cost. We found out that using an additional measure evaluating

M. Cernak

2006-01-01

255

UNIT SELECTION SPEECH SYNTHESIS IN NOISE  

Microsoft Academic Search

The paper presents an approach to unit selection speech syn- thesis in noise. The approach is based on a modification of the speech synthesis method originally published in (1), where the distance of a candidate unit from its cluster center is used as the unit selection cost. We found out that using an addi- tional measure evaluating intelligibility for the

Milos Cer

256

The Role of Speech in Language.  

ERIC Educational Resources Information Center

This book reports the proceedings of the conference on the role of speech in language, the fifth conference in the "Communicating by Language" Series, sponsored by the Growth and Development Branch of the National Institute of Child Health and Human Development. The focus of the first group of papers is on the development of speech in man and…

Kavanagh, James F., Ed.; Cutting, James E., Ed.

257

Toddlers' recognition of noise-vocoded speech  

PubMed Central

Despite their remarkable clinical success, cochlear-implant listeners today still receive spectrally degraded information. Much research has examined normally hearing adult listeners' ability to interpret spectrally degraded signals, primarily using noise-vocoded speech to simulate cochlear implant processing. Far less research has explored infants' and toddlers' ability to interpret spectrally degraded signals, despite the fact that children in this age range are frequently implanted. This study examines 27-month-old typically developing toddlers' recognition of noise-vocoded speech in a language-guided looking study. Children saw two images on each trial and heard a voice instructing them to look at one item (“Find the cat!”). Full-spectrum sentences or their noise-vocoded versions were presented with varying numbers of spectral channels. Toddlers showed equivalent proportions of looking to the target object with full-speech and 24- or 8-channel noise-vocoded speech; they failed to look appropriately with 2-channel noise-vocoded speech and showed variable performance with 4-channel noise-vocoded speech. Despite accurate looking performance for speech with at least eight channels, children were slower to respond appropriately as the number of channels decreased. These results indicate that 2-yr-olds have developed the ability to interpret vocoded speech, even without practice, but that doing so requires additional processing. These findings have important implications for pediatric cochlear implantation.

Newman, Rochelle; Chatterjee, Monita

2013-01-01

258

Neural network approach to speech pathology  

Microsoft Academic Search

A speech problem can be caused by different reasons, from psychological to organic. The existing diagnostic of speech pathologies relies on skilled doctors who can often diagnose by simply listening to the patient. We show that neural networks can simulate this ability and thus provide an automated (preliminary) diagnosis

Antonio P. Salvatore; N. Thome; C. M. Gorss; Michael P. Cannito

1999-01-01

259

Nonlinear, Biophysically-Informed Speech Pathology Detection  

Microsoft Academic Search

This paper reports a simple nonlinear approach to online acoustic speech pathology detection for automatic screening purposes. Straightforward linear preprocessing followed by two nonlinear measures, based parsimoniously upon the biophysics of speech production, combined with subsequent linear classification, achieves an overall normal\\/pathological detection performance of 91.4%, and over 99% with rejection of 15% ambiguous cases. This compares favourably with more

Max Little; P. McSharry; I. Moroz; S. Roberts

2006-01-01

260

Variant and invariant characteristics of speech movements  

Microsoft Academic Search

Upper lip, lower lip, and jaw kinematics during select speech behaviors were studied in an attempt to identify potential invariant characteristics associated with this highly skilled motor behavior. Data indicated that speech motor actions are executed and planned presumably in terms of relatively invariant combined multimovement gestures. In contrast, the individual upper lip, lower lip, and jaw movements and their

V. L. Gracco; J. H. Abbs

1986-01-01

261

The motor theory of speech perception revised  

Microsoft Academic Search

Abstract A motor theory of speech perception, initially proposed to account for results of early experiments with synthetic speech, is now extensively revised to accom- modate recent findings, and to relate the assumptions of the theory to those that might be made,about other perceptual modes. According to the revised theory, phonetic information is perceived in a biologically distinct system, a

ALVIN M. LIBERMAN; IGNATIUS G. MATTINGLY

1985-01-01

262

Noise suppression methods for robust speech processing  

NASA Astrophysics Data System (ADS)

Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during the reporting period for the research program funded to develop real-time, compressed speech analysis-synthesis algorithms whose performance is invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real-time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the research and results in the areas of noise suppression using the dual input adaptive noise cancellation articulation rate change techniques, spectral subtraction and a description of an experiment which demonstrated that the spectral substraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC-10 coded, helicopter speech by 10.6 points. In addition summaries are included of prior studies in Constant-Q signal analysis and synthesis, perceptual modelling, speech activity detection, and pole-zero modelling of noisy signals. Three recent studies in speech modelling using the critical band analysis-synthesis transform and using splines are then presented. Finally a list of major publications generated under this contract is given.

Boll, S. F.; Kajiya, J.; Youngberg, J.; Petersen, T. L.; Ravindra, H.; Done, W.; Cox, B. V.; Cohen, E.

1981-04-01

263

An Acquired Deficit of Audiovisual Speech Processing  

ERIC Educational Resources Information Center

We report a 53-year-old patient (AWF) who has an acquired deficit of audiovisual speech integration, characterized by a perceived temporal mismatch between speech sounds and the sight of moving lips. AWF was less accurate on an auditory digit span task with vision of a speaker's face as compared to a condition in which no visual information from…

Hamilton, Roy H.; Shenton, Jeffrey T.; Coslett, H. Branch

2006-01-01

264

Second Language Learners and Speech Act Comprehension  

ERIC Educational Resources Information Center

Recognizing the specific speech act ( Searle, 1969) that a speaker performs with an utterance is a fundamental feature of pragmatic competence. Past research has demonstrated that native speakers of English automatically recognize speech acts when they comprehend utterances (Holtgraves & Ashley, 2001). The present research examined whether this…

Holtgraves, Thomas

2007-01-01

265

The Oral Speech Mechanism Screening Examination (OSMSE).  

ERIC Educational Resources Information Center

Although speech-language pathologists are expected to be able to administer and interpret oral examinations, there are currently no screening tests available that provide careful administration instructions and data for intra-examiner and inter-examiner reliability. The Oral Speech Mechanism Screening Examination (OSMSE) is designed primarily for…

St. Louis, Kenneth O.; Ruscello, Dennis M.

266

Assessing Speech Discrimination in Individual Infants  

ERIC Educational Resources Information Center

Assessing speech discrimination skills in individual infants from clinical populations (e.g., infants with hearing impairment) has important diagnostic value. However, most infant speech discrimination paradigms have been designed to test group effects rather than individual differences. Other procedures suffer from high attrition rates. In this…

Houston, Derek M.; Horn, David L.; Qi, Rong; Ting, Jonathan Y.; Gao, Sujuan

2007-01-01

267

Nonstationary spectral modeling of voiced speech  

Microsoft Academic Search

The main purpose of this paper is to present a novel model for voiced speech. The classical model, which is being used in many applications, assumes local stationarity, and consequently imposes a simple and well known line structure to the short-time spectrum of voiced speech. The model derived in this paper allows for local non-stationarities not only in terms of

L. Almeida; J. Tribolet

1983-01-01

268

Decreasing Communication Anxiety through Public Speech Training.  

ERIC Educational Resources Information Center

A study investigated the time honored notion that students who successfully complete a basic course in public speaking will experience less anxiety about speech situations, and will have significantly different emotional reactions to potential public speaking situations. Scales to measure communication apprehension, speech anxiety, and pleasure,…

Biggers, Thompson

269

Enhancement and bandwidth compression of noisy speech  

Microsoft Academic Search

Over the past several years there has been considerable attention focused on the problem of enhancement and bandwidth compression of speech degraded by additive background noise. This interest is motivated by several factors including a broad set of important applications, the apparent lack of robustness in current speech-compression systems and the development of several potentially promising and practical solutions. One

J. S. Lim; A. V. Oppenheim

1979-01-01

270

Robust speech recognizer using multiclass SVM  

Microsoft Academic Search

In this paper a robust speech recognizer is presented based on features obtained from the speech signal and also from the image of the speaker. The features were combined by simple concatenation, resulting in composed feature vectors to train the models corresponding to each class. For recognition, the classification process relies on a very effective algorithm, namely the multiclass SVM.

Inge Gavat; Gabriel Costache; Claudia Iancu

2004-01-01

271

Speech Fluency in Fragile X Syndrome  

ERIC Educational Resources Information Center

The present study investigated the dysfluencies in the speech of nine French speaking individuals with fragile X syndrome. Type, number, and loci of dysfluencies were analysed. The study confirms that dysfluencies are a common feature of the speech of individuals with fragile X syndrome but also indicates that the dysfluency pattern displayed is…

Van Borsel, John; Dor, Orianne; Rondal, Jean

2008-01-01

272

Speech Processing Technology in Second Language Testing  

Microsoft Academic Search

The purpose of the study described in this article was to investigate the effectiveness of one application of speech recognition technology in the assessment of spoken English and overall English proficiency. The application referred to is called Versant and is a fully automated test of English as a second language, a test which utilizes speech recognition technology rather than a

Marina Dodigovic

2009-01-01

273

Acquisition of Imitative Speech by Schizophrenic Children  

Microsoft Academic Search

Two mute schizophrenic children were taught imitative speech within an operant conditioning framework. The training procedure consisted of a series of increasingly fine verbal discriminations; the children were rewarded for closer and closer reproductions of the attending adults' speech. We found that reward delivered contingent upon imitation was necessary for development of imitation. Furthermore, the newly established imitation was shown

O. Ivar Lovaas; John P. Berberich; Bernard F. Perloff; Benson Schaeffer

1966-01-01

274

A prosodically guided speech understanding strategy  

Microsoft Academic Search

Our strategy for computer understanding of speech uses prosodic features to break up continuous speech into sentences and phrases and locate stressed syllables in those phrases. The most reliable phonetic data are obtained by performing a distinguishing features analysis within the stressed syllables and by locating sibilants and other robust information in unstressed syllables. The numbers and locations of syntactic

WAYNE A. LEA; MARK F. MEDRESS; TOBY E. SKINNER

1975-01-01

275

How Should a Speech Recognizer Work?  

ERIC Educational Resources Information Center

Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that…

Scharenborg, Odette; Norris, Dennis; ten Bosch, Louis; McQueen, James M.

2005-01-01

276

CLEFT PALATE. FOUNDATIONS OF SPEECH PATHOLOGY SERIES.  

ERIC Educational Resources Information Center

DESIGNED TO PROVIDE AN ESSENTIAL CORE OF INFORMATION, THIS BOOK TREATS NORMAL AND ABNORMAL DEVELOPMENT, STRUCTURE, AND FUNCTION OF THE LIPS AND PALATE AND THEIR RELATIONSHIPS TO CLEFT LIP AND CLEFT PALATE SPEECH. PROBLEMS OF PERSONAL AND SOCIAL ADJUSTMENT, HEARING, AND SPEECH IN CLEFT LIP OR CLEFT PALATE INDIVIDUALS ARE DISCUSSED. NASAL RESONANCE…

RUTHERFORD, DAVID; WESTLAKE, HAROLD

277

Speech-Language Program Review External Report.  

ERIC Educational Resources Information Center

A study evaluated the Speech-Language Program in school district #68, Nanaimo, British Columbia, Canada. An external evaluator visited the district and spent 4 consecutive days in observing speech-language pathologists (SLPs), interviewing teachers, parents, administrators, and examining records. Results indicated an extremely positive response to…

Nussbaum, Jo

278

Uncertainty in training large vocabulary speech recognizers  

Microsoft Academic Search

We propose a technique for annotating data used to train a speech recognizer. The proposed scheme is based on labeling only a single frame for every word in the training set. We make use of the virtual evidence (VE) framework within a graphical model to take advantage of such data. We apply this approach to a large vocabulary speech recognition

Amarnag Subramanya; Chris Bartels; Jeff Bilmes; Patrick Nguyen

2007-01-01

279

What Makes ESL Students' Speech Sound Unacceptable?  

ERIC Educational Resources Information Center

A study of the gravity of non-native speakers' speech errors, particularly as viewed in the workplace, was based on two assumptions: that certain features of spoken English contribute more to speech acceptability than others, and that native speakers have an internalized, ordered list of criteria for making judgments about non-native speakers'…

Browning, Gari

280

Pulmonic Ingressive Speech in Shetland English  

ERIC Educational Resources Information Center

This paper presents a study of pulmonic ingressive speech, a severely understudied phenomenon within varieties of English. While ingressive speech has been reported for several parts of the British Isles, New England, and eastern Canada, thus far Newfoundland appears to be the only locality where researchers have managed to provide substantial…

Sundkvist, Peter

2012-01-01

281

Hypnosis and the Reduction of Speech Anxiety.  

ERIC Educational Resources Information Center

The purposes of this paper are (1) to review the background and nature of hypnosis, (2) to synthesize research on hypnosis related to speech communication, and (3) to delineate and compare two potential techniques for reducing speech anxiety--hypnosis and systematic desensitization. Hypnosis has been defined as a mental state characterised by…

Barker, Larry L.; And Others

282

Speech vs. singing: infants choose happier sounds.  

PubMed

Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119

Corbeil, Marieve; Trehub, Sandra E; Peretz, Isabelle

2013-01-01

283

Speech Interference Level and Aircraft Acoustical Environment  

Microsoft Academic Search

Basic concepts and formulation of the speech interference level (SIL) measure are discussed and the implications of the use of SIL to measure aircraft cabin environment are analyzed. Intelligibility tests with both words and phrases indicate that serious interference with speech can be demonstrated by adding supposedly unimportant frequencies to the SIL criterion masking band. Innocuous effects are also demonstrated

John J. Dreher; William E. Evans

1960-01-01

284

Noise-immune multisensor transduction of speech  

Microsoft Academic Search

Two types of configurations of multiple sensors were developed, tested and evaluated in speech recognition application for robust performance in high levels of acoustic background noise: One type combines the individual sensor signals to provide a single speech signal input, and the other provides several parallel inputs. For single-input systems, several configurations of multiple sensors were developed and tested. Results

Vishu R. Viswanathan; Claudia M. Henry; Alan G. Derr; Salim Roucos; Richard M. Schwartz

1986-01-01

285

Milton's "Areopagitica" Freedom of Speech on Campus  

ERIC Educational Resources Information Center

The author discusses the content in John Milton's "Areopagitica: A Speech for the Liberty of Unlicensed Printing to the Parliament of England" (1985) and provides parallelism to censorship practiced in higher education. Originally published in 1644, "Areopagitica" makes a powerful--and precocious--argument for freedom of speech and against…

Sullivan, Daniel F.

2006-01-01

286

Irrelevant speech effects and sequence learning.  

PubMed

The irrelevant speech effect is the finding that performance on serial recall tasks is impaired by the presence of irrelevant background speech. According to the object-oriented episodic record (O-OER) model, this impairment is due to a conflict of order information from two different sources: the seriation of the irrelevant speech and the rehearsal of the order of the to-be-remembered items. We tested the model's prediction that irrelevant speech should impair performance on other tasks that involve seriation. Experiments 1 and 2 verified that both an irrelevant speech effect and a changing state effect would obtain in a between-subjects design in which a standard serial recall measure was used, allowing employment of a between-subjects design in subsequent experiments. Experiment 3 showed that performance on a sequence-learning task was impaired by the presence of irrelevant speech, and Experiment 4 verified that performance is worse when the irrelevant speech changes more (the changing state effect). These findings support the prediction made by the O-OER model that one essential component to the irrelevant speech effect is serial order information. PMID:17533889

Farley, Lisa A; Neath, Ian; Allbritton, David W; Surprenant, Aimée M

2007-01-01

287

Scaffolded-Language Intervention: Speech Production Outcomes  

ERIC Educational Resources Information Center

This study investigated the effects of a scaffolded-language intervention using cloze procedures, semantically contingent expansions, contrastive word pairs, and direct models on speech abilities in two preschoolers with speech and language impairment speaking African American English. Effects of the lexical and phonological characteristics (i.e.,…

Bellon-Harn, Monica L.; Credeur-Pampolina, Maggie E.; LeBoeuf, Lexie

2013-01-01

288

Biosignal Processing Applications for Speech Processing  

Microsoft Academic Search

Speech is a biosignal that is amenable to general biosignal processing methodologies such as frequency domain processing. This is supported today by the availability of inexpensive digital multimedia hardware and by the developments of the theoretical aspects of signal processing. However, sound processing must be also regarded through the prism of the psychoacoustic reality of the human hearing system. Speech

Stefan Pantazi

289

Teaching Speech to Your Language Delayed Child.  

ERIC Educational Resources Information Center

Intended for parents, the booklet focuses on the speech and language development of children with language delays. The following topics are among those considered: the parent's role in the initial diagnosis of deafness, intellectual handicap, and neurological difficulties; diagnoses and single causes of difficultiy with speech; what to say to…

Rees, Roger J.; Pryor, Jan, Ed.

1980-01-01

290

Repeated Speech Errors: Evidence for Learning  

ERIC Educational Resources Information Center

Three experiments elicited phonological speech errors using the SLIP procedure to investigate whether there is a tendency for speech errors on specific words to reoccur, and whether this effect can be attributed to implicit learning of an incorrect mapping from lemma to phonology for that word. In Experiment 1, when speakers made a phonological…

Humphreys, Karin R.; Menzies, Heather; Lake, Johanna K.

2010-01-01

291

The Effects of TV on Speech Education  

ERIC Educational Resources Information Center

Generally, the speaking aspect is not properly debated when discussing the positive and negative effects of television (TV), especially on children. So, to highlight this point, this study was first initialized by asking the question: "What are the effects of TV on speech?" and secondly, to transform the effects that TV has on speech in a…

Gocen, Gokcen; Okur, Alpaslan

2013-01-01

292

Fighting Words. The Politics of Hateful Speech.  

ERIC Educational Resources Information Center

This book explores issues typified by a series of hateful speech events at Kean College (New Jersey) and on other U.S. campuses in the early 1990s, by examining the dichotomies that exist between the First and the Fourteenth Amendments and between civil liberties and civil rights, and by contrasting the values of free speech and academic freedom…

Marcus, Laurence R.

293

Bibliographic Annual in Speech Communication, 1972.  

ERIC Educational Resources Information Center

This is the third annual volume devoted to recording graduate work in speech communication, providing abstracts of doctoral dissertations, and making available specialized bibliographies. The first section is a bibliography on speech and language acquisition prepared by Joseph A. DeVito. The second is a compilation by Clark S. Marlor of source…

Shearer, Ned A., Ed.

294

Improving robustness of speech recognition systems  

NASA Astrophysics Data System (ADS)

Current Automatic Speech Recognition (ASR) systems fail to perform nearly as good as human speech recognition performance due to their lack of robustness against speech variability and noise contamination. The goal of this dissertation is to investigate these critical robustness issues, put forth different ways to address them and finally present an ASR architecture based upon these robustness criteria. Acoustic variations adversely affect the performance of current phone-based ASR systems, in which speech is modeled as 'beads-on-a-string', where the beads are the individual phone units. While phone units are distinctive in cognitive domain, they are varying in the physical domain and their variation occurs due to a combination of factors including speech style, speaking rate etc.; a phenomenon commonly known as 'coarticulation'. Traditional ASR systems address such coarticulatory variations by using contextualized phone-units such as triphones. Articulatory phonology accounts for coarticulatory variations by modeling speech as a constellation of constricting actions known as articulatory gestures. In such a framework, speech variations such as coarticulation and lenition are accounted for by gestural overlap in time and gestural reduction in space. To realize a gesture-based ASR system, articulatory gestures have to be inferred from the acoustic signal. At the initial stage of this research an initial study was performed using synthetically generated speech to obtain a proof-of-concept that articulatory gestures can indeed be recognized from the speech signal. It was observed that having vocal tract constriction trajectories (TVs) as intermediate representation facilitated the gesture recognition task from the speech signal. Presently no natural speech database contains articulatory gesture annotation; hence an automated iterative time-warping architecture is proposed that can annotate any natural speech database with articulatory gestures and TVs. Two natural speech databases: X-ray microbeam and Aurora-2 were annotated, where the former was used to train a TV-estimator and the latter was used to train a Dynamic Bayesian Network (DBN) based ASR architecture. The DBN architecture used two sets of observation: (a) acoustic features in the form of mel-frequency cepstral coefficients (MFCCs) and (b) TVs (estimated from the acoustic speech signal). In this setup the articulatory gestures were modeled as hidden random variables, hence eliminating the necessity for explicit gesture recognition. Word recognition results using the DBN architecture indicate that articulatory representations not only can help to account for coarticulatory variations but can also significantly improve the noise robustness of ASR system.

Mitra, Vikramjit

295

Reconstructing Speech from Human Auditory Cortex  

PubMed Central

How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.

Pasley, Brian N.; David, Stephen V.; Mesgarani, Nima; Flinker, Adeen; Shamma, Shihab A.; Crone, Nathan E.; Knight, Robert T.; Chang, Edward F.

2012-01-01

296

Speech technology in 2001: new research directions.  

PubMed Central

Research in speech recognition and synthesis over the past several decades has brought speech technology to a point where it is being used in "real-world" applications. However, despite the progress, the perception remains that the current technology is not flexible enough to allow easy voice communication with machines. The focus of speech research is now on producing systems that are accurate and robust but that do not impose unnecessary constraints on the user. This chapter takes a critical look at the shortcomings of the current speech recognition and synthesis algorithms, discusses the technical challenges facing research, and examines the new directions that research in speech recognition and synthesis must take in order to form the basis of new solutions suitable for supporting a wide range of applications.

Atal, B S

1995-01-01

297

Free Speech Movement Digital Archive  

NSDL National Science Digital Library

The Free Speech Movement that began on the Berkeley campus of the University of California in 1964 began a groundswell of student protests and campus-based social activism that would later spread across the United States for the remainder of the decade. With a substantial gift from Stephen M. Silberstein in the late 1990s, the University of California Berkeley Library began an ambitious program to document the role of those students and other participants who gave a coherent and organized voice to the Free Speech Movement. The primary documents provided here are quite extensive and include transcriptions of legal defense documents, leaflets passed out by members of the movement, letters from administrators and faculty members regarding the movement and student unrest, and oral histories. The site also provided a detailed bibliography to material dealing with the movement and a chronology of key events within its early history. Perhaps the most engaging part of the site is the Social Activism Sound Recording Project, which features numerous audio clips of faculty and academic senate debates, student protests, and discussions that were recorded during this period.

1998-01-01

298

Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech  

NASA Astrophysics Data System (ADS)

There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were moderately-highly correlated, even the most highly-related segmental and lexical measures agreed on only 76% of classifications (i.e., to CAS and PD). Finally, VOT analyses revealed that CAS utilized a distinct distribution pattern relative to PD and TD. Discussion frames the current findings within a profile of CAS and provides a validated list of criteria for the differential diagnosis of CAS and PD.

Iuzzini, Jenya

299

E-learning-based speech therapy: a web application for speech training.  

PubMed

Abstract In The Netherlands, a web application for speech training, E-learning-based speech therapy (EST), has been developed for patients with dysarthria, a speech disorder resulting from acquired neurological impairments such as stroke or Parkinson's disease. In this report, the EST infrastructure and its potentials for both therapists and patients are elucidated. EST provides patients with dysarthria the opportunity to engage in intensive speech training in their own environment, in addition to undergoing the traditional face-to-face therapy. Moreover, patients with chronic dysarthria can use EST to independently maintain the quality of their speech once the face-to-face sessions with their speech therapist have been completed. This telerehabilitation application allows therapists to remotely compose speech training programs tailored to suit each individual patient. Moreover, therapists can remotely monitor and evaluate changes in the patient's speech. In addition to its value as a device for composing, monitoring, and carrying out web-based speech training, the EST system compiles a database of dysarthric speech. This database is vital for further scientific research in this area. PMID:20184455

Beijer, Lilian J; Rietveld, Toni C M; van Beers, Marijn M A; Slangen, Robert M L; van den Heuvel, Henk; de Swart, Bert J M; Geurts, Alexander C H

2010-03-01

300

Gender difference in speech intelligibility using speech intelligibility tests and acoustic analyses  

PubMed Central

PURPOSE The purpose of this study was to compare men with women in terms of speech intelligibility, to investigate the validity of objective acoustic parameters related with speech intelligibility, and to try to set up the standard data for the future study in various field in prosthodontics. MATERIALS AND METHODS Twenty men and women were served as subjects in the present study. After recording of sample sounds, speech intelligibility tests by three speech pathologists and acoustic analyses were performed. Comparison of the speech intelligibility test scores and acoustic parameters such as fundamental frequency, fundamental frequency range, formant frequency, formant ranges, vowel working space area, and vowel dispersion were done between men and women. In addition, the correlations between the speech intelligibility values and acoustic variables were analyzed. RESULTS Women showed significantly higher speech intelligibility scores than men and there were significant difference between men and women in most of acoustic parameters used in the present study. However, the correlations between the speech intelligibility scores and acoustic parameters were low. CONCLUSION Speech intelligibility test and acoustic parameters used in the present study were effective in differentiating male voice from female voice and their values might be used in the future studies related patients involved with maxillofacial prosthodontics. However, further studies are needed on the correlation between speech intelligibility tests and objective acoustic parameters.

2010-01-01

301

Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.  

PubMed

Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. PMID:23891732

Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

2013-10-01

302

TOOLS FOR RESEARCH AND EDUCATION IN SPEECH SCIENCE  

Microsoft Academic Search

The Center for Spoken Language Understanding (CSLU) provides free language resources to researchers and educators in all areas of speech and hearing science. These resources are of great potential value to speech scientists for analyzing speech, for diagnosing and treating speech and language problems, for researching and evaluating language technologies, and for training students in the theory and practice of

Ronald A. Cole

303

Speech Enhancement Based on Hilbert-Huang Transform Theory  

Microsoft Academic Search

Speech enhancement is effective in solving the problem of noisy speech. Hilbert-Huang transform (HHT) is efficient for describing the local features of dynamic signals and is a new and powerful theory for the time-frequency analysis. According to the theory of HHT, this text introduced a new method of speech enhancement to improve the speech quantity and the signal noise ratio

Xiaojie Zou; Xueyao Li; Rubo Zhang

2006-01-01

304

The benefit of speech enhancement to the hearing impaired  

Microsoft Academic Search

Decreased speech intelligibility in background noise is a common complaint of most hearing impaired in comparison to quiet conditions. Speech enhancement algorithms widely used in hearing aids are often used with some success to improve speech intelligibility for the hearing impaired. It is eminent from the literature that not all hearing impaired benefit from speech enhancement algorithms. From the literature

N. Fink; M. Furst; C. Muchnik

2008-01-01

305

Tracking Change in Children with Severe and Persisting Speech Difficulties  

ERIC Educational Resources Information Center

Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…

Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

2013-01-01

306

Decoding speech in the presence of other sources  

Microsoft Academic Search

The statistical theory of speech recognition introduced several decades ago has brought about low word error rates for clean speech. However, it has been less suc- cessful in noisy conditions. Since extraneous acoustic sources are present in virtually all everyday speech communication conditions, the failure of the speech recognition model to take noise into account is perhaps the most serious

J. P. Barker; M. P. Cooke; Daniel P. W. Ellis

2005-01-01

307

Speech Characteristics Associated with Three Genotypes of Ataxia  

ERIC Educational Resources Information Center

Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…

Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana

2011-01-01

308

Speech Rate and Fluency in Children and Adolescents  

Microsoft Academic Search

Reduced speech fluency is frequent in clinical paediatric populations, an unexplained finding. To investigate age related effects on speech fluency variables, we analysed samples of narrative speech (picture description) of 308 healthy children, aged 5 to 17 years, and studied its relation with verbal fluency tasks. All studied measures showed significant developmental effects. Speech rate and verbal fluency scores increased,

Isabel Pavão Martins; Rosário Vieira; Clara Loureiro; M. Emilia Santos

2007-01-01

309

Visual and Auditory Input in Second-Language Speech Processing  

ERIC Educational Resources Information Center

The majority of studies in second-language (L2) speech processing have involved unimodal (i.e., auditory) input; however, in many instances, speech communication involves both visual and auditory sources of information. Some researchers have argued that multimodal speech is the primary mode of speech perception (e.g., Rosenblum 2005). Research on…

Hardison, Debra M.

2010-01-01

310

Check List of Books and Equipment in Speech.  

ERIC Educational Resources Information Center

This list of books, equipment, and supplies in speech offers several hundred resources selected by individual advertisers. The resources are divided into such categories as fundamentals of speech; public address; communication; radio, television, and film; theatre; speech and hearing disorders; speech education; dictionaries and other references;…

Speech Communication Association, Annandale, VA.

311

Recent Developments in Speech Motor Research into Stuttering  

Microsoft Academic Search

This paper discusses recent speech motor research into stuttering within the framework of a speech production model. There seems to be no support for the claim that stutterers differ from nonstutterers in assembling motor plans for speech. However, physiological data suggest that stutterers may at least have different ways of initiating and controlling speech movements. It is hypothesized that stuttering

Herman F. M. Peters; Wouter Hulstijn

2000-01-01

312

Fantasy Play in Preschool Classrooms: Age Differences in Private Speech.  

ERIC Educational Resources Information Center

Private speech is speech overtly directed to a young child's self and not directly spoken to another listener. Private speech develops differently during fantasy play than constructive play. This study examined age differences in the amount of fantasy play in the preschool classroom and in the amount and type of private speech that occurs during…

Kirby, Kathleen Campano

313

Carnival—Combining Speech Technology and Computer Animation  

Microsoft Academic Search

Speech is powerful information technology and the basis of human interaction. By emit- ting streams of buzzing, popping, and hissing noises from our mouths, we transmit thoughts, intentions, and knowledge of the world from one mind to another. We're accustomed to thinking of speech as an acoustic, auditory phenomenon. However, speech is also visible. Although the primary function of speech

Michael Berger; Gregor Hofer; Hiroshi Shimodaira

2011-01-01

314

Review: The speech corpus and database of Japanese dialects  

Microsoft Academic Search

is a recording of readings of words, phrases, sentences, and texts in Japanese dialects. The focus of the speech material is on prosody, in particular, on accentual variations, and to a lesser extent on intonation. In addition to the dialectal materials, SCDJD contains speech of the minority language Ainu, Japanese traditional singings, school children's speech, and speech by the foreign

Yasuko Nagano-madsen

315

Mel-frequency cepstral coefficient analysis in speech recognition  

Microsoft Academic Search

Speech recognition is a major topic in speech signal processing. Speech recognition is considered as one of the most popular and reliable biometric technologies used in automatic personal identification systems. Speech recognition systems are used for variety of applications such as multimedia browsing tool, access centre, security and finance. It allows people work in active environment to use computer. For

Chin Kim On; Paulraj M. Pandiyan; Sazali Yaacob; Azali Saudi

2006-01-01

316

Speech systems classification based on frequency of binary word features  

Microsoft Academic Search

This paper presents a robust method to classify the speech communication systems on the basis of speech coding technique uscd in digitizing the speech. The method works for both clear and noisy speech conditions, when the level of noise may be unknown or known in terms of bit alterations level upto 30% and can identify coding in as short as

S. Maithant; Muiya Din

2004-01-01

317

Phonemic Characteristics of Apraxia of Speech Resulting from Subcortical Hemorrhage  

ERIC Educational Resources Information Center

Reports describing subcortical apraxia of speech (AOS) have received little consideration in the development of recent speech processing models because the speech characteristics of patients with this diagnosis have not been described precisely. We describe a case of AOS with aphasia secondary to basal ganglia hemorrhage. Speech-language symptoms…

Peach, Richard K.; Tonkovich, John D.

2004-01-01

318

Speechdat multilingual speech databases for teleservices: across the finish line  

Microsoft Academic Search

The goal of the SpeechDat project is to develop spoken language resources for speech recognisers s uited to realise voice driven teleservices. SpeechDat created speech databases for all official languages of the European Union and some major dialectal varieties and minority languages. The size of the databases ranges between 500 and 5000 speakers. In total 20 d atabases are recorded

Harald Höge; Christoph Draxler; Henk van den Heuvel; Finn Tore Johansen; Eric Sanders; Herbert S. Tropf

1999-01-01

319

Realization of embedded speech recognition module based on STM32  

Microsoft Academic Search

Speech recognition is the key to realize man-machine interface technology. In order to improve the accuracy of speech recognition and implement the module on embedded system, an embedded speaker-independent isolated word speech recognition system based on ARM is designed after analyzing speech recognition theory. The system uses DTW algorithm and improves the algorithm using a parallelogram to extract characteristic parameters

Qinglin Qu; Liangguang Li

2011-01-01

320

A THAI SPEECH TRANSLATION SYSTEM FOR MEDICAL DIALOGS  

Microsoft Academic Search

In this paper we present our activities towards a Thai Speech-to-Speech translation system. We investigated in the design and implementation of a prototype system. For this purpose we carried out research on bootstrapping a Thai speech recognition system, developing a translation component, and building an initial Thai synthesis system using our existing tools. 2. Speech Recognition The language adaptation techniques

Tanja Schultz; Dorcas Alexander; Alan W Black; Kay Peterson; Sinaporn Suebvisai; Alex Waibel

321

Optimization in Speech-Centric Information Processing: Criteria and techniques  

Microsoft Academic Search

Automatic speech recognition (ASR) is an enabling technology for a wide range of information processing applications including speech translation, voice search (i.e., information retrieval with speech input), and conversational understanding. In these speech-centric applications, the output of ASR as \\

Xiaodong He; Li Deng

2012-01-01

322

Hearing smiles and smile suppression in natural speech  

Microsoft Academic Search

That we can hear smiles in speech is an established finding. However smiles in natural speech can be of many different kinds, serving different social functions. Previous research has focused only on one category of smile using either smiles ``posed'' during speech or degraded samples of smiling speech (to disguise the content of utterances). The present study used naturally occurring

Amy K. Drahota; Vasudevi Reddy

2003-01-01

323

On the Dynamics of Casual and Careful Speech.  

ERIC Educational Resources Information Center

Comparative statistical data are presented on speech dynamic (as contrasted with lexical and rhetorical) aspects of major speech styles. Representative samples of story retelling, lectures, speeches, sermons, interviews, and panel discussions serve to determine posited differences between casual and careful speech. Data are drawn from 15,393…

Hieke, A. E.

324

Monkey Lipsmacking Develops Like the Human Speech Rhythm  

ERIC Educational Resources Information Center

Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…

Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.

2012-01-01

325

The ambassador's speech: A particularly Hellenistic genre of oratory  

Microsoft Academic Search

The ambassador's speech assumed great importance during the Hellenistic period and became a distinct genre of deliberative oratory. Although there are no genuine ambassador's speeches extant, one can construct a model speech of this type by comparing ambassador's speeches in the Greek historians, especially Polybius.

Cecil W. Wooten

1973-01-01

326

The Effectiveness of Clear Speech as a Masker  

ERIC Educational Resources Information Center

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

327

Contemporary Reflections on Speech-Based Language Learning  

ERIC Educational Resources Information Center

In "The Relation of Language to Mental Development and of Speech to Language Teaching," S.G. Davidson displayed several timeless insights into the role of speech in developing language and reasons for using speech as the basis for instruction for children who are deaf and hard of hearing. His understanding that speech includes more than merely…

Gustafson, Marianne

2009-01-01

328

High quality time-scale modification for speech  

Microsoft Academic Search

We present a new and simple method for speech rate modification that yields high quality rate-modified speech. Earlier algorithms either required a significant amount of computation for good quality output speech or resulted in poor quality rate-modified speech. The algorithm we describe allows arbitrary linear or nonlinear scaling of the time axis. The algorithm operates in the time domain using

Salim Roucos

1985-01-01

329

Ahab's Speeches: Bombs or Bombastics? A Rhetorical Criticism.  

ERIC Educational Resources Information Center

In an attempt to define rhetorical discourse, the paper examines the speeches of Ahab, the main character from Herman Melville's book, "Moby-Dick." The paper first determines if Ahab's speeches actually fall into the category of rhetorical discourse by examining his major speeches, and then ascertains whether his speeches are bombs (successful…

Fadely, Dean

330

Brain-Computer Interfaces for Speech Communication  

PubMed Central

This paper briefly reviews current silent speech methodologies for normal and disabled individuals. Current techniques utilizing electromyographic (EMG) recordings of vocal tract movements are useful for physically healthy individuals but fail for tetraplegic individuals who do not have accurate voluntary control over the speech articulators. Alternative methods utilizing EMG from other body parts (e.g., hand, arm, or facial muscles) or electroencephalography (EEG) can provide capable silent communication to severely paralyzed users, though current interfaces are extremely slow relative to normal conversation rates and require constant attention to a computer screen that provides visual feedback and/or cueing. We present a novel approach to the problem of silent speech via an intracortical microelectrode brain computer interface (BCI) to predict intended speech information directly from the activity of neurons involved in speech production. The predicted speech is synthesized and acoustically fed back to the user with a delay under 50 ms. We demonstrate that the Neurotrophic Electrode used in the BCI is capable of providing useful neural recordings for over 4 years, a necessary property for BCIs that need to remain viable over the lifespan of the user. Other design considerations include neural decoding techniques based on previous research involving BCIs for computer cursor or robotic arm control via prediction of intended movement kinematics from motor cortical signals in monkeys and humans. Initial results from a study of continuous speech production with instantaneous acoustic feedback show the BCI user was able to improve his control over an artificial speech synthesizer both within and across recording sessions. The success of this initial trial validates the potential of the intracortical microelectrode-based approach for providing a speech prosthesis that can allow much more rapid communication rates.

Brumberg, Jonathan S.; Nieto-Castanon, Alfonso; Kennedy, Philip R.; Guenther, Frank H.

2010-01-01

331

Review of Visual Speech Perception by Hearing and Hearing-Impaired People: Clinical Implications  

ERIC Educational Resources Information Center

Background: Speech perception is often considered specific to the auditory modality, despite convincing evidence that speech processing is bimodal. The theoretical and clinical roles of speech-reading for speech perception, however, have received little attention in speech-language therapy. Aims: The role of speech-read information for speech

Woodhouse, Lynn; Hickson, Louise; Dodd, Barbara

2009-01-01

332

A study on noisy speech recognition  

NASA Astrophysics Data System (ADS)

This paper describes a method of deriving an effective scheme for recognizing human speech in noisy aerospace environments. An effective relaxation operator for noisy speech recognition is derived. The characteristics of this proposed scheme are: (1) the scheme can eliminate ambiguity in recognizing noisy and distorted speech and (2) the scheme is derived in a parallel processing mode and is applicable to higher-speed processing equipment. In addition, the scheme will be expandable to solve the problem of connected-word recognition. In a set of experiments, this scheme gave many positive results. Some of these results are presented.

Yamamoto, Hiromichi; Isobe, Toshio; Homma, Kohzo

333

My Speech Problem, Your Listening Problem, and My Frustration: The Experience of Living with Childhood Speech Impairment  

ERIC Educational Resources Information Center

Purpose: The purpose of this article was to understand the experience of speech impairment (speech sound disorders) in everyday life as described by children with speech impairment and their communication partners. Method: Interviews were undertaken with 13 preschool children with speech impairment (mild to severe) and 21 significant others…

McCormack, Jane; McLeod, Sharynne; McAllister, Lindy; Harrison, Linda J.

2010-01-01

334

Speech Clarity Index (?): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy  

NASA Astrophysics Data System (ADS)

It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (?) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of ? as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that ? is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

Kayasith, Prakasith; Theeramunkong, Thanaruk

335

78 FR 63152 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...  

Federal Register 2010, 2011, 2012, 2013

...speech-to-speech (STS), ASCII/Baudot- compatible services, Spanish-to-Spanish, and call-release. With respect to waivers that...speech-to-speech, ASCII/Baudot-compatible services, Spanish-to-Spanish, and call release. 18. VCO...

2013-10-23

336

Speech pitch determination based on Hilbert-Huang transform  

Microsoft Academic Search

Pitch determination is an essential part of speech recognition and speech processing. In this paper, a new pitch determination method based on Hilbert-Huang Transform (HHT) is presented. The assumption of linearity of the speech-production process and short-time stationarity of speech signals, which is generally employed in recent studies on speech recognition, is no longer needed, and hence the non-linearity of

Hai Huang; Jiaqiang Pan

2006-01-01

337

The Use of Pitch Prediction in Speech Coding  

Microsoft Academic Search

\\u000a Two major types of correlations are present in a speech signal. These are known as near-sample redundancies and distant-sample\\u000a redundancies. Near-sample redundancies are those which are present among speech samples that are close together. Distant-sample\\u000a redundancies are due to the inherent periodicity of voiced speech. Predictive speech coders make use of these correlations\\u000a in the speech signal to enhance coding

Ravi P. Ramachandran

338

Children use visual speech to compensate for non-intact auditory speech.  

PubMed

We investigated whether visual speech fills in non-intact auditory speech (excised consonant onsets) in typically developing children from 4 to 14years of age. Stimuli with the excised auditory onsets were presented in the audiovisual (AV) and auditory-only (AO) modes. A visual speech fill-in effect occurs when listeners experience hearing the same non-intact auditory stimulus (e.g., /-b/ag) as different depending on the presence/absence of visual speech such as hearing /bag/ in the AV mode but hearing /ag/ in the AO mode. We quantified the visual speech fill-in effect by the difference in the number of correct consonant onset responses between the modes. We found that easy visual speech cues /b/ provided greater filling in than difficult cues /g/. Only older children benefited from difficult visual speech cues, whereas all children benefited from easy visual speech cues, although 4- and 5-year-olds did not benefit as much as older children. To explore task demands, we compared results on our new task with those on the McGurk task. The influence of visual speech was uniquely associated with age and vocabulary abilities for the visual speech fill-in effect but was uniquely associated with speechreading skills for the McGurk effect. This dissociation implies that visual speech-as processed by children-is a complicated and multifaceted phenomenon underpinned by heterogeneous abilities. These results emphasize that children perceive a speaker's utterance rather than the auditory stimulus per se. In children, as in adults, there is more to speech perception than meets the ear. PMID:24974346

Jerger, Susan; Damian, Markus F; Tye-Murray, Nancy; Abdi, Hervé

2014-10-01

339

Janus-III: speech-to-speech translation in multiple languages  

Microsoft Academic Search

This paper describes JANUS-III, our most recent version of the JANUS speech-to-speech translation system. We present an overview of the system and focus on how system design facilitates speech translation between multiple languages, and allows for easy adaptation to new source and target languages. We also describe our methodology for evaluation of end-to-end system performance with a variety of source

Alon Lavie; Alex Waibel; Lori Levin; Michael Finke; Donna Gates; M. Gavalda; Torsten Zeppenfeld; Puming Zhan

1997-01-01

340

Real-time lexical competitions during speech-in-speech comprehension  

Microsoft Academic Search

This study aimed at characterizing the cognitive processes that come into play during speech-in-speech comprehension by examining lexical competitions between target speech and concurrent multi-talker babble. We investigated the effects of number of simultaneous talkers (2, 4, 6 or 8) and of the token frequency of the words that compose the babble (high or low) on lexical decision to target

Véronique Boulenger; Michel Hoen; Emmanuel Ferragne; François Pellegrino; Fanny Meunier

2010-01-01

341

Optimizing speech\\/non-speech classifier design using AdaBoost  

Microsoft Academic Search

We propose a new method to design speech\\/non-speech classifiers for voice activity detection and robust endpoint detection using the adaptive boosting (AdaBoost) algorithm. The method uses a combination of simple base classifiers through the AdaBoost algorithm and a set of optimized speech features combined with spectral subtraction. The key benefits of this method are the simple implementation and low computational

Oh-Wook Kwon; Te-Won Lee

2003-01-01

342

Converging toward a common speech code: imitative and perceptuo-motor recalibration processes in speech production  

PubMed Central

Auditory and somatosensory systems play a key role in speech motor control. In the act of speaking, segmental speech movements are programmed to reach phonemic sensory goals, which in turn are used to estimate actual sensory feedback in order to further control production. The adult's tendency to automatically imitate a number of acoustic-phonetic characteristics in another speaker's speech however suggests that speech production not only relies on the intended phonemic sensory goals and actual sensory feedback but also on the processing of external speech inputs. These online adaptive changes in speech production, or phonetic convergence effects, are thought to facilitate conversational exchange by contributing to setting a common perceptuo-motor ground between the speaker and the listener. In line with previous studies on phonetic convergence, we here demonstrate, in a non-interactive situation of communication, online unintentional and voluntary imitative changes in relevant acoustic features of acoustic vowel targets (fundamental and first formant frequencies) during speech production and imitation. In addition, perceptuo-motor recalibration processes, or after-effects, occurred not only after vowel production and imitation but also after auditory categorization of the acoustic vowel targets. Altogether, these findings demonstrate adaptive plasticity of phonemic sensory-motor goals and suggest that, apart from sensory-motor knowledge, speech production continuously draws on perceptual learning from the external speech environment.

Sato, Marc; Grabski, Krystyna; Garnier, Maeva; Granjon, Lionel; Schwartz, Jean-Luc; Nguyen, Noel

2013-01-01

343

Treating visual speech perception to improve speech production in non- fluent aphasia  

PubMed Central

Background and Purpose Several recent studies have revealed modulation of the left frontal lobe speech areas not only during speech production, but also for speech perception. Crucially, the frontal lobe areas highlighted in these studies are the same ones that are involved in non-fluent aphasia. Based on these findings, this study examined the utility of targeting visual speech perception to improve speech production in non-fluent aphasia. Methods Ten patients with chronic non-fluent aphasia underwent computerized language treatment utilizing picture-word matching. To examine the effect of visual peech perception upon picture naming, two treatment phases were compared – one which included matching pictures to heard words and another where pictures were matched to heard words accompanied by a video of the speaker’s mouth presented on the computer screen. Results The results revealed significantly improved picture naming of both trained and untrained items following treatment when it included a visual speech component (i.e. seeing the speaker’s mouth). In contrast, the treatment phase where pictures were only matched to heard words did not result in statistically significant improvement of picture naming. Conclusions The findings suggest that focusing on visual speech perception can significantly improve speech production in non-fluent aphasia and may provide an alternative approach to treat a disorder where speech production seldom improves much in the chronic phase of stroke.

Fridriksson, Julius; Baker, Julie M.; Whiteside, Janet; Eoute, David; Moser, Dana; Vesselinov, Roumen; Rorden, Chris

2008-01-01

344

Phrase-level speech simulation with an airway modulation model of speech production  

PubMed Central

Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated.

Story, Brad H.

2012-01-01

345

Coherence and the speech intelligibility index  

NASA Astrophysics Data System (ADS)

The speech intelligibility index (SII) (ANSI S3.5-1997) provides a means for estimating speech intelligibility under conditions of additive stationary noise or bandwidth reduction. The SII concept for estimating intelligibility is extended in this paper to include broadband peak-clipping and center-clipping distortion, with the coherence between the input and output signals used to estimate the noise and distortion effects. The speech intelligibility predictions using the new procedure are compared with intelligibility scores obtained from normal-hearing and hearing-impaired subjects for conditions of additive noise and peak-clipping and center-clipping distortion. The most effective procedure divides the speech signal into low-, mid-, and high-level regions, computes the coherence SII separately for the signal segments in each region, and then estimates intelligibility from a weighted combination of the three coherence SII values. .

Kates, James M.; Arehart, Kathryn H.

2005-04-01

346

The motor theory of speech perception reviewed  

PubMed Central

More than 50 years after the appearance of the motor theory of speech perception, it is timely to evaluate its three main claims that (1) speech processing is special, (2) perceiving speech is perceiving gestures, and (3) the motor system is recruited for perceiving speech. We argue that to the extent that it can be evaluated, the first claim is likely false. As for the second claim, we review findings that support it and argue that although each of these findings may be explained by alternative accounts, the claim provides a single coherent account. As for the third claim, we review findings in the literature that support it at different levels of generality and argue that the claim anticipated a theme that has become widespread in cognitive science.

GALANTUCCI, BRUNO; FOWLER, CAROL A.; TURVEY, M. T.

2009-01-01

347

Respiratory Sinus Arrhythmia During Speech Production  

PubMed Central

The amplitude of the respiratory sinus arrhythmia (RSA) was investigated during a reading aloud task to determine whether alterations in respiratory control during speech production affect the amplitude of RSA. Changes in RSA amplitude associated with speech were evaluated by comparing RSA amplitudes during reading aloud with those obtained during rest breathing. A third condition, silent reading, was included to control for potentially confounding effects of cardiovascular responses to cognitive processes involved in the process of reading. Calibrated respiratory kinematics, electrocardiograms (ECGs), and speech audio signals were recorded from 18 adults (9 men, 9 women) during 5-min trials of each condition. The results indicated that the increases in respiratory duration, lung volume, and inspiratory velocity associated with reading aloud were accompanied by similar increases in the amplitude of RSA. This finding provides support for the premise that sensorimotor pathways mediating metabolic respiration are actively modulated during speech production.

Reilly, Kevin J.; Moore, Christopher A.

2014-01-01

348

Noise and Speech Interference: Proceedings of Minisymposium.  

National Technical Information Service (NTIS)

Several papers are presented which deal with the psychophysical effects of interference with speech and listening activities by different forms of noise masking and filtering. Special attention was given to the annoyance such interruptions cause, particul...

W. T. Shepherd

1975-01-01

349

Neural Network Classifiers for Speech Recognition.  

National Technical Information Service (NTIS)

Neural nets offer an approach to computation that mimics biological nervous systems. Algorithms based on neural nets have been proposed to address speech recognition tasks which humans perform with little apparent effort. In this reprint, neural net class...

R. S. Lippman

1988-01-01

350

Synthesis of Speech from Unrestricted Text.  

National Technical Information Service (NTIS)

It is often desirable to be able to convert arbitrary English text to natural and intelligible sounding speech. This transformation is facilitated by first obtaining the common underlying abstract linguistic representation which relates to both text and s...

J. Allen

1975-01-01

351

Noise Suppression Methods for Robust Speech Processing.  

National Technical Information Service (NTIS)

Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments for the research program funded to develop real time, compresse...

S. F. Boll D. Pulsipher W. Done B. Cox J. Kajiya

1978-01-01

352

Semi-Automated Speech Transcription System Study.  

National Technical Information Service (NTIS)

This report describes preliminary explorations towards the design of a semi-automatic transcription system. Current transcription practices were studied and are described in this report. The promising results of several speech recognition experiments as w...

J. Baker

1994-01-01

353

Megatrends in Speech Communication: Theory and Research.  

ERIC Educational Resources Information Center

Identifies shifts in the speech communication field that include bridging the diversity in communication theory and research, increasing attention to policy implications and technology factors in communication research, etc. (PD)

Littlejohn, Stephen W.

1985-01-01

354

Speech Intelligibility in Naval Aircraft Radios.  

National Technical Information Service (NTIS)

A study was made to determine how speech intelligibility in naval aircraft radio communications is affected by cockpit noise, by the microphone, helmet, and microphone used by the pilot, and by the vocabulary employed. Using six standard word lists, speec...

J. C. Webster C. R. Allen

1972-01-01

355

Noise-immune multisensor transduction of speech  

NASA Astrophysics Data System (ADS)

Two types of configurations of multiple sensors were developed, tested and evaluated in speech recognition application for robust performance in high levels of acoustic background noise: One type combines the individual sensor signals to provide a single speech signal input, and the other provides several parallel inputs. For single-input systems, several configurations of multiple sensors were developed and tested. Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show that each of the two-sensor configurations tested outperforms the constituent individual sensors in high noise. Also presented are results comparing the performance of two-sensor configurations and individual sensors in speaker-dependent, isolated-word speech recognition tests performed using a commercial recognizer (Verbex 4000) in simulated fighter aircraft cockpit noise.

Viswanathan, Vishu R.; Henry, Claudia M.; Derr, Alan G.; Roucos, Salim; Schwartz, Richard M.

1986-08-01

356

Reading Machine: From Text to Speech.  

National Technical Information Service (NTIS)

A machine with unrestricted vocabulary, that is capable of converting printed text into connected speech in real time, would be extremely useful to blind people. The problems in implementing such a machine are mainly (1) character recognition, (2) convers...

F. F. Lee

1969-01-01

357

Parsing Conversational Speech Using Enhanced Segmentation.  

National Technical Information Service (NTIS)

The lack of sentence boundaries and presence of disfluencies pose difficulties for parsing conversational speech. This work investigates the effects of automatically detecting these phenomena on a probabilistic parser's performance. We demonstrate that a ...

C. Chelba J. G. Kahn M. Ostendorf

2004-01-01

358

Discriminating languages by speech-reading  

Microsoft Academic Search

The goal of this study was to explore the ability to discriminate languages using the visual correlates of speech (i.e., speech-reading).\\u000a Participants were presented with silent video clips of an actor pronouncing two sentences (in Catalan and\\/or Spanish) and\\u000a were asked to judge whether the sentences were in the same language or in different languages. Our results established that\\u000a Spanish—Catalan

Salvador Soto-Faraco; Jordi Navarra; Whitney M. Weikum; Athena Vouloumanos; Núria Sebastián-Gallés; Janet F. Werker

2007-01-01

359

Speaker-independent continuous speech dictation  

Microsoft Academic Search

In this paper we report progress made at LIMSI in speaker-independent large vocabulary speech dictation using newspaper speech corpora. The rec- ognizer makes use of continuous density HMM with Gaussian mixture for acoustic modeling and n-gram statistics estimated on the newspaper texts for language modeling. Acoustic modeling uses cepstrum-based features, context- dependent phone models (intra and interword), phone duration models,

Jean-luc Gauvain; Lori Faith Lamel; Gilles Adda; Martine Adda-decker

1994-01-01

360

Switching Auxiliary Chains for Speech Recognition  

Microsoft Academic Search

This letter investigates the problem of incorporating auxiliary information, e.g., pitch, zero crossing rate (ZCR), and rate-of-speech (ROS), for speech recognition using dynamic Bayesian networks. In this letter, we propose switching auxiliary chains for exploiting different auxiliary information tailored to different phonetic states. The switching function can be specified by a priori knowledge or, more flexibly, be learned from data

Hui Lin; Zhijian Ou

2007-01-01

361

Graphical Models and Automatic Speech Recognition  

Microsoft Academic Search

\\u000a Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition.\\u000a This paper first provides a brief overview of graphical models and their uses as statistical models. It is then shown that\\u000a the statistical assumptions behind many pattern recognition techniques commonly used as part of a speech recognition system\\u000a can be described by a

Jeffrey A. Bilmes

362

An unrestricted vocabulary Arabic speech synthesis system  

Microsoft Academic Search

A method for synthesizing Arabic speech has been developed which uses a reasonably sized set of subphonetic elements as the synthesis units to allow synthesis of unlimited-vocabulary speech of good quality. The synthesis units have been defined after a careful study of the phonetic properties of modern standard Arabic, and they consist of central steady-state portions of vowels, central steady-state

YOUSIF A. EL-IMAM

1989-01-01

363

Large vocabulary natural language continuous speech recognition  

Microsoft Academic Search

A description is presented of the authors' current research on automatic speech recognition of continuously read sentences from a naturally-occurring corpus: office correspondence. The recognition system combines features from their current isolated-word recognition system and from their previously developed continuous-speech recognition system. It consists of an acoustic processor, an acoustic channel model, a language model, and a linguistic decoder. Some

L. R. Bahl; R. Bakis; J. Bellegarda; P. F. Brown; D. Burshtein; S. K. Das; P. V. de Souza; P. S. Gopalakrishnan; F. Jelinek; D. Kanevsky; R. L. Mercer; A. J. Nadas; D. Nahamoo; M. A. Picheny

1989-01-01

364

Large vocabulary continuous speech recognition using HTK  

Microsoft Academic Search

HTK is a portable software toolkit for building speech recognition systems using continuous density hidden Markov models developed by the Cambridge University Speech Group. One particularly successful type of system uses mixture density tied-state triphones. We have used this technique for the 5 k\\/20 k word ARPA Wall Street Journal (WSJ) task. We have extended our approach from using word-internal

P. C. Woodland; J. J. Odell; V. Valtchev; S. J. Young

1994-01-01

365

Neural restoration of degraded audiovisual speech  

PubMed Central

When speech is interrupted by noise, listeners often perceptually “fill-in” the degraded signal, giving an illusion of continuity and improving intelligibility. This phenomenon involves a neural process in which the auditory cortex (AC) response to onsets and offsets of acoustic interruptions is suppressed. Since meaningful visual cues behaviorally enhance this illusory filling-in, we hypothesized that during the illusion, lip movements congruent with acoustic speech should elicit a weaker AC response to interruptions relative to static (no movements) or incongruent visual speech. AC response to interruptions was measured as the power and inter-trial phase consistency of the auditory evoked theta band (4-8 Hz) activity of the electroencephalogram (EEG) and the N1 and P2 auditory evoked potentials (AEPs). A reduction in the N1 and P2 amplitudes and in theta phase-consistency reflected the perceptual illusion at the onset and/or offset of interruptions regardless of visual condition. These results suggest that the brain engages filling-in mechanisms throughout the interruption, which repairs degraded speech lasting up to ~250 ms following the onset of the degradation. Behaviorally, participants perceived greater speech continuity over longer interruptions for congruent compared to incongruent or static audiovisual streams. However, this specific behavioral profile was not mirrored in the neural markers of interest. We conclude that lip-reading enhances illusory perception of degraded speech not by altering the quality of the AC response, but by delaying it during degradations so that longer interruptions can be tolerated.

Shahin, Antoine J.; Kerlin, Jess R.; Bhat, Jyoti; Miller, Lee M.

2012-01-01

366

Motor movement matters: the flexible abstractness of inner speech  

PubMed Central

Inner speech is typically characterized as either the activation of abstract linguistic representations or a detailed articulatory simulation that lacks only the production of sound. We present a study of the ‘speech errors’ that occur during the inner recitation of tongue-twister like phrases. Two forms of inner speech were tested: inner speech without articulatory movements and articulated (mouthed) inner speech. While mouthing one’s inner speech could reasonably be assumed to require more articulatory planning, prominent theories assume that such planning should not affect the experience of inner speech and consequently the errors that are ‘heard’ during its production. The errors occurring in articulated inner speech exhibited the phonemic similarity effect and lexical bias effect, two speech-error phenomena that, in overt speech, have been localized to an articulatory-feature processing level and a lexical-phonological level, respectively. In contrast, errors in unarticulated inner speech did not exhibit the phonemic similarity effect—just the lexical bias effect. The results are interpreted as support for a flexible abstraction account of inner speech. This conclusion has ramifications for the embodiment of language and speech and for the theories of speech production.

Oppenheim, Gary M.; Dell, Gary S.

2010-01-01

367

Perceptual Bias in Speech Error Data Collection: Insights from Spanish Speech Errors  

ERIC Educational Resources Information Center

This paper studies the reliability and validity of naturalistic speech errors as a tool for language production research. Possible biases when collecting naturalistic speech errors are identified and specific predictions derived. These patterns are then contrasted with published reports from Germanic languages (English, German and Dutch) and one…

Perez, Elvira; Santiago, Julio; Palma, Alfonso; O'Seaghdha, Padraig G.

2007-01-01

368

Superhuman multi-talker speech recognition: the IBM 2006 speech separation challenge system  

Microsoft Academic Search

We describe a system for model based speech separation which achieves super-human recognition performance when two talkers speak at similar levels. The system can separate the speech of two speakers from a single channel recording with remarkable results. It incorporates a novel method for performing two-talker speaker identification and gain estimation. We extend the method of model based high resolution

Trausti T. Kristjansson; John R. Hershey; Peder A. Olsen; Steven J. Rennie; Ramesh A. Gopinath

2006-01-01

369

Speech Motor Programming in Apraxia of Speech: Evidence from a Delayed Picture-Word Interference Task  

ERIC Educational Resources Information Center

Purpose: Apraxia of speech (AOS) is considered a speech motor programming impairment, but the specific nature of the impairment remains a matter of debate. This study investigated 2 hypotheses about the underlying impairment in AOS framed within the Directions Into Velocities of Articulators (DIVA; Guenther, Ghosh, & Tourville, 2006) model: The…

Mailend, Marja-Liisa; Maas, Edwin

2013-01-01

370

Autonomic and Emotional Responses of Graduate Student Clinicians in Speech-Language Pathology to Stuttered Speech  

ERIC Educational Resources Information Center

Background: Fluent speakers and people who stutter manifest alterations in autonomic and emotional responses as they view stuttered relative to fluent speech samples. These reactions are indicative of an aroused autonomic state and are hypothesized to be triggered by the abrupt breakdown in fluency exemplified in stuttered speech. Furthermore,…

Guntupalli, Vijaya K.; Nanjundeswaran, Chayadevie; Dayalu, Vikram N.; Kalinowski, Joseph

2012-01-01

371

Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders  

ERIC Educational Resources Information Center

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

Klein, Harriet B.; Liu-Shea, May

2009-01-01

372

Dramatic Effects of Speech Task on Motor and Linguistic Planning in Severely Dysfluent Parkinsonian Speech  

ERIC Educational Resources Information Center

In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…

Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

2012-01-01

373

Role of binaural hearing in speech intelligibility and spatial release from masking using vocoded speech.  

PubMed

A cochlear implant vocoder was used to evaluate relative contributions of spectral and binaural temporal fine-structure cues to speech intelligibility. In Study I, stimuli were vocoded, and then convolved through head related transfer functions (HRTFs) to remove speech temporal fine structure but preserve the binaural temporal fine-structure cues. In Study II, the order of processing was reversed to remove both speech and binaural temporal fine-structure cues. Speech reception thresholds (SRTs) were measured adaptively in quiet, and with interfering speech, for unprocessed and vocoded speech (16, 8, and 4 frequency bands), under binaural or monaural (right-ear) conditions. Under binaural conditions, as the number of bands decreased, SRTs increased. With decreasing number of frequency bands, greater benefit from spatial separation of target and interferer was observed, especially in the 8-band condition. The present results demonstrate a strong role of the binaural cues in spectrally degraded speech, when the target and interfering speech are more likely to be confused. The nearly normal binaural benefits under present simulation conditions and the lack of order of processing effect further suggest that preservation of binaural cues is likely to improve performance in bilaterally implanted recipients. PMID:19894832

Garadat, Soha N; Litovsky, Ruth Y; Yu, Gongqiang; Zeng, Fan-Gang

2009-11-01

374

Visual Speech Feature Extraction From Natural Speech for Multi-modal ASR.  

National Technical Information Service (NTIS)

Improving the accuracy of speech recognition technology by addition of visual information is the key approach to multi-modal ASR research. In this work, we address two important issues, which are lip tracking and the visual speech feature extraction algor...

J. N. Gowdy S. Gurbuz

2002-01-01

375

Speech intelligibility during respirator wear: influences of respirator speech diaphragm size and background noise.  

PubMed

This study assessed the effect of respirator speech device size on speech intelligibility and the impact of background noise on respirator communications effectiveness. Thirty-five subjects completed modified rhyme test (MRT) speech intelligibility testing procedures with and without a respirator under background noises of 40, 60, and 80 dBA. Respirator wear conditions included the use of one unmodified and three mechanical speech diaphragms modified to reduce the surface area of the vibrating inner membrane available for sound transmission. Average MRT scores decreased linearly as background noise levels increased for all conditions. Lower MRT scores were observed for all respirator speech diaphragm conditions compared to the nonrespirator condition within each noise category. Average MRT scores differed significantly between the unmodified speech diaphragm and one with a 70% reduced surface area with a 40-dBA background noise. However, MRT scores were similar between the modified and unmodified diaphragms at both the 60- and 80-dBA noise levels. These findings provide evidence that alternate designs of mechanical-type respirator speech devices can be achieved without further degradation of speech sound transmission. PMID:14674794

Caretti, David M; Strickler, Linda C

2003-01-01

376

Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses  

ERIC Educational Resources Information Center

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

Foundation for Individual Rights in Education (NJ1), 2010

2010-01-01

377

Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses  

ERIC Educational Resources Information Center

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

Foundation for Individual Rights in Education (NJ1), 2011

2011-01-01

378

Plasticity in the human speech motor system drives changes in speech perception.  

PubMed

Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594

Lametti, Daniel R; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M; Ostry, David J

2014-07-30

379

Design of speech illumina mentor (SIM) for teaching speech to the hearing impaired  

Microsoft Academic Search

A computer based speech training program named SIM (Speech Illumina Mentor) has been developed targeted mainly for use by young school-aged hearing impaired children. The program has two major components: a training module which presents the child with examples of stored articulations and a practice module which features a voice recognition system to evaluate elicited productions in a game format.

A. J. A. Soleymani; M. J. McCutcheon; M. H. Southwood

1997-01-01

380

WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition  

Microsoft Academic Search

A significant new speech corpus of British English has been recorded at Cambridge University. Derived from the Wall Street Journal text corpus, WSJCAMO constitutes one of the largest corpora of spoken British English currently in existence. It has been specifically designed for the construction and evaluation of speaker-independent speech recognition systems. The database consists of 140 speakers each speaking about

Tony Robinson; Jeroen Fransen; David Pye; Jonathan Foote; Steve Renals

1995-01-01

381

Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses  

ERIC Educational Resources Information Center

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…

Foundation for Individual Rights in Education (NJ1), 2009

2009-01-01

382

Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.  

PubMed

Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes. PMID:23998347

Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

2014-07-01

383

Speech processing based on short-time Fourier analysis  

SciTech Connect

Short-time Fourier analysis (STFA) is a mathematical technique that represents nonstationary signals, such as speech, music, and seismic signals in terms of time-varying spectra. This representation provides a formalism for such intuitive notions as time-varying frequency components and pitch contours. Consequently, STFA is useful for speech analysis and speech processing. This paper shows that STFA provides a convenient technique for estimating and modifying certain perceptual parameters of speech. As an example of an application of STFA of speech, the problem of time-compression or expansion of speech, while preserving pitch and time-varying frequency content is presented.

Portnoff, M.R.

1981-06-02

384

Study on achieving speech privacy using masking noise  

NASA Astrophysics Data System (ADS)

This study focuses on achieving speech privacy using a meaningless steady masking noise. The most effective index for achieving a satisfactory level of speech privacy was selected, choosing between spectral distance and the articulation index. From a result, spectral distance was selected as the best and most practical index for achieving speech privacy. Next, speech along with a masking noise with a sound pressure level value corresponding to various speech privacy levels were presented to subjects who judged the psychological impression of the particular speech privacy level. Theoretical calculations were in good agreement with the experimental results.

Tamesue, Takahiro; Yamaguchi, Shizuma; Saeki, Tetsuro

2006-11-01

385

Recent developments in speech motor research into stuttering.  

PubMed

This paper discusses recent speech motor research into stuttering within the framework of a speech production model. There seems to be no support for the claim that stutterers differ from nonstutterers in assembling motor plans for speech. However, physiological data suggest that stutterers may at least have different ways of initiating and controlling speech movements. It is hypothesized that stuttering may be the result of a deficiency in speech motor skill. Furthermore, objections to the use of stuttering frequency as a severity index are formulated and future developments in the assessment of speech motor behavior in stuttering are described. PMID:10474010

Peters, H F; Hulstijn, W; Van Lieshout, P H

2000-01-01

386

Segmenting words from natural speech: subsegmental variation in segmental cues.  

PubMed

Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We use this new representation to re-evaluate a key computational model of word segmentation. One finding is that high levels of phonetic variability degrade the model's performance. While robustness to phonetic variability may be intrinsically valuable, this finding needs to be complemented by parallel studies of the actual abilities of children to segment phonetically variable speech. PMID:20307345

Rytting, C Anton; Brew, Chris; Fosler-Lussier, Eric

2010-06-01

387

Speech Pathology in Ancient India--A Review of Sanskrit Literature.  

ERIC Educational Resources Information Center

The paper is a review of ancient Sanskrit literature for information on the origin and development of speech and language, speech production, normality of speech and language, and disorders of speech and language and their treatment. (DB)

Savithri, S. R.

1987-01-01

388

42 CFR 485.715 - Condition of participation: Speech pathology services.  

Code of Federal Regulations, 2010 CFR

...Condition of participation: Speech pathology services. 485.715 Section 485...Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology...

2010-10-01

389

42 CFR 485.715 - Condition of participation: Speech pathology services.  

Code of Federal Regulations, 2013 CFR

...Condition of participation: Speech pathology services. 485.715 Section 485...Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology...

2013-10-01

390

42 CFR 485.715 - Condition of participation: Speech pathology services.  

Code of Federal Regulations, 2010 CFR

...Condition of participation: Speech pathology services. 485.715 Section 485...Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology...

2009-10-01

391

Cortical entrainment to continuous speech: functional roles and interpretations  

PubMed Central

Auditory cortical activity is entrained to the temporal envelope of speech, which corresponds to the syllabic rhythm of speech. Such entrained cortical activity can be measured from subjects naturally listening to sentences or spoken passages, providing a reliable neural marker of online speech processing. A central question still remains to be answered about whether cortical entrained activity is more closely related to speech perception or non-speech-specific auditory encoding. Here, we review a few hypotheses about the functional roles of cortical entrainment to speech, e.g., encoding acoustic features, parsing syllabic boundaries, and selecting sensory information in complex listening environments. It is likely that speech entrainment is not a homogeneous response and these hypotheses apply separately for speech entrainment generated from different neural sources. The relationship between entrained activity and speech intelligibility is also discussed. A tentative conclusion is that theta-band entrainment (4–8 Hz) encodes speech features critical for intelligibility while delta-band entrainment (1–4 Hz) is related to the perceived, non-speech-specific acoustic rhythm. To further understand the functional properties of speech entrainment, a splitter’s approach will be needed to investigate (1) not just the temporal envelope but what specific acoustic features are encoded and (2) not just speech intelligibility but what specific psycholinguistic processes are encoded by entrained cortical activity. Similarly, the anatomical and spectro-temporal details of entrained activity need to be taken into account when investigating its functional properties.

Ding, Nai; Simon, Jonathan Z.

2014-01-01

392

Robust speech coding using microphone arrays  

NASA Astrophysics Data System (ADS)

To achieve robustness and efficiency for voice communication in noise, the noise suppression and bandwidth compression processes are combined to form a joint process using input from an array of microphones. An adaptive beamforming technique with a set of robust linear constraints and a single quadratic inequality constraint is used to preserve desired signal and to cancel directional plus ambient noise in a small room environment. This robustly constrained array processor is found to be effective in limiting signal cancelation over a wide range of input SNRs (-10 dB to +10 dB). The resulting intelligibility gains (8-10 dB) provide significant improvement to subsequent CELP coding. In addition, the desired speech activity is detected by estimating Target-to-Jammer Ratios (TJR) using subband correlations between different microphone inputs or using signals within the Generalized Sidelobe Canceler directly. These two novel techniques of speech activity detection for coding are studied thoroughly in this dissertation. Each is subsequently incorporated with the adaptive array and a 4.8 kbps CELP coder to form a Variable Bit Kate (VBR) coder with noise canceling and Spatial Voice Activity Detection (SVAD) capabilities. This joint noise suppression and bandwidth compression system demonstrates large improvements in desired speech quality after coding, accurate desired speech activity detection in various types of interference, and a reduction in the information bits required to code the speech.

Li, Zhao

1998-09-01

393

Distinct developmental profiles in typical speech acquisition  

PubMed Central

Three- to five-year-old children produce speech that is characterized by a high level of variability within and across individuals. This variability, which is manifest in speech movements, acoustics, and overt behaviors, can be input to subgroup discovery methods to identify cohesive subgroups of speakers or to reveal distinct developmental pathways or profiles. This investigation characterized three distinct groups of typically developing children and provided normative benchmarks for speech development. These speech development profiles, identified among 63 typically developing preschool-aged speakers (ages 36–59 mo), were derived from the children's performance on multiple measures. These profiles were obtained by submitting to a k-means cluster analysis of 72 measures that composed three levels of speech analysis: behavioral (e.g., task accuracy, percentage of consonants correct), acoustic (e.g., syllable duration, syllable stress), and kinematic (e.g., variability of movements of the upper lip, lower lip, and jaw). Two of the discovered group profiles were distinguished by measures of variability but not by phonemic accuracy; the third group of children was characterized by their relatively low phonemic accuracy but not by an increase in measures of variability. Analyses revealed that of the original 72 measures, 8 key measures were sufficient to best distinguish the 3 profile groups.

Campbell, Thomas F.; Shriberg, Lawrence D.; Green, Jordan R.; Abdi, Herve; Rusiewicz, Heather Leavy; Venkatesh, Lakshmi; Moore, Christopher A.

2012-01-01

394

Auditory perception bias in speech imitation  

PubMed Central

In an experimental study, we explored the role of auditory perception bias in vocal pitch imitation. Psychoacoustic tasks involving a missing fundamental indicate that some listeners are attuned to the relationship between all the higher harmonics present in the signal, which supports their perception of the fundamental frequency (the primary acoustic correlate of pitch). Other listeners focus on the lowest harmonic constituents of the complex sound signal which may hamper the perception of the fundamental. These two listener types are referred to as fundamental and spectral listeners, respectively. We hypothesized that the individual differences in speakers' capacity to imitate F0 found in earlier studies, may at least partly be due to the capacity to extract information about F0 from the speech signal. Participants' auditory perception bias was determined with a standard missing fundamental perceptual test. Subsequently, speech data were collected in a shadowing task with two conditions, one with a full speech signal and one with high-pass filtered speech above 300 Hz. The results showed that perception bias toward fundamental frequency was related to the degree of F0 imitation. The effect was stronger in the condition with high-pass filtered speech. The experimental outcomes suggest advantages for fundamental listeners in communicative situations where F0 imitation is used as a behavioral cue. Future research needs to determine to what extent auditory perception bias may be related to other individual properties known to improve imitation, such as phonetic talent.

Postma-Nilsenova, Marie; Postma, Eric

2013-01-01

395

Speech reception thresholds in various interference conditions  

NASA Astrophysics Data System (ADS)

Speech intelligibility is integral to human verbal communication; however, our understanding of the effects of competing noise, room reverberation, and frequency range restriction is incomplete. Using virtual stimuli, the dependence of intelligibility threshold levels on the extent of room reverberation, the relative locations of speech target and masking noise, and the available frequency content of the speech and the masking noise is explored. Speech-shaped masking noise and target sentences have three spectral conditions: wideband, high pass above 2-kHz, and low pass below 2-kHz. The 2-kHz cutoff was chosen to approximately bisect the range of frequencies most important in speech, and the high pass noise condition simulates high-frequency hearing loss. Reverberation conditions include a pseudo-anechoic case, a moderately reverberant ``classroom'' case, and a very reverberant ``bathroom'' case. Both binaural and monaural intelligibility are measured. Preliminary results show that source separation decreases thresholds, reverberation increases thresholds, and low frequency noise reverberates more in the rooms, contributing to increasing thresholds along with the effects of the upward spread of masking. The energetic effects of reverberation are explored. [Work supported by NIH DC00100.

Carr, Suzanne P.; Colburn, H. Steven

2001-05-01

396

Recognizing hesitation phenomena in continuous, spontaneous speech  

NASA Astrophysics Data System (ADS)

Spontaneous speech differs from read speech in speaking rate and hesitation. In natural, spontaneous speech, people often start talking and then think along the way; at times, this causes the speech to have hesitation pauses (both filled and unfilled) and restarts. Results are reported on all types of pauses in a widely-used speech database, for both hesitation pauses and semi-intentional pauses. A distinction is made between grammatical pauses (at major syntactic boundaries) and ungrammatical ones. Different types of unfilled pauses cannot be reliably separated based on silence duration, although grammatical pauses tend to be longer. In the prepausal word before ungrammatical pauses, there were few continuation rises in pitch, whereas 80 percent of the grammatical pauses were accompanied by a prior fundamental frequency rise of 10-40 kHz. Identifying the syntactic function of such hesitation phenomena can improve recognition performance by eliminating from consideration some of the hypotheses proposed by an acoustic recognizer. Results presented allow simple identification of filled pauses (such as uhh, umm) and their syntactic function.

Oshaughnessy, Douglas

397

Speech Enhancement Using Gaussian Scale Mixture Models  

PubMed Central

This paper presents a novel probabilistic approach to speech enhancement. Instead of a deterministic logarithmic relationship, we assume a probabilistic relationship between the frequency coefficients and the log-spectra. The speech model in the log-spectral domain is a Gaussian mixture model (GMM). The frequency coefficients obey a zero-mean Gaussian whose covariance equals to the exponential of the log-spectra. This results in a Gaussian scale mixture model (GSMM) for the speech signal in the frequency domain, since the log-spectra can be regarded as scaling factors. The probabilistic relation between frequency coefficients and log-spectra allows these to be treated as two random variables, both to be estimated from the noisy signals. Expectation-maximization (EM) was used to train the GSMM and Bayesian inference was used to compute the posterior signal distribution. Because exact inference of this full probabilistic model is computationally intractable, we developed two approaches to enhance the efficiency: the Laplace method and a variational approximation. The proposed methods were applied to enhance speech corrupted by Gaussian noise and speech-shaped noise (SSN). For both approximations, signals reconstructed from the estimated frequency coefficients provided higher signal-to-noise ratio (SNR) and those reconstructed from the estimated log-spectra produced lower word recognition error rate because the log-spectra fit the inputs to the recognizer better. Our algorithms effectively reduced the SSN, which algorithms based on spectral analysis were not able to suppress.

Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.

2011-01-01

398

Speech Production as State Feedback Control  

PubMed Central

Spoken language exists because of a remarkable neural process. Inside a speaker's brain, an intended message gives rise to neural signals activating the muscles of the vocal tract. The process is remarkable because these muscles are activated in just the right way that the vocal tract produces sounds a listener understands as the intended message. What is the best approach to understanding the neural substrate of this crucial motor control process? One of the key recent modeling developments in neuroscience has been the use of state feedback control (SFC) theory to explain the role of the CNS in motor control. SFC postulates that the CNS controls motor output by (1) estimating the current dynamic state of the thing (e.g., arm) being controlled, and (2) generating controls based on this estimated state. SFC has successfully predicted a great range of non-speech motor phenomena, but as yet has not received attention in the speech motor control community. Here, we review some of the key characteristics of speech motor control and what they say about the role of the CNS in the process. We then discuss prior efforts to model the role of CNS in speech motor control, and argue that these models have inherent limitations – limitations that are overcome by an SFC model of speech motor control which we describe. We conclude by discussing a plausible neural substrate of our model.

Houde, John F.; Nagarajan, Srikantan S.

2011-01-01

399

Irrelevant speech effects and statistical learning.  

PubMed

Immediate serial recall of visually presented verbal stimuli is impaired by the presence of irrelevant auditory background speech, the so-called irrelevant speech effect. Two of the three main accounts of this effect place restrictions on when it will be observed, limiting its occurrence either to items processed by the phonological loop (the phonological loop hypothesis) or to items that are not too dissimilar from the irrelevant speech (the feature model). A third, the object-oriented episodic record (O-OER) model, requires only that the memory task involves seriation. The present studies test these three accounts by examining whether irrelevant auditory speech will interfere with a task that does not involve the phonological loop, does not use stimuli that are compatible with those to be remembered, but does require seriation. Two experiments found that irrelevant speech led to lower levels of performance in a visual statistical learning task, offering more support for the O-OER model and posing a challenge for the other two accounts. PMID:19370483

Neath, Ian; Guérard, Katherine; Jalbert, Annie; Bireta, Tamra J; Surprenant, Aimée M

2009-08-01

400

Electroglottographic and perceptual evaluation of tracheoesophageal speech.  

PubMed

To optimize tracheoesophageal (TO) speech after total laryngectomy, it is vital to have a robust tool of assessment to help investigate deficiencies, document changes, and facilitate therapy. We sought to evaluate and validate electroglottography (EGG) as an important tool in the multidimensional assessment of TO speech. This study is a cross-sectional study of the largest cohort of TO speakers treated by a single surgeon. A second group of normal laryngeal speakers served as a control group. EGG analysis of both groups using connected speech and sustained vowels was performed. Two trained expert raters undertook perceptual evaluation using two accepted scales. EGG measures were then analyzed for correlation with treatment variables. A separate correlation analysis was performed to identify EGG measures that may be associated with perceptual dimensions. Our data from EGG analysis are similar to data obtained from conventional acoustic signal analysis of TO speakers. Sustained vowel and connected speech parameters were poorer in TO speakers than in normal laryngeal speakers. In perceptual evaluation, only grade (G) of the GRBAS scale and Overall Voice Quality appeared reproducible and reliable. T stage, pharyngeal reconstruction and method of closure, cricopharyngeal myotomy, and postoperative complications appear to be correlated with the EGG measures. Five voice measures-jitter, shimmer, average frequency, normalized noise energy, and irregularity-correlated well with the key dimensions of perceptual assessment. EGG is an important assessment tool of TO speech, and can now be reliably used in a clinical setting. PMID:17490856

Kazi, Rehan; Kanagalingam, Jeeve; Venkitaraman, Ramachandran; Prasad, Vyas; Clarke, Peter; Nutting, Christopher M; Rhys-Evans, Peter; Harrington, Kevin J

2009-03-01

401

Music and speech prosody: a common rhythm  

PubMed Central

Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61) using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA) and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception, and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks) was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress).

Hausen, Maija; Torppa, Ritva; Salmela, Viljami R.; Vainio, Martti; Sarkamo, Teppo

2013-01-01

402

Tuned to the Signal: The Privileged Status of Speech for Young Infants  

ERIC Educational Resources Information Center

Do young infants treat speech as a special signal, compared with structurally similar non-speech sounds? We presented 2- to 7-month-old infants with nonsense speech sounds and complex non-speech analogues. The non-speech analogues retain many of the spectral and temporal properties of the speech signal, including the pitch contour information…

Vouloumanos, Athena; Werker, Janet F.

2004-01-01

403

An integrated approach to improving noisy speech perception  

NASA Astrophysics Data System (ADS)

For a number of practical purposes and tasks, experts have to decode speech recordings of very poor quality. A combination of techniques is proposed to improve intelligibility and quality of distorted speech messages and thus facilitate their comprehension. Along with the application of noise cancellation and speech signal enhancement techniques removing and/or reducing various kinds of distortions and interference (primarily unmasking and normalization in time and frequency fields), the approach incorporates optimal listener expert tactics based on selective listening, nonstandard binaural listening, accounting for short-term and long-term human ear adaptation to noisy speech, as well as some methods of speech signal enhancement to support speech decoding during listening. The approach integrating the suggested techniques ensures high-quality ultimate results and has successfully been applied by Speech Technology Center experts and by numerous other users, mainly forensic institutions, to perform noisy speech records decoding for courts, law enforcement and emergency services, accident investigation bodies, etc.

Koval, Serguei; Stolbov, Mikhail; Smirnova, Natalia; Khitrov, Mikhail

2002-05-01

404

Speech-enabled augmented reality supporting mobile industrial maintenance  

Microsoft Academic Search

The SEAR (speech-enabled AR) framework uses flexible and scalable vision-based localization techniques to offer maintenance technicians a seamless multimodal user interface. The user interface juxtaposes a graphical AR view with a context-sensitive speech dialogue.

Stuart Goose; Sandra Sudarsky; Xiang Zhang; Nassir Navab

2003-01-01

405

Status Report on Speech Research, 1 January-31 March 1982.  

National Technical Information Service (NTIS)

This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation and practical applications. Manuscripts cover the following topics: Speech perception and memory coding in relation...

A. M. Liberman

1982-01-01

406

Speech Articulation Disorders Among Military Dependent School Children.  

National Technical Information Service (NTIS)

This report presents the results of a survey pertaining to speech articulation disorders among military dependent children attending the Ayer, Massachusetts, public school system in grades 1-6. Report indicates a higher speech articulation disorder rate a...

D. E. Gordon

1970-01-01

407

Manpower Resources and Needs in Speech Pathology/Audiology.  

National Technical Information Service (NTIS)

The study describes the demographic and professional characteristics of the speech pathology/audiology work force and those students currently in training; identifies current patterns of manpower utilization in speech pathology and audiology; estimates th...

1974-01-01

408

Getting Your Employer to Cover Speech, Language and Hearing Services  

MedlinePLUS

... shows that 82% of Fortune 1000 companies cover speech-language pathology and audiology services. Some employers may ask why ... clinics, hospitals, health departments, and private practices provide speech-language pathology and audiology services beyond what may be available ...

409

Adding Speech, Language, and Hearing Benefits to Your Policy  

MedlinePLUS

... Benefits Employers, Insurers, and Labor Unions: Adding Speech, Language, and Hearing Benefits to Your Policy This section ... and labor unions-with information on adding speech, language, and hearing benefits to your health insurance policy. ...

410

On the quality of synthetic speech evaluation and improvements  

NASA Astrophysics Data System (ADS)

The human letter to sound conversion is modeled in a text to speech system. The purpose of the study is the assessment of some limitations of Linear Predictive Coding (LPC) as a scheme for speech analysis, manipulation and synthesis, and the exploration of ways to remove some of the drawbacks of LPC. The intelligibility of synthetic speech, which is an important attribute of speech quality, is studied. A perception experiment, in which an articulated test without noise and a Monosyllabic Adaptive Speech Interference Test (MASIT) were used to evaluate the intelligibility of nine different speech coding schemes, was conducted. The naturalness of synthetic speech is studied. The preservation of the identity of a speaker, as a feature of a high quality speech coding scheme, is studied. The relative importance of coded vocal tract and voice source information for perceived speaker identity is investigated.

Eggen, Josephus Hubertus

1992-06-01

411

What Is Voice? What Is Speech? What Is Language?  

MedlinePLUS

... What Is Voice? What Is Speech? What Is Language? On this page: Voice Speech Language Where can ... may occur in children who have developmental disabilities. Language Language is the expression of human communication through ...

412

Enhancement of Speech Intelligibility for the Hearing Impaired.  

National Technical Information Service (NTIS)

A set of computer programs was tested that can noticeably improve speech intelligibility for the hearing impaired. The processing first emphasizes speech features by removing noise and pitch irregularities from vowels and by adaptively enhancing the chara...

J. M. Kates

1981-01-01

413

"Thoughts Concerning Education": John Locke On Teaching Speech  

ERIC Educational Resources Information Center

Locke's suggestions for more effective speech instruction have gone largely unnoticed. Consequently, it is the purpose of this article to consider John Locke's criticisms, theory and specific methods of speech education. (Author)

Baird, John E.

1971-01-01

414

Effects of Interior Aircraft Noise on Speech Intelligibility and Annoyance.  

National Technical Information Service (NTIS)

Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceiv...

K. S. Pearsons R. L. Bennett

1977-01-01

415

Some Effects of Training on the Perception of Synthetic Speech  

PubMed Central

The present study was conducted to determine the effects of training on the perception of synthetic speech. Three groups of subjects were tested with synthetic speech using the same tasks before and after training. One group was trained with synthetic speech. A second group went through the identical training procedures using natural speech. The third group received no training. Although performance of the three groups was the same prior to training, significant differences on the post-test measures of word recognition were observed: the group trained with synthetic speech performed much better than the other two groups. A six-month follow-up indicated that the group trained with synthetic speech displayed long-term retention of the knowledge and experience gained with prior exposure to synthetic speech generated by a text-to-speech system.

Schwab, Eileen C.; Nusbaum, Howard C.; Pisoni, David B.

2012-01-01

416

A Rating of Doctoral Programs in Speech Communication, 1976  

ERIC Educational Resources Information Center

Reviews a survey evaluation of speech communication doctoral programs existing in 1976. Available from: ACA Bulletin, Robert Hall, Editor, Speech Communication Association, 5205 Leesburg Pike, Suite 1001, Falls Church, VA 22041. (MH)

Edwards, Renee; Barker, Larry

1977-01-01

417

Graduate Programs in Speech Communication: A Position Paper  

ERIC Educational Resources Information Center

Details a position paper concerning the major focus of graduate programs in speech communication. Available from: ACA Bulletin, Robert Hall, Editor, Speech Communication Association, 5205 Leesburg Pike, Suite 1001, Falls Church, VA 22041. (MH)

Goldberg, Alvin A.

1977-01-01

418

Parsing the phonological loop: Activation timing in the dorsal speech stream determines accuracy in speech reproduction  

PubMed Central

Summary Despite significant research and important clinical correlates, direct neural evidence for a phonological loop linking speech perception, short-term memory and production remains elusive. To investigate these processes, we acquired whole-head magnetoencephalographic (MEG) recordings from human subjects performing a variable-length syllable sequence reproduction task. The MEG sensor data was source-localized using a time-frequency spatially adaptive filter, and we examined the time-courses of cortical oscillatory power and the correlations of oscillatory power with behavior, between onset of the audio stimulus and the overt speech response. We found dissociations between time-courses of behaviorally relevant activations in a network of regions falling largely within the dorsal speech stream. In particular, verbal working memory load modulated high gamma power (HGP) in both Sylvian-Parietal-Temporal (Spt) and Broca’s Areas. The time-courses of the correlations between HGP and subject performance clearly alternated between these two regions throughout the task. Our results provide the first evidence of a reverberating input-output buffer system in the dorsal stream underlying speech sensorimotor integration, consistent with recent phonological loop, competitive queuing and speech-motor control models. These findings also shed new light on potential sources of speech dysfunction in aphasia and neuropsychiatric disorders, identifying anatomically and behaviorally dissociable activation time-windows critical for successful speech reproduction.

Herman, Alexander B.; Houde, John F.; Vinogradov, Sophia; Nagarajan, Srikantan

2013-01-01

419

American Speech-Language-Hearing Association  

NSDL National Science Digital Library

The American Speech-Language-Hearing-Association (ASHA) is the national professional, scientific, and credentialing association for more than 166,000 members in fields like audiology and speech-language pathology. New users might want to slide on over to the Information For area. Here they will find thematic sections for audiologists, students, academic programs, and the general public. Also on the homepage are six areas of note, including Publications, Events, Advocacy, and Continuing Education. In the Publications area, visitors can look over best-practice documents, listen to a podcast series, and also learn more about ASHA's academic journals, which include the American Journal of Audiology and the Journal of Speech, Language, and Hearing Research. [KMG

2013-01-01

420

Computer speech synthesis: its status and prospects.  

PubMed

Computer speech synthesis has reached a high level of performance, with increasingly sophisticated models of linguistic structure, low error rates in text analysis, and high intelligibility in synthesis from phonemic input. Mass market applications are beginning to appear. However, the results are still not good enough for the ubiquitous application that such technology will eventually have. A number of alternative directions of current research aim at the ultimate goal of fully natural synthetic speech. One especially promising trend is the systematic optimization of large synthesis systems with respect to formal criteria of evaluation. Speech recognition has progressed rapidly in the past decade through such approaches, and it seems likely that their application in synthesis will produce similar improvements. PMID:7479804

Liberman, M

1995-10-24

421

Motor speech deficit following carotid endarterectomy.  

PubMed Central

Stroke as a complication of carotid endarterectomy has been extensively reviewed. Considerably less attention has been directed to local injuries of the cranial nerves and their branches. Verta, Hertzer, Imparato, DeWeese, and Matsumoto have reported experience with these injuries. DeWeese found a 9.7% rate of cranial nerve injury, while in Hertzer's series, 15% of patients had nerve dysfunction in the early postendarterectomy period. In 1980, Liapis in a preliminary report found that when postoperative examination was supplemented by detailed evaluation by speech pathologists, the incidence of early abnormalities reached 27%. The purpose of this study was to expand upon Liapis' early observation and to clarify the contribution of the speech pathologists in identifying cranial nerve dysfunctions, specifically those resulting in motor speech abnormalities, following carotid endarterectomy.

Evans, W E; Mendelowitz, D S; Liapis, C; Wolfe, V; Florence, C L

1982-01-01

422

Motor speech deficit following carotid endarterectomy.  

PubMed

Stroke as a complication of carotid endarterectomy has been extensively reviewed. Considerably less attention has been directed to local injuries of the cranial nerves and their branches. Verta, Hertzer, Imparato, DeWeese, and Matsumoto have reported experience with these injuries. DeWeese found a 9.7% rate of cranial nerve injury, while in Hertzer's series, 15% of patients had nerve dysfunction in the early postendarterectomy period. In 1980, Liapis in a preliminary report found that when postoperative examination was supplemented by detailed evaluation by speech pathologists, the incidence of early abnormalities reached 27%. The purpose of this study was to expand upon Liapis' early observation and to clarify the contribution of the speech pathologists in identifying cranial nerve dysfunctions, specifically those resulting in motor speech abnormalities, following carotid endarterectomy. PMID:7125731

Evans, W E; Mendelowitz, D S; Liapis, C; Wolfe, V; Florence, C L

1982-10-01

423

Template based low data rate speech encoder  

NASA Astrophysics Data System (ADS)

The 2400-b/s linear predictive coder (LPC) is currently being widely deployed to support tactical voice communication over narrowband channels. However, there is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low-probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems. An 800-b/s voice encoding algorithm is presented which is an extension of the 2400-b/s LPC. To construct template tables, speech samples of 420 speakers uttering 8 sentences each were excerpted from the Texas Instrument - Massachusetts Institute of Technology (TIMIT) Acoustic-Phonetic Speech Data Base. Speech intelligibility of the 800-b/s voice encoding algorithm measured by the diagnostic rhyme test (DRT) is 91.5 for three male speakers. This score compares favorably with the 2400-b/s LPC of a few years ago.

Fransen, Lawrence

1993-09-01

424

Method of speech enhancement based on Hilbert-Huang transform  

Microsoft Academic Search

A new speech enhancement method of Hilbert-Huang transform (HHT) was proposed. Hilbert-Huang transform is a new and powerful theory for the time-frequency analysis and is efficient for describing the local features of dynamic signals. On the basis of basic speech enhancement methods and HHT algorithm, a new method of speech enhancement is introduced. Using the EMD algorithm, firstly the speech

Xueyao Li; Xiaojie Zou; Rubo Zhang; Guanqun Liu

2008-01-01

425

Enhancing Emotion Recognition from Speech through Feature Selection  

Microsoft Academic Search

\\u000a In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech\\u000a feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion Challenge,\\u000a we studied the relative importance of the individual speech parameters, and based on their ranking, a subset of speech parameters\\u000a that offered advantageous performance was selected.

Theodoros Kostoulas; Todor Ganchev; Alexandros Lazaridis; Nikos Fakotakis

2010-01-01

426

A segmental speech model with applications to word spotting  

Microsoft Academic Search

The authors present a segmental speech model that explicitly models the dynamics in a variable-duration speech segment by using a time-varying trajectory model of the speech features in the segment. Each speech segment is represented by a set of statistics which includes a time-varying trajectory, a residual error covariance around the trajectory, and the number of frames in the segment.

Herbert Gish; Kenney Ng

1993-01-01

427

Statistical feature evaluation for classification of stressed speech  

Microsoft Academic Search

The variations in speech production due to stress have an adverse affect on the performances of speech and speaker recognition\\u000a algorithms. In this work, different speech features, such as Sinusoidal Frequency Features (SFF), Sinusoidal Amplitude Features\\u000a (SAF), Cepstral Coefficients (CC) and Mel Frequency Cepstral Coefficients (MFCC), are evaluated to find out their relative\\u000a effectiveness to represent the stressed speech. Different

H. Patro; G. Senthil Raja; S. Dandapat

2007-01-01

428

Speech recognition in advanced rotorcraft - Using speech controls to reduce manual control overload  

NASA Technical Reports Server (NTRS)

An experiment has been conducted to ascertain the usefulness of helicopter pilot speech controls and their effect on time-sharing performance, under the impetus of multiple-resource theories of attention which predict that time-sharing should be more efficient with mixed manual and speech controls than with all-manual ones. The test simulation involved an advanced, single-pilot scout/attack helicopter. Performance and subjective workload levels obtained supported the claimed utility of speech recognition-based controls; specifically, time-sharing performance was improved while preparing a data-burst transmission of information during helicopter hover.

Vidulich, Michael A.; Bortolussi, Michael R.

1988-01-01

429

System And Method For Characterizing Voiced Excitations Of Speech And Acoustic Signals, Removing Acoustic Noise From Speech, And Synthesizi  

DOEpatents

The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

Burnett, Greg C. (Livermore, CA); Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

2006-04-25

430

Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems  

Microsoft Academic Search

Automatic speech recognition systems work reasonably well under clean conditions but become fragile in practical applications involving real-world environments. To date, most approaches dealing with environmental noise in speech systems are based on assumptions concerning the noise, or differences in collecting and training on a specific noise condition, rather than exploring the nature of the noise. As such, speech recognition,

Murat Akbacak; John H. L. Hansen

2007-01-01

431

Acoustic Studies of Dysarthric Speech: Methods, Progress, and Potential.  

ERIC Educational Resources Information Center

Describes the major types of acoustic analysis available for the study of speech, identifies equipment and other components needed for a modern speech-analysis laboratory, and lists possible measurements for various aspects of phonation, articulation, and resonance, as they might be seen in neurologically disordered speech. (Author/DB)

Kent, Ray D.; Weismer, Gary; Kent, Jane F.; Vorperian, Houri K.; Duffy, Joseph R.

1999-01-01

432

Auditory Long Latency Responses to Tonal and Speech Stimuli  

ERIC Educational Resources Information Center

Purpose: The effects of type of stimuli (i.e., nonspeech vs. speech), speech (i.e., natural vs. synthetic), gender of speaker and listener, speaker (i.e., self vs. other), and frequency alteration in self-produced speech on the late auditory cortical evoked potential were examined. Method: Young adult men (n = 15) and women (n = 15), all with…

Swink, Shannon; Stuart, Andrew

2012-01-01

433

Use of Computer Speech Technologies To Enhance Learning.  

ERIC Educational Resources Information Center

Discusses the design of an innovative learning system that uses new technologies for the man-machine interface, incorporating a combination of Automatic Speech Recognition (ASR) and Text To Speech (TTS) synthesis. Highlights include using speech technologies to mimic the attributes of the ideal tutor and design features. (AEF)

Ferrell, Joe

1999-01-01

434

RECOVERING OF PACKET LOSS FOR DISTRIBUTED SPEECH RECOGNITION  

Microsoft Academic Search

This work deals with the packet loss problem in a Distributed Speech Recognition architecture. A packet loss simulation model is first proposed in order to simulate different channel degradation conditions. In these conditions, the performance of our continuous French speech recognition system is evaluated for packets containing different numbers of speech feature vectors. Several reconstruction strategies, to recover lost information,

Pedro Mayorga; Richard Lamy; Laurent Besacier

2002-01-01

435

Elements of a Plan-Based Theory of Speech Acts  

Microsoft Academic Search

This paper explores the truism that people think about what they say. It proposes hat, to satisfy their own goals, people often plan their speech acts to affect their listenerr' beliefs, goals, and emotional states. Such language use mn be mod- elled by viewing speech acts as operators in a planning system, thus allowing both physical and speech acts to

Philip R. Cohen; C. Raymond Perrault

2003-01-01

436

Using on-line altered auditory feedback treating Parkinsonian speech  

NASA Astrophysics Data System (ADS)

Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].

Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

2005-09-01

437

Speech Errors in Alzheimer's Disease: Reevaluating Morphosyntactic Preservation  

Microsoft Academic Search

Researchers studying the speech of individuals with probable Alzheimer's disease (PAD) report that morphosyntax is preserved relative to lexical aspects of speech. The current study questions whether dividing all errors into only two categories, morphosyntactic and lexical, is warranted, given the theoretical controversies concerning the production and representation of pronouns and closed-class words in particular. Two experiments compare the speech

Lori J. P. Altmann; Daniel Kempler; Elaine S. Andersen

2001-01-01

438

Central Timing Deficits in Subtypes of Primary Speech Disorders  

ERIC Educational Resources Information Center

Childhood apraxia of speech (CAS) is a proposed speech disorder subtype that interferes with motor planning and/or programming, affecting prosody in many cases. Pilot data (Peter & Stoel-Gammon, 2005) were consistent with the notion that deficits in timing accuracy in speech and music-related tasks may be associated with CAS. This study replicated…

Peter, Beate; Stoel-Gammon, Carol

2008-01-01

439

Dimensions of Early Speech Sound Disorders: A Factor Analytic Study  

ERIC Educational Resources Information Center

The goal of this study was to classify children with speech sound disorders (SSD) empirically, using factor analytic techniques. Participants were 3-7-year olds enrolled in speech/language therapy (N=185). Factor analysis of an extensive battery of speech and language measures provided support for two distinct factors, representing the skill…

Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Stein, Catherine M.; Shriberg, Lawrence D.; Iyengar, Sudha K.; Taylor, H. Gerry

2006-01-01

440

Developing a Weighted Measure of Speech Sound Accuracy  

ERIC Educational Resources Information Center

Purpose: To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound…

Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

2011-01-01

441

Applications of computer generated expressive speech for communication disorders  

Microsoft Academic Search

This paper focuses on generation of expressive speech, specif- ically speech displaying vocal affect. Generating speech with vocal affect is important for diagnosis, research, and remedia- tion for children with autism and developmental language dis- orders. However, because vocal affect involves many acoustic factors working together in complex ways, it is unlikely that we will be able to generate compelling

Jan P. H. van Santen; Lois M. Black; Gilead Cohen; Alexander Kain; Esther Klabbers; Taniya Mishra; Jacques de Villiers; Xiaochuan Niu

2003-01-01

442

Noise adaptation using linear regression for continuous noisy speech recognition  

Microsoft Academic Search

We present an approach for recognising continuous speech in the presence of an additive noise, based on model adaptation. The method consists in transforming the parameters of acoustic mod- els to reduce the acoustic mismatch between a test utterance and a set of clean speech models. We assume that speech is modelled by a set of Stochastic Trajectory Models (STM).

Olivier Siohan; Yifan Gong; Jean-Paul HATON

1995-01-01

443

Perception of Intersensory Synchrony in Audiovisual Speech: Not that Special  

ERIC Educational Resources Information Center

Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired ("unity assumption"). Participants made…

Vroomen, Jean; Stekelenburg, Jeroen J.

2011-01-01

444

Speech reconstruction from mel frequency cepstral coefficients and pitch frequency  

Microsoft Academic Search

This paper presents a novel low complexity, frequency domain algorithm for reconstruction of speech from the mel-frequency cepstral coefficients (MFCC), commonly used by speech recognition systems, and the pitch frequency values. The reconstruction technique is based on the sinusoidal speech representation. A set of sine-wave frequencies is derived using the pitch frequency and voicing decisions, and synthetic phases are then

Dan Chazan; Ron Hoory; Gilad Cohen; Meir Zibulski

2000-01-01

445

Consolidation-Based Speech Translation and Evaluation Approach  

NASA Astrophysics Data System (ADS)

The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidation-based speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also propose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.

Hori, Chiori; Zhao, Bing; Vogel, Stephan; Waibel, Alex; Kashioka, Hideki; Nakamura, Satoshi

446

Nonlinear feature based classification of speech under stress  

Microsoft Academic Search

Studies have shown that variability introduced by stress or emotion can severely reduce speech recognition accuracy. Techniques for detecting or assessing the presence of stress could help improve the robustness of speech recognition systems. Although some acoustic variables derived from linear speech production theory have been investigated as indicators of stress, they are not always consistent. Three new features derived

Guojun Zhou; John H. L. Hansen; James F. Kaiser

2001-01-01

447

Emphasized speech synthesis based on hidden Markov models  

Microsoft Academic Search

This paper presents a statistical approach to synthesizing emphasized speech based on hidden Markov models (HMMs). Context-dependent HMMs are trained using emphasized speech data uttered by intentionally emphasizing an arbitrary accentual phrase in a sentence. To model acoustic characteristics of emphasized speech, new contextual factors describing an emphasized accentual phrase are additionally considered in model training. Moreover, to build HMMs

Kumiko Morizane; Keigo Nakamura; Tomoki Toda; Hiroshi Saruwatari; Kiyohiro Shikano

2009-01-01

448

A speech enhancement method based on Kalman filtering  

Microsoft Academic Search

In this paper, the problem of speech enhancement when only corrupted speech signal is available for processing is considered. For this, the Kalman filtering method is studied and compared with the Wiener filtering method. Its performance is found to be significantly better than the Wiener filtering method. A delayed-Kalman filtering method is also proposed which improves the speech enhancement performance

K. K. Paliwal; Anjan Basu

1987-01-01

449

The Different Functions of Speech in Defamation and Privacy Cases.  

ERIC Educational Resources Information Center

Reviews United States Supreme Court decisions since 1900 to show that free speech decisions often rest on the circumstances surrounding the speech. Indicates that freedom of speech wins out over privacy when social or political function but not when personal happiness is the issue.

Kebbel, Gary

1984-01-01

450

Visually Impaired Persons' Comprehension of Text Presented with Speech Synthesis.  

ERIC Educational Resources Information Center

This study of 48 individuals with visual impairments (16 middle-aged with experience in synthetic speech, 16 middle-aged inexperienced, and 16 older inexperienced) found that speech synthesis, compared to natural speech, generally yielded lower results with respect to memory and understanding of texts. Experience had no effect on performance.…

Hjelmquist, E.; And Others

1992-01-01

451

Cortical activity patterns predict robust speech discrimination ability in noise  

PubMed Central

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem.

Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

2012-01-01

452

Index to NASA news releases and speeches, 1992  

NASA Technical Reports Server (NTRS)

This issue of the Index to NASA News Releases and Speeches contains a listing of news releases distributed by the Office of Public Affairs, NASA Headquarters, and a selected listing of speeches presented by members of the Headquarters staff during 1992. The index is arranged in six sections: subject index, personal names index, news release number index, accession number index, speeches, and news releases.

1993-01-01

453

Index to NASA news releases and speeches, 1990  

NASA Technical Reports Server (NTRS)

This issue of the annual Index to NASA News Releases and Speeches contains a listing of news releases distributed by the Office of Public Affairs, NASA Headquarters, and a selected listing of speeches presented by members of headquarters staff during 1990. The index is arranged in six sections: Subject Index, Personal Names Index, News Release Number Index, Accession Number, Speeches, and New Releases Indices.

1991-01-01

454

Rhythmic Priming Enhances the Phonological Processing of Speech  

ERIC Educational Resources Information Center

While natural speech does not possess the same degree of temporal regularity found in music, there is recent evidence to suggest that temporal regularity enhances speech processing. The aim of this experiment was to examine whether speech processing would be enhanced by the prior presentation of a rhythmical prime. We recorded electrophysiological…

Cason, Nia; Schon, Daniele

2012-01-01

455

SPEECH AND HEARING PROGRAMS, ORGANIZATIONAL AND ADMINISTRATIVE MANUAL.  

ERIC Educational Resources Information Center

THIS HANDBOOK OUTLINES THE PRACTICES AND PROCEDURES IN THE OPERATION OF THE SPEECH AND HEARING PROGRAMS IN THE MONTGOMERY COUNTY, MARYLAND, SCHOOL SYSTEM AND DESCRIBES THE DUTIES AND RESPONSIBILITIES OF THE SUPERVISOR OF THE SPEECH AND HEARING PROGRAMS, SPEECH AND HEARING THERAPIST, HEARING THERAPIST, SPECIAL CLASS TEACHER, AND AUDIOMETRIST. A…

Montgomery County Public Schools, Rockville, MD.

456

PROSPECTS FOR SPEECH TECHNOLOGY IN THE OCEANIA REGION  

Microsoft Academic Search

The development of speech technology in the Oceania region is an issue for Australian speech scientists and technologists. In this paper we examine both the issues that govern the development of speech technology anywhere, the specific opportunities and inhibiting factors of the Oceania region, and the role that Australia, as the largest and most prosperous nation of the region, can

J Bruce Millar

2000-01-01

457

Speech-Perception-in-Noise Deficits in Dyslexia  

ERIC Educational Resources Information Center

Speech perception deficits in developmental dyslexia were investigated in quiet and various noise conditions. Dyslexics exhibited clear speech perception deficits in noise but not in silence. "Place-of-articulation" was more affected than "voicing" or "manner-of-articulation." Speech-perception-in-noise deficits persisted when performance of…

Ziegler, Johannes C.; Pech-Georgel, Catherine; George, Florence; Lorenzi, Christian

2009-01-01

458

Speech Perception Within an Auditory Cognitive Science Framework  

Microsoft Academic Search

The complexities of the acoustic speech signal pose many significant challenges for listeners. Although perceiving speech begins with auditory processing, investigation of speech perception has progressed mostly independently of study of the auditory system. Nevertheless, a growing body of evidence demonstrates that cross-fertilization between the two areas of research can be productive. We briefly describe research bridging the study of

Lori L. Holt; Andrew J. Lotto

2008-01-01

459

Cerebral mechanisms of prosodic integration: evidence from connected speech  

Microsoft Academic Search

Using functional Magnetic Resonance Imaging (fMRI) and long connected speech stimuli, we addressed the question of neuronal networks involved in prosodic integration by comparing (1) differences in brain activity when hearing connected speech stimuli with high and low degrees of prosodic expression; (2) differences in brain activity in two different diotic listening conditions (normal speech delivery to both ears, i.e.,

Isabelle Hesling; Sylvain Clément; Martine Bordessoules; Michèle Allard

2005-01-01

460

Balancing Act: Teachers' Classroom Speech and the First Amendment.  

ERIC Educational Resources Information Center

Discusses how using "Pickering v. Board of Education" or "Hazelwood School District v Kuhlmeier" as precedents in cases involving classroom speech by teachers has fostered court inconsistency. Proposes a judicial standard that emphasizes adequate notice to teachers when certain speech is proscribed and scrutinizes school-board actions when speech

Daly, Karen C.

2001-01-01

461

Effects of interior aircraft noise on speech intelligibility and annoyance  

Microsoft Academic Search

Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that

K. S. Pearsons; R. L. Bennett

1977-01-01

462

Family Pedigrees of Children with Suspected Childhood Apraxia of Speech  

ERIC Educational Resources Information Center

Forty-two children (29 boys and 13 girls), ages 3-10 years, were referred from the caseloads of clinical speech-language pathologists for suspected childhood apraxia of speech (CAS). According to results from tests of speech and oral motor skills, 22 children met criteria for CAS, including a severely limited consonant and vowel repertoire,…

Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy; Taylor, H. Gerry; Iyengar, Sudha; Shriberg, Lawrence D.

2004-01-01

463

The Nature of Referred Subtypes of Primary Speech Disability  

ERIC Educational Resources Information Center

Of 1100 children referred to a mainstream paediatric speech and language therapy service in a 15-month period (January 1999 to April 2000), 320 had primary speech impairment. No referred child had significant hearing impairment, learning disability or physical disability. This paper describes the nature of the subtypes of speech disability…

Broomfield, J.; Dodd, B.

2004-01-01

464

Speech, Hearing and Language: work in progress Volume 13  

Microsoft Academic Search

Abstract Pitch, periodicity and aperiodicity are regarded as important cues for the perception of speech. However, modern CIS cochlear implant speech processors, and recent simulations of these processors, provide no explicit representation of these factors. Wehave,constructed ,four-channel vocoder ,processors ,that manipulate ,the representation of periodicity and pitch information, and examined the effects on the perception of speech and the ability

Tim GREEN; Andrew FAULKNER; Stuart ROSEN

465

Recent Research on the Treatment of Speech Anxiety.  

ERIC Educational Resources Information Center

Apprehension on the part of students who must engage in public speaking figures high on the list of student fears. Speech anxiety has been viewed as a trait--the overall propensity to fear giving speeches--and as a state--the condition of fearfulness on a particular occasion of speech making. Methodology in therapy research should abide by the…

Page, Bill

466

Coping with speech anxiety: The power of positive thinking  

Microsoft Academic Search

This paper reports two studies designed to probe the link between speech anxiety and positive thinking. The first study reconfirms earlier research that speech anxiety is positively correlated with negative thoughts and negatively related to positive thoughts. The second study found that students trained to use visualization reported a higher proportion of positive to negative thoughts and lower speech anxiety

Joe Ayres

1988-01-01

467

Automated Discovery of Speech Act Categories in Educational Games  

ERIC Educational Resources Information Center

In this paper we address the important task of automated discovery of speech act categories in dialogue-based, multi-party educational games. Speech acts are important in dialogue-based educational systems because they help infer the student speaker's intentions (the task of speech act classification) which in turn is crucial to providing adequate…

Rus, Vasile; Moldovan, Cristian; Niraula, Nobal; Graesser, Arthur C.

2012-01-01

468

Synthetic speech stimuli spectrally normalized for nonhuman cochlear dimensions  

Microsoft Academic Search

Behavioral or neural measures of speech encoding are often taken from animals with auditory systems that differ substantially from those of humans. Absolute distance between spectral peaks of speech sounds along the basilar membrane is typically much greater in humans than in smaller animals. To address this problem, a synthesizer was developed for creating speech scaled for nonhuman cochleae in

Michael Kiefte; Keith R. Kluender; William S. Rhode

2002-01-01

469

Speech for the Deaf Child: Knowledge and Use.  

ERIC Educational Resources Information Center

Presented is a collection of 16 papers on speech development, handicaps, teaching methods, and educational trends for the aurally handicapped child. Arthur Boothroyd relates acoustic phonetics to speech teaching, and Jean Utley Lehman investigates a scheme of linguistic organization. Differences in speech production by deaf and normal hearing…

Connor, Leo E., Ed.

470

A new speaker adaptation technique using very short calibration speech  

Microsoft Academic Search

A speaker adaptation technique based on the separation of speech spectra variation sources is developed for improving speaker-independent continuous speech recognition. The variation sources include speaker acoustic characteristics, phonologic characteristics, and contextual dependency of allophones. Statistical methods are formulated to normalize speech spectra based on speaker acoustic characteristics and then adapt mixture Gaussian density phone models based on speaker phonologic

Yunxin Zhao

1993-01-01

471

Noise, unattended speech and short-term memory  

Microsoft Academic Search

Studies of ‘noise pollution’ have typically used unpatterned white noise. The present study compares the effect of noise with that of unattended speech. Three experiments required the immediate serial recall of sequences of nine visually presented digits accompanied by silence, noise or unattended speech in an unfamiliar language. Experiments 1 and 2 showed a clear effect of unattended speech at

PIERRE SALAMÉ; ALAN BADDELEY

1987-01-01

472

Summarization evaluation for text and speech: issues and approaches  

Microsoft Academic Search

This paper surveys current text and speech summarization evaluation approaches. It discusses advantages and disadvantages of these, with the goal of identifying summarization techniques most suitable to speech summarization. Precision\\/recall schemes, as well as summary accuracy measures which incorporate weight- ings based on multiple human decisions, are suggested as particu- larly suitable in evaluating speech summaries. Index Terms: evaluation, text

Ani Nenkova

2006-01-01

473

The Impact of Extrinsic Demographic Factors on Cantonese Speech Acquisition  

ERIC Educational Resources Information Center

This study modeled the associations between extrinsic demographic factors and children's speech acquisition in Hong Kong Cantonese. The speech of 937 Cantonese-speaking children aged 2;4 to 6;7 in Hong Kong was assessed using a standardized speech test. Demographic information regarding household income, paternal education, maternal education,…

To, Carol K. S.; Cheung, Pamela S. P.; McLeod, Sharynne

2013-01-01

474

Normalization on Temporal Modulation Transfer Function for Robust Speech Recognition  

Microsoft Academic Search

In this paper, we proposed a robust speech feature extraction algorithm for automatic speech recognition which reduced the noise effect in the temporal modulation domain. The proposed algorithm has two steps to deal with the time series of cepstral coefficients. The first step adopted a modulation contrast normalization to normalize the temporal modulation contrast of both clean and noisy speech

Xugang Lu; Shigeki Matsuda; Tohru Shimizu; Satoshi Nakamura

2008-01-01

475

Speech, ink, and slides: the interaction of content channels  

Microsoft Academic Search

In this paper, we report on an empirical exploration of digital ink and speech usage in lecture presentation. We studied the video archives of five Master's level Computer Science courses to understand how instructors use ink and speech together while lecturing, and to evaluate techniques for analyzing digital ink. Our interest in understanding how ink and speech are used together

Richard J. Anderson; Crystal Hoyer; Craig Prince; Jonathan Su; Fred Videon; Steven A. Wolfman

2004-01-01

476

Speech Features Extraction Using Cone-Shaped Kernel Distribution  

Microsoft Academic Search

The paper reviews two basic time-frequency distributions, spectrogram and cone-shaped kernel distribution. We study, analyze and compare properties and performance of these quadratic representations on speech signals. Cone-shaped kernel distribution was successfully ap- plied to speech features extraction due to several useful properties in time-frequency analysis of speech signals.

Janez Zibert; France Mihelic; Nikola Pavesic

2002-01-01

477

Improving the speech intelligibility in classrooms  

NASA Astrophysics Data System (ADS)

One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations of the reverberation time, the indoor ambient noise (or background noise level), the signal-to-noise ratio, and the speech transmission index, it aims to establish a guideline for improving the speech intelligibility in classrooms for any countries and any environmental conditions. The study showed that the acoustical conditions of most of the measured classrooms in Hong Kong are unsatisfactory. The selection of materials inside a classroom is important for improving speech intelligibility at design stage, especially the acoustics ceiling, to shorten the reverberation time inside the classroom. The signal-to-noise should be higher than 11dB(A) for over 70% of speech perception, either tonal or non-tonal languages, without the usage of address system. The unexpected results bring out a call to revise the standard design and to devise acceptable standards for classrooms in Hong Kong. It is also demonstrated a method for assessment on the classroom in other cities with similar environmental conditions.

Lam, Choi Ling Coriolanus

478

Examining the Influence of Speech Frame Size and Number of Cepstral Coefficients on the Speech Recognition Performance  

Microsoft Academic Search

In the present work we explore the influence of front-end setup on the speech recognition performance. Specifically, we study the dependence between specific parameters of the speech parameterization stage, such as speech frame size and number of Mel-frequency cepstral coefficients (MFCC), and the word error rate (WER). Our comparative evaluation is performed by employing the Sphinx-IV speech recognition engine and

Iosif Mporas; Todor Ganchev; Elias Kotinas; Nikos Fakotakis

479

Recent advances in SRI'S IraqCommTM Iraqi Arabic-English speech-to-speech translation system  

Microsoft Academic Search

We summarize recent progress on SRI's IraqComm™ Iraqi Arabic-English two-way speech-to-speech translation system. In the past year we made substantial developments in our speech recognition and machine translation technology, leading to significant improvements in both accuracy and speed of the IraqComm system. On the 2008 NIST-evaluation dataset our two- way speech-to-text (S2T) system achieved 6% to 8% absolute improvement in

Murat Akbacak; Horacio Franco; Michael W. Frandsen; Sasa Hasan; Huda Jameel; Andreas Kathol; Shahram Khadivi; Xin Lei; Arindam Mandal; Saab Mansour; Kristin Precoda; Colleen Richey; Dimitra Vergyri; Wen Wang; Mei Yang; Jing Zheng

2009-01-01

480

Longitudinal Study of Speech Perception, Speech, and Language for Children with Hearing Loss in an Auditory-Verbal Therapy Program  

ERIC Educational Resources Information Center

This study examined the speech perception, speech, and language developmental progress of 25 children with hearing loss (mean Pure-Tone Average [PTA] 79.37 dB HL) in an auditory verbal therapy program. Children were tested initially and then 21 months later on a battery of assessments. The speech and language results over time were compared with…

Dornan, Dimity; Hickson, Louise; Murdoch, Bruce; Houston, Todd

2009-01-01

481

Linguistic contributions to speech-on-speech masking for native and non-native listeners: language familiarity and semantic content.  

PubMed

This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-a?-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. PMID:22352516

Brouwer, Susanne; Van Engen, Kristin J; Calandruccio, Lauren; Bradlow, Ann R

2012-02-01

482

Linguistic contributions to speech-on-speech masking for native and non-native listeners: Language familiarity and semantic content  

PubMed Central

This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener’s knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language.

Brouwer, Susanne; Van Engen, Kristin J.; Calandruccio, Lauren; Bradlow, Ann R.

2012-01-01

483

Effects of Synthetic Speech Output on Requesting and Natural Speech Production in Children with Autism: A Preliminary Study  

ERIC Educational Resources Information Center

Requesting is often taught as an initial target during augmentative and alternative communication intervention in children with autism. Speech-generating devices are purported to have advantages over non-electronic systems due to their synthetic speech output. On the other hand, it has been argued that speech output, being in the auditory…

Schlosser, Ralf W.; Sigafoos, Jeff; Luiselli, James K.; Angermeier, Katie; Harasymowyz, Ulana; Schooley, Katherine; Belfiore, Phil J.

2007-01-01

484

Effects of synthetic speech output on requesting and natural speech production in children with autism: A preliminary study  

Microsoft Academic Search

Requesting is often taught as an initial target during augmentative and alternative communication intervention in children with autism. Speech-generating devices are purported to have advantages over non-electronic systems due to their synthetic speech output. On the other hand, it has been argued that speech output, being in the auditory modality, may not be compatible with the processing preferences of learners

Ralf W. Schlosser; Jeff Sigafoos; James K. Luiselli; Katie Angermeier; Ulana Harasymowyz; Katherine Schooley; Phil J. Belfiore

2007-01-01

485

A Statistical Model-Based Speech Enhancement Using Acoustic Noise Classification for Robust Speech Communication  

NASA Astrophysics Data System (ADS)

In this paper, we present a speech enhancement technique based on the ambient noise classification that incorporates the Gaussian mixture model (GMM). The principal parameters of the statistical model-based speech enhancement algorithm such as the weighting parameter in the decision-directed (DD) method and the long-term smoothing parameter of the noise estimation, are set according to the classified context to ensure best performance under each noise. For real-time context awareness, the noise classification is performed on a frame-by-frame basis using the GMM with the soft decision framework. The speech absence probability (SAP) is used in detecting the speech absence periods and updating the likelihood of the GMM.

Choi, Jae-Hun; Chang, Joon-Hyuk

486

Synthesizing speech animation by learning compact speech co-articulation models  

Microsoft Academic Search

While speech animation fundamentally consists of a sequence of phonemes over time, sophisticated animation requires smooth interpolation and co-articulation effects, where the preceding and following phonemes influence the shape of a phoneme. Co-articulation has been approached in speech animation research in several ways, most often by simply smoothing the mouth geometry motion over time. Data-driven approaches tend to generate realistic

Zhigang Deng; J. P. Lewis; Ulrich Neumann

2005-01-01

487

Not only Translation Quality: Evaluating the NESPOLE! Speech-to-Speech Translation System along other Viewpoints  

Microsoft Academic Search

Performance and usability of real-world speech-to-speech translation systems, like the one developed within the Nespole! project, are aected by several aspects that go beyond the pure translation quality provided by the Human Language Technology components of the system. In this paper we describe these aspects as viewpoints along which we have evaluated the Nespole! system. Four main issues are investigated:

R. Cattoni; G. Lazzari; N. Mana; F. Pianesi; E. Pianta; ITC-irst Trento; F. Metze; J. McDonough; H. Soltau; E. Costantini; S. Burger; D. Gates; A. Lavie; L. Levin; C. Langley; K. Peterson; T. Schultz; A. Waibel; D. Wallace; L. Besacier; H. Blanchon; D. Vaufreydaz; L. Taddei

488

Implementing SRI's Pashto speech-to-speech translation system on a smart phone  

Microsoft Academic Search

We describe our recent effort implementing SRI's UMPC-based Pashto speech-to-speech (S2S) translation system on a smart phone running the Android operating system. In order to maintain very low latencies of system response on computationally limited smart phone platforms, we developed efficient algorithms and data structures and optimized model sizes for various system components. Our current Android-based S2S system requires less

Jing Zheng; Arindam Mandal; Xin Lei; M. Frandsen; N. F. Ayan; D. Vergyri; Wen Wang; M. Akbacak; K. Precoda

2010-01-01

489

Speech\\/Non-Speech Detection in Meetings from Automatically Extracted Low Resolution Visual Features  

Microsoft Academic Search

In this paper we address the problem of estimating who is speaking from automatically extracted low resolution vi- sual cues from group meetings. Traditionally, the task of speech\\/non-speech detection or speaker diarization tries to find who speaks and when from audio features only. Re- cent work has addressed the problem audio-visually but of- ten with less emphasis on the visual

Anon Anon

490

Analysis of Information in Speech and Its Application in Speech Recognition  

Microsoft Academic Search

Previous work analyzed the information in speech using analysis of variance (ANOVA). ANOVA assumes that sources of information\\u000a (phone, speaker, and channel) are univariate gaussian. The sources of information, however, are not unimodal gaussian. Phones\\u000a in speech recognition, e.g., are generally modeled using a multi-state, multi-mixture model. Therefore, this work extends\\u000a ANOVA by assuming phones with 3 state, single mixture

Sachin S. Kajarekar; Hynek Hermansky

2000-01-01

491

Effects of interior aircraft noise on speech intelligibility and annoyance  

NASA Technical Reports Server (NTRS)

Recordings of the aircraft ambiance from ten different types of aircraft were used in conjunction with four distinct speech interference tests as stimuli to determine the effects of interior aircraft background levels and speech intelligibility on perceived annoyance in 36 subjects. Both speech intelligibility and background level significantly affected judged annoyance. However, the interaction between the two variables showed that above an 85 db background level the speech intelligibility results had a minimal effect on annoyance ratings. Below this level, people rated the background as less annoying if there was adequate speech intelligibility.

Pearsons, K. S.; Bennett, R. L.

1977-01-01

492

MOTIVATION OF SPEECH AND HEARING HANDICAPPED CHILDREN.  

ERIC Educational Resources Information Center

AN ANALYSIS WAS MADE OF CHILD MOTIVATION IN STRUCTURED SPEECH AND HEARING THERAPY BASED ON THE ASSUMPTION THAT CHILDREN WHO NEED THIS TYPE OF HELP HAVE A CONSCIOUS DESIRE FOR IMPROVEMENT. PERTINENT LITERATURE WAS SURVEYED TO ESTABLISH A LISTING OF COMMON MOTIVES, DESIRES, AND NEEDS OF PEOPLE, AND A MOTIVATIONAL PREFERENCE EXAMINATION WAS…

GERSTMAN, HUBERT L.; SIEGENTHALER, BRUCE M.

493

Simultaneous Screening for Hearing, Speech and Language.  

National Technical Information Service (NTIS)

The ultimate goals of the project were to: (1) develop a means of screening for speech, language, and hearing problems in a child health setting, using no more time than is ordinarily committed to hearing screening alone; and (2) improve current hearing s...

R. A. Sturner

1992-01-01

494

Speech Recognition Technology for Disabilities Education  

ERIC Educational Resources Information Center

Speech recognition is an alternative to traditional methods of interacting with a computer, such as textual input through a keyboard. An effective system can replace or reduce the reliability on standard keyboard and mouse input. This can especially assist dyslexic students who have problems with character or word use and manipulation in a textual…

Tang, K. Wendy; Kamoua, Ridha; Sutan, Victor; Farooq, Omer; Eng, Gilbert; Chu, Wei Chern; Hou, Guofeng

2005-01-01

495

Perception of Speech in Noise: Neural Correlates  

PubMed Central

The presence of irrelevant auditory information (other talkers, environmental noises) presents a major challenge to listening to speech. The fundamental frequency (F0) of the target speaker is thought to provide an important cue for the extraction of the speaker’s voice from background noise, but little is known about the relationship between speech-in-noise (SIN) perceptual ability and neural encoding of the F0. Motivated by recent findings that music and language experience enhance brainstem representation of sound, we examined the hypothesis that brainstem encoding of the F0 is diminished to a greater degree by background noise in people with poorer perceptual abilities in noise. To this end, we measured speech-evoked auditory brainstem responses to /da/ in quiet and two multi-talker babble conditions (two-talker and six-talker) in native English-speaking young adults who ranged in their ability to perceive and recall SIN. Listeners who were poorer performers on a standardized SIN measure demonstrated greater susceptibility to the degradative effects of noise on the neural encoding of the F0. Particularly diminished was their phase-locked activity to the fundamental frequency in the portion of the syllable known to be most vulnerable to perceptual disruption (i.e., the formant transition period). Our findings suggest that the subcortical representation of the F0 in noise contributes to the perception of speech in noisy conditions.

Song, Judy H.; Skoe, Erika; Banai, Karen; Kraus, Nina

2012-01-01

496

BYBLOS: The BBN continuous speech recognition system  

Microsoft Academic Search

In this paper, we describe BYBLOS, the BBN continuous speech recognition system. The system, designed for large vocabulary applications, integrates acoustic, phonetic, lexical, and linguistic knowledge sources to achieve high recognition performance. The basic approach, as described in previous papers [1, 2], makes extensive use of robust context-dependent models of phonetic coarticulation using Hidden Markov Models (HMM). We describe the

Y. Chow; M. Dunham; O. Kimball; M. Krasner; G. Kubala; J. Makhoul; P. Price; S. Roucos; R. Schwartz

1987-01-01

497

The Carolinas Speech Communication Annual, 1997.  

ERIC Educational Resources Information Center

This 1997 issue of "The Carolinas Speech Communication Annual" contains the following articles: "'Bridges of Understanding': UNESCO's Creation of a Fantasy for the American Public" (Michael H. Eaves and Charles F. Beadle, Jr.); "Developing a Communication Cooperative: A Student, Faculty, and Organizational Learning Experience" (Peter M. Kellett…

McKinney, Bruce C.

1997-01-01

498

Exploiting nonacoustic sensors for speech encoding  

Microsoft Academic Search

The intelligibility of speech transmitted through low-rate coders is severely degraded when high levels of acoustic noise are present in the acoustic environment. Recent advances in nonacoustic sensors, including microwave radar, skin vibration, and bone conduction sensors, provide the exciting possibility of both glottal excitation and, more generally, vocal tract measurements that are relatively immune to acoustic disturbances and can

Thomas F. Quatieri; Kevin Brady; Dave Messing; Joseph P. Campbell; William M. Campbell; Michael S. Brandstein; Clifford J. Weinstein; John D. Tardelli; Paul D. Gatewood

2006-01-01

499

Speech Breathing in Children and Adolescents.  

ERIC Educational Resources Information Center

A study of 80 children, aged 7, 10, 13, and 16, found that gender was not an important variable in speech breathing, but age was. The youngest group exhibited such things as larger lung, rib cage, and abdominal volume initiations and terminations for breath groups and fewer syllables per breath group. (Author/JDD)

Hoit, Jeannette D.; And Others

1990-01-01

500

Speech for People with Tracheostomies or Ventilators  

MedlinePLUS

... The Preferred Practice Patterns for the Profession of Speech-Language Pathology outline the common practices followed by SLPs when engaging in various aspects of the profession. The Preferred Practice Patterns for ... assessment and intervention are outlined in Sections 28 and 29.