Sample records for speech

  1. Speech Development

    MedlinePLUS

    ... to assess your child’s speech production and language development and make appropriate therapy recommendations. It is also ... in both the onset of speech and the development of speech sounds during the first 9-24 ...

  2. Sparkling Speeches

    NSDL National Science Digital Library

    2012-12-14

    Sparkling is the word! In this lesson, students will investigate transforming an exciting student-created expository into an engaging and quality speech using resources from the classroom and the school media center. Students will listen to a remarkable Martin Luther King speech provided by YouTube, confer with classmates on speech construction, and use a variety of easy to access materials (included with this lesson) during the construction of their speech. The lesson allows for in-depth trials and experiments with expository writing and speech writing. In one exciting option, students may use a "Speech Forum" to safely practice their unique speeches in front of a small non-assessing audience of fellow students. A complete exploration and comprehension of introductions, main ideas with support details, and an engaging conclusion transformed into a student speech with a written exam are the final assessments for this memorable lesson.

  3. Plowing Speech

    E-print Network

    Zla ba sgrol ma

    2009-11-10

    . Sman shad plowing speech 1.WAV Length of track 00:01:18 Related tracks (include description/relationship if appropriate) Title of track Plowing Speech Translation of title Description (to be used in archive entry) This file contains a... plowing speech and a discussion about the speech. Genre or type (i.e. epic, song, ritual) Plowing speech Name of recorder (if different from collector) Zla ba sgrol ma Date of recording November 10th 2009. Place of recording Ci jo Village, Phu...

  4. Symbolic Speech

    ERIC Educational Resources Information Center

    Podgor, Ellen S.

    1976-01-01

    The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)

  5. Speech Aids

    NASA Technical Reports Server (NTRS)

    1987-01-01

    Designed to assist deaf and hearing impaired-persons in achieving better speech, Resnick Worldwide Inc.'s device provides a visual means of cuing the deaf as a speech-improvement measure. This is done by electronically processing the subjects' sounds and comparing them with optimum values which are displayed for comparison.

  6. Obama Speeches

    E-print Network

    Hacker, Randi

    2009-02-25

    Broadcast Transcript: Barack Obama is already having a global impact. The Japanese are using his speeches to help teach English. In fact, a book with his collected speeches is currently at the top of the bestseller list on Amazon's Japanese website...

  7. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.

  8. Speech Perception

    E-print Network

    Spoken words exist for mere moments, but from this fleeting acoustic signal we are able to apprehend considerable information. We can decode the linguistic message of the speaker as well as information about her gender, age, region of origin, identity and emotional state. As adult listeners, we are so adept at speech perception that the ability seems trivial. However, the ease with which we perceive speech belies the complexity of the perceptual, cognitive and neural mechanisms involved. The primary reason that speech perception is so complex is there is no straightforward, one?to?one correspondence between a speech segment (e.g., /d/) and its acoustic qualities. About fifty years ago, researchers presumed that there was a simple one?to?one relationship and based on this hypothesis attempted to build a reading machine for the blind whereby written text was translated into a sound alphabet with sound?by?sound translation. However, even with many hours of training, people could not comprehend the machine’s speech. This failure led to the discovery that speech is not a sequence of discrete sounds as text is a string of separate letters (Liberman, 1996). Acoustic elements of a spoken word (e.g., the three speech segments in ‘dean’, /din/) are not produced discretely. Rather, the acoustic information for

  9. Speech production knowledge in automatic speech recognition 

    E-print Network

    King, Simon; Frankel, Joe; Livescu, Karen; McDermott, Erik; Richmond, Korin; Wester, Mirjam

    2007-01-01

    Although much is known about how speech is produced, and research into speech production has resulted in measured articulatory data, feature systems of different kinds and numerous models, speech production knowledge is almost totally ignored...

  10. ROBUST SPEECH RECOGNITION BY INTEGRATING SPEECH SEPARATION AND HYPOTHESIS TESTING

    E-print Network

    Wang, DeLiang "Leon"

    on recognition, noisy speech is typically preprocessed by speech enhancement algorithms, such as spectral- tral subtraction for speech enhancement followed by recognition of enhanced speech [4]. The missingROBUST SPEECH RECOGNITION BY INTEGRATING SPEECH SEPARATION AND HYPOTHESIS TESTING Soundararajan

  11. Great American Speeches

    NSDL National Science Digital Library

    Ms. Olsen

    2006-11-14

    Watch the video presentations of each of these speeches. Gettysburg address Martin Luther King- I Have a Dream Freedom of Speech by Mario Savio Mario Savio Speech New worker plan Speech by FDR For manuscripts, audio and video of many other modern and past speeches follow the link below: American Speech Bank ...

  12. 78 FR 49717 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ...FCC 13-101] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...may be filed electronically using the Internet by accessing the Commission's Electronic...Commission's Speech-to-Speech and Internet Protocol (IP)...

  13. Speech analyzer

    NASA Technical Reports Server (NTRS)

    Lokerson, D. C. (inventor)

    1977-01-01

    A speech signal is analyzed by applying the signal to formant filters which derive first, second and third signals respectively representing the frequency of the speech waveform in the first, second and third formants. A first pulse train having approximately a pulse rate representing the average frequency of the first formant is derived; second and third pulse trains having pulse rates respectively representing zero crossings of the second and third formants are derived. The first formant pulse train is derived by establishing N signal level bands, where N is an integer at least equal to two. Adjacent ones of the signal bands have common boundaries, each of which is a predetermined percentage of the peak level of a complete cycle of the speech waveform.

  14. Keynote Speeches.

    ERIC Educational Resources Information Center

    2000

    This document contains the six of the seven keynote speeches from an international conference on vocational education and training (VET) for lifelong learning in the information era. "IVETA (International Vocational Education and Training Association) 2000 Conference 6-9 August 2000" (K.Y. Yeung) discusses the objectives and activities of Hong…

  15. Speech Problems

    MedlinePLUS

    ... speak, we must coordinate many muscles from various body parts and systems, including the larynx, which contains the vocal cords; the teeth, lips, tongue, and mouth; and the respiratory system. The ability to understand language and produce speech is coordinated by the brain. ...

  16. Subvocal Speech

    NSDL National Science Digital Library

    Science Update (; )

    2004-07-26

    Every word you say is controlled by electrical nerve signals from your brain, which tell your lips, throat, and tongue exactly how to say it. This Science Update lesson deals with how scientists are trying to tap into those silent speech commands.

  17. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  18. SPEECH ENHANCEMENT FOR NOISE-ROBUST SPEECH RECOGNITION

    E-print Network

    Shapiro, Benjamin

    SPEECH ENHANCEMENT FOR NOISE-ROBUST SPEECH RECOGNITION Vikramjit Mitra and Carol Y. Espy-Wilson Speech Communication Lab SCL Introduction Propose an adaptive speech enhancement technique. Detects SNR-robust Speech Recognition Speech Enhancement Noise robust automated transcription Corpus `Speech in Speech

  19. SPEECH PARAMETERIZATION FOR AUTOMATIC SPEECH RECOGNITION IN NOISY CONDITIONS

    E-print Network

    SPEECH PARAMETERIZATION FOR AUTOMATIC SPEECH RECOGNITION IN NOISY CONDITIONS Bojana Gaji of automatic speech recognition systems (ASR) against additive background noise, by finding speech parameters robustness of auditory based speech parameterization methods, we compare the steps involved with those

  20. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-15

    ...FCC 13-101] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...Commission's Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...378-3160, fax: (202) 488-5563, Internet: www.bcpiweb.com. Document...

  1. Speech and Language Impairments

    MedlinePLUS

    ... CFR §300.8(c)(11] Back to top Development of Speech and Language Skills in Childhood Speech ... a professional. ______________________ More on the Milestones of Language Development What are the milestones of typical speech-language ...

  2. Speech recognition and understanding

    SciTech Connect

    Vintsyuk, T.K.

    1983-05-01

    This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition of connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.

  3. REPRESENTING SPEECH RHYTHM

    Microsoft Academic Search

    B. Zellner Keller; E. Keller

    The issue of representing speech rhythm is understood in this paper as the search for relevant primary parameters that will allow the formalisation of speech rhythm. Current speech synthesisers show that phonological models are not satisfactory with respect to the modelling of speech rhythmicity. Our analysis indicates that this may be in part related to the formalisation of rhythmic representation.

  4. Opportunities in Speech Pathology.

    ERIC Educational Resources Information Center

    Newman, Parley W.

    The importance of speech is discussed and speech pathology is described. Types of communication disorders considered are articulation disorders, aphasia, facial deformity, hearing loss, stuttering, delayed speech, voice disorders, and cerebral palsy; examples of five disorders are given. Speech pathology is investigated from these aspects: the…

  5. Speech research directions

    SciTech Connect

    Atal, B.S.; Rabiner, L.R.

    1986-09-01

    This paper presents an overview of the current activities in speech research. The authors discuss the state of the art in speech coding, text-to-speech synthesis, speech recognition, and speaker recognition. In the speech coding area, current algorithms perform well at bit rates down to 9.6 kb/s, and the research is directed at bringing the rate for high-quality speech coding down to 2.4 kb/s. In text-to-speech synthesis, what we currently are able to produce is very intelligible but not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthetic speech that these systems produce. Finally, today's systems for speech and speaker recognition provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, etc.

  6. Introduction Speech and non-speech exhibit similar spectrally contrastive

    E-print Network

    Holt, Lori L.

    in 3 noise conditions Perception of coarticulated speech with contrastively enhanced spectrotemporal Conclusions General, non-speech-specific contrast enhancement can benefit recognition of coarticulated speechIntroduction Speech and non-speech exhibit similar spectrally contrastive context effects on speech

  7. ROBUST SPEECH RECOGNITION USING MULTIPLE PRIOR MODELS FOR SPEECH RECONSTRUCTION

    E-print Network

    Wang, DeLiang "Leon"

    speech recognition to enhance noisy speech. Typically, a single prior model is trained by pooling normalization (CMN), while others preprocess noisy speech using speech enhancement techniques. If noise samplesROBUST SPEECH RECOGNITION USING MULTIPLE PRIOR MODELS FOR SPEECH RECONSTRUCTION Arun Narayanan

  8. Cochlear implant speech recognition with speech maskers

    Microsoft Academic Search

    Ginger S. Stickney; Fan-Gang Zeng; Ruth Litovsky; Peter Assmann

    2004-01-01

    Speech recognition performance was measured in normal-hearing and cochlear-implant listeners with maskers consisting of either steady-state speech-spectrum-shaped noise or a competing sentence. Target sentences from a male talker were presented in the presence of one of three competing talkers (same male, different male, or female) or speech-spectrum-shaped noise generated from this talker at several target-to-masker ratios. For the normal-hearing listeners,

  9. Trainable videorealistic speech animation

    E-print Network

    Ezzat, Tony F. (Tony Farid)

    2002-01-01

    I describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a pre-determined speech corpus. After ...

  10. Speech disorders - children

    MedlinePLUS

    ... person has problems creating or forming the speech sounds needed to communicate with others. Three common speech ... are disorders in which a person repeats a sound, word, or phrase. Stuttering may be the most ...

  11. Silence, speech, and responsibility

    E-print Network

    Maitra, Ishani, 1974-

    2002-01-01

    Pornography deserves special protections, it is often said, because it qualifies as speech; therefore, no matter what we think of it, we must afford it the protections that we extend to most speech, but don't extend to ...

  12. Apraxia of Speech

    MedlinePLUS

    ... the nervous system ( aphasia ). Developmental apraxia of speech (DAS) occurs in children and is present from birth. ... dyspraxia, articulatory apraxia, and childhood apraxia of speech. DAS is different from what is known as a ...

  13. Speech and Communication Disorders

    MedlinePLUS

    ... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...

  14. CHAPTER 1. INTRODUCTION Speech coding or speech compression is one of the important aspects of speech

    E-print Network

    Beex, A. A. "Louis"

    1 CHAPTER 1. INTRODUCTION Speech coding or speech compression is one of the important aspects of speech communications nowadays. Some of the speech communication media that need speech coding are wireless communications and Internet telephony. By coding the speech, the speed to transmit the digitized

  15. Free Speech Yearbook 1978.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  16. Constrained iterative speech enhancement with application to speech recognition

    Microsoft Academic Search

    John H. L. Hansen; Mark A. Clements

    1991-01-01

    The basis of an improved form of iterative speech enhancement for single-channel inputs is sequential maximum a posteriori estimation of the speech waveform and its all-pole parameters, followed by imposition of constraints upon the sequence of speech spectra. The approaches impose intraframe and interframe constraints on the input speech signal. Properties of the line spectral pair representation of speech allow

  17. Recognizing Speech in a Novel The Motor Theory of Speech

    E-print Network

    Paris-Sud XI, Université de

    Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed Clément Moulin FIGURES 50 TABLES 56 APPE NDIX 58 ABSTRACT The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned

  18. Speech Recognition: A General Overview.

    ERIC Educational Resources Information Center

    de Sopena, Luis

    Speech recognition is one of five main areas in the field of speech processing. Difficulties in speech recognition include variability in sound within and across speakers, in channel, in background noise, and of speech production. Speech recognition can be used in a variety of situations: to perform query operations and phone call transfers; for…

  19. Springer Handbook on Speech Processing and Speech Communication 1 LOW-BIT-RATE SPEECH CODING

    E-print Network

    Springer Handbook on Speech Processing and Speech Communication 1 LOW-BIT-RATE SPEECH CODING Alan 02420, USA ABSTRACT Low-bit-rate speech coding, at rates below 4 kb/s, is needed for both communication and voice storage ap- plications. At such low rates, full encoding of the speech waveform is not possible

  20. Reviews: The High School Basic Speech Text

    ERIC Educational Resources Information Center

    Klopf, Donald W.

    1970-01-01

    Critical evaluations of "The Art of Speaking, "Building Better Speech, "The New American Speech, "Public Speaking: The Essentials, "Speak Up! "Speech: A High School Course, "The Speech Arts," Speech for All," Speech for Today," Speech in Action," and Speech in American Society". (RD)

  1. Early recognition of speech

    PubMed Central

    Remez, Robert E; Thomas, Emily F

    2013-01-01

    Classic research on the perception of speech sought to identify minimal acoustic correlates of each consonant and vowel. In explaining perception, this view designated momentary components of an acoustic spectrum as cues to the recognition of elementary phonemes. This conceptualization of speech perception is untenable given the findings of phonetic sensitivity to modulation independent of the acoustic and auditory form of the carrier. The empirical key is provided by studies of the perceptual organization of speech, a low-level integrative function that finds and follows the sensory effects of speech amid concurrent events. These projects have shown that the perceptual organization of speech is keyed to modulation; fast; unlearned; nonsymbolic; indifferent to short-term auditory properties; and organization requires attention. The ineluctably multisensory nature of speech perception also imposes conditions that distinguish language among cognitive systems. WIREs Cogn Sci 2013, 4:213–223. doi: 10.1002/wcs.1213 PMID:23926454

  2. Hearing or speech impairment - resources

    MedlinePLUS

    Resources - hearing or speech impairment ... The following organizations are good resources for information on hearing impairment or speech impairment: American Speech-Language-Hearing Association - www.asha.org National Dissemination Center for Children ...

  3. Speech Sound Disorders: Articulation and Phonological Processes

    MedlinePLUS

    Speech Sound Disorders: Articulation and Phonological Processes What are speech sound disorders ? Can adults have speech sound disorders ? What ... individuals with speech sound disorders ? What are speech sound disorders? Most children make some mistakes as they ...

  4. Analyzing a Famous Speech

    NSDL National Science Digital Library

    Melissa Weeks Noel

    2012-08-01

    After gaining skill through analyzing a historic and contemporary speech as a class, students will select a famous speech from a list compiled from several resources and write an essay that identifies and explains the rhetorical strategies that the author deliberately chose while crafting the text to make an effective argument. Their analysis will consider questions such as: What makes the speech an argument?, How did the author's rhetoric evoke a response from the audience?, and Why are the words still venerated today?

  5. Distributed processing for speech understanding

    SciTech Connect

    Bronson, E.C.; Siegel, L.

    1983-01-01

    Continuous speech understanding is a highly complex artificial intelligence task requiring extensive computation. This complexity precludes real-time speech understanding on a conventional serial computer. Distributed processing technique can be applied to the speech understanding task to improve processing speed. In the paper, the speech understanding task and several speech understanding systems are described. Parallel processing techniques are presented and a distributed processing architecture for speech understanding is outlined. 35 references.

  6. Articulatory speech synthesis and speech production modelling

    NASA Astrophysics Data System (ADS)

    Huang, Jun

    This dissertation addresses the problem of speech synthesis and speech production modelling based on the fundamental principles of human speech production. Unlike the conventional source-filter model, which assumes the independence of the excitation and the acoustic filter, we treat the entire vocal apparatus as one system consisting of a fluid dynamic aspect and a mechanical part. We model the vocal tract by a three-dimensional moving geometry. We also model the sound propagation inside the vocal apparatus as a three-dimensional nonplane-wave propagation inside a viscous fluid described by Navier-Stokes equations. In our work, we first propose a combined minimum energy and minimum jerk criterion to estimate the dynamic vocal tract movements during speech production. Both theoretical error bound analysis and experimental results show that this method can achieve very close match at the target points and avoid the abrupt change in articulatory trajectory at the same time. Second, a mechanical vocal fold model is used to compute the excitation signal of the vocal tract. The advantage of this model is that it is closely coupled with the vocal tract system based on fundamental aerodynamics. As a result, we can obtain an excitation signal with much more detail than the conventional parametric vocal fold excitation model. Furthermore, strong evidence of source-tract interaction is observed. Finally, we propose a computational model of the fricative and stop types of sounds based on the physical principles of speech production. The advantage of this model is that it uses an exogenous process to model the additional nonsteady and nonlinear effects due to the flow mode, which are ignored by the conventional source- filter speech production model. A recursive algorithm is used to estimate the model parameters. Experimental results show that this model is able to synthesize good quality fricative and stop types of sounds. Based on our dissertation work, we carefully argue that the articulatory speech production model has the potential to flexibly synthesize natural-quality speech sounds and to provide a compact computational model for speech production that can be beneficial to a wide range of areas in speech signal processing.

  7. Chief Seattle's Speech Revisited

    ERIC Educational Resources Information Center

    Krupat, Arnold

    2011-01-01

    Indian orators have been saying good-bye for more than three hundred years. John Eliot's "Dying Speeches of Several Indians" (1685), as David Murray notes, inaugurates a long textual history in which "Indians... are most useful dying," or, as in a number of speeches, bidding the world farewell as they embrace an undesired but apparently inevitable…

  8. Packet Speech Systems Technology

    Microsoft Academic Search

    C. J. Weinstein; P. E. Blankenship

    1981-01-01

    The long-range objectives of the Packet Speech Systems Technology Program are to develop and demonstrate techniques for efficient digital speech communication on networks suitable for both voice and data, and to investigate and develop techniques for integrated voice and data communication in packetized networks, including wideband common-user satellite links. Specific areas of concern are: the concentration of statistically fluctuating volumes

  9. Speeches by TAFE Directors.

    ERIC Educational Resources Information Center

    Wilson, Sara, Ed.

    Three directors of TAFE (Technical and Future Education) are represented in this publication. Speeches by Lyall P. Fricker of TAFE, South Australia are: "Innovation in TAFE" and "Tertiary Education for All." Speeches by Allan Pattison of TAFE, New South Wales include "TAFE in New South Wales: Past Achievements and Future Prospects"; "TAFE and…

  10. Illustrated Speech Anatomy.

    ERIC Educational Resources Information Center

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  11. Private Speech in Ballet

    ERIC Educational Resources Information Center

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  12. Free Speech Yearbook 1975.

    ERIC Educational Resources Information Center

    Barbour, Alton, Ed.

    This issue of the "Free Speech Yearbook" contains the following: "Between Rhetoric and Disloyalty: Free Speech Standards for the Sunshire Soldier" by Richard A. Parker; "William A. Rehnquist: Ideologist on the Bench" by Peter E. Kane; "The First Amendment's Weakest Link: Government Regulation of Controversial Advertising" by Patricia Goss:…

  13. STUDENTS AND FREE SPEECH

    NSDL National Science Digital Library

    dramsden

    2013-04-22

    Free speech is a constitutional right, correct? What about in school? The US Constitution protects everyone, young or old big or small. As Horton said "A person is a person no matter how small". Yet does that mean people can say what ever they want, whenever they want? Does the right to free speech give ...

  14. Free Speech. No. 38.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  15. .ROBUST SPEECH RECOGNITION USING SINGULAR VALUE DECOMPOSITION BASED SPEECH ENHANCEMENT

    E-print Network

    .ROBUST SPEECH RECOGNITION USING SINGULAR VALUE DECOMPOSITION BASED SPEECH ENHANCEMENT B. T. Lilly the performance of a speechrecognition system in the presence of noise, is to enhance the speech prior to its is the enhanced speech signal. In the experiments presented in this paper, we use singu- lar value decomposition

  16. Constrained iterative speech enhancement with application to automatic speech recognition

    Microsoft Academic Search

    John H. L. Hansen; Mark A. Clements

    1988-01-01

    A set of iterative speech enhancement techniques using spectral constraints is extended and evaluated. The approaches apply inter- and intraframe spectral constraints to ensure optimum speech quality across all classes of speech. Constraints are applied on the basis of the presence of perceptually important speech characteristics found during the enhancement procedure. Results show improvement over past techniques for additive white

  17. EFFECT OF SPEECH CODERS ON SPEECH RECOGNITION PERFORMANCE

    E-print Network

    EFFECT OF SPEECH CODERS ON SPEECH RECOGNITION PERFORMANCE B.T. Lilly and K.K. Paliwal School of Microelectronic Engineering Griffith University Brisbane, QLD 4111, Australia ABSTRACT Speech coders with bitrates to expectthat the per- formance of speechrecognition systemswill deteriorate when coded speech is applied

  18. THE EFFECT OF SPEECH AND AUDIO COMPRESSION ON SPEECH RECOGNITION

    E-print Network

    Boyer, Edmond

    THE EFFECT OF SPEECH AND AUDIO COMPRESSION ON SPEECH RECOGNITION PERFORMANCE L. Besacier, C.Besacier@imag.fr Abstract - This paper proposes an in-depth look at the influence of different speech and audio codecs on the performance of our continuous speech recognition engine. GSM full rate, G711, G723.1 and MPEG coders

  19. Speech, Language & Hearing Association

    NSDL National Science Digital Library

    The American Speech-Language-Hearing Association’s (ASHA) mission statement is to “promote the interests of and provide the highest quality services for professionals in audiology, speech-language pathology, and speech and hearing science.” Their website is designed to help ASHA accomplish this task, and is a valuable resource for anyone involved in this industry. The ASHA has been around for 79 years and in that time has created resources for students and the general public, in order to educate people about speech and communication disorders and diseases. The site includes detailed explanations on many diseases and disorders and provides additional resources for those who want to learn more. For students, there are sections with information on various speech, language, and hearing professions; a guide to academic programs; and a useful guide to the Praxis exam required for many of these professions.

  20. Estimation of Severity of Speech Disability through Speech Envelope

    E-print Network

    Gudi, Anandthirtha B; Nagaraj, H C; 10.5121/sipij.2011.2203

    2011-01-01

    In this paper, envelope detection of speech is discussed to distinguish the pathological cases of speech disabled children. The speech signal samples of children of age between five to eight years are considered for the present study. These speech signals are digitized and are used to determine the speech envelope. The envelope is subjected to ratio mean analysis to estimate the disability. This analysis is conducted on ten speech signal samples which are related to both place of articulation and manner of articulation. Overall speech disability of a pathological subject is estimated based on the results of above analysis.

  1. Voice and Speech after Laryngectomy

    ERIC Educational Resources Information Center

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  2. Delayed Speech or Language Development

    MedlinePLUS

    ... speech-language pathologist will look at a child's speech and language skills within the context of total development. Besides observing ... work with your child at home to improve speech and language skills. Evaluation by a speech-language pathologist may find ...

  3. Emacspeak—direct speech access

    Microsoft Academic Search

    T. V. Raman

    1996-01-01

    Emacspeak is a full-fledged speech output inter- face to Emacs, and is being used to provide direct speech access to a UNIX workstation. The kind of speech access provided by Emacspeak is qual- itatively different from what conventional screen- readers provide —emacspeak makes applications speak— as opposed to speaking the screen. Emacspeak is the first full-fledged speech output system that

  4. Speech Perception Dominic W. Massaro

    E-print Network

    Massaro, Dominic

    Speech Perception Dominic W. Massaro This psychological account of speech perception includes and theory indicate that speech perception is a form of pattern recognition that is influenced by multiple of California, Santa Cruz Santa Cruz, California 95060, USA Massaro@ucsc.edu Speech Perception warrants an entry

  5. MARQUETTE UNIVERSITY Speech Signal Enhancement

    E-print Network

    Johnson, Michael T.

    MARQUETTE UNIVERSITY Speech Signal Enhancement Using A Microphone Array A THESIS SUBMITTED Reserved #12;iii Preface This thesis describes the design and implementation of a speech enhancement system that uses microphone array beamforming and speech enhancement algorithms applied to a speech signal

  6. Robust recognition of children's speech

    Microsoft Academic Search

    Alexandros Potamianos; Shrikanth S. Narayanan

    2003-01-01

    Abstract—Developmental changes in speech production intro- duce age-dependent spectral and temporal variability in the speech signal produced by children. Such variabilities pose challenges for robust automatic recognition of children’s speech. Through an analysis of age-related acoustic characteristics of children’s speech in the context of automatic speech recognition (ASR), effects such as frequency scaling of spectral envelope parameters are demonstrated. Recognition

  7. The Effect of SpeechEasy on Stuttering Frequency, Speech Rate, and Speech Naturalness

    ERIC Educational Resources Information Center

    Armson, Joy; Kiefte, Michael

    2008-01-01

    The effects of SpeechEasy on stuttering frequency, stuttering severity self-ratings, speech rate, and speech naturalness for 31 adults who stutter were examined. Speech measures were compared for samples obtained with and without the device in place in a dispensing setting. Mean stuttering frequencies were reduced by 79% and 61% for the device…

  8. Generalization to conversational speech.

    PubMed

    Elbert, M; Dinnsen, D A; Swartzlander, P; Chin, S B

    1990-11-01

    Although changes in children's phonological systems due to treatment have been documented in single-word testing, changes in conversational speech are less well known. Single-word and conversation samples were analyzed for 10 phonologically disordered children, before and after treatment and 3 months later. Results suggest that for most of the children, there were system changes in both single words and in conversational speech. It appears that many phonologically disordered children are able to extend their correct production to conversation without direct treatment on spontaneous speech. PMID:2232748

  9. Portable Speech Synthesizer

    NASA Technical Reports Server (NTRS)

    Leibfritz, Gilbert H.; Larson, Howard K.

    1987-01-01

    Compact speech synthesizer useful traveling companion to speech-handicapped. User simply enters statement on board, and synthesizer converts statement into spoken words. Battery-powered and housed in briefcase, easily carried on trips. Unit used on telephones and face-to-face communication. Synthesizer consists of micro-computer with memory-expansion module, speech-synthesizer circuit, batteries, recharger, dc-to-dc converter, and telephone amplifier. Components, commercially available, fit neatly in 17-by 13-by 5-in. briefcase. Weighs about 20 lb (9 kg) and operates and recharges from ac receptable.

  10. Maynard Dixon: "Free Speech."

    ERIC Educational Resources Information Center

    Day, Michael

    1987-01-01

    Based on Maynard Dixon's oil painting, "Free Speech," this lesson attempts to expand high school students' understanding of art as a social commentary and the use of works of art to convey ideas and ideals. (JDH)

  11. Speech-Language Pathologists

    MedlinePLUS

    ... options Create and carry out an individualized treatment plan When treating patients, speech-language pathologists typically do ... any changes in a patient’s condition or treatment plan, and, eventually, they complete a final evaluation when ...

  12. Great American Speeches

    NSDL National Science Digital Library

    This new companion site from PBS offers an excellent collection of speeches, some with audio and video clips, from many of the nation's "most influential and poignant speakers of the recorded age." In the Speech Archives, users will find a timeline of significant 20th-century events interspersed with the texts of over 90 speeches, some of which also offer background and audio or video clips. Additional sections of the site include numerous activities for students: two quizzes in the American History Challenge, Pop-Up Trivia, A Wordsmith Challenge, Critics' Corner and Could You be a Politician? which allows visitors to try their hand at reading a speech off of a teleprompter.

  13. A Wedding Speech

    E-print Network

    Bkra shis bzang po

    2009-01-01

    This collection of 77 audio files focuses on weddings and weddings speeches but also contains: folk tales, folk songs, riddles, tongue twisters, and local history from Bang smad Village and Ri sne Village, Bang smad Township, Nyag rong County...

  14. Auditory speech preprocessors

    SciTech Connect

    Zweig, G.

    1989-01-01

    A nonlinear transmission line model of the cochlea (Zweig 1988) is proposed as the basis for a novel speech preprocessor. Sounds of different intensities, such as voiced and unvoiced speech, are preprocessed in radically different ways. The Q's of the preprocessor's nonlinear filters vary with input amplitude, higher Q's (longer integration times) corresponding to quieter sounds. Like the cochlea, the preprocessor acts as a ''subthreshold laser'' that traps and amplifies low level signals, thereby aiding in their detection and analysis. 17 refs.

  15. Computer-generated speech

    SciTech Connect

    Aimthikul, Y.

    1981-12-01

    This thesis reviews the essential aspects of speech synthesis and distinguishes between the two prevailing techniques: compressed digital speech and phonemic synthesis. It then presents the hardware details of the five speech modules evaluated. FORTRAN programs were written to facilitate message creation and retrieval with four of the modules driven by a PDP-11 minicomputer. The fifth module was driven directly by a computer terminal. The compressed digital speech modules (T.I. 990/306, T.S.I. Series 3D and N.S. Digitalker) each contain a limited vocabulary produced by the manufacturers while both the phonemic synthesizers made by Votrax permit an almost unlimited set of sounds and words. A text-to-phoneme rules program was adapted for the PDP-11 (running under the RSX-11M operating system) to drive the Votrax Speech Pac module. However, the Votrax Type'N Talk unit has its own built-in translator. Comparison of these modules revealed that the compressed digital speech modules were superior in pronouncing words on an individual basis but lacked the inflection capability that permitted the phonemic synthesizers to generate more coherent phrases. These findings were necessarily highly subjective and dependent on the specific words and phrases studied. In addition, the rapid introduction of new modules by manufacturers will necessitate new comparisons. However, the results of this research verified that all of the modules studied do possess reasonable quality of speech that is suitable for man-machine applications. Furthermore, the development tools are now in place to permit the addition of computer speech output in such applications.

  16. Speech perception and production

    PubMed Central

    Casserly, Elizabeth D.; Pisoni, David B.

    2012-01-01

    Until recently, research in speech perception and speech production has largely focused on the search for psychological and phonetic evidence of discrete, abstract, context-free symbolic units corresponding to phonological segments or phonemes. Despite this common conceptual goal and intimately related objects of study, however, research in these two domains of speech communication has progressed more or less independently for more than 60 years. In this article, we present an overview of the foundational works and current trends in the two fields, specifically discussing the progress made in both lines of inquiry as well as the basic fundamental issues that neither has been able to resolve satisfactorily so far. We then discuss theoretical models and recent experimental evidence that point to the deep, pervasive connections between speech perception and production. We conclude that although research focusing on each domain individually has been vital in increasing our basic understanding of spoken language processing, the human capacity for speech communication is so complex that gaining a full understanding will not be possible until speech perception and production are conceptually reunited in a joint approach to problems shared by both modes. PMID:23946864

  17. Join Cost for Unit Selection Speech Synthesis 

    E-print Network

    Vepa, Jithendra

    Undoubtedly, state-of-the-art unit selection-based concatenative speech systems produce very high quality synthetic speech. this is due to a large speech database containing many instances of each speech unit, with a varied and natural distribution...

  18. Speech Recognition Via Phonetically Featured Syllables 

    E-print Network

    King, Simon; Stephenson, Todd; Isard, Stephen; Taylor, Paul; Strachan, Alex

    We describe a speech recogniser which uses a speech production-motivated phonetic-feature description of speech. We argue that this is a natural way to describe the speech signal and offers an efficient intermediate parameterisation for use...

  19. Speech processing using maximum likelihood continuity mapping

    DOEpatents

    Hogden, John E. (Santa Fe, NM)

    2000-01-01

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  20. Speech processing using maximum likelihood continuity mapping

    SciTech Connect

    Hogden, J.E.

    2000-04-18

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  1. Towards speech as a knowledge resource

    Microsoft Academic Search

    Eric W. Brown; Savitha Srinivasan; Anni Coden; Dulce B. Ponceleon; James W. Cooper; Arnon Amir; Jan Pieper

    2001-01-01

    Speech is a tantalizing mode of human communication. On the one hand, humans understand speech with ease and use speech to express complex ideas, information, and knowledge. On the other hand, automatic speech recognition with computers is still very hard, and extracting knowledge from speech is even harder. In this paper we motivate the study of speech as a knowledge

  2. SPEECH DECOMPOSITION AND ENHANCEMENT Sungyub Yoo

    E-print Network

    Allen, Jont

    SPEECH DECOMPOSITION AND ENHANCEMENT by Sungyub Yoo BS, Soonchunhyang University, 1995 MS enhance speech intelligibility is examined. Computer algorithms to decompose speech into two different enhanced speech. The energy of the enhanced speech was adjusted to be equal to the original speech

  3. Speech Enhancement of Noisy Speech Using Log-Spectral Amplitude Estimator and Harmonic Tunneling

    E-print Network

    Wichmann, Felix

    Speech Enhancement of Noisy Speech Using Log-Spectral Amplitude Estimator and Harmonic Tunneling we present a two stage noise reduction algo- rithm for speech enhancement. The speech noise removal and decreases the performance of speech coding and speech recog- nition systems. In speech enhancement

  4. Cochlear implant speech recognition with speech maskersa) Ginger S. Stickneyb)

    E-print Network

    Litovsky, Ruth

    Cochlear implant speech recognition with speech maskersa) Ginger S. Stickneyb) and Fan-Gang Zeng; accepted 16 May 2004 Speech recognition performance was measured in normal-hearing and cochlear-implant processed through a noise-excited vocoder designed to simulate a cochlear implant. With unprocessed stimuli

  5. Speech separation using speaker-adapted eigenvoice speech models

    E-print Network

    Ellis, Dan

    are not known a priori. The sources are modeled using hidden Markov models (HMM) and sepa- rated using factorial is evaluated on the task defined in the 2006 Speech Separation Challenge [Cooke, M.P., Lee, T.-W., 2008. The 2006 Speech Separation Challenge. Computer Speech and Language] and compared with separation using

  6. Differential Diagnosis of Severe Speech Disorders Using Speech Gestures

    ERIC Educational Resources Information Center

    Bahr, Ruth Huntley

    2005-01-01

    The differentiation of childhood apraxia of speech from severe phonological disorder is a common clinical problem. This article reports on an attempt to describe speech errors in children with childhood apraxia of speech on the basis of gesture use and acoustic analyses of articulatory gestures. The focus was on the movement of articulators and…

  7. Open Domain Speech Recognition & Translation:Lectures and Speeches

    Microsoft Academic Search

    C. Fugen; M. Kolss; D. Bernreuther; M. Paulik; S. Stuker; S. Vogel; A. Waibel

    2006-01-01

    For years speech translation has focused on the recognition and translation of discourses in limited domains, such as hotel reservations or scheduling tasks. Only recently research projects have been started to tackle the problem of open domain speech recognition and translation of complex tasks such as lectures and speeches. In this paper we present the on-going work at our laboratory

  8. Large Scale Speech Synthesis Evaluation 

    E-print Network

    Podsiadlo, Monika

    2007-11-11

    In speech synthesis evaluation, it is critical that we know what exactly affects the results of the evaluation rather than employing as vague notions as, say, "good quality speech". As so far we have only been able to ...

  9. Development of a speech autocuer

    NASA Technical Reports Server (NTRS)

    Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; Mccartney, M. L.

    1980-01-01

    A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.

  10. A virtual vocabulary speech recognizer

    E-print Network

    Pathe, Peter D

    1983-01-01

    A system for the automatic recognition of human speech is described. A commercially available speech recognizer sees its recognition vocabulary increased through the use of virtual memory management techniques. central to ...

  11. Speech coding: a tutorial review

    Microsoft Academic Search

    ANDREAS S. SPANIAS

    1994-01-01

    The past decade has witnessed substantial progress towards the application of low-rate speech coders to civilian and military communications as well as computer-related voice applications. Central to this progress has been the development of new speech coders capable of producing high-quality speech at low data rates. Most of these coders incorporate mechanisms to: represent the spectral properties of speech, provide

  12. Abortion and compelled physician speech.

    PubMed

    Orentlicher, David

    2015-03-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. PMID:25846035

  13. SPEECH ENHANCEMENT FOR CROSSTALK INTERFERENCE

    E-print Network

    SPEECH ENHANCEMENT FOR CROSS­TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust. Based on this noise esti­ mate, a new speech enhancement technique is proposed. The enhancement method EDICS Code: SPL.SA.1.5 Speech Enhancement submitted Jan. 19, 1996 to IEEE Signal Processing Letters

  14. The State of Free Speech

    Microsoft Academic Search

    Danielle Celermajer

    2008-01-01

    This article surveys the regulation of speech under the government of Prime Minister Howard so as to clarify the understandings of free speech that have been operant and to extrapolate from this debate to consider broader shifts in political forms. It reviews and contrasts liberal and republican conceptions of free speech and argues that what suffered most is the critical

  15. SPEECH CODING: FUNDAMENTALS AND APPLICATIONS

    E-print Network

    Alwan, Abeer

    EOT156 SPEECH CODING: FUNDAMENTALS AND APPLICATIONS MARK HASEGAWA-JOHNSON University of Illinois, California 1. INTRODUCTION Speech coding is the process of obtaining a compact representation of voice, speech coders have become essential components in telecommunications and in the multimedia infrastructure

  16. SPEECH ESSENTIAL SKILL OUTCOME STATEMENTS

    E-print Network

    Gering, Jon C.

    SB #3405 SPEECH ESSENTIAL SKILL OUTCOME STATEMENTS Newly-Proposed Version: A liberally educated of the speechmaking process including selecting topics, organizing speeches, using persuasive appeals and using. _____________________________________________________________________________________ Current Version: Speaking and listening competencies, approved by the governing body of the Speech

  17. Barack Obama's South Carolina speech

    Microsoft Academic Search

    Alessandro Capone

    2010-01-01

    In this paper, I shall analyze US Presidential Barack Obama's South Carolina victory speech from the perspective of pragmemes. In particular, I shall explore the idea that this speech is constituted by many voices (in other words, it displays polyphony, to use an idea due to Bakhtin, 1981, 1986) and that the audience is part of this speech event, adding

  18. Castro Speech Databases

    NSDL National Science Digital Library

    The Latin American Network Information Center at the University of Texas provides access to a searchable and browsable database of speeches by Cuban Leader Fidel Castro. It contains "full text of English translations of speeches, interviews, and press conferences by Castro, based upon the records of the Foreign Broadcast Information Service (FBIS), a US government agency responsible for monitoring broadcast and print media in countries throughout the world." Users should note that the search interface, while allowing searching on any of nine types of documents, as well as keyword and date, lacks user guidance. Documents are organized by date. While this is not a repository of all of Castro's speeches, the amount of material at the site makes it valuable to researchers.

  19. Speech spectrogram expert

    SciTech Connect

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  20. Speech Communication and Math

    NSDL National Science Digital Library

    Amick, Patty

    Patty Amick, Cheryl Hawkins, and Lori Trumbo of Greenville Technical College created this resource to connect the art of public speaking with the task of demographic data collection. This course will help students create and interpret charts and graphs using mean, median, mode, and percentages. It will also allow students to recognize flawed surveys and design their own in order to produce valid data, all while writing a persuasive speech to incorporate their findings. This is a great website for educators looking to combine speech communication and math in a very hands-on way.

  1. Figures of Speech

    ERIC Educational Resources Information Center

    Dutton, Yanina; Meyer, Sue

    2007-01-01

    In this article, the authors report that almost one in three adults in the UK have experience of learning a language as an adult, but only four percent are currently doing so--one percent less that in 1999, equivalent to a drop of half a million adults learning languages. Figures of speech, NIACE's UK-wide survey of language learning, also found a…

  2. Black History Speech

    ERIC Educational Resources Information Center

    Noldon, Carl

    2007-01-01

    The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…

  3. Mandarin Visual Speech Information

    ERIC Educational Resources Information Center

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  4. Interlocutor Informative Speech

    ERIC Educational Resources Information Center

    Gray, Jonathan M.

    2005-01-01

    Sharing information orally is an important skill that public speaking classes teach well. However, the author's students report that they do not often see informative lectures, demonstrations, presentations, or discussions that follow the structures and formats of an informative speech as it is discussed in their textbooks. As a result, the author…

  5. Perceptual Learning in Speech

    ERIC Educational Resources Information Center

    Norris, Dennis; McQueen, James M.; Cutler, Anne

    2003-01-01

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g.,…

  6. Microprocessor for speech recognition

    SciTech Connect

    Ishizuka, H.; Watari, M.; Sakoe, H.; Chiba, S.; Iwata, T.; Matsuki, T.; Kawakami, Y.

    1983-01-01

    A new single-chip microprocessor for speech recognition has been developed utilizing multi-processor architecture and pipelined structure. By DP-matching algorithm, the processor recognizes up to 340 isolated words or 40 connected words in realtime. 6 references.

  7. GLOBAL FREEDOM OF SPEECH

    Microsoft Academic Search

    Lars Binderup

    2007-01-01

    It has been suggested that the multicultural nature of modern liberal states (in particular the formation of immigration minorities from other cultures due to the process of globalisation) provides reasons - from a liberal egalitarian perspective - for recognising a civic or democratic norm, as opposed to a legal norm, that curbs exercises of the right to free speech that

  8. Speech Recognition using SVMs

    Microsoft Academic Search

    Mark Gales; Nathan Smith

    2002-01-01

    An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more discriminatory mapping is introduced based on the likelihood-ratio. The score-space de ned by this mapping avoids some of the problems with the Fisher score.

  9. Women, Speech and Experience

    Microsoft Academic Search

    Kathleen S. Sullivan

    2005-01-01

    As a harbinger of new first amendment doctrine, antipornography feminism offers a puzzling and fruitful study. Pitted against their erstwhile political allies and allying with otherwise adversaries, its adherents do not easily fall into any ideological map. 1 Against civil libertarianism they point to the inequality of speakers and subjects. 2 Against the postmodern discursive appropriation and subversion of speech,

  10. Sequence Learning & Speech Recognition

    E-print Network

    Keysers, Daniel

    Sequence Learning & Speech Recognition TU Kaiserslautern & DFKI Image Understanding and Pattern Recognition Prof. Dr. Thomas Breuel Presentation by Martin Krämer #12;Contents Sequence Learning Hidden on System Sciences, Volume 5, 2003 #12;Sequence Learning Overview analysis of a sequence of elements

  11. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    ERIC Educational Resources Information Center

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  12. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C. (Livermore, CA); Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

    2002-01-01

    Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

  13. Multilevel Analysis in Analyzing Speech Data

    ERIC Educational Resources Information Center

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  14. Speech characteristics in the Kabuki syndrome

    Microsoft Academic Search

    Sheila Upton; Carmella S. Stadter; Pat Landis; Eric A. Wulfsberg

    2003-01-01

    Six children with Kabuki syndrome were studied to investigate speech patterns as- sociated with the syndrome. Each child's speech was characterized with regard to articulation (types of errors and intelligibi- lity), pitch (high or low), loudness (volume of speech), and prosody (general quality of speech that combines rate and inflection). All six children had a history of delayed speech and

  15. Special issue on Speech Enhancement 1. Introduction

    E-print Network

    Gannot, Sharon

    Editorial Special issue on Speech Enhancement 1. Introduction Speech quality is severely degraded and environment. Speech enhancement algorithms which improve the quality of speech and reduce or eliminate of microphones used to collect the acoustic signal and noise, different speech enhancement algorithms have been

  16. Speech enhancement using excitation source information

    Microsoft Academic Search

    B. Yegnanarayana; S. R. Mahadeva Prasanna; K. Sreenivasa Rao

    2002-01-01

    This paper proposes an approach for processing speech from multiple microphones to enhance speech degraded by noise and reverberation. The approach is based on exploiting the features of the excitation source in speech production. In particular, the characteristics of voiced speech can be used to derive a coherently added signal from the linear prediction (LP) residuals of the degraded speech

  17. Speech categorization in context: Joint effects of nonspeech and speech precursors

    E-print Network

    Holt, Lori L.

    Speech categorization in context: Joint effects of nonspeech and speech precursors Lori L. Holta The extent to which context influences speech categorization can inform theories of pre-lexical speech perception. Across three conditions, listeners categorized speech targets preceded by speech context

  18. ALTERNATIVE SPEECH COMMUNICATION BASED ON CUED SPEECH Panikos Heracleous, Noureddine Aboutabit, and Denis Beautemps

    E-print Network

    Boyer, Edmond

    ALTERNATIVE SPEECH COMMUNICATION BASED ON CUED SPEECH Panikos Heracleous, Noureddine Aboutabit, and Denis Beautemps GIPSA-lab, Speech and Cognition Department, CNRS UMR 5216 / Stendhal University on alternative speech communication based on Cued Speech. Cued Speech is a visual mode of communication that uses

  19. NOISE ADAPTIVE SPEECH RECOGNITION WITH ACOUSTIC MODELS TRAINED FROM NOISY SPEECH EVALUATED ON AURORA-2 DATABASE

    E-print Network

    NOISE ADAPTIVE SPEECH RECOGNITION WITH ACOUSTIC MODELS TRAINED FROM NOISY SPEECH EVALUATED the noise adaptive speech recognition for noisy speech recognition in non-stationary noise to the situation that acoustic models are trained from noisy speech. We justify it by that the noise adaptive speech recognition

  20. Neurophysiology of speech differences in childhood apraxia of speech.

    PubMed

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016

  1. Predicting confusions and intelligibility of noisy speech

    E-print Network

    Messing, David P. (David Patrick), 1979-

    2007-01-01

    Current predictors of speech intelligibility are inadequate for making predictions of speech confusions caused by acoustic interference. This thesis is inspired by the need for a capability to understand and predict speech ...

  2. Speech Recognition: How Do We Teach It?

    ERIC Educational Resources Information Center

    Barksdale, Karl

    2002-01-01

    States that growing use of speech recognition software has made voice writing an essential computer skill. Describes how to present the topic, develop basic speech recognition skills, and teach speech recognition outlining, writing, proofreading, and editing. (Contains 14 references.) (SK)

  3. An Articulatory Speech-Prosthesis System

    E-print Network

    Wee, Keng Hoong

    We investigate speech-coding strategies for brain-machine-interface (BMI) based speech prostheses. We present an articulatory speech-synthesis system using an experimental integrated-circuit vocal tract that models the ...

  4. SSML: A speech synthesis markup language. 

    E-print Network

    Taylor, Paul A; Isard, Amy

    1997-01-01

    This paper describes the Speech Synthesis Markup Language, SSML, which has been designed as a platform independent interface standard for speech synthesis systems. The paper discusses the need for standardisation in speech ...

  5. SpeechBot

    NSDL National Science Digital Library

    This new experimental search engine from Compaq indexes over 2,500 hours of content from 20 popular American radio shows. Using its speech recognition software, Compaq creates "a time-aligned 'transcript' of the program and build[s] an index of the words spoken during the program." Users can then search the index by keyword or advanced search. Search returns include the text of the clip, a link to a longer transcript, the relevant audio clip in RealPlayer format, the entire program in RealPlayer format, and a link to the radio show's Website. The index is updated daily. Please note that, while SpeechBot worked fine on Windows/NT machines, the Scout Project was unable to access the audio clips using Macs.

  6. Headphone localization of speech

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1993-01-01

    Three-dimensional acoustic display systems have recently been developed that synthesize virtual sound sources over headphones based on filtering by head-related transfer functions (HRTFs), the direction-dependent spectral changes caused primarily by the pinnae. In this study, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with nonindividualized HRTFs. About half of the subjects 'pulled' their judgments toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgments; 15 to 46 percent of stimuli were heard inside the head, with the shortest estimates near the median plane. The results suggest that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized HRTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  7. Semantic Interpretation for Speech Recognition

    NSDL National Science Digital Library

    Lernout & Hauspie Speech Products.

    The first working draft of the World Wide Web Consortium's (W3C) Semantic Interpretation for Speech Recognition is now available. The document "defines the process of Semantic Interpretation for Speech Recognition and the syntax and semantics of semantic interpretation tags that can be added to speech recognition grammars." The document is a draft, open for suggestions from W3C members and other interested users.

  8. Autoregressive HMMs for speech synthesis

    E-print Network

    Shannon, Matt; Byrne, William

    2009-09-07

    , “Graphical models and automatic speech recognition,” in Mathematical foundations of speech and language processing, M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld, Eds. Springer-Verlag, 2004. [8] K. Chin and P. Woodland, “Maximum mutual information... on Acoustics, Speech and Signal Processing, vol. 38, no. 2, pp. 220–225, 1990. [6] P. Woodland, “Hidden Markov models using vector linear pre- diction and discriminative output distributions,” in Proc. ICASSP 1992, vol. 1, 1992, pp. 509–512. [7] J. Bilmes...

  9. Binaural Speech Segregation

    Microsoft Academic Search

    Nicoleta Roman; DeLiang Wang

    2003-01-01

    It is relatively easy for a human listener to attend to a particular speaker at a cocktail party in the presence of other\\u000a speakers, music and environmental sounds. To perform this task, the human listener needs to separate the target speech from\\u000a a mixture of multiple concurrent sources reflected by various surfaces. This process is referred to as auditory scene

  10. Speech Accent Archive

    NSDL National Science Digital Library

    The George Mason University archive of speech accents is a tool for linguists, speech pathologists, phoneticians, engineers who train speech recognition machines, and even interested laypeople. Volunteers who are native English speakers and non-native speakers were asked to read an elicitation paragraph in English that "uses common English words, but contains a variety of difficult English sounds and sound sequences." Visitors will quickly have the paragraph memorized while exploring different accents. There are several ways for visitors to find accents to listen to, one of which is by clicking on a map of the world, labeled "atlas/regions", or by language, labeled "language/speakers". Once visitors have chosen a region or language, the gender, and birthplace of the speaker will appear. Age and other data, such as "other languages" and "age of English onset", are provided to visitors when the link to a speaker is chosen. The "Generalizations" section contains "general rules that describe a speaker's accent", and they are based on General American English (GAE).

  11. Applications for Subvocal Speech

    NASA Technical Reports Server (NTRS)

    Jorgensen, Charles; Betts, Bradley

    2007-01-01

    A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.

  12. Speech rhythm: a metaphor?

    PubMed

    Nolan, Francis; Jeon, Hae-Sung

    2014-12-19

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep 'prominence gradient', i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a 'stress-timed' language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow 'syntagmatic contrast' between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms. PMID:25385774

  13. Speech-in-Speech Recognition: A Training Study

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.

    2012-01-01

    This study aims to identify aspects of speech-in-noise recognition that are susceptible to training, focusing on whether listeners can learn to adapt to target talkers ("tune in") and learn to better cope with various maskers ("tune out") after short-term training. Listeners received training on English sentence recognition in speech-shaped noise…

  14. SPEECH ENHANCEMENT IN THE DFT DOMAIN USING LAPLACIAN SPEECH PRIORS

    Microsoft Academic Search

    Rainer Martin; Colin Breithaupt

    2003-01-01

    In this paper we consider optimal estimators for speech enhance- ment in the Discrete Fourier Transform (DFT) domain. We de- rive an analytical solution for estimating complex DFT coefficients in the MMSE sense when the clean speech DFT coefficients are Laplacian distributed and the DFT coefficients of the noise are Gaussian or Laplacian distributed. We show that these estimators have

  15. A speech locked loop for cochlear implants and speech prostheses

    E-print Network

    Wee, Keng Hoong

    We have previously described a feedback loop that combines an auditory processor with a low-power analog integrated-circuit vocal tract to create a speech-locked-loop. Here, we describe how the speech-locked loop can help ...

  16. Enhancing Peer Feedback and Speech Preparation: The Speech Video Activity

    ERIC Educational Resources Information Center

    Opt, Susan

    2012-01-01

    In the typical public speaking course, instructors or assistants videotape or digitally record at least one of the term's speeches in class or lab to offer students additional presentation feedback. Students often watch and self-critique their speeches on their own. Peers often give only written feedback on classroom presentations or completed…

  17. Interpersonal Orientation and Speech Behavior.

    ERIC Educational Resources Information Center

    Street, Richard L., Jr.; Murphy, Thomas L.

    1987-01-01

    Indicates that (1) males with low interpersonal orientation (IO) were least vocally active and expressive and least consistent in their speech performances, and (2) high IO males and low IO females tended to demonstrate greater speech convergence than either low IO males or high IO females. (JD)

  18. SPEECH--MAN'S NATURAL COMMUNICATION.

    ERIC Educational Resources Information Center

    DUDLEY, HOMER; AND OTHERS

    SESSION 63 OF THE 1967 INSTITUTE OF ELECTRICAL AND ELECTRONIC ENGINEERS INTERNATIONAL CONVENTION BROUGHT TOGETHER SEVEN DISTINGUISHED MEN WORKING IN FIELDS RELEVANT TO LANGUAGE. THEIR TOPICS INCLUDED ORIGIN AND EVOLUTION OF SPEECH AND LANGUAGE, LANGUAGE AND CULTURE, MAN'S PHYSIOLOGICAL MECHANISMS FOR SPEECH, LINGUISTICS, AND TECHNOLOGY AND…

  19. American Studies through Folk Speech.

    ERIC Educational Resources Information Center

    Pedersen, E. Martin

    1993-01-01

    American slang reflects diversity, imagination, self-confidence, and optimism of the American people. Its vitality is due in part to the guarantee of free speech and lack of a national academy of language or of any official attempt to purify American speech, in part to Americans' historic geographic mobility. Such "folksay" includes riddles and…

  20. Visual Speech And Speaker Recognition

    E-print Network

    Visual Speech And Speaker Recognition Juergen Luettin Department of Computer Science University and their temporal dependencies by hidden Markov models. The models are trained using the EM-algorithm and speech mixture models and hidden Markov models. The proposed methods were evaluated for lip localisation, lip

  1. Speech Prosody in Cerebellar Ataxia

    ERIC Educational Resources Information Center

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  2. Use of Speech Recognition Software

    Microsoft Academic Search

    Marc J Haxer; Leslie W Guinn; Norman D Hogikyan

    2001-01-01

    Speech recognition software for the personal or office computer is a relatively new area of technology. As the number of these products has increased so has use of this software. Some individuals will employ speech recognition systems due to difficulty with the conventional keyboard and mouse interface; others will use it for perceived efficiency or simply novelty. Regardless of the

  3. Speech acoustics: How much science?

    PubMed Central

    Tiwari, Manjul

    2012-01-01

    Human vocalizations are sounds made exclusively by a human vocal tract. Among other vocalizations, for example, laughs or screams, speech is the most important. Speech is the primary medium of that supremely human symbolic communication system called language. One of the functions of a voice, perhaps the main one, is to realize language, by conveying some of the speaker's thoughts in linguistic form. Speech is language made audible. Moreover, when phoneticians compare and describe voices, they usually do so with respect to linguistic units, especially speech sounds, like vowels or consonants. It is therefore necessary to understand the structure as well as nature of speech sounds and how they are described. In order to understand and evaluate the speech, it is important to have at least a basic understanding of science of speech acoustics: how the acoustics of speech are produced, how they are described, and how differences, both between speakers and within speakers, arise in an acoustic output. One of the aims of this article is try to facilitate this understanding. PMID:22690047

  4. Methods of Teaching Speech Recognition

    ERIC Educational Resources Information Center

    Rader, Martha H.; Bailey, Glenn A.

    2010-01-01

    Objective: This article introduces the history and development of speech recognition, addresses its role in the business curriculum, outlines related national and state standards, describes instructional strategies, and discusses the assessment of student achievement in speech recognition classes. Methods: Research methods included a synthesis of…

  5. Recent Advancements in Speech Enhancement

    Microsoft Academic Search

    Yariv Ephraim; Israel Cohen

    Speech enhancement is a long standing problem with numerous applications ranging from hearing aids, to coding and automatic recognition of speech signals. In this survey paper we focus on enhancement from a single microphone, and assume that the noise is additive and statistically independent of the signal. We present the principles that guide researchers working in this area, and provide

  6. Distortion measures for speech processing

    Microsoft Academic Search

    ROBERT M. GRAY; ANDRES BUZO; Y. Matsuyama

    1980-01-01

    Several properties, interrelations, and interpretations are developed for various speech spectral distortion measures. The principle results are 1) the development of notions of relative strength and equivalence of the various distortion measures both in a mathematical sense corresponding to subjective equivalence and in a coding sense when used in minimum distortion or nearest neighbor speech processing systems; 2) the demonstration

  7. Speech Perception and Language Acquisition

    E-print Network

    Mehler, Jacques

    Speech Perception and Language Acquisition in the First Year of Life Judit Gervain1 and Jacques in the domains of speech perception, phono- logical development, word learning, morphosyntactic acquisition acquisition that in- tegrates behavioral, cognitive, neural, and evolutionary considerations and proposes

  8. Speaker Clustering in Speech Recognition

    Microsoft Academic Search

    Olga Grebenskaya; Tomi Kinnunen; Pasi Fränti

    2005-01-01

    The paper presents a combination of speaker and speech recognition techniques aiming to improve speech recognition rates. This combination is done by clustering the speaker models created from the training material. Speaker model is a codebook obtained by Vector Quantization (VQ) approach. We propose metaclustering algorithm to group codebooks into clusters and calculate the centroid codebooks. The last are thought

  9. Audiovisual Speech Recalibration in Children

    ERIC Educational Resources Information Center

    van Linden, Sabine; Vroomen, Jean

    2008-01-01

    In order to examine whether children adjust their phonetic speech categories, children of two age groups, five-year-olds and eight-year-olds, were exposed to a video of a face saying /aba/ or /ada/ accompanied by an auditory ambiguous speech sound halfway between /b/ and /d/. The effect of exposure to these audiovisual stimuli was measured on…

  10. Speech Restoration: An Interactive Process

    ERIC Educational Resources Information Center

    Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny

    2009-01-01

    Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…

  11. IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING 1 Model-Based Speech Enhancement with Improved

    E-print Network

    So, Hing-Cheung

    IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING 1 Model-Based Speech Enhancement-based approach to enhance noisy speech using an analysis-synthesis framework. Target speech is reconstructed. Ini- tially, we propose an analysis-synthesis framework for speech enhancement based on harmonic noise

  12. Empirical Distributions of DFT-Domain Speech Coefficients Based on Estimated Speech Variances

    E-print Network

    obtained from a short-time discrete Fourier transform (DFT) in the context of speech enhancement frameworks. The distribution of clean speech spectral coefficients is of great importance for speech enhancement algorithmsEmpirical Distributions of DFT-Domain Speech Coefficients Based on Estimated Speech Variances Timo

  13. Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model

    Microsoft Academic Search

    Zenton Goh; Kah-Chye Tan; B. T. G. Tan

    1999-01-01

    In this work, we are concerned with optimal estimation of clean speech from its noisy version based on a speech model we propose. We first propose a (single) speech model which satisfactorily describes voiced and unvoiced speech and silence (i.e., pauses between speech utterances), and also allows for exploitation of the long term characteristics of noise. We then reformulate the

  14. Evaluating speech intelligibility enhancement for HMM-based synthetic speech in noise

    E-print Network

    Edinburgh, University of

    Evaluating speech intelligibility enhancement for HMM-based synthetic speech in noise Cassia It is possible to increase the intelligibility of speech in noise by enhancing the clean speech signal Valentini-Botinhao, Junichi Yamagishi, Simon King The Centre for Speech Technology Research, University

  15. Exemplar-based speech enhancement and its application to noise-robust automatic speech recognition

    E-print Network

    Virtanen, Tuomas

    53 Exemplar-based speech enhancement and its application to noise-robust automatic speech-based technique for speech enhance- ment of noisy speech is proposed. The technique works by find- ing a sparse at http://www.cs.tut.fi/ ~tuomasv/. Index Terms: speech enhancement, exemplar-based, noise ro- bustness

  16. Preferred track: Speech Motor Control Title: Specificity of speech sensori-motor learning

    E-print Network

    Paris-Sud XI, Université de

    COVER PAGE Preferred track: Speech Motor Control Title: Specificity of speech sensori of speech production. Current research focuses on speech sensori-motor learning and its consequences for speech perception. David Ostry Professor at McGill University and Senior Scientist at Haskins

  17. An open source speech synthesis module for a visual-speech recognition system

    E-print Network

    Paris-Sud XI, Université de

    An open source speech synthesis module for a visual-speech recognition system S. Manitsarisa , B Conference 23-27 April 2012, Nantes, France 3937 #12;A Silent Speech Interface (SSI) is a voice replacement technology that permits speech communication without vocalization. The visual-speech recognition engine

  18. Research paper Speech discrimination after early exposure to pulsed-noise or speech

    E-print Network

    Kilgard, Michael P.

    Research paper Speech discrimination after early exposure to pulsed-noise or speech Kamalini G these changes are severe enough to alter neural representations and behavioral discrimination of speech. We-noise or speech. Both groups of rats were trained to discriminate speech sounds when they were young adults

  19. Assessment of language impact to speech privacy in closed offices

    Microsoft Academic Search

    Yong Ma Ma; Daryl J. Caswell; Liming Dai; Jim T. Goodchild

    2002-01-01

    Speech privacy is the opposite concept of speech intelligibility and can be assessed by the predictors of speech intelligibility. Based on the existing standards and the research to date, most objective assessments for speech privacy and speech intelligibility, such as articulation index (AI) or speech intelligibility index (SII), speech transmission index (STI), and sound early-to-late ratio (C50), are evaluated by

  20. Speech Patterns and Racial Wage Inequality

    Microsoft Academic Search

    Jeffrey Grogger

    2011-01-01

    Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the wages of young workers. Even after controlling

  1. Speech Patterns and Racial Wage Inequality

    ERIC Educational Resources Information Center

    Grogger, Jeffrey

    2011-01-01

    Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…

  2. Phonetic Recalibration Only Occurs in Speech Mode

    ERIC Educational Resources Information Center

    Vroomen, Jean; Baart, Martijn

    2009-01-01

    Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…

  3. Speech Writing and Improving Public Speaking Skills.

    ERIC Educational Resources Information Center

    Haven, Richard P.

    A course in speech writing (preparing speeches for delivery by another person) is critical to the development of public speaking skills for college students. Unlike the traditional public speaking course, speech writing classes emphasize the preparation of the content of a speech over the delivery of the message. Students develop the ability to…

  4. ON THE NATURE OF SPEECH SCIENCE.

    ERIC Educational Resources Information Center

    PETERSON, GORDON E.

    IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…

  5. Audio-Visual Speech Perception Is Special

    ERIC Educational Resources Information Center

    Tuomainen, J.; Andersen, T.S.; Tiippana, K.; Sams, M.

    2005-01-01

    In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and…

  6. Speech Perception Within an Auditory Cognitive Science

    E-print Network

    Holt, Lori L.

    Speech Perception Within an Auditory Cognitive Science Framework Lori L. Holt1 and Andrew J. Lotto2 speech begins with auditory processing, inves- tigation of speech perception has progressed mostly inde the study of general auditory processing and speech perception, showing that the latter is constrained

  7. The unimportance of phase in speech enhancement

    Microsoft Academic Search

    D. Wang; Jae Lim

    1982-01-01

    The importance of Fourier transform phase in speech enhancement is considered. Results indicate that a more accurate estimation of phase is unwarranted in speech enhancement at the S\\/N ratios where the intelligibility scores of unprocessed speech range from 5 to 95 percent, if the phase estimate is used to reconstruct speech by combining it with an independently estimated magnitude or

  8. Statistical-model-based speech enhancement systems

    Microsoft Academic Search

    YARIV EPHRAIM

    1992-01-01

    Since the statistics of the speech signal as well as of the noise are not explicitly available, and the most perceptually meaningful distortion measure is not known, model-based approaches have recently been extensively studied and applied to the three basic problems of speech enhancement: signal estimation from a given sample function of noisy speech, signal coding when only noisy speech

  9. Emerging Technologies Speech Tools and Technologies

    ERIC Educational Resources Information Center

    Godwin-Jones, Robert

    2009-01-01

    Using computers to recognize and analyze human speech goes back at least to the 1970's. Developed initially to help the hearing or speech impaired, speech recognition was also used early on experimentally in language learning. Since the 1990's, advances in the scientific understanding of speech as well as significant enhancements in software and…

  10. REVERBERANT SPEECH ENHANCEMENT USING CEPSTRAL PROCESSING

    E-print Network

    Kabal, Peter

    S14.25 REVERBERANT SPEECH ENHANCEMENT USING CEPSTRAL PROCESSING Duncan Bees It, Maier Blostein l The dereverberation of acoustically reverberant speech has p* tentml application to the enhancement of speech which microphone reverberant speech enhancement typically requires prior knowledge of h(n) and subsequent inverse

  11. Knowledge based speech analysis and enhancement

    Microsoft Academic Search

    Cory Myers; Alan Oppenheim; Randall Davis; Webster Dove

    1984-01-01

    This paper describes a system for speech analysis and enhancement which combines signal processing and symbolic processing in a closely coupled manner. The system takes as input both a noisy speech signal and a symbolic description of the speech signal. The system attempts to reconstruct the original speech waveform using symbolic processing to help model the signal and to guide

  12. ANNUAL SPEECH PATHOLOGY HONOURS RESEARCH MINICONFERENCE 2012

    E-print Network

    ANNUAL SPEECH PATHOLOGY HONOURS RESEARCH MINICONFERENCE 2012 Every year the Speech Pathology, dyslexia, childhood apraxia of speech, stuttering and dysphagia. Date: Monday, 15th October 2012 Time: 1 pm CRICOSProviderCode00301J(WA),02637B(NSW) All interested are welcome. This invitation extends to Speech Pathology

  13. Multifractal nature of unvoiced speech signals

    SciTech Connect

    Adeyemi, O.A. [Department of Electrical Engineering, University of Rhode Island, Kingston, Rhode Island 02881 (United States); Hartt, K. [Department of Physics, University of Rhode Island, Kingston, Rhode Island 02881 (United States); Boudreaux-Bartels, G.F. [Department of Electrical Engineering, University of Rhode Island, Kingston, Rhode Island 02881 (United States)

    1996-06-01

    A refinement is made in the nonlinear dynamic modeling of speech signals. Previous research successfully characterized speech signals as chaotic. Here, we analyze fricative speech signals using multifractal measures to determine various fractal regimes present in their chaotic attractors. Results support the hypothesis that speech signals have multifractal measures. {copyright} {ital 1996 American Institute of Physics.}

  14. NEUROSYSTEMS Cortical activity patterns predict robust speech

    E-print Network

    Kilgard, Michael P.

    NEUROSYSTEMS Cortical activity patterns predict robust speech discrimination ability in noise Jai A, TX 75080- 3021, USA Keywords: neural basis of speech, rodent, similarities between human and animal, speech in noise, temporal integration Abstract The neural mechanisms that support speech discrimination

  15. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  16. Personalising speech-to-speech translation in the EMIME project 

    E-print Network

    Kurimo, Mikko; Byrne, William; Dines, John; Garner, Philip N.; Gibson, Matthew; Guan, Yong; Hirsimaki, Teemu; Karhila, Reima; King, Simon; Liang, Hui; Oura, Keiichiro; Saheer, Lakshmi; Shannon, Matt; Shiota, Sayaka; Tian, Jilei; Tokuda, Keiichi; Wester, Mirjam; Wu, Yi-Jian; Yamagishi, Junichi

    2010-01-01

    In the EMIME project we have studied unsupervised cross-lingual speaker adaptation. We have employed an HMM statistical framework for both speech recognition and synthesis which provides transformation mechanisms to adapt ...

  17. The contribution of sensitivity to speech rhythm and non?speech rhythm to early reading development

    Microsoft Academic Search

    Andrew J. Holliman; Clare Wood; Kieron Sheehy

    2010-01-01

    Both sensitivity to speech rhythm and non?speech rhythm have been associated with successful phonological awareness and reading development in separate studies. However, the extent to which speech rhythm, non?speech rhythm and literacy skills are interrelated has not been examined. As a result, five? to seven?year?old English?speaking children were assessed on measures of speech rhythm sensitivity, non?speech rhythm sensitivity (both receptive

  18. New speech enhancement techniques for low bit rate speech coding

    Microsoft Academic Search

    R. Martin; Richard V. Cox

    1999-01-01

    In this paper we present novel solutions for pre-processing noisy speech prior to low bit rate speech coding. We strive especially to improve the estimation of spectral parameters and to reduce the additional algorithmic delay caused by the enhancement pre-processor. While the former is achieved using a new adaptive limiting algorithm for the a priori signal-to-noise ratio (SNR) estimate, the

  19. Speech recovery device

    DOEpatents

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  20. Speech Accent Archive

    NSDL National Science Digital Library

    Developed by a team of researchers at George Mason University, this rather fascinating site contains speech samples of 259 individuals from different language backgrounds reading the same paragraph. Some of the languages included on the site include Portuguese, Sardinian, Polish, and Urdu. Clicking on any one of the languages will take users to an individualized page that contains a sound bar, some basic demographic information, a phonetic transcription of the sample reading, and a link to the speaker's phonological generalizations. In many cases, there will be examples of several different speakers for each linguistic tradition from different regions. The site avoids excessive technical jargon and provides basic descriptions of such linguistic phenomena ranging from vowel raising, consonant deletion, and voicing change. Finally, the site provides information on how to make an effective voice recording and links to additional resources.

  1. Silog: Speech Input Logon

    NASA Astrophysics Data System (ADS)

    Grau, Sergio; Allen, Tony; Sherkat, Nasser

    Silog is a biometrie authentication system that extends the conventional PC logon process using voice verification. Users enter their ID and password using a conventional Windows logon procedure but then the biometrie authentication stage makes a Voice over IP (VoIP) call to a VoiceXML (VXML) server. User interaction with this speech-enabled component then allows the user's voice characteristics to be extracted as part of a simple user/system spoken dialogue. If the captured voice characteristics match those of a previously registered voice profile, then network access is granted. If no match is possible, then a potential unauthorised system access has been detected and the logon process is aborted.

  2. Prosody in apraxia of speech.

    PubMed

    Boutsen, Frank R; Christman, Sarah S

    2002-11-01

    Prosody is a complex process that involves modulation of pitch, loudness, duration, and linearity in the acoustic stream to serve linguistic and affective communication goals. It arises from the interaction of distributed neural networks that may be anatomically and functionally lateralized. Intrinsic prosody is mediated largely through left hemisphere mechanisms and encompasses those elements of linguistic microstructure (e.g., syllabic magnitudes and durations, basic consonantal and vocalic gesture specifications, and so) that yield the segmental aspects of speech. Extrinsic prosody is processed primarily by right hemisphere (RH) mechanisms and involves manipulation of intonation across longer perceptual groupings. Intrinsic prosody deficits can lead to several core symptoms of speech apraxia such as difficulty with utterance initiation and syllable transitionalization and may lead to the establishment of inappropriate syllable boundaries. The intrinsic prosody profiles associated with acquired apraxia of speech, developmental speech apraxia, and ataxic dysarthria may aid in the clinical differentiation of these disorders. PMID:12461724

  3. Letter-based speech synthesis 

    E-print Network

    Watts, Oliver; Yamagishi, Junichi; King, Simon

    2010-01-01

    Initial attempts at performing text-to-speech conversion based on standard orthographic units are presented, forming part of a larger scheme of training TTS systems on features that can be trivially extracted from text. We evaluate the possibility...

  4. Is Private Speech Really Private? 

    E-print Network

    Smith, Ashley

    2011-01-01

    This study sought to answer the question “is private speech really private?” by assessing if participants spoke more to themselves when in the company of the experimenter or when they were alone. The similarity between ...

  5. Speech processing: An evolving technology

    SciTech Connect

    Crochiere, R.E.; Flanagan, J.L.

    1986-09-01

    As we enter the information age, speech processing is emerging as an important technology for making machines easier and more convenient for humans to use. It is both an old and a new technology - dating back to the invention of the telephone and forward, at least in aspirations, to the capabilities of HAL in 2001. Explosive advances in microelectronics now make it possible to implement economical real-time hardware for sophisticated speech processing - processing that formerly could be demonstrated only in simulations on main-frame computers. As a result, fundamentally new product concepts - as well as new features and functions in existing products - are becoming possible and are being explored in the marketplace. As the introductory piece to this issue, the authors draw a brief perspective on the evolving field of speech processing and assess the technology in the the three constituent sectors: speech coding, synthesis, and recognition.

  6. [Speech disorders in ENT practice].

    PubMed

    Seidner, W

    1997-04-01

    The most frequent speech and language disorders which ENT doctors are confronted with are generally known to be and presented as: delayed speech and language development, dystalia, dysglossia, rhinolalia, dysarthria, and verbal fluency disorders (stuttering, cluttering). The diagnostic portion in comparison to the therapeutic part is always greater and quite different. The close cooperation with representatives of phoniatrics and pedaudiology, as well as logopedics and other specialities such as neurology, and internal medicine is highly necessary. PMID:9264604

  7. An improved speech transmission index for intelligibility prediction Belinda Schwerin

    E-print Network

    transmission index; Modulation transfer function; Speech enhancement; Objective evaluation; Speech intelligibility; Short-time modulation spectrum 1. Introduction The enhancement of speech corrupted by noise hasAn improved speech transmission index for intelligibility prediction Belinda Schwerin , Kuldip

  8. Neural pathways for visual speech perception.

    PubMed

    Bernstein, Lynne E; Liebenthal, Einat

    2014-01-01

    This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611

  9. Child directed speech, speech in noise and hyperarticulated speech in the Pacific Northwest

    NASA Astrophysics Data System (ADS)

    Wright, Richard; Carmichael, Lesley; Beckford Wassink, Alicia; Galvin, Lisa

    2001-05-01

    Three types of exaggerated speech are thought to be systematic responses to accommodate the needs of the listener: child-directed speech (CDS), hyperspeech, and the Lombard response. CDS (e.g., Kuhl et al., 1997) occurs in interactions with young children and infants. Hyperspeech (Johnson et al., 1993) is a modification in response to listeners difficulties in recovering the intended message. The Lombard response (e.g., Lane et al., 1970) is a compensation for increased noise in the signal. While all three result from adaptations to accommodate the needs of the listener, and therefore should share some features, the triggering conditions are quite different, and therefore should exhibit differences in their phonetic outcomes. While CDS has been the subject of a variety of acoustic studies, it has never been studied in the broader context of the other ``exaggerated'' speech styles. A large crosslinguistic study was undertaken that compares speech produced under four conditions: spontaneous conversations, CDS aimed at 6-9-month-old infants, hyperarticulated speech, and speech in noise. This talk will present some findings for North American English as spoken in the Pacific Northwest. The measures include f0, vowel duration, F1 and F2 at vowel midpoint, and intensity.

  10. Speech enhancement using a soft-decision noise suppression filter

    Microsoft Academic Search

    R. McAulay; M. Malpass

    1980-01-01

    One way of enhancing speech in an additive acoustic noise environment is to perform a spectral decomposition of a frame of noisy speech and to attenuate a particular spectral line depending on how much the measured speech plus noise power exceeds an estimate of the background noise. Using a two-state model for the speech event (speech absent or speech present)

  11. Iterative and sequential Kalman filter-based speech enhancement algorithms

    Microsoft Academic Search

    Sharon Gannot; David Burshtein; Ehud Weinstein

    1998-01-01

    Speech quality and intelligibility might significantly deteriorate in the presence of background noise, especially when the speech signal is subject to subsequent processing. In particular, speech coders and automatic speech recognition (ASR) systems that were designed or trained to act on clean speech signals might be rendered useless in the presence of background noise. Speech enhancement algorithms have therefore attracted

  12. INCORPORATING SPECTRAL SUBTRACTION AND NOISE TYPE FOR UNVOICED SPEECH SEGREGATION

    E-print Network

    Wang, DeLiang "Leon"

    , monaural speech enhancement methods enhance noisy speech based on certain assumptions or models of speech and interference [8]. Speech enhancement methods improve speech quality. However, they have a limited abilityINCORPORATING SPECTRAL SUBTRACTION AND NOISE TYPE FOR UNVOICED SPEECH SEGREGATION Ke Hu and De

  13. Non-Intrusive GMM-Based Speech Quality Measurement

    Microsoft Academic Search

    Tiago H. Falk; Qingfeng Xu; Wai-Yip Chan

    2005-01-01

    We propose a non-intrusive speech quality measurement algorithm based on using Gaussian-mixture probability models of features of undegraded speech signals as an artificial reference model of “clean” speech behaviour. The consistency between the features of the test speech signal and the reference model serves as an indicator of speech quality. Consistency measures are calculated and mapped to an objective speech

  14. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C. (Livermore, CA); Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

    2006-08-08

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  15. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2004-03-23

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  16. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-02-14

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  17. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, John F. (Berkeley, CA); Ng, Lawrence C. (Danville, CA)

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  18. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, J.F.; Ng, L.C.

    1998-03-17

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.

  19. The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

    ERIC Educational Resources Information Center

    Wayne, Rachel V.; Johnsrude, Ingrid S.

    2012-01-01

    Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

  20. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  1. Defensive Speech, or, Can We Use Speech Acts in Language Education?

    ERIC Educational Resources Information Center

    Mey, Jacob

    Using a language effectively involves both expressing oneself and acting upon one's surroundings. Linguistic theory has focused on the study of speech acts, but speech activity is basically pragmatic, that is, the conversational context determines the individual speech acts and not the other way around. Theorizing about the speech act independent…

  2. LARGEVOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES

    E-print Network

    Wang, Hsin-Min

    LARGE­VOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES Bo­ren Bai via the Internet. This paper deals with the problem of Chinese text and Mandarin speech information retrieval with Mandarin speech queries. Instead of using the syllable­based information alone, the word

  3. Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition

    E-print Network

    .V. All rights reserved. Keywords: Robust speech recognition; MMSE estimation; Speech enhancement methods fall under two categories: front-end speech/feature enhancement and back-end model adaptation. Back are interested in methods that perform enhancement on the speech signal. Several methods falling

  4. Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation

    E-print Network

    Cohen, Israel

    Speech enhancement using super-Gaussian speech models and noncausal a priori SNR estimation Israel that the performance of noncausal estimation, when applied to the problem of speech enhancement, is better under has a smaller effect on the enhanced speech signal when using the noncausal a priori SNR estimator

  5. A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

    E-print Network

    Whelan, Paul F.

    A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition Abstract. This paper presents the development of a novel visual speech recognition (VSR) system based on a new representation that extends the standard viseme concept (that is referred in this paper to as Visual Speech Unit

  6. The Relationship between Speech Perception and Auditory Organisation: Studies with Spectrally Reduced Speech

    E-print Network

    Barker, Jon

    The Relationship between Speech Perception and Auditory Organisation: Studies with Spectrally Reduced Speech Jon Barker Abstract Listeners are remarkably adept at recognising speech that has undergone extensive spectral reduction. Natural speech can be reproduced using as few as three time­varying sinusoids

  7. Synthesizing Fast Speech by Implementing Multi-Phone Units in Unit Selection Speech Synthesis

    E-print Network

    Möbius, Bernd

    Synthesizing Fast Speech by Implementing Multi-Phone Units in Unit Selection Speech Synthesis Donata Moers1,2 , Igor Jauk1 , Bernd Möbius1,3 , Petra Wagner2 1 Division of Language and Speech.wagner@uni-bielefeld.de Abstract This paper presents a new approach to synthesizing fast speech in unit selection synthesis. After

  8. Running Head: SPEECH ERRORS AND PHONOLOGICAL THEORY Linking speech errors and generative phonological theory

    E-print Network

    Reber, Paul J.

    1 Running Head: SPEECH ERRORS AND PHONOLOGICAL THEORY Linking speech errors-goldrick@northwestern.edu #12; 2 Abstract Speech errors are a critical source of data on the tacit in spontaneous speech, in experimental paradigms such as tongue twisters, and those

  9. The Contribution of Sensitivity to Speech Rhythm and Non-Speech Rhythm to Early Reading Development

    ERIC Educational Resources Information Center

    Holliman, Andrew J.; Wood, Clare; Sheehy, Kieron

    2010-01-01

    Both sensitivity to speech rhythm and non-speech rhythm have been associated with successful phonological awareness and reading development in separate studies. However, the extent to which speech rhythm, non-speech rhythm and literacy skills are interrelated has not been examined. As a result, five- to seven-year-old English-speaking children…

  10. Can Speech Recognizers Measure the Effectiveness of Encoding Algorithms for Digital Speech Transmission?

    E-print Network

    Can Speech Recognizers Measure the Effectiveness of Encoding Algorithms for Digital Speech telephony, often convey human speech in a highly encoded form. Methods that rely on human subjects to highly encoded speech. For this reason, researchers investigate alternative means to objectively measure

  11. What can Visual Speech Synthesis tell Visual Speech Recognition? Michael M. Cohen and Dominic W. Massaro

    E-print Network

    Massaro, Dominic

    What can Visual Speech Synthesis tell Visual Speech Recognition? Michael M. Cohen and Dominic W Abstract We consider the problem of speech recognition given visual and auditory information, and discuss some of the ways that speech synthesis can provide assistance. Three possible contributions

  12. PINPOINTING PRONUNCIATION ERRORS IN CHILDREN S SPEECH: EXAMINING THE ROLE OF THE SPEECH

    E-print Network

    Eskenazi, Maxine

    PINPOINTING PRONUNCIATION ERRORS IN CHILDREN S SPEECH: EXAMINING THE ROLE OF THE SPEECH RECOGNIZER 2 Carnegie Speech Company, 4619 Newell Simon Hall, 5000 Forbes Ave. Pittsburgh, PA 15213 max@cs.cmu.edu, gap@cs.cmu.edu Abstract In speech recognition, when a system created for one application is used

  13. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 7, OCTOBER 2001 731 Speech Enhancement Using a

    E-print Network

    Texas at Dallas, University of

    , IEEE Abstract--This paper presents a sinusoidal model based algo- rithm for enhancement of speech--Sinusoidal speech model, speech and noise, speech enhancement, speech quality. I. INTRODUCTION IN general, the need.g., [24]), the sinusoidal model has not received the same level of attention in a speech enhancement

  14. Structural Representation of Speech for Phonetic Classification 

    E-print Network

    Gutkin, Alexander; King, Simon

    This paper explores the issues involved in using symbolic metric algorithms for automatic speech recognition(ASR), via a structural representation of speech. This representation is based on a set of phonological distinctive ...

  15. Articulatory features for robust visual speech recognition

    E-print Network

    Saenko, Ekaterina, 1976-

    2004-01-01

    This thesis explores a novel approach to visual speech modeling. Visual speech, or a sequence of images of the speaker's face, is traditionally viewed as a single stream of contiguous units, each corresponding to a phonetic ...

  16. Multimodal speech recognition with ultrasonic sensors

    E-print Network

    Zhu, Bo, M. Eng. Massachusetts Institute of Technology

    2008-01-01

    Ultrasonic sensing of articulator movement is an area of multimodal speech recognition that has not been researched extensively. The widely-researched audio-visual speech recognition (AVSR), which relies upon video data, ...

  17. Speech and Language Problems in Children

    MedlinePLUS

    Children vary in their development of speech and language skills. Health professionals have milestones for what's normal. ... it may be due to a speech or language disorder. Language disorders can mean that the child ...

  18. Speech-Language Therapy (For Parents)

    MedlinePLUS

    ... your child may have a problem with certain speech or language skills. Or perhaps while talking to your child, you ... correct pronunciation and use repetition exercises to build speech and language skills. Articulation therapy: Articulation, or sound production, exercises involve ...

  19. Resources for speech synthesis of Viennese varieties 

    E-print Network

    Pucher, Michael; Neubarth, Friedrich; Strom, Volker; Moosmuller, Sylvia; Hofer, Gregor; Kranzler, Christian; Schuchmann, Gudrun; Schabus, Dietmar

    2010-01-01

    This paper describes our work on developing corpora of three varieties of Viennese for unit selection speech synthesis. The synthetic voices for Viennese varieties, implemented with the open domain unit selection speech ...

  20. Speech Recognition: Its Place in Business Education.

    ERIC Educational Resources Information Center

    Szul, Linda F.; Bouder, Michele

    2003-01-01

    Suggests uses of speech recognition devices in the classroom for students with disabilities. Compares speech recognition software packages and provides guidelines for selection and teaching. (Contains 14 references.) (SK)

  1. Speech rhythm guided syllable nuclei detection

    E-print Network

    Glass, James R.

    In this paper, we present a novel speech-rhythm-guided syllable-nuclei location detection algorithm. As a departure from conventional methods, we introduce an instantaneous speech rhythm estimator to predict possible regions ...

  2. Speech synthesis by phonological structure matching. 

    E-print Network

    Taylor, Paul; Black, Alan W

    1999-01-01

    This paper presents a new technique for speech synthesis by unit selection. The technique works by specifying the synthesis target and the speech database as phonological trees, and using a selection algorithm which ...

  3. A newly devised speech accumulator.

    PubMed

    Ryu, S; Komiyama, S; Kannae, S; Watanabe, H

    1983-01-01

    Voice therapy is often most effective for treating patients with vocal cord polyp, polypoid degeneration and singer's nodule. However, little is known about the total speaking times in 1 day, the ratio of speech per hour and the sound level during speech, in individual patients. If these parameters can be readily detected, it could be clarified as to how speaking times or patterns are related to a particular voice disorder and/or what instruction a doctor had given a patient. We devised a speech accumulator which records the vibration time of the vocal cords by a small contact microphone attached to the neck, but does not record the actual speech, thus monitoring the privacy of the individual. The time of speech, at any sound level can be read digitally, at any time. This system was clinically used for 11 patients and proved to be most useful. The longest speaking time in 1 day was 182 min, for a bus guide, and the shortest time was 33 min for an office clerk. PMID:6341919

  4. Formant speech synthesis: improving production quality

    Microsoft Academic Search

    NEAL B. PINTO; DONALD G. CHILDERS; AJIT L. LALWANI

    1989-01-01

    The authors describe analysis and synthesis methods for improving the quality of speech produced by D.H. Klatt's (J. Acoust. Soc. Am., vol.67, p.971-95, 1980) software formant synthesizer. Synthetic speech generated using an excitation waveform resembling the glotal volume-velocity was found to be perceptually preferred over speech synthesized using other types of excitation. In addition, listeners ranked speech tokens synthesized with

  5. Interventions for Speech Sound Disorders in Children

    ERIC Educational Resources Information Center

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  6. An Approach to Speech Driven Animation

    Microsoft Academic Search

    Ningping Sun; Kaori Suigetsu; Toru Ayabe

    2008-01-01

    In the manufacture of Japanese style anime the movement of lip with speech is usually shortened to the more convenient 'open' and 'close' of mouth because of the expensive production cost. In this paper we provide an approach to deal with speech driven lip animation for Japanese style anime. First we discuss the previous work of digital speech signal processing

  7. Improving Speech Production with Adolescents and Adults.

    ERIC Educational Resources Information Center

    Whitehead, Brenda H.; Barefoot, Sidney M.

    1992-01-01

    This paper deals with the specific problems of the adolescent and adult hearing-impaired individual who wishes to improve and develop his or her expressive speech ability. Considered are issues critical to the learning process, intervention strategies for improving speech production, and speech production as one part of communication competency.…

  8. Speech Patterns and Racial Wage Inequality

    Microsoft Academic Search

    Jeffrey Grogger

    2008-01-01

    Speech patterns differ substantially between whites and African Americans. I collect and analyze data on speech patterns to understand the role they may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and ASVAB scores. They are also highly correlated with the wages of young workers. Black speakers whose

  9. The Varieties of Speech to Young Children

    ERIC Educational Resources Information Center

    Huttenlocher, Janellen; Vasilyeva, Marina; Waterfall, Heidi R.; Vevea, Jack L.; Hedges, Larry V.

    2007-01-01

    This article examines caregiver speech to young children. The authors obtained several measures of the speech used to children during early language development (14-30 months). For all measures, they found substantial variation across individuals and subgroups. Speech patterns vary with caregiver education, and the differences are maintained over…

  10. Distinguishing deceptive from non-deceptive speech

    Microsoft Academic Search

    Julia Hirschberg; Stefan Benus; Jason M. Brenier; Frank Enos; Sarah Friedman; Sarah Gilman; Cynthia Girand; Martin Graciarenay; Andreas Katholy; Laura Michaelis; Bryan L. Pellom; Elizabeth Shriberg; Andreas Stolcke

    2005-01-01

    To date, studies of deceptive speech have largely been con- fined to descriptive studies and observations from subjects, re- searchers, or practitioners, with few empirical studies of the specific lexical or acoustic\\/prosodic features which may charac- terize deceptive speech. We present results from a study seek- ing to distinguish deceptive from non-deceptive speech using machine learning techniques on features extracted

  11. Multistep coding of speech parameters for compression

    Microsoft Academic Search

    Ladan Baghai-Ravary; Steve W. Beet

    1998-01-01

    This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression that directly utilizes the multidimensional nature of the input data. Many methods of speech analysis yield a two-dimensional (2-D) pattern, with time as one of the dimensions. Various such speech representations, and power spectrum sequences in particular, are shown here

  12. Speech and Hearing Science, Anatomy and Physiology.

    ERIC Educational Resources Information Center

    Zemlin, Willard R.

    Written for those interested in speech pathology and audiology, the text presents the anatomical, physiological, and neurological bases for speech and hearing. Anatomical nomenclature used in the speech and hearing sciences is introduced and the breathing mechanism is defined and discussed in terms of the respiratory passage, the framework and…

  13. Speech Perception in Individuals with Auditory Neuropathy

    ERIC Educational Resources Information Center

    Zeng, Fan-Gang; Liu, Sheng

    2006-01-01

    Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…

  14. GRAPHICAL MODELS AND AUTOMATIC SPEECH RECOGNITION

    E-print Network

    Bilmes, Jeff

    for speech recognition and language processing can also be simply described by a graph, including many, Language Modeling 1. Introduction. Since its inception, the field of automatic speech recognition (ASRGRAPHICAL MODELS AND AUTOMATIC SPEECH RECOGNITION JEFFREY A. BILMES Abstract. Graphical models

  15. Communicating by Language: The Speech Process.

    ERIC Educational Resources Information Center

    House, Arthur S., Ed.

    This document reports on a conference focused on speech problems. The main objective of these discussions was to facilitate a deeper understanding of human communication through interaction of conference participants with colleagues in other disciplines. Topics discussed included speech production, feedback, speech perception, and development of…

  16. Acoustics of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  17. Reported Speech in Talking Race on Campus.

    ERIC Educational Resources Information Center

    Buttny, Richard

    1997-01-01

    Examines the "reported speech" (in which speakers quote the speech of others or themselves) of college students concerning race. Finds that African Americans discursively portray Whites as unwilling to admit racism and stereotyping; and Whites frame African Americans as exaggerating racism or overemphasizing their ethnicity. Claims reported speech

  18. Codebook constrained Wiener filtering for speech enhancement

    Microsoft Academic Search

    T. V. Sreenivas; Pradeep Kirnapure

    1996-01-01

    Speech enhancement using iterative Wiener filtering has been shown to require interframe and intraframe constraints in all-pole parameter estimation. We show that a clean speech VQ codebook is more effective in providing intraframe constraints and, hence, better convergence of the iterative filtering scheme. Satisfactory speech enhancement results are obtained with a small codebook of 128, and the algorithm is effective

  19. Signal Subspace Methods for Speech Enhancement

    E-print Network

    Signal Subspace Methods for Speech Enhancement Ph.D. Thesis Peter S. K. Hansen LYNGBY 1997 IMM Lyngby -- Denmark 1997­09­30 pskh Signal Subspace Methods for Speech Enhancement Ph.D. Thesis Peter S. K is Signal Subspace Methods for Speech Enhancement where techniques from the areas of signal processing

  20. Intelligibility enhancement of synthetic speech in noise

    E-print Network

    Edinburgh, University of

    Intelligibility enhancement of synthetic speech in noise C´assia Valentini Botinh~ao TH E U N I V E of a hidden Markov model (HMM-) based speech synthesis system that allows for flexible enhancement strategies with noise-independent enhancement approaches based on the acoustics of highly intelligible speech

  1. An Adaptive KLT Approach for Speech Enhancement

    Microsoft Academic Search

    Afshin Rezayee; Saeed Gazor

    1999-01-01

    An adaptive Karhunen-Loeve Transform tracking based algorithm is proposed for enhancement ofspeech degraded by colored additive interference. This algorithm decomposes noisy speech into its componentsalong the axes of a KLT-based vector space of the clean speech. It is observed that the noiseenergy is disparately distributed along each eigenvector. These energies are obtained from noise samplesgathered from silence intervals between speech

  2. Iterative Speech Enhancement With Spectral Constraints

    E-print Network

    Texas at Dallas, University of

    Iterative Speech Enhancement With Spectral Constraints John H. Hansen and Mark A. Clements Georgia iterative speech enhancementtechnique based on spectral constraints is presented in this paper estimate of a speech waveform in additive white noise. Thenew approach applies inter- and intra

  3. A perceptually motivated approach for speech enhancement

    Microsoft Academic Search

    Yi Hu; Philipos C. Loizou

    2003-01-01

    A new perceptually motivated approach is proposed for enhancement of speech corrupted by colored noise. The proposed approach takes into account the frequency masking properties of the human auditory system and reduces the perceptual effect of the residual noise. This new perceptual method is incorporated into a frequency-domain speech enhancement method and a subspace-based speech enhancement method. A better power

  4. Syllable Structure in Dysfunctional Portuguese Children's Speech

    ERIC Educational Resources Information Center

    Candeias, Sara; Perdigao, Fernando

    2010-01-01

    The goal of this work is to investigate whether children with speech dysfunctions (SD) show a deficit in planning some Portuguese syllable structures (PSS) in continuous speech production. Knowledge of which aspects of speech production are affected by SD is necessary for efficient improvement in the therapy techniques. The case-study is focused…

  5. Repetitive speech phenomena in Parkinson's disease

    Microsoft Academic Search

    Th Benke; C Hohenstein; W Poewe; B Butterworth

    2000-01-01

    OBJECTIVESRepetitive speech phenomena are morphologically heterogeneous iterations of speech which have been described in several neurological disorders such as vascular dementia, progressive supranuclear palsy, Wilson's disease, and Parkinson's disease, and which are presently only poorly understood. The present, prospective study investigated repetitive speech phenomena in Parkinson's disease to describe their morphology, assess their prevalence, and to establish their relation with

  6. A Unified Theoretical Bayesian Model of Speech

    E-print Network

    Boyer, Edmond

    A Unified Theoretical Bayesian Model of Speech Communication Clément Moulin-Frier1 , Jean-Luc Schwartz1 , Julien Diard2 , Pierre Bessière3 1 GIPSA-Lab, Speech and Cognition Department (ex-ICP), UMR and theories in speech communication, this paper proposes an original Bayesian framework able to express each

  7. Speech Recognition Experiments Silicon Auditory Models

    E-print Network

    Lazzaro, John

    Speech Recognition Experiments with Silicon Auditory Models John Lazzaro and John Wawrzynek CS these representations. We report on a speech recognizer that uses this system for feature extraction, and we evaluate the performance of this speech recognition system on a speaker-independent 13-word recognition task. 1

  8. Listener Effort for Highly Intelligible Tracheoesophageal Speech

    ERIC Educational Resources Information Center

    Nagle, Kathy F.; Eadie, Tanya L.

    2012-01-01

    The purpose of this study was to determine whether: (a) inexperienced listeners can reliably judge listener effort and (b) whether listener effort provides unique information beyond speech intelligibility or acceptability in tracheoesophageal speech. Twenty inexperienced listeners made judgments of speech acceptability and amount of effort…

  9. ABOUT SPEECH MOTOR CONTROL COMPLEXITY Pascal Perrier

    E-print Network

    Paris-Sud XI, Université de

    20/07/2005 ABOUT SPEECH MOTOR CONTROL COMPLEXITY Pascal Perrier Institut de la Communication Parlée ABSTRACT A key issue in research about speech motor control is the one of the level of complexity of the speech motor system, including the complex tongue-jaw biomechanics? Or would more simple internal

  10. Speech & Hearing Clinic College of Science

    E-print Network

    Hickman, Mark

    Speech & Hearing Clinic College of Science Department of Communication Disorders How to contact us: Please contact the Speech and Hearing Clinic during business hours to make an appointment. Clinical will tell you what we have learned about your child's speech and language and phonological awareness skills

  11. AUDIOVISUAL SPEECH SYNTHESIS Barry-John Theobald

    E-print Network

    Theobald, Barry-John

    AUDIOVISUAL SPEECH SYNTHESIS Barry-John Theobald School of Computing Sciences, University of East Anglia bjt@cmp.uea.ac.uk ABSTRACT The ultimate goal of audiovisual speech synthesis is to create a machine that is able to articulate human- like audiovisual speech from text. There has been much interest

  12. Audiovisual Asynchrony Detection in Human Speech

    ERIC Educational Resources Information Center

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  13. All-pole modeling of degraded speech

    Microsoft Academic Search

    Jae Lim; A. Oppenheim

    1978-01-01

    This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise. The procedure, based on maximum a posteriori (MAP) estimation techniques is first developed in the absence of noise and related to linear prediction analysis of speech. The modification in the presence of background noise is shown to be

  14. Investigating Holistic Measures of Speech Prosody

    ERIC Educational Resources Information Center

    Cunningham, Dana Aliel

    2012-01-01

    Speech prosody is a multi-faceted dimension of speech which can be measured and analyzed in a variety of ways. In this study, the speech prosody of Mandarin L1 speakers, English L2 speakers, and English L1 speakers was assessed by trained raters who listened to sound clips of the speakers responding to a graph prompt and reading a short passage.…

  15. Perception of Speech Reflects Optimal Use of Probabilistic Speech Cues

    ERIC Educational Resources Information Center

    Clayards, Meghan; Tanenhaus, Michael K.; Aslin, Richard N.; Jacobs, Robert A.

    2008-01-01

    Listeners are exquisitely sensitive to fine-grained acoustic detail within phonetic categories for sounds and words. Here we show that this sensitivity is optimal given the probabilistic nature of speech cues. We manipulated the probability distribution of one probabilistic cue, voice onset time (VOT), which differentiates word initial labial…

  16. Speech separation using speaker-adapted eigenvoice speech models

    Microsoft Academic Search

    Ron J. Weiss; Daniel P. W. Ellis

    2010-01-01

    We present a system for model-based source separation for use on single channel speech mixtures where the precise source characteristics are not known a priori. The sources are modeled using hidden Markov models (HMM) and sepa- rated using factorial HMM methods. Without prior speaker models for the sources in the mixture it is difficult to exactly resolve the individual sources

  17. Speech Perception in Children with Speech Output Disorders

    ERIC Educational Resources Information Center

    Nijland, Lian

    2009-01-01

    Research in the field of speech production pathology is dominated by describing deficits in output. However, perceptual problems might underlie, precede, or interact with production disorders. The present study hypothesizes that the level of the production disorders is linked to level of perception disorders, thus lower-order production problems…

  18. Animating speech: an automated approach using speech synthesised by rules

    Microsoft Academic Search

    David R. Hill; Andrew Pearce; Brian Wyvill

    1988-01-01

    This paper is concerned with the problem of animating computer drawn images of speaking human characters, and particularly with the problem of reducing the cost of adequate lip synchronisation. Since the method is based upon the use of speech synthesis by rules, extended to manipulate facial parameters, and there is also a need to gather generalised data about facial expressions

  19. NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database

    Microsoft Academic Search

    C. Jankowski; A. Kalyanswamy; S. Basson; J. Spitz

    1990-01-01

    The creation of the network TIMIT (NTIMIT) database, which is the result of transmitting the TIMIT database over the telephone network, is described. A brief description of the TIMIT database is given, including characteristics useful for speech analysis and recognition. The hardware and software required for the transmission of the database is described. The geographic distribution of the TIMIT utterances

  20. Pulse Vector-Excitation Speech Encoder

    NASA Technical Reports Server (NTRS)

    Davidson, Grant; Gersho, Allen

    1989-01-01

    Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.

  1. Phrase-programmable digital speech system

    SciTech Connect

    Raymond, W.J.; Morgan, R.L.; Miller, R.L.

    1987-01-27

    This patent describes a phrase speaking computer system having a programmable digital computer and a speech processor, the speech processor comprising: a voice synthesizer; a read/write speech data segment memory; a read/write command memory; control processor means including processor control programs and logic connecting to the memories and to the voice synthesizer. It is arranged to scan the command memory and to respond to command data entries stored therein by transferring corresponding speech data segments from the speech data segment memory to the voice synthesizer; data conveyance means, connecting the computer to the command memory and the speech data segment memory, for transferring the command data entries supplied by the computer into the command memory and for transferring the speech data segments supplied by the computer into the speech data segment memory; and an enable signal line connecting the computer to the speech processor and arranged to initiate the operation of the processor control programs and logic when the enable signal line is enabled by the computer; the programmable computer including speech control programs controlling the operation of the computer including data conveyance command sequences that cause the computer to supply command data entries to the data conveyance means and speech processor enabling command sequences that cause computer to energize the enable signal line.

  2. Adaptive Redundant Speech Transmission over Wireless Multimedia Sensor Networks Based on Estimation of Perceived Speech Quality

    PubMed Central

    Kang, Jin Ah; Kim, Hong Kook

    2011-01-01

    An adaptive redundant speech transmission (ARST) approach to improve the perceived speech quality (PSQ) of speech streaming applications over wireless multimedia sensor networks (WMSNs) is proposed in this paper. The proposed approach estimates the PSQ as well as the packet loss rate (PLR) from the received speech data. Subsequently, it decides whether the transmission of redundant speech data (RSD) is required in order to assist a speech decoder to reconstruct lost speech signals for high PLRs. According to the decision, the proposed ARST approach controls the RSD transmission, then it optimizes the bitrate of speech coding to encode the current speech data (CSD) and RSD bitstream in order to maintain the speech quality under packet loss conditions. The effectiveness of the proposed ARST approach is then demonstrated using the adaptive multirate-narrowband (AMR-NB) speech codec and ITU-T Recommendation P.563 as a scalable speech codec and the PSQ estimation, respectively. It is shown from the experiments that a speech streaming application employing the proposed ARST approach significantly improves speech quality under packet loss conditions in WMSNs. PMID:22164086

  3. Contrastive Analysis and Impromptu Speech.

    ERIC Educational Resources Information Center

    Sajavaara, Kari

    Naturalistic impromptu speech is difficult for researchers to analyze because it is difficult to observe. Further, it is impossible to create two similar communicative situations across languages for contrastive analysis. The experimental compromise is the extension of traditional contrastive methodology from elements of grammar to elements of…

  4. DISTRIBUTED SPEECH RECOGNITION WIRA GUNAWAN

    E-print Network

    Hasegawa-Johnson, Mark

    like to thank Professor Mark Hasegawa-Johnson for giving me the opportunity to do research in speech.2.6 Autoregressive modeling 16 3.2.7 LPC cepstrum coefficients 17 3.3 Dynamic Time Warping (DTW) 17 3.3.1 Time technology to enable customers to obtain weather information, stock quotes, business news, sports news

  5. Speech Errors across the Lifespan

    ERIC Educational Resources Information Center

    Vousden, Janet I.; Maylor, Elizabeth A.

    2006-01-01

    Dell, Burger, and Svec (1997) proposed that the proportion of speech errors classified as anticipations (e.g., "moot and mouth") can be predicted solely from the overall error rate, such that the greater the error rate, the lower the anticipatory proportion (AP) of errors. We report a study examining whether this effect applies to changes in error…

  6. Perception of the speech code

    Microsoft Academic Search

    A. M. Liberman; F. S. Cooper; D. P. Shankweiler; M. Studdert-Kennedy

    1967-01-01

    Man could not perceive speech well if each phoneme were cued by a unit sound. In fact, many phonemes are encoded so that a single acoustic cue carries information in parallel about successive phonemic segments. This reduces the rate at which discrete sounds must be perceived, but at the price of a complex relation between cue and phoneme: cues vary

  7. Inner Speech Impairments in Autism

    ERIC Educational Resources Information Center

    Whitehouse, Andrew J. O.; Maybery, Murray T.; Durkin, Kevin

    2006-01-01

    Background: Three experiments investigated the role of inner speech deficit in cognitive performances of children with autism. Methods: Experiment 1 compared children with autism with ability-matched controls on a verbal recall task presenting pictures and words. Experiment 2 used pictures for which the typical names were either single syllable or…

  8. Speech Research. Interim Scientific Report.

    ERIC Educational Resources Information Center

    Cooper, Franklin S.

    The status and progress of several studies dealing with the nature of speech, instrumentation for its investigation, and instrumentation for practical applications is reported on. The period of January 1 through June 30, 1969 is covered. Extended reports and manuscripts cover the following topics: programing for the Glace-Holmes synthesizer,…

  9. Speech enhancement using psychoacoustic criteria

    Microsoft Academic Search

    D. Tsoukalas; M. Paraskevas; J. Mourjopoulos

    1993-01-01

    The technique uses the auditory masking threshold to extract information for the audible noise components. Those components are then removed using adaptive nonlinear spectral modification. The main advantage of such an approach is that the speech signal is not affected by processing. In addition, very little information on the features of the noise is required. The proposed method was found

  10. Speech based automatic lie detection

    Microsoft Academic Search

    M. E. Gadallah; Matar A. Matar; Ayman F. Algezawi

    1999-01-01

    This work studies the effect of the emotions that is experienced due to a guilt situation on different vocal parameters in an attempt to identify whether or not the suspect is lying. The homomorphic speech processing is applied to extract the vocal parameters related to the source excitation such as: pitch, pitch power and vowel duration and those related to

  11. Distance measures for speech processing

    Microsoft Academic Search

    J. Markel

    1976-01-01

    The properties and interrelationships among four measures of distance in speech processing are theoretically and experimentally discussed. The root mean square (rms) log spectral distance, cepstral distance, likelihood ratio (minimum residual principle or delta coding (DELCO) algorithm), and a cosh measure (based upon two nonsymmetrical likelihood ratios) are considered. It is shown that the cepstral measure bounds the rms log

  12. Embedding speech into virtual realities

    NASA Technical Reports Server (NTRS)

    Bohn, Christian-Arved; Krueger, Wolfgang

    1993-01-01

    In this work a speaker-independent speech recognition system is presented, which is suitable for implementation in Virtual Reality applications. The use of an artificial neural network in connection with a special compression of the acoustic input leads to a system, which is robust, fast, easy to use and needs no additional hardware, beside a common VR-equipment.

  13. Temporal characteristics of speech: the effect of age and speech style.

    PubMed

    Bóna, Judit

    2014-08-01

    Aging affects temporal characteristics of speech. It is still a question how these changes occur in different speech styles which require various cognitive skills. In this paper speech rate, articulation rate, and pauses of 20 young and 20 old speakers are analyzed in four speech styles: spontaneous narrative, narrative recalls, a three-participant conversation, and reading aloud. Results show that age has a significant effect only on speech rate, articulation rate, and frequency of pauses. Speech style has a higher effect on temporal parameters than speakers' age. PMID:25096134

  14. Applying nonlinear dynamics features for speech-based fatigue detection

    Microsoft Academic Search

    Jarek Krajewski; David Sommer; Thomas Schnupp; Tom Laufenberg; Christian Heinze; Martin Golz

    2010-01-01

    This paper describes a speech signal processing method to measure fatigue from speech. The advantages of this realtime approach are that obtaining speech data is non obtrusive, free from sensor application and calibration efforts. Applying methods of Non Linear Dynamics(NLD) provides additional information regarding the dynamics and structure of fatigue speech comparing to the commonly applied speech emotion recognition feature

  15. Segmenting Words from Natural Speech: Subsegmental Variation in Segmental Cues

    ERIC Educational Resources Information Center

    Rytting, C. Anton; Brew, Chris; Fosler-Lussier, Eric

    2010-01-01

    Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We…

  16. THE COMPREHENSION OF RAPID SPEECH BY THE BLIND, PART III.

    ERIC Educational Resources Information Center

    FOULKE, EMERSON

    A REVIEW OF THE RESEARCH ON THE COMPREHENSION OF RAPID SPEECH BY THE BLIND IDENTIFIES FIVE METHODS OF SPEECH COMPRESSION--SPEECH CHANGING, ELECTROMECHANICAL SAMPLING, COMPUTER SAMPLING, SPEECH SYNTHESIS, AND FREQUENCY DIVIDING WITH THE HARMONIC COMPRESSOR. THE SPEECH CHANGING AND ELECTROMECHANICAL SAMPLING METHODS AND THE NECESSARY APPARATUS HAVE…

  17. Narrowband to wideband conversion of speech using GMM based transformation

    Microsoft Academic Search

    Kun-Youl Park; Hyung Soon Kim

    2000-01-01

    Reconstruction of wideband speech from its narrowband version is an attractive issue, since it can enhance the speech quality without modifying the existing communication networks. This paper proposes a new recovery method of wideband speech from narrowband speech. In the proposed method, the narrowband spectral envelope of input speech is transformed to a wideband spectral envelope based on the Gaussian

  18. Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research

    Microsoft Academic Search

    Eric Chang; Yu Shi; Jian-Lai Zhou; Chao Huang

    2001-01-01

    The necessity of gathering data has been an impediment for researchers and students who are interested in getting started in the fields related to speech recognition. We are proposing a new approach of distributing data that is designed to quickly help researchers and students achieve a set of baseline results to build upon. Furthermore, by leveraging publicly available programs, all

  19. Brain activation abnormalities during speech and non-speech in stuttering speakers

    PubMed Central

    Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M.J.; Ludlow, Christy L.

    2009-01-01

    Although stuttering is regarded as a speech-specific disorder, there is a growing body of evidence suggesting that subtle abnormalities in the motor planning and execution of non-speech gestures exist in stuttering individuals. We hypothesized that people who stutter (PWS) would differ from fluent controls in their neural responses during motor planning and execution of both speech and non-speech gestures that had auditory targets. Using fMRI with sparse sampling, separate BOLD responses were measured for perception, planning, and fluent production of speech and non-speech vocal tract gestures. During both speech and non-speech perception and planning, PWS had less activation in the frontal and temporoparietal regions relative to controls. During speech and non-speech production, PWS had less activation than the controls in the left superior temporal gyrus (STG) and the left pre-motor areas (BA 6) but greater activation in the right STG, bilateral Heschl’s gyrus (HG), insula, putamen, and precentral motor regions (BA 4). Differences in brain activation patterns between PWS and controls were greatest in the females and less apparent in males. In conclusion, similar differences in PWS from the controls were found during speech and non-speech; during perception and planning they had reduced activation while during production they had increased activity in the auditory area on the right and decreased activation in the left sensorimotor regions. These results demonstrated that neural activation differences in PWS are not speech-specific. PMID:19401143

  20. Speech and language delay in children.

    PubMed

    McLaughlin, Maura R

    2011-05-15

    Speech and language delay in children is associated with increased difficulty with reading, writing, attention, and socialization. Although physicians should be alert to parental concerns and to whether children are meeting expected developmental milestones, there currently is insufficient evidence to recommend for or against routine use of formal screening instruments in primary care to detect speech and language delay. In children not meeting the expected milestones for speech and language, a comprehensive developmental evaluation is essential, because atypical language development can be a secondary characteristic of other physical and developmental problems that may first manifest as language problems. Types of primary speech and language delay include developmental speech and language delay, expressive language disorder, and receptive language disorder. Secondary speech and language delays are attributable to another condition such as hearing loss, intellectual disability, autism spectrum disorder, physical speech problems, or selective mutism. When speech and language delay is suspected, the primary care physician should discuss this concern with the parents and recommend referral to a speech-language pathologist and an audiologist. There is good evidence that speech-language therapy is helpful, particularly for children with expressive language disorder. PMID:21568252

  1. Loss tolerant speech decoder for telecommunications

    NASA Technical Reports Server (NTRS)

    Prieto, Jr., Jaime L. (Inventor)

    1999-01-01

    A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.

  2. Some articulatory details of emotional speech

    NASA Astrophysics Data System (ADS)

    Lee, Sungbok; Yildirim, Serdar; Bulut, Murtaza; Kazemzadeh, Abe; Narayanan, Shrikanth

    2005-09-01

    Differences in speech articulation among four emotion types, neutral, anger, sadness, and happiness are investigated by analyzing tongue tip, jaw, and lip movement data collected from one male and one female speaker of American English. The data were collected using an electromagnetic articulography (EMA) system while subjects produce simulated emotional speech. Pitch, root-mean-square (rms) energy and the first three formants were estimated for vowel segments. For both speakers, angry speech exhibited the largest rms energy and largest articulatory activity in terms of displacement range and movement speed. Happy speech is characterized by largest pitch variability. It has higher rms energy than neutral speech but articulatory activity is rather comparable to, or less than, neutral speech. That is, happy speech is more prominent in voicing activity than in articulation. Sad speech exhibits longest sentence duration and lower rms energy. However, its articulatory activity is no less than neutral speech. Interestingly, for the male speaker, articulation for vowels in sad speech is consistently more peripheral (i.e., more forwarded displacements) when compared to other emotions. However, this does not hold for female subject. These and other results will be discussed in detail with associated acoustics and perceived emotional qualities. [Work supported by NIH.

  3. The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses

    ERIC Educational Resources Information Center

    Adank, Patti

    2012-01-01

    The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…

  4. Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based Smart Phones and Tablets

    E-print Network

    Speech Enhancement for Android (SEA): A Speech Processing Demonstration Tool for Android Based presents a speech processing platform which can be used to demonstrate and investigate speech enhancement methods. This platform is called Speech Enhancement for Android (SEA), and has been developed

  5. SYNTHETIC VISUAL SPEECH DRIVEN FROM AUDITORY SPEECH Eva Agelfors, Jonas Beskow, Bjrn Granstrm, Magnus Lundeberg, Giampiero Salvi,

    E-print Network

    Beskow, Jonas

    SYNTHETIC VISUAL SPEECH DRIVEN FROM AUDITORY SPEECH Eva Agelfors, Jonas Beskow, Björn Granström of Speech, Music and Hearing, KTH, Sweden {eva, beskow, bjorn, magnusl, giampi, kalle, tobias}@speech.kth.se www.speech.kth.se/teleface/ ABSTRACT We have developed two different methods for using auditory

  6. Gifts of Speech: Women's Speeches from Around the World

    NSDL National Science Digital Library

    Leon, Liz Kent

    2012-09-13

    The Gifts of Speech site brings together speeches given by women from all around the world. The site is under the direction of Liz Linton Kent Leon, who is the electronic resources librarian at Sweet Briar College. First-time users may wish to click on the How To area to learn how to navigate the site. Of course, the FAQ area is a great way to learn about the site as well, and it should not be missed as it tells about the origin story for the site. In the Collections area, visitors can listen in to all of the Nobel Lectures delivered by female recipients and look at a list of the top 100 speeches in American history as determined by a group of researchers at the University of Wisconsin-Madison and Texas A & M University. Users will also want to use the Browse area to look over talks by women from Robin Abrams to Begum Kahaleda Zia, the former prime minster of the People's Republic of Bangladesh.

  7. Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds

    PubMed Central

    Agnew, Z.K.; McGettigan, C.; Scott, S.K.

    2012-01-01

    Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557

  8. The application of naturalistic conversation training to speech production in children with speech disabilities.

    PubMed Central

    Camarata, S

    1993-01-01

    The purpose of this experiment was to test the effectiveness of including speech production into naturalistic conversation training for 2 children with speech production disabilities. A multiple baseline design across behaviors (target phonemes) and across subjects (for the same phoneme) indicated that naturalistic conversation training resulted in improved spontaneous speech production. The implications of these findings are discussed relative to existing models of speech production training and other aspects of communication disorders. PMID:8331014

  9. Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments

    Microsoft Academic Search

    David Malah; Richard V. Cox; Anthony J. Accardi

    1999-01-01

    Speech enhancement algorithms which are based on estimating the short-time spectral amplitude of the clean speech have better performance when a soft-decision gain modification, depending on the a priori probability of speech absence, is used. In reported works a fixed probability, q, is assumed. Since speech is non-stationary and may not be present in every frequency bin when voiced, we

  10. [Ergonomical study on Chinese speech warning].

    PubMed

    Han, D; Zhou, C; Liu, Y; Zhai, Y

    1998-02-01

    Ergonomical experiments on speech warning under noise background were carried out in 40 healthy males, aged 20-33. Through the determination of auditory reaction time to the Chinese speech warning under dual-tasks and the subjective evalution of the suitable time length of main warning voice by the subject, the optimum parameters of Chinese speech warning in accordance with space ergonomics were determined. It was found that: suitable time length of main warning voice is 0.35-0.55s, main interval is 0.15-0.35s, speech speed is 4-6 word/s, and sentence interval is 0.2-0.4s. Meanwhile, the analysis of heart rate (HR) and heart rate variability (HRV) demonstrated that the speech warning using aforementioned parameters didn't increase the operator's work load. The results can serve as the objective ergonomical basis and the evaluation criterion for design of speech warning in manned space vehicle. PMID:11541261

  11. Primary Progressive Aphasia and Apraxia of Speech

    PubMed Central

    Jung, Youngsin; Duffy, Joseph R.; Josephs, Keith A.

    2014-01-01

    Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: non-fluent/agrammatic, semantic, and logopenic variants of primary progressive aphasia. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. Here, we review clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech. The distinctions among these disorders will be crucial since accurate diagnosis will be important from a prognostic and therapeutic standpoint. PMID:24234355

  12. Adding Speech, Language, and Hearing Benefits to Your Policy

    MedlinePLUS

    ... for 182,000 members and affiliates who are audiologists; speech-language pathologists; speech, language, and hearing scientists; ... students. Read more Connect with ASHA Information For Audiologists Speech Language Pathologists Students Faculty Contact Us The ...

  13. PERFORMANCE OF NONLINEAR SPEECH ENHANCEMENT USING PHASE SPACE RECONSTRUCTION

    E-print Network

    Povinelli, Richard J.

    PERFORMANCE OF NONLINEAR SPEECH ENHANCEMENT USING PHASE SPACE RECONSTRUCTION Michael T. Johnson enhancement methods. The proposed nonlinear methods are compared with traditional speech enhancement-Malah filtering, as had been suggested by previous studies. 1. INTRODUCTION Speech enhancement methods endeavor

  14. Binaural model-based speech intelligibility enhancement and assessment in

    E-print Network

    #12;Binaural model-based speech intelligibility enhancement and assessment in hearing aids beamforming and the effect on binaural cues and speech intelligibility . . . . . . . . . . 31 2.3.4 Cepstral smoothing of masks . . . . . . . . . . . . . . . . . . 35 2.4 Binaural CASA speech

  15. Concept-to-speech synthesis by phonological structure matching 

    E-print Network

    Taylor, Paul

    2000-04-15

    This paper presents a new way of generating synthetic-speech waveforms from a linguistic description. The algorithm is presented as a proposed solution to the speech-generation problem in a concept-to-speech system. Off-line, ...

  16. Apraxia of speech: an overview.

    PubMed

    Ogar, Jennifer; Slama, Hilary; Dronkers, Nina; Amici, Serena; Gorno-Tempini, Maria Luisa

    2005-12-01

    Apraxia of speech (AOS) is a motor speech disorder that can occur in the absence of aphasia or dysarthria. AOS has been the subject of some controversy since the disorder was first named and described by Darley and his Mayo Clinic colleagues in the 1960s. A recent revival of interest in AOS is due in part to the fact that it is often the first symptom of neurodegenerative diseases, such as primary progressive aphasia and corticobasal degeneration. This article will provide a brief review of terminology associated with AOS, its clinical hallmarks and neuroanatomical correlates. Current models of motor programming will also be addressed as they relate to AOS and finally, typical treatment strategies used in rehabilitating the articulation and prosody deficits associated with AOS will be summarized. PMID:16393756

  17. Headphone localization of speech stimuli

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1991-01-01

    Recently, three dimensional acoustic display systems have been developed that synthesize virtual sound sources over headphones based on filtering by Head-Related Transfer Functions (HRTFs), the direction-dependent spectral changes caused primarily by the outer ears. Here, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with non-individualized HRTFs. About half of the subjects 'pulled' their judgements toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgements; 15 to 46 percent of stimuli were heard inside the head with the shortest estimates near the median plane. The results infer that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized RTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  18. Language processing for speech understanding

    NASA Astrophysics Data System (ADS)

    Woods, W. A.

    1983-07-01

    This report considers language understanding techniques and control strategies that can be applied to provide higher-level support to aid in the understanding of spoken utterances. The discussion is illustrated with concepts and examples from the BBN speech understanding system, HWIM (Hear What I Mean). The HWIM system was conceived as an assistant to a travel budget manager, a system that would store information about planned and taken trips, travel budgets and their planning. The system was able to respond to commands and answer questions spoken into a microphone, and was able to synthesize spoken responses as output. HWIM was a prototype system used to drive speech understanding research. It used a phonetic-based approach, with no speaker training, a large vocabulary, and a relatively unconstraining English grammar. Discussed here is the control structure of the HWIM and the parsing algorithm used to parse sentences from the middle-out, using an ATN grammar.

  19. Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling

    E-print Network

    Glass, James R.

    Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling filter for speech enhancement, hidden Markov models for speech reconstruction, and speaker reserved. Keywords: Speech enhancement; Speaker modeling; Speech recognition; Missing-feature theory

  20. A Serial Prediction Component for Speech Timing

    Microsoft Academic Search

    Eric Keller; Brigitte Zellner-keller

    Durational serial interactions are not generallyincorporated into contemporary predictivemodels of timing for speech synthesis. In thisstudy, an anti-correlational factor applied atthe syllable level was identified for syllablelags occurring within roughly 500 ms. Asapplied to synthetic speech, a strongly anticorrelationaleffect appears to lend a pleasant,"swingy" effect to the speech output, while theabsence of such an effect results in a more"regimented" style

  1. Speech enhancement based on temporal processing

    Microsoft Academic Search

    Hynek Hermansky; Eric A. Wan; Carlos Avendano

    1995-01-01

    Finite impulse response (FIR) Wiener-like filters are applied to time trajectories of the cubic-root compressed short-term power spectrum of noisy speech recorded over cellular telephone communications. Informal listenings indicate that the technique brings a noticeable improvement to the quality of processed noisy speech while not causing any significant degradation to clean speech. Alternative filter structures are being investigated as well

  2. Speech enhancement using linear prediction residual

    Microsoft Academic Search

    B. Yegnanarayana; Carlos Avendaño; Hynek Hermansky; P. Satyanarayana Murthy

    1999-01-01

    Abstract In this paper we propose,a method,for enhancement,of speech in the presence of additive noise. The objective is to selectively enhance,the high signal-to-noise ratio (SNR) regions in the noisy speech in the temporal,and spectral do- mains, without causing significant distortion in the resulting enhanced speech. This is proposed to be done at three diÄerent levels. (a) At the gross level,

  3. Speech enhancement based conceptually on auditory evidence

    Microsoft Academic Search

    Yan Ming Cheng; Douglas O'Shaughnessy

    1991-01-01

    A new idea, enhancing speech based on auditory evidence, is explored for the problem of enhancing speech degraded by stationary and nonstationary additive white noise. Distinguishing different objectives for heavy and light noise interference, two related algorithms are developed. For speech degraded by heavy noise, the improvement in signal-to-noise ratio (SNR) is as high as 12 dB; for lightly noisy

  4. Subjective Comparison of Speech Enhancement Algorithms

    Microsoft Academic Search

    Yi Hu; Philipos C. Loizou

    2006-01-01

    We report on the development of a noisy speech corpus suitable for evaluation of speech enhancement algorithms. This corpus is used for the subjective evaluation of 13 speech enhancement methods encompassing four classes of algorithms: spectral subtractive, subspace, statistical-model based and Wiener algorithms. The subjective evaluation was performed by Dynastat, Inc. using the ITU-T P.835 methodology designed to evaluate the

  5. Phonetically sensitive discriminants for improved speech recognition

    Microsoft Academic Search

    G. R. Doddington

    1989-01-01

    A phonetically sensitive transformation of speech features has yielded significant improvement in speech-recognition performance. This (linear) transformation of the speech feature vector is designed to discriminate against out-of-class confusion data and is a function of phonetic state. Evaluation of the technique on the TI\\/NBS connected digit database demonstrates word (sentence) error rates of 0.5% (1.5%) for unknown-length strings and 0.2%

  6. Network Speech Systems Technology Program

    Microsoft Academic Search

    C. J. Weinstein

    1980-01-01

    This report documents work performed during FY 1980 on the DCA-sponsored Network Speech Systems Technology Program. The areas of work reported are: (1) communication systems studies in Demand-Assignment Multiple Access (DAMA), voice\\/data integration, and adaptive routing, in support of the evolving Defense Communications System (DCS) and Defense Switched Network (DSN); (2) a satellite\\/terrestrial integration design study including the functional design

  7. Speed and Accuracy of Rapid Speech Output by Adolescents with Residual Speech Sound Errors Including Rhotics

    ERIC Educational Resources Information Center

    Preston, Jonathan L.; Edwards, Mary Louise

    2009-01-01

    Children with residual speech sound errors are often underserved clinically, yet there has been a lack of recent research elucidating the specific deficits in this population. Adolescents aged 10-14 with residual speech sound errors (RE) that included rhotics were compared to normally speaking peers on tasks assessing speed and accuracy of speech

  8. Speech and Language Skills of Parents of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry

    2007-01-01

    Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…

  9. Adaptive coding and optimal speech storage in digital speech interpolation systems

    Microsoft Academic Search

    P. S. Tiwari; K. C. Harit

    1990-01-01

    One of the important issue in speech interpolation is that of speech freeze-out which becomes a very serious consideration with multihop networks. The paper discusses the implementation of two such techniques namely speech storage and variable bitrate strategies with their independent and combined effects on the performance of multihop DSI networks. The trade-off between the effective increased channel capacity and

  10. A Self-Transcribing Speech Corpus: Collecting Continuous Speech with an Online Educational Game

    Microsoft Academic Search

    Alexander Gruenstein; Ian McGraw; Andrew Sutherland

    We describe a novel approach to collecting orthographically transcribed continuous speech data through the use of an online educational game called Voice Scatter, in which players study flashcards by using speech to match terms with their definitions. We analyze a corpus of 30,938 utterances, totaling 27.63 hours of speech, collected during the first 22 days that Voice Scat- ter was

  11. Stability and Composition of Functional Synergies for Speech Movements in Children with Developmental Speech Disorders

    ERIC Educational Resources Information Center

    Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

    2011-01-01

    The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa(/spa[image omitted]/) and paas(/pa[image omitted]s/) by 10 6- to 9-year-olds with developmental speech

  12. Impaired Speech Perception in Poor Readers: Evidence from Hearing and Speech Reading

    Microsoft Academic Search

    Beatrice de Gelder; Jean Vroomen

    1998-01-01

    The performance of 14 poor readers on an audiovisual speech perception task was compared with 14 normal subjects matched on chronological age (CA) and 14 subjects matched on reading age (RA). The task consisted of identifying synthetic speech varying in place of articulation on an acoustic 9-point continuum between \\/ba\\/ and \\/da\\/ (Massaro & Cohen, 1983). The acoustic speech events

  13. Shall we mix synthetic speech and human speech?: impact on users' performance, perception, and attitude

    Microsoft Academic Search

    Li Gong; Jennifer Lai

    2001-01-01

    Because it is impractical to record human voice for ever- changing dynamic content such as email messages and news, many commercial speech applications use human speech for fixed prompts and synthetic speech (TTS) for the dynamic content. However, this mixing approach may not be optimal from a consistency perspective. A 2-condition between-group experiment (N = 24) was conducted to compare

  14. Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2012

    2012-01-01

    The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be protected by the…

  15. Construction of a Rated Speech Corpus of L2 Learners' Spontaneous Speech

    ERIC Educational Resources Information Center

    Yoon, Su-Youn; Pierce, Lisa; Huensch, Amanda; Juul, Eric; Perkins, Samantha; Sproat, Richard; Hasegawa-Johnson, Mark

    2009-01-01

    This work reports on the construction of a rated database of spontaneous speech produced by second language (L2) learners of English. Spontaneous speech was collected from 28 L2 speakers representing six language backgrounds and five different proficiency levels. Speech was elicited using formats similar to that of the TOEFL iBT and the Speaking…

  16. LARGE-VOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES

    E-print Network

    Wang, Hsin-Min

    LARGE-VOCABULARY CHINESE TEXT/SPEECH INFORMATION RETRIEVAL USING MANDARIN SPEECH QUERIES Bo-ren Bai. This paper deals with the problem of Chinese text and Mandarin speech information retrieval with Mandarin extensively studied in recent years [1-2]. This paper deals with the problem of Chinese text and Mandarin

  17. Private and Inner Speech and the Regulation of Social Speech Communication

    ERIC Educational Resources Information Center

    San Martin Martinez, Conchi; Boada i Calbet, Humbert; Feigenbaum, Peter

    2011-01-01

    To further investigate the possible regulatory role of private and inner speech in the context of referential social speech communications, a set of clear and systematically applied measures is needed. This study addresses this need by introducing a rigorous method for identifying private speech and certain sharply defined instances of inaudible…

  18. Speaking of Speech with the Disciplines: Collaborative Discussions about Collaborative Speech

    ERIC Educational Resources Information Center

    Compton, Josh

    2010-01-01

    As Lecturer of Speech in the Institute for Writing and Rhetoric at Dartmouth College, I have joined an ongoing conversation about speech that spans disciplines. This article takes a step back from looking at communication across the curriculum as a program and instead looks at one of the earliest stages of the process--conversations about speech

  19. Speech Analysis and Synthesis by Linear Prediction of the Speech Wave

    Microsoft Academic Search

    B. S. Atal; SUZANNE L. HANAUER

    1971-01-01

    We describe a procedure for efficient encoding of the speech wave by representing it in terms of time-varying parameters related to the transfer function of the vocal tract and the characteristics of the excitation. The speech wave, sampled at 10 kHz, is analyzed by predicting the present speech sample as a linear combination of the 12 previous samples. The 12

  20. A High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition

    E-print Network

    Sollich, Peter

    1 A High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition Abstract-- This work proposes a novel support vector machine (SVM) based robust automatic speech-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency

  1. GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features

    E-print Network

    Paris-Sud XI, Université de

    GMM Mapping Of Visual Features of Cued Speech From Speech Spectral Features Zuheng Ming1,2 , Denis method based on GMM modeling to map the acoustic speech spectral features to visual features of Cued is innovative and different with the classic text-to-visual approach. Two different training methods for GMM

  2. Vocoders and Speech Perception: Uses of Computer-Based Speech Analysis-Synthesis in Stimulus Generation.

    ERIC Educational Resources Information Center

    Tierney, Joseph; Mack, Molly

    1987-01-01

    Stimuli used in research on the perception of the speech signal have often been obtained from simple filtering and distortion of the speech waveform, sometimes accompanied by noise. However, for more complex stimulus generation, the parameters of speech can be manipulated, after analysis and before synthesis, using various types of algorithms to…

  3. Dual stream speech recognition using articulatory syllable models

    Microsoft Academic Search

    Antti Puurula; Dirk Van Compernolle

    2010-01-01

    Recent theoretical developments in neuroscience suggest that sublexical speech processing occurs via two parallel processing\\u000a pathways. According to this Dual Stream Model of Speech Processing speech is processed both as sequences of speech sounds\\u000a and articulations. We attempt to revise the “beads-on-a-string” paradigm of Hidden Markov Models in Automatic Speech Recognition\\u000a (ASR) by implementing a system for dual stream speech

  4. Speech Enhancement based on Compressive Sensing Algorithm

    NASA Astrophysics Data System (ADS)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  5. SESSION INDEPENDENT NON-AUDIBLE SPEECH RECOGNITION USING SURFACE ELECTROMYOGRAPHY

    E-print Network

    Schultz, Tanja

    SESSION INDEPENDENT NON-AUDIBLE SPEECH RECOGNITION USING SURFACE ELECTROMYOGRAPHY Lena Maier in surface electromyography based speech recognition ensue from repositioning electrodes between record- ing

  6. Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

    ERIC Educational Resources Information Center

    Maas, Edwin; Mailend, Marja-Liisa

    2012-01-01

    Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…

  7. Speech perception as an active cognitive process.

    PubMed

    Heald, Shannon L M; Nusbaum, Howard C

    2014-01-01

    One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID:24672438

  8. Speech perception as an active cognitive process

    PubMed Central

    Heald, Shannon L. M.; Nusbaum, Howard C.

    2014-01-01

    One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID:24672438

  9. Speech discrimination after early exposure to pulsed-noise or speech.

    PubMed

    Ranasinghe, Kamalini G; Carraway, Ryan S; Borland, Michael S; Moreno, Nicole A; Hanacik, Elizabeth A; Miller, Robert S; Kilgard, Michael P

    2012-07-01

    Early experience of structured inputs and complex sound features generate lasting changes in tonotopy and receptive field properties of primary auditory cortex (A1). In this study we tested whether these changes are severe enough to alter neural representations and behavioral discrimination of speech. We exposed two groups of rat pups during the critical period of auditory development to pulsed-noise or speech. Both groups of rats were trained to discriminate speech sounds when they were young adults, and anesthetized neural responses were recorded from A1. The representation of speech in A1 and behavioral discrimination of speech remained robust to altered spectral and temporal characteristics of A1 neurons after pulsed-noise exposure. Exposure to passive speech during early development provided no added advantage in speech sound processing. Speech training increased A1 neuronal firing rate for speech stimuli in naïve rats, but did not increase responses in rats that experienced early exposure to pulsed-noise or speech. Our results suggest that speech sound processing is resistant to changes in simple neural response properties caused by manipulating early acoustic environment. PMID:22575207

  10. The Effects of Stimulus Variability on the Perceptual Learning of Speech and Non-Speech Stimuli

    PubMed Central

    Banai, Karen; Amitay, Sygal

    2015-01-01

    Previous studies suggest fundamental differences between the perceptual learning of speech and non-speech stimuli. One major difference is in the way variability in the training set affects learning and its generalization to untrained stimuli: training-set variability appears to facilitate speech learning, while slowing or altogether extinguishing non-speech auditory learning. We asked whether the reason for this apparent difference is a consequence of the very different methodologies used in speech and non-speech studies. We hypothesized that speech and non-speech training would result in a similar pattern of learning if they were trained using the same training regimen. We used a 2 (random vs. blocked pre- and post-testing) × 2 (random vs. blocked training) × 2 (speech vs. non-speech discrimination task) study design, yielding 8 training groups. A further 2 groups acted as untrained controls, tested with either random or blocked stimuli. The speech task required syllable discrimination along 4 minimal-pair continua (e.g., bee-dee), and the non-speech stimuli required duration discrimination around 4 base durations (e.g., 50 ms). Training and testing required listeners to pick the odd-one-out of three stimuli, two of which were the base duration or phoneme continuum endpoint and the third varied adaptively. Training was administered in 9 sessions of 640 trials each, spread over 4–8 weeks. Significant learning was only observed following speech training, with similar learning rates and full generalization regardless of whether training used random or blocked schedules. No learning was observed for duration discrimination with either training regimen. We therefore conclude that the two stimulus classes respond differently to the same training regimen. A reasonable interpretation of the findings is that speech is perceived categorically, enabling learning in either paradigm, while the different base durations are not well-enough differentiated to allow for categorization, resulting in disruption to learning. PMID:25714552

  11. Localization of Sublexical Speech Perception Components

    ERIC Educational Resources Information Center

    Turkeltaub, Peter E.; Coslett, H. Branch

    2010-01-01

    Models of speech perception are in general agreement with respect to the major cortical regions involved, but lack precision with regard to localization and lateralization of processing units. To refine these models we conducted two Activation Likelihood Estimation (ALE) meta-analyses of the neuroimaging literature on sublexical speech perception.…

  12. Reliability of Speech Diadochokinetic Test Measurement

    ERIC Educational Resources Information Center

    Gadesmann, Miriam; Miller, Nick

    2008-01-01

    Background: Measures of articulatory diadochokinesis (DDK) are widely used in the assessment of motor speech disorders and they play a role in detecting abnormality, monitoring speech performance changes and classifying syndromes. Although in clinical practice DDK is generally measured perceptually, without support from instrumental methods that…

  13. Repeated Speech Errors: Evidence for Learning

    ERIC Educational Resources Information Center

    Humphreys, Karin R.; Menzies, Heather; Lake, Johanna K.

    2010-01-01

    Three experiments elicited phonological speech errors using the SLIP procedure to investigate whether there is a tendency for speech errors on specific words to reoccur, and whether this effect can be attributed to implicit learning of an incorrect mapping from lemma to phonology for that word. In Experiment 1, when speakers made a phonological…

  14. Robust Speech Recognition Under Noisy Ambient

    E-print Network

    CHAPTER Robust Speech Recognition Under Noisy Ambient Conditions 6Kuldip K. Paliwal School ..................................................................................................................... 156 ABSTRACT Automatic speech recognition is critical in natural human-centric interfaces for ambient matching, model combination, speaker adaptation, microphone array. 6.1 INTRODUCTION Ambient intelligence

  15. Variability in the clear speech intelligibility advantage

    NASA Astrophysics Data System (ADS)

    Konopka, Kenneth; Smiljanic, Rajka; Bradlow, Ann

    2005-09-01

    The overall intelligibility advantage for sentences produced in clear versus conversational speech is well-documented. This study looked at recognition accuracy across words in early and late positions in semantically anomalous and meaningful sentences spoken in clear versus conversational speaking styles. For both sentence types, the results showed the expected overall intelligibility advantage for clear speech over conversational speech. For the semantically anomalous sentences, in both speaking styles, a decline in keyword identification rate was observed with words earlier in the sentence being more accurately recognized than words later in the sentence. Furthermore, the intelligibility advantage for clear over conversational speech remained relatively constant across word positions. For the meaningful sentences, the decline in keyword identification rate across word positions was observed for conversational speech only. Meaningful sentences spoken in clear speech yielded a high, relatively stable word identification rate across position-in-sentence, resulting in a larger clear speech intelligibility benefit for words late in the sentence than for words early in the sentence. These results suggest that for typical meaningful sentences, the acoustic-phonetic enhancements of clear speech and the availability of semantic-contextual information combine to ``boost'' the intelligibility of words in late sentence positions.

  16. Speech recognition with amplitude and frequency modulations

    Microsoft Academic Search

    Fan-Gang Zeng; Kaibao Nie; Ginger S. Stickney; Ying-Yee Kong; Michael Vongphoe; Ashish Bhargave; Chaogang Wei; Keli Cao

    2005-01-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited

  17. Speech Recognition with Primarily Temporal Cues

    Microsoft Academic Search

    Robert V. Shannon; Fan-Gang Zeng; Vivek Kamath; John Wygonski; Michael Ekelid

    1995-01-01

    Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants,

  18. Speech reinforcement based on partial masking effect

    Microsoft Academic Search

    Jong Won Shin; Yu Gwang Jin; Seung Seop Park; Nam Soo Kim

    2009-01-01

    Perceived quality of the speech signal deteriorates significantly in the presence of ambient noise. In this paper, based on the analysis that the partial masking effect is a main source of the quality degradation when interfering signals are present, we propose a novel approach to enhance the perceived quality of speech signal when the ambient noise cannot be directly controlled

  19. The Need for a Speech Corpus

    ERIC Educational Resources Information Center

    Campbell, Dermot F.; McDonnell, Ciaran; Meinardi, Marti; Richardson, Bunny

    2007-01-01

    This paper outlines the ongoing construction of a speech corpus for use by applied linguists and advanced EFL/ESL students. In the first part, sections 1-4, the need for improvements in the teaching of listening skills and pronunciation practice for EFL/ESL students is noted. It is argued that the use of authentic native-to-native speech is…

  20. Speech and Language Delays in Identical Twins.

    ERIC Educational Resources Information Center

    Bentley, Pat

    Following a literature review on speech and language development of twins, case studies are presented of six sets of identical twins screened for entrance into kindergarten. Five sets of the twins and one boy from the sixth set failed to pass the screening test, particularly the speech and language section, and were referred for therapy to correct…

  1. Intelligibility predictors and neural representation of speech

    Microsoft Academic Search

    Bryce E. Lobdell; Jont B. Allen; Mark Hasegawa-Johnson

    2011-01-01

    Intelligibility predictors tell us a great deal about human speech perception, in particular which acoustic factors strongly effect human behavior, and which do not. A particular intelligibility predictor, the Articulation Index (AI), is interesting because it models human behavior in noise, and its form has implications about representation of speech in the brain. Specifically, the Articulation Index implies that a

  2. Visual Speech Synthesis by Morphing Visemes

    E-print Network

    Ezzat, Tony

    1999-05-01

    We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a small set of images spanning a large range of mouth ...

  3. Speech Fluency in Fragile X Syndrome

    ERIC Educational Resources Information Center

    Van Borsel, John; Dor, Orianne; Rondal, Jean

    2008-01-01

    The present study investigated the dysfluencies in the speech of nine French speaking individuals with fragile X syndrome. Type, number, and loci of dysfluencies were analysed. The study confirms that dysfluencies are a common feature of the speech of individuals with fragile X syndrome but also indicates that the dysfluency pattern displayed is…

  4. The Neural Substrates of Infant Speech Perception

    ERIC Educational Resources Information Center

    Homae, Fumitaka; Watanabe, Hama; Taga, Gentaro

    2014-01-01

    Infants often pay special attention to speech sounds, and they appear to detect key features of these sounds. To investigate the neural foundation of speech perception in infants, we measured cortical activation using near-infrared spectroscopy. We presented the following three types of auditory stimuli while 3-month-old infants watched a silent…

  5. The Karlsruhe-Verbmobil speech recognition engine

    Microsoft Academic Search

    Michael Finke; Petra Geutner; Hermann Hild; Thomas Kemp; Klaus Ries; Martin Westphal

    1997-01-01

    Verbmobil, a German research project, aims at machine translation of spontaneous speech input. The ultimate goal is the development of a portable machine translator that will allow people to negotiate in their native language. Within this project the University of Karlsruhe has developed a speech recognition engine that has been evaluated on a yearly basis during the project and shows

  6. Hypnosis and the Reduction of Speech Anxiety.

    ERIC Educational Resources Information Center

    Barker, Larry L.; And Others

    The purposes of this paper are (1) to review the background and nature of hypnosis, (2) to synthesize research on hypnosis related to speech communication, and (3) to delineate and compare two potential techniques for reducing speech anxiety--hypnosis and systematic desensitization. Hypnosis has been defined as a mental state characterised by…

  7. Brain-computer interfaces for speech communication

    Microsoft Academic Search

    Jonathan S. Brumberg; Alfonso Nieto-Castanon; Philip R. Kennedy; Frank H. Guenther

    2010-01-01

    This paper briefly reviews current silent speech methodologies for normal and disabled individuals. Current techniques utilizing electromyographic (EMG) recordings of vocal tract movements are useful for physically healthy individuals but fail for tetraplegic individuals who do not have accurate voluntary control over the speech articulators. Alternative methods utilizing EMG from other body parts (e.g., hand, arm, or facial muscles) or

  8. Affective Speech Elicited With a Computer Game

    Microsoft Academic Search

    Tom Johnstone; Carien M. van Reekum; Kathryn Hird; Kim Kirsner; Klaus R. Scherer

    2005-01-01

    To determine the degree to which emotional changes in speech reflect factors other than arousal, such as valence, the authors used a computer game to induce natural emotional speech. Voice samples were elicited following game events that were either conducive or obstructive to the goal of winning and were accompanied by either pleasant or unpleasant sounds. Acoustic analysis of the

  9. European Portuguese MRI based speech production studies

    Microsoft Academic Search

    Paula Martins; Inês Carbone; Alda Pinto; Augusto Silva; António J. S. Teixeira

    2008-01-01

    Knowledge of the speech production mechanism is essential for the development of speech production models and theories. Magnetic resonance imaging delivers high quality images of soft tissues, has multiplanar capacity and allows for the visualization of the entire vocal tract. To our knowledge, there are no complete and systematic magnetic resonance imaging studies of European Portuguese production. In this study,

  10. TOWARDS AUTOMATIC SPEECH RECOGNITION IN ADVERSE ENVIRONMENTS

    Microsoft Academic Search

    D. Dimitriadis; N. Katsamanis; P. Maragos; G. Papandreou; V. Pitsikalis

    Some of our research efforts towards building Automatic Speech Recognition (ASR) systems designed to work in real-world conditions are presented. The methods we pro- pose exhibit improved performance in noisy environments and offer robustness against speaker variability. Advanced nonlinear signal processing techniques, modulation- and chaotic-based, are utilized for auditory feature extraction. The auditory features are complemented with visual speech cues

  11. Speech enhancement for hands-free terminals

    Microsoft Academic Search

    Nedelko GrbiC; Sven Nordholm; A. Johansson

    2001-01-01

    This paper discusses signal processing methods for speech extraction in use with voice communication applications such as personal digital assistants (PDA), mobile telephone terminals and personal computers. The user is distant from the device and thus the speech signal entering the device may be subject to reverberation and may be disturbed by background noise. The proposed structure consists of a

  12. Enhancing Speech Discrimination through Stimulus Repetition

    ERIC Educational Resources Information Center

    Holt, Rachael Frush

    2011-01-01

    Purpose: To evaluate the effects of sequential and alternating repetition on speech-sound discrimination. Method: Typically hearing adults' discrimination of 3 pairs of speech-sound contrasts was assessed at 3 signal-to-noise ratios using the change/no-change procedure. On change trials, the standard and comparison stimuli differ; on no-change…

  13. Calibration of Confidence Measures in Speech Recognition

    Microsoft Academic Search

    Dong Yu; Jinyu Li; Li Deng

    2011-01-01

    Most speech recognition applications in use today rely heavily on confidence measure for making optimal deci- sions. In this paper, we aim to answer the question: what can be done to improve the quality of confidence measure if we cannot modify the speech recognition engine? The answer provided in this paper is a post-processing step called confidence calibration, which can

  14. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    ERIC Educational Resources Information Center

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  15. Building Searchable Collections of Enterprise Speech Data.

    ERIC Educational Resources Information Center

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  16. Hate Speech and the First Amendment

    Microsoft Academic Search

    MICHAEL ISRAEL

    1999-01-01

    A cornerstone of democracy is the First Amendment's protection of free speech. The founding fathers saw this as contributing to democratic government. Ironically, contemporary free speech protects groups such as Nazis, White and Black supremacists, pornographers, gangster rappers, TV violence, and gratuitous film profiteers; in short, these are agents of disorder, and have practically nothing of discourse value. This article

  17. Speech offsets activate the right parietal cortex.

    PubMed

    Hamada, Takashi; Iwaki, Sunao; Kawano, Tsuneo

    2004-09-01

    Speech offsets, i.e., sudden transitions from continuous speech sound to silence, activated both hemispheres differently. In addition to peak activities in the bilateral temporal cortices at about 120 ms after the offsets, the right parietal cortex was activated later irrespective of the stimulated ear. The result was discussed in the context of auditory attention. PMID:15350281

  18. Speech vs. singing: infants choose happier sounds

    PubMed Central

    Corbeil, Marieve; Trehub, Sandra E.; Peretz, Isabelle

    2013-01-01

    Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4–13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119

  19. CLEFT PALATE. FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    ERIC Educational Resources Information Center

    RUTHERFORD, DAVID; WESTLAKE, HAROLD

    DESIGNED TO PROVIDE AN ESSENTIAL CORE OF INFORMATION, THIS BOOK TREATS NORMAL AND ABNORMAL DEVELOPMENT, STRUCTURE, AND FUNCTION OF THE LIPS AND PALATE AND THEIR RELATIONSHIPS TO CLEFT LIP AND CLEFT PALATE SPEECH. PROBLEMS OF PERSONAL AND SOCIAL ADJUSTMENT, HEARING, AND SPEECH IN CLEFT LIP OR CLEFT PALATE INDIVIDUALS ARE DISCUSSED. NASAL RESONANCE…

  20. Analog acoustic expression in speech communication

    Microsoft Academic Search

    Hadas Shintel; Howard C. Nusbaum; Arika Okrent

    2006-01-01

    We present the first experimental evidence of a phenomenon in speech communication we call “analog acoustic expression.” Speech is generally thought of as conveying information in two distinct ways: discrete linguistic-symbolic units such as words and sentences represent linguistic meaning, and continuous prosodic forms convey information about the speaker’s emotion and attitude, intended syntactic structure, or discourse structure. However, there

  1. Milton's "Areopagitica" Freedom of Speech on Campus

    ERIC Educational Resources Information Center

    Sullivan, Daniel F.

    2006-01-01

    The author discusses the content in John Milton's "Areopagitica: A Speech for the Liberty of Unlicensed Printing to the Parliament of England" (1985) and provides parallelism to censorship practiced in higher education. Originally published in 1644, "Areopagitica" makes a powerful--and precocious--argument for freedom of speech and against…

  2. Speech-driven cartoon animation with emotions

    Microsoft Academic Search

    Yan Li; Feng Yu; Ying-qing Xu; Eric Chang; Heung-yeung Shum

    2001-01-01

    In this paper, we present a cartoon face animation system for multimedia HCI applications. We animate face cartoons not only from input speech, but also based on emotions derived from speech signal. Using a corpus of over 700 utterances from different speakers, we have trained SVMs (support vector machines) to recognize four categories of emotions: neutral, happiness, anger and sadness.

  3. Pronunciation Modeling for Large Vocabulary Speech Recognition

    ERIC Educational Resources Information Center

    Kantor, Arthur

    2010-01-01

    The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy in automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the…

  4. How Should a Speech Recognizer Work?

    ERIC Educational Resources Information Center

    Scharenborg, Odette; Norris, Dennis; ten Bosch, Louis; McQueen, James M.

    2005-01-01

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that…

  5. Enhancement of speech corrupted by acoustic noise

    Microsoft Academic Search

    M. Berouti; R. Schwartz; J. Makhoul

    1979-01-01

    This paper describes a method for enhancing speech corrupted by broadband noise. The method is based on the spectral noise subtraction method. The original method entails subtracting an estimate of the noise power spectrum from the speech power spectrum, setting negative differences to zero, recombining the new power spectrum with the original phase, and then reconstructing the time waveform. While

  6. Voice Modulations in German Ironic Speech

    ERIC Educational Resources Information Center

    Scharrer, Lisa; Christmann, Ursula; Knoll, Monja

    2011-01-01

    Previous research has shown that in different languages ironic speech is acoustically modulated compared to literal speech, and these modulations are assumed to aid the listener in the comprehension process by acting as cues that mark utterances as ironic. The present study was conducted to identify paraverbal features of German "ironic criticism"…

  7. Second Language Learners and Speech Act Comprehension

    ERIC Educational Resources Information Center

    Holtgraves, Thomas

    2007-01-01

    Recognizing the specific speech act ( Searle, 1969) that a speaker performs with an utterance is a fundamental feature of pragmatic competence. Past research has demonstrated that native speakers of English automatically recognize speech acts when they comprehend utterances (Holtgraves & Ashley, 2001). The present research examined whether this…

  8. The motor theory of speech perception revised

    Microsoft Academic Search

    ALVIN M. LIBERMAN; IGNATIUS G. MATTINGLY

    1985-01-01

    Abstract A motor theory of speech perception, initially proposed to account for results of early experiments with synthetic speech, is now extensively revised to accom- modate recent findings, and to relate the assumptions of the theory to those that might be made,about other perceptual modes. According to the revised theory, phonetic information is perceived in a biologically distinct system, a

  9. Investigation on Mandarin broadcast news speech recognition

    Microsoft Academic Search

    Mei-Yuh Hwang; Xin Lei; Wen Wang; Takahiro Shinozaki

    2006-01-01

    This paper describes our efforts in building a competitive Mandarin broadcast news speech recognizer. We success- fully incorporated the most popular speech technologies into our system. More importantly, we present two novel algorithms in smoothing pitch features and segmenting Chi- nese characters into word units. Additionally, we propose to borrow the principle of pointwise mutual information for creating a Chinese

  10. Neural Networks for Distant Speech Recognition

    E-print Network

    Edinburgh, University of

    , and lots of adaptation! · MDM systems in 2014: Neural networks, less beamforming, and less adaptation #12;RR T S C Neural Networks for Distant Speech Recognition Steve Renals! Centre for Speech Technology and Hain, 2011) [implicit] #12;R T S C (Deep) Neural Networks #12;R T S C The Perceptron (Rosenblatt) #12;R

  11. An Acquired Deficit of Audiovisual Speech Processing

    ERIC Educational Resources Information Center

    Hamilton, Roy H.; Shenton, Jeffrey T.; Coslett, H. Branch

    2006-01-01

    We report a 53-year-old patient (AWF) who has an acquired deficit of audiovisual speech integration, characterized by a perceived temporal mismatch between speech sounds and the sight of moving lips. AWF was less accurate on an auditory digit span task with vision of a speaker's face as compared to a condition in which no visual information from…

  12. Disfluencies in the Analysis of Speech Data.

    ERIC Educational Resources Information Center

    Naro, Anthony Julius; Scherre, Maria Marta Pereira

    1996-01-01

    Discusses a study of concord phenomena in spoken Brazilian Portuguese. Findings indicate the presence of disfluencies, including apparent corrections, in about 15% of the relevant tokens in the corpus of recorded speech data. It is concluded that speech is not overly laden with errors, and there is nothing in the data to mislead the language…

  13. Speech enhancement by nonlinear multiband envelope filtering

    Microsoft Academic Search

    Thomas Langhans; Hans Werner Strube

    1982-01-01

    Starting from a known relation between modulation transfer function (of a room) and speech intelligibility, we tried to enhance speech corrupted by reverberation or noise by suitably filtering the envelope (power) signals in critical frequency bands. Two methods based on FFT-overlap-adding and onlinear prediction, respectively, have been developed. Linear envelope filtering was not successful for neither pre- nor postprocessing, even

  14. Iterative speech enhancement with spectral constraints

    Microsoft Academic Search

    John H. Hansen; Mark A. Clements

    1987-01-01

    A new and improved iterative speech enhancement technique based on spectral constraints is presented in this paper. The iterative technique, originally formulated by Lim and Oppenheim, attempts to solve for the maximum likelihood estimate of a speech waveform in additive white noise. The new approach applies inter- and intra-frame spectral constraints to ensure convergence to reasonable values and hence improve

  15. Reverberant speech enhancement using cepstral processing

    Microsoft Academic Search

    D. Bees; M. Blostein; P. Kabal

    1991-01-01

    Complex cepstral deconvolution is applied to acoustic dereverberation. It is found that traditional cepstral techniques fail in acoustic dereverberation because segmentation errors in the time domain prevent accurate cepstral computation. An algorithm for speech dereverberation which incorporates a novel approach to the segmentation and windowing procedure for speech is presented. Averaging in the cepstrum is exploited to increase the separation

  16. Evaluation of objective measures for speech enhancement

    Microsoft Academic Search

    Yi Hu; Philipos C. Loizou

    2006-01-01

    In this paper, we evaluate the performance of several objective measures in terms of predicting the quality of noisy speech en- hanced by noise suppression algorithms. The objective measures considered a wide range of distortions introduced by four types of real-world noise at two SNRs by four classes of speech enhance- ment algorithms: spectral subtractive, subspace, statistical-model based and Wiener

  17. Hidden Markov Models for Speech Recognition

    Microsoft Academic Search

    B. H. Juang; L. R. Rabiner

    1991-01-01

    The use of hidden Markov models for speech recognition has become predominant in the last several years, as evidenced by the number of published papers and talks at major speech conferences. The reasons this method has become so popular are the inherent statistical (mathematically precise) framework; the ease and availability of training algorithms for cstimating the parameters of the models

  18. Scaffolded-Language Intervention: Speech Production Outcomes

    ERIC Educational Resources Information Center

    Bellon-Harn, Monica L.; Credeur-Pampolina, Maggie E.; LeBoeuf, Lexie

    2013-01-01

    This study investigated the effects of a scaffolded-language intervention using cloze procedures, semantically contingent expansions, contrastive word pairs, and direct models on speech abilities in two preschoolers with speech and language impairment speaking African American English. Effects of the lexical and phonological characteristics (i.e.,…

  19. Pitch-Learning Algorithm For Speech Encoders

    NASA Technical Reports Server (NTRS)

    Bhaskar, B. R. Udaya

    1988-01-01

    Adaptive algorithm detects and corrects errors in sequence of estimates of pitch period of speech. Algorithm operates in conjunction with techniques used to estimate pitch period. Used in such parametric and hybrid speech coders as linear predictive coders and adaptive predictive coders.

  20. SPEECH LEVELS IN VARIOUS NOISE ENVIRONMENTS

    EPA Science Inventory

    The goal of this study was to determine average speech levels used by people when conversing in different levels of background noise. The non-laboratory environments where speech was recorded were: high school classrooms, homes, hospitals, department stores, trains and commercial...

  1. Performing speech recognition research with hypercard

    NASA Technical Reports Server (NTRS)

    Shepherd, Chip

    1993-01-01

    The purpose of this paper is to describe a HyperCard-based system for performing speech recognition research and to instruct Human Factors professionals on how to use the system to obtain detailed data about the user interface of a prototype speech recognition application.

  2. Speech as Process: A Case Study.

    ERIC Educational Resources Information Center

    Brooks, Robert D.; Scheidel, Thomas M.

    1968-01-01

    In order to test the internal evaluative processes and not merely the final reactions of an audience to a speaker, 97 Caucasian college students expressed their attitudes toward Malcolm X while listening to a 25-minute tape-recorded speech by him. Eight 30-second silent intervals at natural pauses in the speech gave the students time to respond…

  3. Why Impromptu Speech Is Easy To Understand.

    ERIC Educational Resources Information Center

    Le Feal, K. Dejean

    Impromptu speech is characterized by the simultaneous processes of ideation (the elaboration and structuring of reasoning by the speaker as he improvises) and expression in the speaker. Other elements accompany this characteristic: division of speech flow into short segments, acoustic relief in the form of word stress following a pause, and both…

  4. Speech masking and cancelling and voice obscuration

    DOEpatents

    Holzrichter, John F.

    2013-09-10

    A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.

  5. Anatomy and Physiology of the Speech Mechanism.

    ERIC Educational Resources Information Center

    Sheets, Boyd V.

    This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…

  6. Pulmonic Ingressive Speech in Shetland English

    ERIC Educational Resources Information Center

    Sundkvist, Peter

    2012-01-01

    This paper presents a study of pulmonic ingressive speech, a severely understudied phenomenon within varieties of English. While ingressive speech has been reported for several parts of the British Isles, New England, and eastern Canada, thus far Newfoundland appears to be the only locality where researchers have managed to provide substantial…

  7. Enhancement and bandwidth compression of noisy speech

    Microsoft Academic Search

    J. S. Lim; A. V. Oppenheim

    1979-01-01

    Over the past several years there has been considerable attention focused on the problem of enhancement and bandwidth compression of speech degraded by additive background noise. This interest is motivated by several factors including a broad set of important applications, the apparent lack of robustness in current speech-compression systems and the development of several potentially promising and practical solutions. One

  8. Speech after Mao: Literature and Belonging

    ERIC Educational Resources Information Center

    Hsieh, Victoria Linda

    2012-01-01

    This dissertation aims to understand the apparent failure of speech in post-Mao literature to fulfill its conventional functions of representation and communication. In order to understand this pattern, I begin by looking back on the utility of speech for nation-building in modern China. In addition to literary analysis of key authors and works,…

  9. Suppressing cocktail party noise for speech acquisition

    Microsoft Academic Search

    Kai Yu; Boling Xu; Mingyang Dai; Chongzhi Yu

    2000-01-01

    A novel and improved iterative speech enhancement approach is presented to suppress the undesired speech or noise in cocktail party environments. The key idea in the proposed approach is that a difference phase-correlation-based criterion function (PCF) is introduced to work as the convergence controller for a modified iterative Wiener filter (MIWF). The interference component is attenuated continually until convergence occurs.

  10. Nonlinear, Biophysically-Informed Speech Pathology Detection

    Microsoft Academic Search

    Max Little; P. McSharry; I. Moroz; S. Roberts

    2006-01-01

    This paper reports a simple nonlinear approach to online acoustic speech pathology detection for automatic screening purposes. Straightforward linear preprocessing followed by two nonlinear measures, based parsimoniously upon the biophysics of speech production, combined with subsequent linear classification, achieves an overall normal\\/pathological detection performance of 91.4%, and over 99% with rejection of 15% ambiguous cases. This compares favourably with more

  11. Tampa Bay International Business Summit Keynote Speech

    NASA Technical Reports Server (NTRS)

    Clary, Christina

    2011-01-01

    A keynote speech outlining the importance of collaboration and diversity in the workplace. The 20-minute speech describes NASA's challenges and accomplishments over the years and what lies ahead. Topics include: diversity and inclusion principles, international cooperation, Kennedy Space Center planning and development, opportunities for cooperation, and NASA's vision for exploration.

  12. Childhood Apraxia of Speech Family Start Guide

    MedlinePLUS

    ... the child’s cognitive skills. In typical speech/language development, the child’s receptive and expressive skills increase together to a ... Be Involved? If you have concerns about your child’s development in addition to their speech, other professionals may ...

  13. Crossed Apraxia of Speech: A Case Report

    ERIC Educational Resources Information Center

    Balasubramanian, Venu; Max, Ludo

    2004-01-01

    The present study reports on the first case of crossed apraxia of speech (CAS) in a 69-year-old right-handed female (SE). The possibility of occurrence of apraxia of speech (AOS) following right hemisphere lesion is discussed in the context of known occurrences of ideomotor apraxias and acquired neurogenic stuttering in several cases with right…

  14. Sequence Kernels for Speaker and Speech Recognition

    E-print Network

    Gales, Mark

    Sequence Kernels for Speaker and Speech Recognition Mark Gales - work with Martin Layton, Chris;Sequence Kernels for Speaker and Speech Recognition Overview · Support Vector Machines and kernels ­ "static" kernels ­ text-independent speaker verification · Sequence (dynamic) kernels ­ discrete

  15. Speech entrainment enables patients with Broca’s aphasia to produce fluent speech

    PubMed Central

    Hubbard, H. Isabel; Hudspeth, Sarah Grace; Holland, Audrey L.; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

    2012-01-01

    A distinguishing feature of Broca’s aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect ‘speech entrainment’ and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca’s aphasia. In Experiment 1, 13 patients with Broca’s aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca’s area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and functional magnetic resonance imaging data were collected before and after the treatment phase. Patients were able to produce a greater variety of words with and without speech entrainment at 1 and 6 weeks after training. Treatment-related decrease in cortical activation associated with speech entrainment was found in areas of the left posterior-inferior parietal lobe. We conclude that speech entrainment allows patients with Broca’s aphasia to double their speech output compared with spontaneous speech. Neuroimaging results suggest that speech entrainment allows patients to produce fluent speech by providing an external gating mechanism that yokes a ventral language network that encodes conceptual aspects of speech. Preliminary results suggest that training with speech entrainment improves speech production in Broca’s aphasia providing a potential therapeutic method for a disorder that has been shown to be particularly resistant to treatment. PMID:23250889

  16. Reconstructing Speech from Human Auditory Cortex

    PubMed Central

    Pasley, Brian N.; David, Stephen V.; Mesgarani, Nima; Flinker, Adeen; Shamma, Shihab A.; Crone, Nathan E.; Knight, Robert T.; Chang, Edward F.

    2012-01-01

    How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex. PMID:22303281

  17. Voice quality modelling for expressive speech synthesis.

    PubMed

    Monzo, Carlos; Iriondo, Ignasi; Socoró, Joan Claudi

    2014-01-01

    This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F 0, duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics. PMID:24587738

  18. [Speech recognition in the norm and pathology].

    PubMed

    Rusalova, M N; Kislova, O O; Sidorova, O A

    2010-01-01

    The purpose of this paper was to study theoretically and experimentally clinical and electrophysiological correlates of the human abilities to recognize emotions in speech. It was shown that the disorder of the recognition of emotions in speech occurred following the lesion of the right temporal region, however the most serious defect of recognition manifested itself at the frontal temporal localization of the lesion. The comparison of the EEG characteristics between two groups of subjects with high and low indices of recognition of emotions in speech revealed a higher level of activation in the posterior temporal regions of the right hemisphere and in the frontal regions of the left hemisphere in the group with the low index of recognition. Clinical data and findings obtained in electrophysiological studies permitted us to make a conclusion that the recognition of emotions in speech involved not only temporal regions of the right hemisphere, but also speech centers in the left hemisphere. PMID:20297688

  19. Open Microphone Speech Understanding: Correct Discrimination Of In Domain Speech

    NASA Technical Reports Server (NTRS)

    Hieronymus, James; Aist, Greg; Dowding, John

    2006-01-01

    An ideal spoken dialogue system listens continually and determines which utterances were spoken to it, understands them and responds appropriately while ignoring the rest This paper outlines a simple method for achieving this goal which involves trading a slightly higher false rejection rate of in domain utterances for a higher correct rejection rate of Out of Domain (OOD) utterances. The system recognizes semantic entities specified by a unification grammar which is specialized by Explanation Based Learning (EBL). so that it only uses rules which are seen in the training data. The resulting grammar has probabilities assigned to each construct so that overgeneralizations are not a problem. The resulting system only recognizes utterances which reduce to a valid logical form which has meaning for the system and rejects the rest. A class N-gram grammar has been trained on the same training data. This system gives good recognition performance and offers good Out of Domain discrimination when combined with the semantic analysis. The resulting systems were tested on a Space Station Robot Dialogue Speech Database and a subset of the OGI conversational speech database. Both systems run in real time on a PC laptop and the present performance allows continuous listening with an acceptably low false acceptance rate. This type of open microphone system has been used in the Clarissa procedure reading and navigation spoken dialogue system which is being tested on the International Space Station.

  20. Review of Visual Speech Perception by Hearing and Hearing-Impaired People: Clinical Implications

    ERIC Educational Resources Information Center

    Woodhouse, Lynn; Hickson, Louise; Dodd, Barbara

    2009-01-01

    Background: Speech perception is often considered specific to the auditory modality, despite convincing evidence that speech processing is bimodal. The theoretical and clinical roles of speech-reading for speech perception, however, have received little attention in speech-language therapy. Aims: The role of speech-read information for speech

  1. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    SciTech Connect

    Hogden, J.

    1996-12-31

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values are constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.

  2. Subtyping Children With Speech Sound Disorders by Endophenotypes

    PubMed Central

    Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.

    2012-01-01

    Purpose The present study examined associations of 5 endophenotypes (i.e., measurable skills that are closely associated with speech sound disorders and are useful in detecting genetic influences on speech sound production), oral motor skills, phonological memory, phonological awareness, vocabulary, and speeded naming, with 3 clinical criteria for classifying speech sound disorders: severity of speech sound disorders, our previously reported clinical subtypes (speech sound disorders alone, speech sound disorders with language impairment, and childhood apraxia of speech), and the comorbid condition of reading disorders. Participants and Method Children with speech sound disorders and their siblings were assessed at early childhood (ages 4–7 years) on measures of the 5 endophenotypes. Severity of speech sound disorders was determined using the z score for Percent Consonants Correct—Revised (developed by Shriberg, Austin, Lewis, McSweeny, & Wilson, 1997). Analyses of variance were employed to determine how these endophenotypes differed among the clinical subtypes of speech sound disorders. Results and Conclusions Phonological memory was related to all 3 clinical classifications of speech sound disorders. Our previous subtypes of speech sound disorders and comorbid conditions of language impairment and reading disorder were associated with phonological awareness, while severity of speech sound disorders was weakly associated with this endophenotype. Vocabulary was associated with mild versus moderate speech sound disorders, as well as comorbid conditions of language impairment and reading disorder. These 3 endophenotypes proved useful in differentiating subtypes of speech sound disorders and in validating current clinical classifications of speech sound disorders. PMID:22844175

  3. The ambassador's speech: A particularly Hellenistic genre of oratory

    Microsoft Academic Search

    Cecil W. Wooten

    1973-01-01

    The ambassador's speech assumed great importance during the Hellenistic period and became a distinct genre of deliberative oratory. Although there are no genuine ambassador's speeches extant, one can construct a model speech of this type by comparing ambassador's speeches in the Greek historians, especially Polybius.

  4. Speech Sound Disorders in a Community Study of Preschool Children

    ERIC Educational Resources Information Center

    McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane

    2013-01-01

    Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…

  5. Speech Characteristics Associated with Three Genotypes of Ataxia

    ERIC Educational Resources Information Center

    Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana

    2011-01-01

    Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…

  6. A review of ASR technologies for children's speech

    Microsoft Academic Search

    Matteo Gerosa; Diego Giuliani; Shrikanth Narayanan; Alexandros Potamianos

    2009-01-01

    In this paper, we review: (1) the acoustic and linguistic properties of children's speech for both read and spontaneous speech, and (2) the developments in automatic speech recognition for children with application to spoken dialogue and multimodal dialogue system design. First, the effect of developmental changes on the absolute values and variability of acoustic correlates is presented for read speech

  7. On the Dynamics of Casual and Careful Speech.

    ERIC Educational Resources Information Center

    Hieke, A. E.

    Comparative statistical data are presented on speech dynamic (as contrasted with lexical and rhetorical) aspects of major speech styles. Representative samples of story retelling, lectures, speeches, sermons, interviews, and panel discussions serve to determine posited differences between casual and careful speech. Data are drawn from 15,393…

  8. TOWARDS RAPID LANGUAGE PORTABILITY OF SPEECH PROCESSING SYSTEMS Tanja Schultz

    E-print Network

    Schultz, Tanja

    develop automatic speech processing systems in many languages. We successfully built speech and text dataTOWARDS RAPID LANGUAGE PORTABILITY OF SPEECH PROCESSING SYSTEMS Tanja Schultz Interactive Systems speech processing products in several languages have been widely distributed all over the world

  9. Experiment in Learning to Discriminate Frequency Transposed Speech.

    ERIC Educational Resources Information Center

    Ahlstrom, K.G.; And Others

    In order to improve speech perception by transposing the speech signals to lower frequencies, to determine which aspects of the information in the acoustic speech signals were influenced by transposition, and to compare two different methods of training speech perception, 44 subjects were trained to discriminate between transposed words or…

  10. Visual and Auditory Input in Second-Language Speech Processing

    ERIC Educational Resources Information Center

    Hardison, Debra M.

    2010-01-01

    The majority of studies in second-language (L2) speech processing have involved unimodal (i.e., auditory) input; however, in many instances, speech communication involves both visual and auditory sources of information. Some researchers have argued that multimodal speech is the primary mode of speech perception (e.g., Rosenblum 2005). Research on…

  11. Computational Differences between Whispered and Non-Whispered Speech

    ERIC Educational Resources Information Center

    Lim, Boon Pang

    2011-01-01

    Whispering is a common type of speech which is not often studied in speech technology. Perceptual and physiological studies show us that whispered speech is subtly different from phonated speech, and is surprisingly able to carry a tremendous amount of information. In this dissertation we consider the question: What makes whispering a good form of…

  12. Listening to talking faces: motor cortical activation during speech perception

    E-print Network

    Coulson, Seana

    Listening to talking faces: motor cortical activation during speech perception Jeremy I. Skipper that audiovisual speech perception activated a network of brain regions that included cortical motor areas involved into the speech perception process involves a network of multimodal brain regions associated with speech

  13. Speech enhancement using a constrained iterative sinusoidal model

    Microsoft Academic Search

    Jesper Jensen; John H. L. Hansen

    2001-01-01

    This paper presents a sinusoidal model based algorithm for enhancement of speech degraded by additive broad-band noise. In order to ensure speech-like characteristics observed in clean speech, smoothness constraints are imposed on the model parameters using a spectral envelope surface (SES) smoothing procedure. Algorithm evaluation is performed using speech signals degraded by additive white Gaussian noise. Distortion as measured by

  14. Phase coherence in speech reconstruction for enhancement and coding applications

    Microsoft Academic Search

    T. F. Quatieri; R. J. McAulay

    1989-01-01

    It has been shown that an analysis-synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially perceptually indistinguishable from the original. A change in speech quality has been observed, however, when the phase relation of the sine waves is altered. This occurs in practice when sine waves are processed for speech enhancement and for speech coding.

  15. PSYCHOACOUSTICALLY CONSTRAINED AND DISTORTION MINIMIZED SPEECH ENHANCEMENT ALGORITHM

    E-print Network

    Yoo, Chang D.

    PSYCHOACOUSTICALLY CONSTRAINED AND DISTORTION MINIMIZED SPEECH ENHANCEMENT ALGORITHM Seokhwan Jo minimized speech enhancement algorithm is considered. In general, noise reduction leads to speech distortion, and thus, the goal of an enhancement al- gorithm should reduce noise and speech distortion so that both

  16. Usefulness of Phase in Speech Processing Kuldip K. Paliwal

    E-print Network

    of the speech processing applications (such as speech recognition [1, 2] and enhancement [3, 4]). It is perhaps using only power spectrum (phase spectrum is totally ignored). Similarly, in speech enhancement systems [3], only power spectrum is enhanced; phase spectrum of noisy speech is left untouched. In this paper

  17. Subjective comparison and evaluation of speech enhancement algorithms

    Microsoft Academic Search

    Yi Hu; Philipos C. Loizou

    2007-01-01

    Making meaningful comparisons between the performance of the various speech enhancement algorithms proposed over the years has been elusive due to lack of a common speech database, differences in the types of noise used and differences in the testing methodology. To facilitate such comparisons, we report on the development of a noisy speech corpus suitable for evaluation of speech enhancement

  18. SPEECH ENHANCEMENT USING MULTI--PULSE EXCITED LINEAR PREDICTION SYSTEM

    E-print Network

    SPEECH ENHANCEMENT USING MULTI--PULSE EXCITED LINEAR PREDICTION SYSTEM K.K. PALIWAL Computer enhancement. It is shown that for successful enhancement of speech the error--weighting filter should of enhancing speech corrupted by additive white noise, when only noisy speech is available, is of considerable

  19. SPEECH ENHANCEMENT USING HARMONIC REGENERATION Cyril Plapous 1

    E-print Network

    Paris-Sud XI, Université de

    SPEECH ENHANCEMENT USING HARMONIC REGENERATION Cyril Plapous 1 , Claude Marro 1 , Pascal Scalart 2 in enhanced speech because of the non reliability of estimators for small signal- to-noise ratios. We propose The problem of enhancing speech degraded by additive noise, when only the noisy speech is available, has been

  20. Speech enhancement using a mixture-maximum model

    Microsoft Academic Search

    David Burshtein; Sharon Gannot

    2002-01-01

    We present a spectral domain, speech enhancement algorithm. The new algorithm is based on a mixture model for the short time spectrum of the clean speech signal, and on a maximum assumption in the production of the noisy speech spectrum. In the past this model was used in the context of noise robust speech recognition. In this paper we show

  1. AUDITORY CODING BASED SPEECH ENHANCEMENT Yao Ren, Michael T. Johnson

    E-print Network

    Johnson, Michael T.

    AUDITORY CODING BASED SPEECH ENHANCEMENT Yao Ren, Michael T. Johnson Speech and Signal Processing This paper demonstrates a speech enhancement system based on an efficient auditory coding approach, coding-stationary characteristics of speech signals than the Fourier transform or wavelet transform. Enhancement is accomplished

  2. On the use of dynamic spectral parameters in speech recognition

    Microsoft Academic Search

    S. M. Ahadi; H. Sheikhzadeh; R. L. Brennan; G. H. Freeman; E. Chau

    2003-01-01

    Spectral dynamics have attracted the attention of researchers in speech recognition for a long time. As part of the speech feature vector they are found to be useful and hence are almost part of any feature extraction algorithm for speech recognition. However, the usual cepstral dynamics do not directly reflect the dynamics of the speech spectrum, as they are extracted

  3. ELSEVIER Speech Communication 14 (1994) 205 229 SPEECHCOMMUNICATION

    E-print Network

    Kabal, Peter

    for quantifying the degree of distortion introduced by a speech coder. An original speech and its coded version in an information-theoretic sense. In essence, it evaluates the cross-entropy of Ihe neural firings for the coded-distortion function for speech coding using the Blahut algorithm. Four state-of-the-art speech coders with rates

  4. Cries and Whispers Classification of Vocal Effort in Expressive Speech

    E-print Network

    Cries and Whispers Classification of Vocal Effort in Expressive Speech Nicolas Obin IRCAM-CNRS UMR- tive and challenging issues for speech technologies, e.g. the de- velopment of automatic content-based speech processing and speech recognition systems in the context of video games post- production and voice

  5. BCS 561: Speech Perception and Recognition Spring 2006

    E-print Network

    1 BCS 561: Speech Perception and Recognition Spring 2006 Instructor: Richard Aslin Wednesdays 1 on human speech perception and recognition. Topics include an overview of phonetics, categorical perception, speech perception by nonhumans and by human infants, perception of nonnative speech sounds, intermodal

  6. SPEECH RECOGNITION FOR A TRAVEL RESERVATION SYSTEM Hakan Erdogan

    E-print Network

    Erdogan, Hakan

    SPEECH RECOGNITION FOR A TRAVEL RESERVATION SYSTEM Hakan Erdogan IBM TJ Watson Research Center PO on speech recogni- tion for a spoken dialog system for automatic travel reserva- tions. The system can receive speech input from an analog line or from IP telephony through a web site. The system uses speech

  7. Speech Perception Lori L. Holt, Ph.D.

    E-print Network

    Holt, Lori L.

    Speech Perception Lori L. Holt, Ph.D. Associate Professor Department at speech perception that the ability seems trivial. However, the ease with which we perceive speech belies the complexity of the perceptual, cognitive and neural mechanisms involved. The primary reason that speech

  8. The SPHINX-II speech recognition system: an overview

    Microsoft Academic Search

    Xuedong Huang; Fileno Alleva; Hsiao-Wuen Hon; Mei-Yuh Hwang; Ronald Rosenfeld

    1993-01-01

    In order for speech recognizers to deal with increased task p erplexity, speaker variation, and environment variation, improved speech recognition is critical. Stead y progress has been made along these three dimensions at Carnegie Mellon. In this paper, we review the SPHINX-II speech recognition system and summarize our recent efforts on improved speech recognition.

  9. HMM-based Speech Synthesis Adapted to Listeners' & Talkers' conditions

    E-print Network

    Edinburgh, University of

    HMM-based Speech Synthesis Adapted to Listeners' & Talkers' conditions Dr Junichi Yamagishi The Centre for Speech Technology Research University of Edinburgh www.cstr.ed.ac.uk Centre for Speech Technology Research R T S C 1 #12;Agenda 1. Text-to-speech synthesis (TTS) - Hidden Markov model (HMM

  10. Speech-Background Classification by Using SVM Technique

    Microsoft Academic Search

    Waleed H. Abdulla; Vojislav Kecman; Nik Kasabov

    This paper investigates a novel support vector machines (SVMs) based technique to segregate speech segments from the concurrent background. The goal of the speech segment extraction is to separate the acoustic events of interest in a continuously recorded signal from the other parts of the signal (background). Speech segment extraction is an essential step in many front-end processors in speech

  11. Tracking Change in Children with Severe and Persisting Speech Difficulties

    ERIC Educational Resources Information Center

    Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

    2013-01-01

    Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…

  12. Phonemic Characteristics of Apraxia of Speech Resulting from Subcortical Hemorrhage

    ERIC Educational Resources Information Center

    Peach, Richard K.; Tonkovich, John D.

    2004-01-01

    Reports describing subcortical apraxia of speech (AOS) have received little consideration in the development of recent speech processing models because the speech characteristics of patients with this diagnosis have not been described precisely. We describe a case of AOS with aphasia secondary to basal ganglia hemorrhage. Speech-language symptoms…

  13. An improved automatic lipreading system to enhance speech recognition

    Microsoft Academic Search

    Eric Petajan; Bradford Bischoff; David Bodoff; N. M. Brooke

    1988-01-01

    Current acoustic speech recognition technology performs well with very small vocabularies in noise or with large vocabularies in very low noise. Accurate acoustic speech recognition in noise with vocabularies over 100 words has yet to be achieved. Humans frequently lipread the visible facial speech articulations to enhance speech recognition, especially when the acoustic signal is degraded by noise or hearing

  14. Monkey Lipsmacking Develops Like the Human Speech Rhythm

    ERIC Educational Resources Information Center

    Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.

    2012-01-01

    Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…

  15. The Effectiveness of Clear Speech as a Masker

    ERIC Educational Resources Information Center

    Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

    2010-01-01

    Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

  16. Optimal subband Kalman filter for normal and oesophageal speech enhancement.

    PubMed

    Ishaq, Rizwan; García Zapirain, Begoña

    2014-01-01

    This paper presents the single channel speech enhancement system using subband Kalman filtering by estimating optimal Autoregressive (AR) coefficients and variance for speech and noise, using Weighted Linear Prediction (WLP) and Noise Weighting Function (NWF). The system is applied for normal and Oesophageal speech signals. The method is evaluated by Perceptual Evaluation of Speech Quality (PESQ) score and Signal to Noise Ratio (SNR) improvement for normal speech and Harmonic to Noise Ratio (HNR) for Oesophageal Speech (OES). Compared with previous systems, the normal speech indicates 30% increase in PESQ score, 4 dB SNR improvement and OES shows 3 dB HNR improvement. PMID:25227070

  17. Free Speech Movement Digital Archive

    NSDL National Science Digital Library

    The Free Speech Movement that began on the Berkeley campus of the University of California in 1964 began a groundswell of student protests and campus-based social activism that would later spread across the United States for the remainder of the decade. With a substantial gift from Stephen M. Silberstein in the late 1990s, the University of California Berkeley Library began an ambitious program to document the role of those students and other participants who gave a coherent and organized voice to the Free Speech Movement. The primary documents provided here are quite extensive and include transcriptions of legal defense documents, leaflets passed out by members of the movement, letters from administrators and faculty members regarding the movement and student unrest, and oral histories. The site also provided a detailed bibliography to material dealing with the movement and a chronology of key events within its early history. Perhaps the most engaging part of the site is the Social Activism Sound Recording Project, which features numerous audio clips of faculty and academic senate debates, student protests, and discussions that were recorded during this period.

  18. Speech processing using conditional observable maximum likelihood continuity mapping

    DOEpatents

    Hogden, John; Nix, David

    2004-01-13

    A computer implemented method enables the recognition of speech and speech characteristics. Parameters are initialized of first probability density functions that map between the symbols in the vocabulary of one or more sequences of speech codes that represent speech sounds and a continuity map. Parameters are also initialized of second probability density functions that map between the elements in the vocabulary of one or more desired sequences of speech transcription symbols and the continuity map. The parameters of the probability density functions are then trained to maximize the probabilities of the desired sequences of speech-transcription symbols. A new sequence of speech codes is then input to the continuity map having the trained first and second probability function parameters. A smooth path is identified on the continuity map that has the maximum probability for the new sequence of speech codes. The probability of each speech transcription symbol for each input speech code can then be output.

  19. COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY-BASED SPEECH RECOGNITION

    E-print Network

    Garner, Philip N.

    COMBINING CEPSTRAL NORMALIZATION AND COCHLEAR IMPLANT-LIKE SPEECH PROCESSING FOR MICROPHONE ARRAY normalization and cochlear implant-like speech processing for microphone array- based speech recognition Number corpus (MONC), are clean and not overlapping. Cochlear implant-like speech processing, which

  20. Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

    ERIC Educational Resources Information Center

    Cowan, Gloria; Khatchadourian, Desiree

    2003-01-01

    Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…

  1. 1600 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 6, AUGUST 2011 Unvoiced Speech Segregation From Nonspeech

    E-print Network

    Wang, DeLiang "Leon"

    of unvoiced speech from nonspeech interference. Speech enhancement methods have been proposed to enhance noisy1600 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 6, AUGUST 2011 Unvoiced Speech Segregation From Nonspeech Interference via CASA and Spectral Subtraction Ke Hu, Student

  2. 1110 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 6, NOVEMBER 2005 Processing of Reverberant Speech

    E-print Network

    Zotkin, Dmitry N.

    algorithms, and also for enhancement of speech. For time-delay estimation, speech sig- nals are normally at an array of microphones and enhance- ment of the received speech signals are challenging tasks [1], [2.853005 time-delay information is also useful to enhance the speech, by combining the output signals from

  3. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a

    E-print Network

    Gannot, Sharon

    , IEEE Abstract--We present a spectral domain, speech enhancement algorithm. The new algorithm is based to alterna- tive speech enhancement algorithms. Index Terms--Gaussian mixture model, MIXMAX model, speech enhancement. I. INTRODUCTION SPEECH quality and intelligibility might significantly deteriorate

  4. Model-based Noisy Speech Recognition with Environment Parameters Estimated by Noise Adaptive Speech Recognition with Prior

    E-print Network

    Model-based Noisy Speech Recognition with Environment Parameters Estimated by Noise Adaptive Speech.paliwal@griffith.edu.au nakamura@slt.atr.co.jp Abstract We have proposed earlier a noise adaptive speech recognition ap- proach for recognizing speech corrupted by nonstationary noise and channel distortion. In this paper, we extend

  5. Speech and hearing acoustics at Bell Labs

    NASA Astrophysics Data System (ADS)

    Allen, Jont

    2001-05-01

    A. G. Bell's interest in basic research of speech and hearing was one of the keys to the Bell Lab culture. When the first network circuits were built, speech quality was very low. Research was needed on speech articulation (the probability correct for nonsense speech sounds). George Campbell, a mathematician and ultimate engineer, and expert on Heaviside, extended work of Lord Rayleigh. In 1910 Campbell was the first to generate consonant identification confusion matrices, and show sound grouping (features). Crandall took up this work and attempted (but failed) to define the articulation density over frequency. By 1921 Fletcher had solved Crandall's problem, with the the Articulation Index theory, based on the idea of independent feature perception, across frequency and time. In 1929 he wrote his first book, Speech and Hearing, which sold over 5000 copies. His second book, Speech and Hearing in Communications, was first released in 1953, after his retirement. Other key people that worked closely with Fletcher were J. C. Steinberg, Munson, French, Galt, Hartley, Kingsbury, Nyquist, Sivian, White, and Wegel. I will try to introduce each of these people and describe their contributions to the speech and hearing field.

  6. Brain-Computer Interfaces for Speech Communication

    PubMed Central

    Brumberg, Jonathan S.; Nieto-Castanon, Alfonso; Kennedy, Philip R.; Guenther, Frank H.

    2010-01-01

    This paper briefly reviews current silent speech methodologies for normal and disabled individuals. Current techniques utilizing electromyographic (EMG) recordings of vocal tract movements are useful for physically healthy individuals but fail for tetraplegic individuals who do not have accurate voluntary control over the speech articulators. Alternative methods utilizing EMG from other body parts (e.g., hand, arm, or facial muscles) or electroencephalography (EEG) can provide capable silent communication to severely paralyzed users, though current interfaces are extremely slow relative to normal conversation rates and require constant attention to a computer screen that provides visual feedback and/or cueing. We present a novel approach to the problem of silent speech via an intracortical microelectrode brain computer interface (BCI) to predict intended speech information directly from the activity of neurons involved in speech production. The predicted speech is synthesized and acoustically fed back to the user with a delay under 50 ms. We demonstrate that the Neurotrophic Electrode used in the BCI is capable of providing useful neural recordings for over 4 years, a necessary property for BCIs that need to remain viable over the lifespan of the user. Other design considerations include neural decoding techniques based on previous research involving BCIs for computer cursor or robotic arm control via prediction of intended movement kinematics from motor cortical signals in monkeys and humans. Initial results from a study of continuous speech production with instantaneous acoustic feedback show the BCI user was able to improve his control over an artificial speech synthesizer both within and across recording sessions. The success of this initial trial validates the potential of the intracortical microelectrode-based approach for providing a speech prosthesis that can allow much more rapid communication rates. PMID:20204164

  7. Retrieving information on the neurophysiology of speech.

    PubMed

    Kelly, S A

    1988-01-01

    A search for literature on the relationship between the autonomic nervous system and speech was performed via the DIALOG Information Service on four databases: BIOSIS Previews, EMBASE, MEDLINE, and PsycINFO. A doctoral candidate in the Department of Audiology and Speech Sciences at Purdue University reviewed the citations retrieved for relevancy. Then, the coverage of the relevant literature by the four databases was analyzed. MEDLINE yielded the greatest number of relevant citations followed by EMBASE, BIOSIS, and PsycINFO. The four databases showed little overlap in coverage of citations or journal titles. This study indicates that it would be advantageous to search all four databases for the speech scientist. PMID:10304128

  8. Vector Adaptive/Predictive Encoding Of Speech

    NASA Technical Reports Server (NTRS)

    Chen, Juin-Hwey; Gersho, Allen

    1989-01-01

    Vector adaptive/predictive technique for digital encoding of speech signals yields decoded speech of very good quality after transmission at coding rate of 9.6 kb/s and of reasonably good quality at 4.8 kb/s. Requires 3 to 4 million multiplications and additions per second. Combines advantages of adaptive/predictive coding, and code-excited linear prediction, yielding speech of high quality but requires 600 million multiplications and additions per second at encoding rate of 4.8 kb/s. Vector adaptive/predictive coding technique bridges gaps in performance and complexity between adaptive/predictive coding and code-excited linear prediction.

  9. Acoustic differences among casual, conversational, and read speech

    NASA Astrophysics Data System (ADS)

    Pinnow, DeAnna

    Speech is a complex behavior that allows speakers to use many variations to satisfy the demands connected with multiple speaking environments. Speech research typically obtains speech samples in a controlled laboratory setting using read material, yet anecdotal observations of such speech, particularly from talkers with a speech and language impairment, have identified a "performance" effect in the produced speech which masks the characteristics of impaired speech outside of the lab (Goberman, Recker, & Parveen, 2010). The aim of the current study was to investigate acoustic differences among laboratory read, laboratory conversational, and casual speech through well-defined speech tasks in the laboratory and in talkers' natural environments. Eleven healthy research participants performed lab recording tasks (19 read sentences and a dialogue about their life) and collected natural-environment recordings of themselves over 3-day periods using portable recorders. Segments were analyzed for articulatory, voice, and prosodic acoustic characteristics using computer software and hand counting. The current study results indicate that lab-read speech was significantly different from casual speech: greater articulation range, improved voice quality measures, lower speech rate, and lower mean pitch. One implication of the results is that different laboratory techniques may be beneficial in obtaining speech samples that are more like casual speech, thus making it easier to correctly analyze abnormal speech characteristics with fewer errors.

  10. Speech measurements using a laser Doppler vibrometer sensor: Application to speech enhancement

    Microsoft Academic Search

    Yekutiel Avargel; Israel Cohen

    2011-01-01

    In this paper, we present a remote speech-measurement system, which utilizes an auxiliary laser Doppler vibrometer (LDV) sensor. When focusing on the larynx, this sensor captures useful speech information at low-frequency regions (up to 1.5?2 kHz), and is shown to be immune to acoustical disturbances. For improved speech enhancement, we propose a new algorithm for efficiently combining the signals from

  11. Progressive apraxia of speech as a window into the study of speech planning processes

    Microsoft Academic Search

    Marina Laganaro; Michèle Croisier; Odile Bagou; Frédéric Assal

    We present a 3-year follow-up study of a patient with progressive apraxia of speech (PAoS), aimed at investigating whether the theoretical organization of phonetic encoding is reflected in the progressive disruption of speech. As decreased speech rate was the most striking pattern of disruption during the first 2 years, durational analyses were carried out longitudinally on syllables excised from spontaneous,

  12. Speech Perception and Short Term Memory Deficits in Persistent Developmental Speech Disorder

    PubMed Central

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2008-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech perception and short-term memory. Nine adults with a persistent familial developmental speech disorder without language impairment were compared with 20 controls on tasks requiring the discrimination of fine acoustic cues for word identification and on measures of verbal and nonverbal short-term memory. Significant group differences were found in the slopes of the discrimination curves for first formant transitions for word identification with stop gaps of 40 and 20 ms with effect sizes of 1.60 and 1.56. Significant group differences also occurred on tests of nonverbal rhythm and tonal memory, and verbal short-term memory with effect sizes of 2.38, 1.56 and 1.73. No group differences occurred in the use of stop gap durations for word identification. Because frequency-based speech perception and short-term verbal and nonverbal memory deficits both persisted into adulthood in the speech-impaired adults, these deficits may be involved in the persistence of speech disorders without language impairment. PMID:15896836

  13. Treating visual speech perception to improve speech production in non- fluent aphasia

    PubMed Central

    Fridriksson, Julius; Baker, Julie M.; Whiteside, Janet; Eoute, David; Moser, Dana; Vesselinov, Roumen; Rorden, Chris

    2008-01-01

    Background and Purpose Several recent studies have revealed modulation of the left frontal lobe speech areas not only during speech production, but also for speech perception. Crucially, the frontal lobe areas highlighted in these studies are the same ones that are involved in non-fluent aphasia. Based on these findings, this study examined the utility of targeting visual speech perception to improve speech production in non-fluent aphasia. Methods Ten patients with chronic non-fluent aphasia underwent computerized language treatment utilizing picture-word matching. To examine the effect of visual peech perception upon picture naming, two treatment phases were compared – one which included matching pictures to heard words and another where pictures were matched to heard words accompanied by a video of the speaker’s mouth presented on the computer screen. Results The results revealed significantly improved picture naming of both trained and untrained items following treatment when it included a visual speech component (i.e. seeing the speaker’s mouth). In contrast, the treatment phase where pictures were only matched to heard words did not result in statistically significant improvement of picture naming. Conclusions The findings suggest that focusing on visual speech perception can significantly improve speech production in non-fluent aphasia and may provide an alternative approach to treat a disorder where speech production seldom improves much in the chronic phase of stroke. PMID:19164782

  14. How visual timing and form information affect speech and non-speech processing.

    PubMed

    Kim, Jeesun; Davis, Chris

    2014-10-01

    Auditory speech processing is facilitated when the talker's face/head movements are seen. This effect is typically explained in terms of visual speech providing form and/or timing information. We determined the effect of both types of information on a speech/non-speech task (non-speech stimuli were spectrally rotated speech). All stimuli were presented paired with the talker's static or moving face. Two types of moving face stimuli were used: full-face versions (both spoken form and timing information available) and modified face versions (only timing information provided by peri-oral motion available). The results showed that the peri-oral timing information facilitated response time for speech and non-speech stimuli compared to a static face. An additional facilitatory effect was found for full-face versions compared to the timing condition; this effect only occurred for speech stimuli. We propose the timing effect was due to cross-modal phase resetting; the form effect to cross-modal priming. PMID:25190328

  15. Pronunciation learning for automatic speech recognition

    E-print Network

    Badr, Ibrahim

    2011-01-01

    In many ways, the lexicon remains the Achilles heel of modern automatic speech recognizers (ASRs). Unlike stochastic acoustic and language models that learn the values of their parameters from training data, the baseform ...

  16. Consonant landmark detection for speech recognition

    E-print Network

    Park, Chi-youn, 1981-

    2008-01-01

    This thesis focuses on the detection of abrupt acoustic discontinuities in the speech signal, which constitute landmarks for consonant sounds. Because a large amount of phonetic information is concentrated near acoustic ...

  17. Teaming for Speech and Auditory Training.

    ERIC Educational Resources Information Center

    Nussbaum, Debra B.; Waddy-Smith, Bettie

    1985-01-01

    The article suggests three strategies for the audiologist and speech/communication specialist to use in assisting the preschool teacher to implement student's individualized education program: (1) demonstration teaming, (2) dual teaming; and (3) rotation teaming. (CL)

  18. Sounds and speech perception Productivity of language

    E-print Network

    Pillow, Jonathan

    ­ Nasal/sinus passages ­ Lips and teeth · All effect sound made Phonetic features · Speech sounds differ on features · Vowel/consonant ­ Is there an obstruction of the vocal tract · For consonants (vowels have

  19. Assessment in Graded Speech Communication Internships.

    ERIC Educational Resources Information Center

    Weitzel, Al

    Fundamental differences exist between conceptualizing and operationalizing experiential education, in general, and graded speech communication internship classes-for-credit, in particular. Monitoring participation in experiential activities (including intercollegiate athletics, musical or stage production, publication of the campus newspaper, and…

  20. Increasing speech intelligibility in children with autism.

    PubMed

    Koegel, R L; Camarata, S; Koegel, L K; Ben-Tall, A; Smith, A E

    1998-06-01

    Accumulating studies are documenting specific motivational variables that, when combined into a naturalistic teaching paradigm, reliably influence the effectiveness of language teaching interactions for children with autism. However, the effectiveness of this approach has not yet been assessed with respect to improving speech intelligibility. The purpose of this study was to systematically compare two intervention conditions, a Naturalistic approach (which incorporated motivational variables) vs. an Analog (more traditional, structured) approach, with developmentally similar speech sounds equated within and across conditions for each child. Data indicate that although both methods effectively increased correct production of the target sounds under some conditions, functional use of the target sounds in conversation occurred only when the naturalistic procedures were used during intervention. Results are discussed in terms of pivotal variables that may produce improvements in speech sounds during conversational speech. PMID:9656136

  1. Automated Speech Recognition in air traffic control

    E-print Network

    Trikas, Thanassis

    1987-01-01

    Over the past few years, the technology and performance of Automated Speech Recognition (ASR) systems has been improving steadily. This has resulted in their successful use in a number of industrial applications. Motivated ...

  2. Managing to Speak by Managing the Speech.

    ERIC Educational Resources Information Center

    Sussman, Lyle

    1988-01-01

    The essence of giving a good speech is to view it as a managerial problem/opportunity and apply the four management functions to resolve it. These four functions are (1) planning; (2) organizing; (3) motivating; and (4) controlling. (JOW)

  3. Modelling Speech Dynamics with Trajectory-HMMs 

    E-print Network

    Zhang, Le

    2009-01-01

    The conditional independence assumption imposed by the hidden Markov models (HMMs) makes it difficult to model temporal correlation patterns in human speech. Traditionally, this limitation is circumvented by appending ...

  4. Speech therapy and voice recognition instrument

    NASA Technical Reports Server (NTRS)

    Cohen, J.; Babcock, M. L.

    1972-01-01

    Characteristics of electronic circuit for examining variations in vocal excitation for diagnostic purposes and in speech recognition for determiniog voice patterns and pitch changes are described. Operation of the circuit is discussed and circuit diagram is provided.

  5. commanimation: Creating and managing animations via speech

    E-print Network

    Kim, Hana

    A speech controlled animation system is both a useful application program as well as a laboratory in which to investigate context aware applications as well as controlling errors. The user need not have prior knowledge or ...

  6. Neuropsychology and Genetics of Speech, Language,

    E-print Network

    Carlini, David

    Neuropsychology and Genetics of Speech, Language, and Literacy Disorders Robin L. Peterson, MAa deprivation, and other more severe developmental disorders (such as autism or mental retar- dation). The first

  7. Intelligibility enhancement of synthetic speech in noise 

    E-print Network

    Valentini Botinha?o, Ca?ssia

    2013-11-28

    , providing the correct information in adverse conditions can be crucial to certain applications. Speech that adapts or reacts to different listening conditions can in turn be more expressive and natural. In this work we focus on enhancing the intelligibility...

  8. A Speech After a Circle Dance

    E-print Network

    Bkra shis bzang po

    2009-01-01

    This collection of 77 audio files focuses on weddings and weddings speeches but also contains: folk tales, folk songs, riddles, tongue twisters, and local history from Bang smad Village and Ri sne Village, Bang smad Township, Nyag rong County...

  9. Speech recognition using linear dynamic models. 

    E-print Network

    Frankel, Joe; King, Simon

    2006-01-01

    The majority of automatic speech recognition (ASR) systems rely on hidden Markov models, in which Gaussian mixtures model the output distributions associated with sub-phone states. This approach, whilst successful, models consecutive feature vectors...

  10. Speech emotional features extraction based on electroglottograph.

    PubMed

    Chen, Lijiang; Mao, Xia; Wei, Pengfei; Compare, Angelo

    2013-12-01

    This study proposes two classes of speech emotional features extracted from electroglottography (EGG) and speech signal. The power-law distribution coefficients (PLDC) of voiced segments duration, pitch rise duration, and pitch down duration are obtained to reflect the information of vocal folds excitation. The real discrete cosine transform coefficients of the normalized spectrum of EGG and speech signal are calculated to reflect the information of vocal tract modulation. Two experiments are carried out. One is of proposed features and traditional features based on sequential forward floating search and sequential backward floating search. The other is the comparative emotion recognition based on support vector machine. The results show that proposed features are better than those commonly used in the case of speaker-independent and content-independent speech emotion recognition. PMID:24047321

  11. Making speech recognition work on the web

    E-print Network

    Varenhorst, Christopher J

    2011-01-01

    We present an improved Audio Controller for Web-Accessible Multimodal Interface toolkit -- a system that provides a simple way for developers to add speech recognition to web pages. Our improved system offers increased ...

  12. Linear dynamic models for automatic speech recognition 

    E-print Network

    Frankel, Joe

    The majority of automatic speech recognition (ASR) systems rely on hidden Markov models (HMM), in which the output distribution associated with each state is modelled by a mixture of diagonal covariance Gaussians. Dynamic ...

  13. CHATR: A generic speech synthesis system 

    E-print Network

    Black, Alan W; Taylor, Paul A

    1994-01-01

    This paper describes a generic speech synthesis system called CHATR which is being developed at ATR. CHATR is designed in a modular way so that module parameters and even which modules are actually used may be set and ...

  14. The motor theory of speech perception reviewed

    PubMed Central

    GALANTUCCI, BRUNO; FOWLER, CAROL A.; TURVEY, M. T.

    2009-01-01

    More than 50 years after the appearance of the motor theory of speech perception, it is timely to evaluate its three main claims that (1) speech processing is special, (2) perceiving speech is perceiving gestures, and (3) the motor system is recruited for perceiving speech. We argue that to the extent that it can be evaluated, the first claim is likely false. As for the second claim, we review findings that support it and argue that although each of these findings may be explained by alternative accounts, the claim provides a single coherent account. As for the third claim, we review findings in the literature that support it at different levels of generality and argue that the claim anticipated a theme that has become widespread in cognitive science. PMID:17048719

  15. American Speech-Language-Hearing Association

    MedlinePLUS

    American Speech-Language-Hearing Association (ASHA) Making effective communication, a human right, accessible and achievable for all. ... to assess and treat kids with poor social communication skills. New streaming video course from Sylvia Diehl, ...

  16. Mothers' speech in three social classes

    Microsoft Academic Search

    C. E. Snow; A. Arlman-Rupp; Y. Hassing; J. Jobse; J. Joosten; J. Vorster

    1976-01-01

    Functional and linguistic aspects of the speech of Dutch-speaking mothers from three social classes to their 2-year-old children were studied. Mothers' speech in Dutch showed the same characteristics of simplicity and redundancy found in other languages. In a free play situation, both academic and lower middle class mothers produced more expansions and used fewer imperatives, more substantive deixis, and fewer

  17. Function of infant-directed speech

    Microsoft Academic Search

    Marilee Monnot

    1999-01-01

    The relationship between a biological process and a behavioral trait indicates a proximate mechanism by which natural selection\\u000a can act. In that context, examining an aspect of infant health is one method of investigating the adaptive significance of\\u000a infant-directed speech (ID speech), and it could help to explain the widespread use of this communication style. The correlation\\u000a between infant growth

  18. Towards Quranic reader controlled by speech

    E-print Network

    Yekache, Yacine; Kouninef, Belkacem

    2012-01-01

    In this paper we describe the process of designing a task-oriented continuous speech recognition system for Arabic, based on CMU Sphinx4, to be used in the voice interface of Quranic reader. The concept of the Quranic reader controlled by speech is presented, the collection of the corpus and creation of acoustic model are described in detail taking into account a specificities of Arabic language and the desired application.

  19. Electronic speech synthesis with microcomputer control

    NASA Astrophysics Data System (ADS)

    Fegan, D. J.; Grimley, H. M.

    1985-11-01

    An undergraduate physics laboratory project is described which utilizes a dedicated speech processor microchip and a series of read-only memories to produce a limited vocabulary of high quality spoken English under the interactive control of a Commodore 64 microcomputer for use in courses in acoustics. The project affords a cheap and simple introduction to electronic speech synthesis and also offers experience in microcomputer interfacing and control.

  20. Loss of speech after orthotopic liver transplantation.

    PubMed

    Bronster, D J; Boccagni, P; O'Rourke, M; Emre, S; Schwartz, M; Miller, C

    1995-01-01

    Alteration of speech is a rare but distressing complication of orthotopic liver transplantation (OLT). We describe a characteristic speech disorder identified in a large series of consecutive patients undergoing OLT. Between 1988 and 1993, 525 adults underwent OLT. For all recipients with neurologic complications, we reviewed clinical findings, imaging and electrophysiologic test results, and perioperative laboratory data. Five patients (ages 23-52; UNOS status 3-4) exhibited a characteristic pattern of stuttering dysarthria, leading to complete loss of speech production, occasionally with elements of aphasia. In four of the five patients, right-sided focal seizures were subsequently noted. All cases presented within the first 10 postoperative days and improved with 1 month of cessation of cyclosporin (CyA), although halting, monotonous speech was evident to some degree in all five for up to 1 year. There was no correlation between onset of symptoms and CyA levels. None of the patients has clinical or radiologic findings suggestive of central pontine myelinolysis or akinetic mutism. EEGs and Spect scan results were consistent with dysfunction in the left frontotemporoparietal regions of the brain. A characteristic speech disorder, which may be described as cortical dysarthria or speech apraxia, occurs in approximately 1% of adults undergoing OLT. Prompt recognition of this syndrome and temporary cessation of CyA therapy may favorable affect the course. PMID:7626186

  1. Robust coarticulatory modeling for continuous speech recognition

    NASA Astrophysics Data System (ADS)

    Schwartz, R.; Chow, Y. L.; Dunham, M. O.; Kimball, O.; Krasner, M.; Kubala, F.; Makhoul, J.; Price, P.; Roucos, S.

    1986-10-01

    The purpose of this project is to perform research into algorithms for the automatic recognition of individual sounds or phonemes in continuous speech. The algorithms developed should be appropriate for understanding large-vocabulary continuous speech input and are to be made available to the Strategic Computing Program for incorporation in a complete word recognition system. This report describes process to date in developing phonetic models that are appropriate for continuous speech recognition. In continuous speech, the acoustic realization of each phoneme depends heavily on the preceding and following phonemes: a process known as coarticulation. Thus, while there are relatively few phonemes in English (on the order of fifty or so), the number of possible different accoustic realizations is in the thousands. Therefore, to develop high-accuracy recognition algorithms, one may need to develop literally thousands of relatively distance phonetic models to represent the various phonetic context adequately. Developing a large number of models usually necessitates having a large amount of speech to provide reliable estimates of the model parameters. The major contributions of this work are the development of: (1) A simple but powerful formalism for modeling phonemes in context; (2) Robust training methods for the reliable estimation of model parameters by utilizing the available speech training data in a maximally effective way; and (3) Efficient search strategies for phonetic recognition while maintaining high recognition accuracy.

  2. Speech Pathology in Ancient India--A Review of Sanskrit Literature.

    ERIC Educational Resources Information Center

    Savithri, S. R.

    1987-01-01

    The paper is a review of ancient Sanskrit literature for information on the origin and development of speech and language, speech production, normality of speech and language, and disorders of speech and language and their treatment. (DB)

  3. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ...Condition of participation: Speech pathology services. 485.715 Section 485...Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology...

  4. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ...false Condition of participation: Speech pathology services. 485.715 ...of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech...

  5. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ...false Condition of participation: Speech pathology services. 485.715 ...of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech...

  6. 42 CFR 485.715 - Condition of participation: Speech pathology services.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ...false Condition of participation: Speech pathology services. 485.715 ...of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech...

  7. Systematic Studies of Modified Vocalization: The Effect of Speech Rate on Speech Production Measures during Metronome-Paced Speech in Persons Who Stutter

    ERIC Educational Resources Information Center

    Davidow, Jason H.

    2014-01-01

    Background: Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control…

  8. Preschool Speech Intelligibility and Vocabulary Skills Predict Long-Term Speech and Language Outcomes Following Cochlear Implantation in Early Childhood

    PubMed Central

    Castellanos, Irina; Kronenberger, William G.; Beer, Jessica; Henning, Shirley C.; Colson, Bethany G.; Pisoni, David B.

    2013-01-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants, but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine if early preschool measures of speech and language performance predict speech-language functioning in long-term users of cochlear implants. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3 – 6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes. PMID:23998347

  9. Study on achieving speech privacy using masking noise

    NASA Astrophysics Data System (ADS)

    Tamesue, Takahiro; Yamaguchi, Shizuma; Saeki, Tetsuro

    2006-11-01

    This study focuses on achieving speech privacy using a meaningless steady masking noise. The most effective index for achieving a satisfactory level of speech privacy was selected, choosing between spectral distance and the articulation index. From a result, spectral distance was selected as the best and most practical index for achieving speech privacy. Next, speech along with a masking noise with a sound pressure level value corresponding to various speech privacy levels were presented to subjects who judged the psychological impression of the particular speech privacy level. Theoretical calculations were in good agreement with the experimental results.

  10. Relationship between listeners' nonnative speech recognition and categorization abilities.

    PubMed

    Atagi, Eriko; Bent, Tessa

    2015-01-01

    Enhancement of the perceptual encoding of talker characteristics (indexical information) in speech can facilitate listeners' recognition of linguistic content. The present study explored this indexical-linguistic relationship in nonnative speech processing by examining listeners' performance on two tasks: nonnative accent categorization and nonnative speech-in-noise recognition. Results indicated substantial variability across listeners in their performance on both the accent categorization and nonnative speech recognition tasks. Moreover, listeners' accent categorization performance correlated with their nonnative speech-in-noise recognition performance. These results suggest that having more robust indexical representations for nonnative accents may allow listeners to more accurately recognize the linguistic content of nonnative speech. PMID:25618098

  11. Speech processing based on short-time Fourier analysis

    SciTech Connect

    Portnoff, M.R.

    1981-06-02

    Short-time Fourier analysis (STFA) is a mathematical technique that represents nonstationary signals, such as speech, music, and seismic signals in terms of time-varying spectra. This representation provides a formalism for such intuitive notions as time-varying frequency components and pitch contours. Consequently, STFA is useful for speech analysis and speech processing. This paper shows that STFA provides a convenient technique for estimating and modifying certain perceptual parameters of speech. As an example of an application of STFA of speech, the problem of time-compression or expansion of speech, while preserving pitch and time-varying frequency content is presented.

  12. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  13. Neural bases of childhood speech disorders: lateralization and plasticity for speech functions during development.

    PubMed

    Liégeois, Frédérique J; Morgan, Angela T

    2012-01-01

    Current models of speech production in adults emphasize the crucial role played by the left perisylvian cortex, primary and pre-motor cortices, the basal ganglia, and the cerebellum for normal speech production. Whether similar brain-behaviour relationships and leftward cortical dominance are found in childhood remains unclear. Here we reviewed recent evidence linking motor speech disorders (apraxia of speech and dysarthria) and brain abnormalities in children and adolescents with developmental, progressive, or childhood-acquired conditions. We found no evidence that unilateral damage can result in apraxia of speech, or that left hemisphere lesions are more likely to result in dysarthria than lesion to the right. The few studies reporting on childhood apraxia of speech converged towards morphological, structural, metabolic or epileptic anomalies affecting the basal ganglia, perisylvian and rolandic cortices bilaterally. Persistent dysarthria, similarly, was commonly reported in individuals with syndromes and conditions affecting these same structures bilaterally. In conclusion, for the first time we provide evidence that longterm and severe childhood speech disorders result predominantly from bilateral disruption of the neural networks involved in speech production. PMID:21827785

  14. The Use of Interpreters by Speech-Language Pathologists Conducting Bilingual Speech-Language Assessments

    ERIC Educational Resources Information Center

    Palfrey, Carol Lynn

    2013-01-01

    The purpose of this non-experimental quantitative study was to explore the practices of speech-language pathologists in conducting bilingual assessments with interpreters. Data were obtained regarding the assessment tools and practices used by speech-language pathologists, the frequency with which they work with interpreters, and the procedures…

  15. Implementing Speech Supplementation Strategies: Effects on Intelligibility and Speech Rate of Individuals with Chronic Severe Dysarthria.

    ERIC Educational Resources Information Center

    Hustad, Katherine C.; Jones, Tabitha; Dailey, Suzanne

    2003-01-01

    A study compared intelligibility and speech rate differences following speaker implementation of 3 strategies (topic, alphabet, and combined topic and alphabet supplementation) and a habitual speech control condition for 5 speakers with severe dysarthria. Combined cues and alphabet cues yielded significantly higher intelligibility scores and…

  16. Autonomic and Emotional Responses of Graduate Student Clinicians in Speech-Language Pathology to Stuttered Speech

    ERIC Educational Resources Information Center

    Guntupalli, Vijaya K.; Nanjundeswaran, Chayadevie; Dayalu, Vikram N.; Kalinowski, Joseph

    2012-01-01

    Background: Fluent speakers and people who stutter manifest alterations in autonomic and emotional responses as they view stuttered relative to fluent speech samples. These reactions are indicative of an aroused autonomic state and are hypothesized to be triggered by the abrupt breakdown in fluency exemplified in stuttered speech. Furthermore,…

  17. Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2011

    2011-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  18. Lexical Stress Modeling for Improved Speech Recognition of Spontaneous Telephone Speech in the JUPITER Domain1

    E-print Network

    greatly reduce the number of com- peting word candidates [4]. Clearly, lexical stress contains use- ful on the recognition of stress patterns to reduce word candidates for large-vocabulary isolated word recognition [4, 5Lexical Stress Modeling for Improved Speech Recognition of Spontaneous Telephone Speech

  19. Hormones and temporal components of speech: sex differences and effects of menstrual cyclicity on speech

    Microsoft Academic Search

    Sandra P Whiteside; Anna Hanson; Patricia E Cowell

    2004-01-01

    Voice onset time (VOT) is a salient acoustic parameter of speech which signals the ‘voiced’ and ‘voiceless’ status of plosives in English (e.g. the initial sound in ‘bat’ versus the initial sound in ‘pat’). As a micro-temporal acoustic parameter, VOT may be sensitive to changes in hormones which may affect the neuromuscular systems involved in speech production. This study adopted

  20. Cued Speech for Enhancing Speech Perception and First Language Development of Children With Cochlear Implants

    PubMed Central

    Leybaert, Jacqueline; LaSasso, Carol J.

    2010-01-01

    Nearly 300 million people worldwide have moderate to profound hearing loss. Hearing impairment, if not adequately managed, has strong socioeconomic and affective impact on individuals. Cochlear implants have become the most effective vehicle for helping profoundly deaf children and adults to understand spoken language, to be sensitive to environmental sounds, and, to some extent, to listen to music. The auditory information delivered by the cochlear implant remains non-optimal for speech perception because it delivers a spectrally degraded signal and lacks some of the fine temporal acoustic structure. In this article, we discuss research revealing the multimodal nature of speech perception in normally-hearing individuals, with important inter-subject variability in the weighting of auditory or visual information. We also discuss how audio-visual training, via Cued Speech, can improve speech perception in cochlear implantees, particularly in noisy contexts. Cued Speech is a system that makes use of visual information from speechreading combined with hand shapes positioned in different places around the face in order to deliver completely unambiguous information about the syllables and the phonemes of spoken language. We support our view that exposure to Cued Speech before or after the implantation could be important in the aural rehabilitation process of cochlear implantees. We describe five lines of research that are converging to support the view that Cued Speech can enhance speech perception in individuals with cochlear implants. PMID:20724357